Uploaded by casvromans1334

linear-algebra-a-modern-introduction-4e-ism-4nbsped-1285869605-9781285869605 compress

advertisement
Complete Solutions Manual
Linear Algebra
A Modern Introduction
FOURTH EDITION
David Poole
Trent University
Prepared by
Roger Lipsett
Australia • Brazil • Japan • Korea • Mexico • Singapore • Spain • United Kingdom • United States
ISBN-13: 978-128586960-5
ISBN-10: 1-28586960-5
© 2015 Cengage Learning
ALL RIGHTS RESERVED. No part of this work covered by the
copyright herein may be reproduced, transmitted, stored, or
used in any form or by any means graphic, electronic, or
mechanical, including but not limited to photocopying,
recording, scanning, digitizing, taping, Web distribution,
information networks, or information storage and retrieval
systems, except as permitted under Section 107 or 108 of the
1976 United States Copyright Act, without the prior written
permission of the publisher except as may be permitted by the
license terms below.
For product information and technology assistance, contact us at
Cengage Learning Customer & Sales Support,
1-800-354-9706.
For permission to use material from this text or product, submit
all requests online at www.cengage.com/permissions
Further permissions questions can be emailed to
permissionrequest@cengage.com.
Cengage Learning
200 First Stamford Place, 4th Floor
Stamford, CT 06902
USA
Cengage Learning is a leading provider of customized
learning solutions with office locations around the globe,
including Singapore, the United Kingdom, Australia,
Mexico, Brazil, and Japan. Locate your local office at:
www.cengage.com/global.
Cengage Learning products are represented in
Canada by Nelson Education, Ltd.
To learn more about Cengage Learning Solutions,
visit www.cengage.com.
Purchase any of our products at your local college
store or at our preferred online store
www.cengagebrain.com.
NOTE: UNDER NO CIRCUMSTANCES MAY THIS MATERIAL OR ANY PORTION THEREOF BE SOLD, LICENSED, AUCTIONED,
OR OTHERWISE REDISTRIBUTED EXCEPT AS MAY BE PERMITTED BY THE LICENSE TERMS HEREIN.
READ IMPORTANT LICENSE INFORMATION
Dear Professor or Other Supplement Recipient:
Cengage Learning has provided you with this product (the
“Supplement”) for your review and, to the extent that you adopt
the associated textbook for use in connection with your course
(the “Course”), you and your students who purchase the
textbook may use the Supplement as described below. Cengage
Learning has established these use limitations in response to
concerns raised by authors, professors, and other users
regarding the pedagogical problems stemming from unlimited
distribution of Supplements.
Cengage Learning hereby grants you a nontransferable license
to use the Supplement in connection with the Course, subject to
the following conditions. The Supplement is for your personal,
noncommercial use only and may not be reproduced, posted
electronically or distributed, except that portions of the
Supplement may be provided to your students IN PRINT FORM
ONLY in connection with your instruction of the Course, so long
as such students are advised that they
Printed in the United States of America
1 2 3 4 5 6 7 17 16 15 14 13
may not copy or distribute any portion of the Supplement to any
third party. You may not sell, license, auction, or otherwise
redistribute the Supplement in any form. We ask that you take
reasonable steps to protect the Supplement from unauthorized
use, reproduction, or distribution. Your use of the Supplement
indicates your acceptance of the conditions set forth in this
Agreement. If you do not accept these conditions, you must
return the Supplement unused within 30 days of receipt.
All rights (including without limitation, copyrights, patents, and
trade secrets) in the Supplement are and will remain the sole and
exclusive property of Cengage Learning and/or its licensors. The
Supplement is furnished by Cengage Learning on an “as is” basis
without any warranties, express or implied. This Agreement will
be governed by and construed pursuant to the laws of the State
of New York, without regard to such State’s conflict of law rules.
Thank you for your assistance in helping to safeguard the integrity
of the content contained in this Supplement. We trust you find the
Supplement a useful teaching tool.
Contents
1 Vectors
1.1 The Geometry and Algebra of Vectors . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
1.2 Length and Angle: The Dot Product . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
Exploration: Vectors and Geometry . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
1.3 Lines and Planes . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
Exploration: The Cross Product . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
1.4 Applications . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
Chapter Review . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
3
3
10
25
27
41
44
48
2 Systems of Linear Equations
53
2.1 Introduction to Systems of Linear Equations . . . . . . . . . . . . . . . . . . . . . . . . . . . 53
2.2 Direct Methods for Solving Linear Systems . . . . . . . . . . . . . . . . . . . . . . . . . . . . 58
Exploration: Lies My Computer Told Me . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 75
Exploration: Partial Pivoting . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 75
Exploration: An Introduction to the Analysis of Algorithms . . . . . . . . . . . . . . . . . . . . . . 77
2.3 Spanning Sets and Linear Independence . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 79
2.4 Applications . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 93
2.5 Iterative Methods for Solving Linear Systems . . . . . . . . . . . . . . . . . . . . . . . . . . . 112
Chapter Review . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 123
3 Matrices
129
3.1 Matrix Operations . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 129
3.2 Matrix Algebra . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 138
3.3 The Inverse of a Matrix . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 150
3.4 The LU Factorization . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 164
3.5 Subspaces, Basis, Dimension, and Rank . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 176
3.6 Introduction to Linear Transformations . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 192
3.7 Applications . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 209
Chapter Review . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 230
4 Eigenvalues and Eigenvectors
235
4.1 Introduction to Eigenvalues and Eigenvectors . . . . . . . . . . . . . . . . . . . . . . . . . . . 235
4.2 Determinants . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 250
Exploration: Geometric Applications of Determinants . . . . . . . . . . . . . . . . . . . . . . . . . 263
4.3 Eigenvalues and Eigenvectors of n × n Matrices . . . . . . . . . . . . . . . . . . . . . . . . . . 270
4.4 Similarity and Diagonalization . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 291
4.5 Iterative Methods for Computing Eigenvalues . . . . . . . . . . . . . . . . . . . . . . . . . . . 308
4.6 Applications and the Perron-Frobenius Theorem . . . . . . . . . . . . . . . . . . . . . . . . . 326
Chapter Review . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 365
1
2
CONTENTS
5 Orthogonality
371
5.1 Orthogonality in Rn . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 371
5.2 Orthogonal Complements and Orthogonal Projections . . . . . . . . . . . . . . . . . . . . . . 379
5.3 The Gram-Schmidt Process and the QR Factorization . . . . . . . . . . . . . . . . . . . . . . 388
Exploration: The Modified QR Process . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 398
Exploration: Approximating Eigenvalues with the QR Algorithm . . . . . . . . . . . . . . . . . . . 402
5.4 Orthogonal Diagonalization of Symmetric Matrices . . . . . . . . . . . . . . . . . . . . . . . . 405
5.5 Applications . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 417
Chapter Review . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 442
6 Vector Spaces
451
6.1 Vector Spaces and Subspaces . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 451
6.2 Linear Independence, Basis, and Dimension . . . . . . . . . . . . . . . . . . . . . . . . . . . . 463
Exploration: Magic Squares . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 477
6.3 Change of Basis . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 480
6.4 Linear Transformations . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 491
6.5 The Kernel and Range of a Linear Transformation . . . . . . . . . . . . . . . . . . . . . . . . 498
6.6 The Matrix of a Linear Transformation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 507
Exploration: Tiles, Lattices, and the Crystallographic Restriction . . . . . . . . . . . . . . . . . . . 525
6.7 Applications . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 527
Chapter Review . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 531
7 Distance and Approximation
537
7.1 Inner Product Spaces . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 537
Exploration: Vectors and Matrices with Complex Entries . . . . . . . . . . . . . . . . . . . . . . . 546
Exploration: Geometric Inequalities and Optimization Problems . . . . . . . . . . . . . . . . . . . 553
7.2 Norms and Distance Functions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 556
7.3 Least Squares Approximation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 568
7.4 The Singular Value Decomposition . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 590
7.5 Applications . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 614
Chapter Review . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 625
8 Codes
633
8.1 Code Vectors . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 633
8.2 Error-Correcting Codes . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 637
8.3 Dual Codes . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 641
8.4 Linear Codes . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 647
8.5 The Minimum Distance of a Code . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 650
Chapter 1
Vectors
1.1
The Geometry and Algebra of Vectors
H-2, 3L
1.
H2, 3L
3
2
1
H3, 0L
-2
1
-1
2
3
-1
H3, -2L
-2
2. Since
2
3
5
+
=
,
−3
0
−3
2
2
4
+
=
,
−3
3
0
2
−2
0
+
=
,
−3
3
0
plotting those vectors gives
1
2
3
4
-1
c
b
-2
a
-3
d
-4
-5
3
5
2
3
5
+
=
,
−3
−2
−5
4
CHAPTER 1. VECTORS
3.
2
c
z
b
1
-2
0
-1
0
1
y2
1
0
3
2
-1
a
x
-1
d
-2
4. Since the heads are all at (3, 2, 1), the tails are at
     
3
0
3
2 − 2 = 0 ,
1
0
1
     
0
3
3
2 − 2 = 0 ,
1
1
0
     
3
1
2
2 − −2 = 4 ,
1
1
0
# »
5. The four vectors AB are
3
c
2
1
d
a
1
2
3
4
-1
b
-2
In standard position, the vectors are
# »
(a) AB = [4 − 1, 2 − (−1)] = [3, 3].
# »
(b) AB = [2 − 0, −1 − (−2)] = [2, 1]
# » (c) AB = 12 − 2, 3 − 32 = − 32 , 32
# » (d) AB = 16 − 13 , 12 − 13 = − 16 , 16 .
3
2
a
1
c
b
d
-1
1
2
3
     
3
−1
4
2 − −1 = 3 .
1
−2
3
1.1. THE GEOMETRY AND ALGEBRA OF VECTORS
5
6. Recall the notation that [a, b] denotes a move of a units horizontally and b units vertically. Then during
the first part of the walk, the hiker walks 4 km north, so a = [0, 4]. During the second part of the
walk, the hiker walks a distance of 5 km northeast. From the components, we get
" √
√ #
5 2 5 2
◦
◦
,
.
b = [5 cos 45 , 5 sin 45 ] =
2
2
Thus the net displacement vector is
" √
√ #
5 2
5 2
c=a+b=
, 4+
.
2
2
7. a + b =
3
2
3+2
5
+
=
=
.
0
3
0+3
3
3
2
a+b
b
1
a
1
2
−2
2 − (−2)
4
8. b−c =
−
=
=
.
3
3
3−3
0
2
3
4
5
3
2
b
-c
1
b-c
1
3
−2
5
9. d − c =
−
=
.
−2
3
−5
2
3
4
1
2
3
4
5
d
-1
-2
d-c
-3
-c
-4
-5
3
3
3+3
10. a + d =
+
=
=
0
−2
0 + (−2)
6
.
−2
a
1
2
3
4
5
6
d
-1
a+d
-2
11. 2a + 3c = 2[0, 2, 0] + 3[1, −2, 1] = [2 · 0, 2 · 2, 2 · 0] + [3 · 1, 3 · (−2), 3 · 1] = [3, −2, 3].
12.
3b − 2c + d = 3[3, 2, 1] − 2[1, −2, 1] + [−1, −1, −2]
= [3 · 3, 3 · 2, 3 · 1] + [−2 · 1, −2 · (−2), −2 · 1] + [−1, −1, −2]
= [6, 9, −1].
6
CHAPTER 1. VECTORS
13. u = [cos 60◦ , sin 60◦ ] =
h
1
2,
√
3
2
i
h √
i
, and v = [cos 210◦ , sin 210◦ ] = − 23 , − 21 , so that
#
√ √
1
3
3 1
u+v =
−
,
−
,
2
2
2
2
#
√ √
1
3
3 1
u−v =
+
,
+
.
2
2
2
2
"
"
# »
14. (a) AB = b − a.
# » # »
# » # »
(b) Since OC = AB, we have BC = OC − b = (b − a) − b = −a.
# »
(c) AD = −2a.
# »
# »
# »
(d) CF = −2OC = −2AB = −2(b − a) = 2(a − b).
# » # » # »
(e) AC = AB + BC = (b − a) + (−a) = b − 2a.
# »
# »
# »
# »
(f ) Note that F A and OB are equal, and that DE = −AB. Then
# » # » # »
# » # »
BC + DE + F A = −a − AB + OB = −a − (b − a) + b = 0.
15. 2(a − 3b) + 3(2b + a)
property e.
distributivity
=
(2a − 6b) + (6b + 3a)
property b.
associativity
=
(2a + 3a) + (−6b + 6b) = 5a.
16.
property e.
distributivity
−3(a − c) + 2(a + 2b) + 3(c − b)
=
property b.
associativity
=
(−3a + 3c) + (2a + 4b) + (3c − 3b)
(−3a + 2a) + (4b − 3b) + (3c + 3c)
= −a + b + 6c.
17. x − a = 2(x − 2a) = 2x − 4a ⇒ x − 2x = a − 4a ⇒ −x = −3a ⇒ x = 3a.
18.
x + 2a − b = 3(x + a) − 2(2a − b) = 3x + 3a − 4a + 2b
x − 3x = −a − 2a + 2b + b
⇒
⇒
−2x = −3a + 3b ⇒
3
3
x = a − b.
2
2
19. We have 2u + 3v = 2[1, −1] + 3[1, 1] = [2 · 1 + 3 · 1, 2 · (−1) + 3 · 1] = [5, 1]. Plots of all three vectors are
2
w
1
v
v
1
-1
2
3
4
v
u
-1
v
u
-2
5
6
1.1. THE GEOMETRY AND ALGEBRA OF VECTORS
7
20. We have −u − 2v = −[−2, 1] − 2[2, −2] = [−(−2) − 2 · 2, −1 − 2 · (−2)] = [−2, 3]. Plots of all three
vectors are
-u
-v
3
-u
2
w
-v
1
u
-2
1
-1
2
v
-1
-2
21. From the diagram, we see that w =
−2u + 4v.
6
-u
5
-u
w
4
v
3
v
2
v
1
v
1
-1
2
3
4
6
5
u
-1
22. From the diagram, we see that w = 2u +
3v.
9
8
u
7
6
w
5
u
4
3
v
2
u
v
1
v
-2
-1
1
2
3
4
5
6
7
23. Property (d) states that u + (−u) = 0. The first diagram below shows u along with −u. Then, as the
diagonal of the parallelogram, the resultant vector is 0.
Property (e) states that c(u + v) = cu + cv. The second figure illustrates this.
8
CHAPTER 1. VECTORS
cv
cHu + vL
cu
u
v
cu
u
u+v
u
cv
v
-u
24. Let u = [u1 , u2 , . . . , un ] and v = [v1 , v2 , . . . , vn ], and let c and d be scalars in R.
Property (d):
u + (−u) = [u1 , u2 , . . . , un ] + (−1[u1 , u2 , . . . , un ])
= [u1 , u2 , . . . , un ] + [−u1 , −u2 , . . . , −un ]
= [u1 − u1 , u2 − u2 , . . . , un − un ]
= [0, 0, . . . , 0] = 0.
Property (e):
c(u + v) = c ([u1 , u2 , . . . , un ] + [v1 , v2 , . . . , vn ])
= c ([u1 + v1 , u2 + v2 , . . . , un + vn ])
= [c(u1 + v1 ), c(u2 + v2 ), . . . , c(un + vn )]
= [cu1 + cv1 , cu2 + cv2 , . . . , cun + cvn ]
= [cu1 , cu2 , . . . , cun ] + [cv1 , cv2 , . . . , cvn ]
= c[u1 , u2 , . . . , un ] + c[v1 , v2 , . . . , vn ]
= cu + cv.
Property (f):
(c + d)u = (c + d)[u1 , u2 , . . . , un ]
= [(c + d)u1 , (c + d)u2 , . . . , (c + d)un ]
= [cu1 + du1 , cu2 + du2 , . . . , cun + dun ]
= [cu1 , cu2 , . . . , cun ] + [du1 , du2 , . . . , dun ]
= c[u1 , u2 , . . . , un ] + d[u1 , u2 , . . . , un ]
= cu + du.
Property (g):
c(du) = c(d[u1 , u2 , . . . , un ])
= c[du1 , du2 , . . . , dun ]
= [cdu1 , cdu2 , . . . , cdun ]
= [(cd)u1 , (cd)u2 , . . . , (cd)un ]
= (cd)[u1 , u2 , . . . , un ]
= (cd)u.
25. u + v = [0, 1] + [1, 1] = [1, 0].
26. u + v = [1, 1, 0] + [1, 1, 1] = [0, 0, 1].
1.1. THE GEOMETRY AND ALGEBRA OF VECTORS
9
27. u + v = [1, 0, 1, 1] + [1, 1, 1, 1] = [0, 1, 0, 0].
28. u + v = [1, 1, 0, 1, 0] + [0, 1, 1, 1, 0] = [1, 0, 1, 0, 0].
29.
+
0
1
2
3
0
0
1
2
3
1
1
2
3
0
2
2
3
0
1
3
3
0
1
2
·
0
1
2
3
0
0
0
0
0
1
0
1
2
3
2
0
2
0
2
3
0
3
2
1
0
0
1
2
3
4
1
1
2
3
4
0
2
2
3
4
0
1
3
3
4
0
1
2
4
4
0
1
2
3
·
0
1
2
3
4
0
0
0
0
0
0
1
0
1
2
3
4
2
0
2
4
1
3
3
0
3
1
4
2
30.
+
0
1
2
3
4
4
0
4
3
2
1
31. 2 + 2 + 2 = 6 = 0 in Z3 .
32. 2 · 2 · 2 = 3 · 2 = 0 in Z3 .
33. 2(2 + 1 + 2) = 2 · 2 = 3 · 1 + 1 = 1 in Z3 .
34. 3 + 1 + 2 + 3 = 4 · 2 + 1 = 1 in Z4 .
35. 2 · 3 · 2 = 4 · 3 + 0 = 0 in Z4 .
36. 3(3 + 3 + 2) = 4 · 6 + 0 = 0 in Z4 .
37. 2 + 1 + 2 + 2 + 1 = 2 in Z3 , 2 + 1 + 2 + 2 + 1 = 0 in Z4 , 2 + 1 + 2 + 2 + 1 = 3 in Z5 .
38. (3 + 4)(3 + 2 + 4 + 2) = 2 · 1 = 2 in Z5 .
39. 8(6 + 4 + 3) = 8 · 4 = 5 in Z9 .
10
40. 2100 = 210
= (1024)10 = 110 = 1 in Z11 .
41. [2, 1, 2] + [2, 0, 1] = [1, 1, 0] in Z33 .
42. 2[2, 2, 1] = [2 · 2, 2 · 2, 2 · 1] = [1, 1, 2] in Z33 .
43. 2([3, 1, 1, 2] + [3, 3, 2, 1]) = 2[2, 0, 3, 3] = [2 · 2, 2 · 0, 2 · 3, 2 · 3] = [0, 0, 2, 2] in Z44 .
2([3, 1, 1, 2] + [3, 3, 2, 1]) = 2[1, 4, 3, 3] = [2 · 1, 2 · 4, 2 · 3, 2 · 3] = [2, 3, 1, 1] in Z45 .
44. x = 2 + (−3) = 2 + 2 = 4 in Z5 .
45. x = 1 + (−5) = 1 + 1 = 2 in Z6
46. x = 2−1 = 2 in Z3 .
47. No solution. 2 times anything is always even, so cannot leave a remainder of 1 when divided by 4.
48. x = 2−1 = 3 in Z5 .
49. x = 3−1 4 = 2 · 4 = 3 in Z5 .
50. No solution. 3 times anything is always a multiple of 3, so it cannot leave a remainder of 4 when
divided by 6 (which is also a multiple of 3).
51. No solution. 6 times anything is always even, so it cannot leave an odd number as a remainder when
divided by 8.
10
CHAPTER 1. VECTORS
52. x = 8−1 9 = 7 · 9 = 8 in Z11
53. x = 2−1 (2 + (−3)) = 3(2 + 2) = 2 in Z5 .
54. No solution. This equation is the same as 4x = 2 − 5 = −3 = 3 in Z6 . But 4 times anything is even,
so it cannot leave a remainder of 3 when divided by 6 (which is also even).
55. Add 5 to both sides to get 6x = 6, so that x = 1 or x = 5 (since 6 · 1 = 6 and 6 · 5 = 30 = 6 in Z8 ).
56. (a)
All values.
(b)
All values.
(c)
All values.
57. (a) All a 6= 0 in Z5 have a solution because 5 is a prime number.
(b) a = 1 and a = 5 because they have no common factors with 6 other than 1.
(c) a and m can have no common factors other than 1; that is, the greatest common divisor, gcd, of
a and m is 1.
1.2
Length and Angle: The Dot Product
−1
3
·
= (−1) · 3 + 2 · 1 = −3 + 2 = −1.
2
1
3
4
2. Following Example 1.15, u · v =
·
= 3 · 4 + (−2) · 6 = 12 − 12 = 0.
−2
6
   
2
1
3. u · v = 2 · 3 = 1 · 2 + 2 · 3 + 3 · 1 = 2 + 6 + 3 = 11.
1
3
1. Following Example 1.15, u · v =
4. u · v = 3.2 · 1.5 + (−0.6) · 4.1 + (−1.4) · (−0.2) = 4.8 − 2.46 + 0.28 = 2.62.
  

4
√1
√
 2 − 2
√
√
√

√  
5. u · v = 
 3 ·  0  = 1 · 4 + 2 · (− 2) + 3 · 0 + 0 · (−5) = 4 − 2 = 2.
−5
0

 

−2.29
1.12
−3.25  1.72

 
6. u · v = 
 2.07 ·  4.33 = −1.12 · 2.29 − 3.25 · 1.72 + 2.07 · 4.33 − 1.83 · (−1.54) = 3.6265.
−1.54
−1.83
7. Finding a unit vector v in the same direction as a given vector u is called normalizing the vector u.
Proceed as in Example 1.19:
p
√
kuk = (−1)2 + 22 = 5,
so a unit vector v in the same direction as u is
" # "
#
− √15
1
1 −1
v=
u= √
=
.
kuk
√2
5
2
5
8. Proceed as in Example 1.19:
kuk =
p
√
√
32 + (−2)2 = 9 + 4 = 13,
so a unit vector v in the direction of u is
" # "
#
√3
3
1
1
13
v=
u= √
=
.
kuk
13 −2
− √213
1.2. LENGTH AND ANGLE: THE DOT PRODUCT
11
9. Proceed as in Example 1.19:
kuk =
p
12 + 22 + 32 =
√
14,
so a unit vector v in the direction of u is
  

√1
1
14
 2 
1
1  

=
v=
u= √ 
2
 √14  .


kuk
14
3
√
3
14
10. Proceed as in Example 1.19:
p
√
√
kuk = 3.22 + (−0.6)2 + (−1.4)2 = 10.24 + 0.36 + 1.96 = 12.56 ≈ 3.544,
so a unit vector v in the direction of u is

 

1.5
0.903
1
1 
0.4 ≈ −0.169 .
v=
u=
kuk
3.544
−2.1
−0.395
11. Proceed as in Example 1.19:
r
kuk =
12 +
√ 2 √ 2
√
2 +
3 + 02 = 6,
so a unit vector v in the direction of u is
  1   1  √ 
6
√
√
1
6
6
6
√
√
√
   2  1   3
√  √   3 
1  2
1
√6  =  3  =  √ 
=
u= √ 
v=
√ 

 1   2

3
kuk
6  3 
 √6   √2   2 
0
0
0
0

12. Proceed as in Example 1.19:
p
√
kuk = 1.122 + (−3.25)2 + 2.072 + (−1.83)2 = 1.2544 + 10.5625 + 4.2849 + 3.3489
√
= 19.4507 ≈ 4.410,
so a unit vector v in the direction of u is
1
1 1.12 −3.25 2.07 −1.83 ≈ 0.254 −0.737
v
u=
kuk
4.410
−1
3
−4
13. Following Example 1.20, we compute: u − v =
−
=
, so
2
1
1
q
√
2
d(u, v) = ku − vk = (−4) + 12 = 17.
3
4
−1
14. Following Example 1.20, we compute: u − v =
−
=
, so
−2
6
−8
q
√
2
2
d(u, v) = ku − vk = (−1) + (−8) = 65.
     
2
−1
1
15. Following Example 1.20, we compute: u − v = 2 − 3 = −1, so
3
1
2
q
√
2
2
d(u, v) = ku − vk = (−1) + (−1) + 22 = 6.
0.469
−0.415 .
12
CHAPTER 1. VECTORS

 
 

3.2
1.5
1.7
16. Following Example 1.20, we compute: u − v = −0.6 −  4.1 = −4.7, so
−1.4
−0.2
−1.2
d(u, v) = ku − vk =
p
1.72 + (−4.7)2 + (−1.2)2 =
√
26.42 ≈ 5.14.
17. (a) u · v is a real number, so ku · vk is the norm of a number, which is not defined.
(b) u · v is a scalar, while w is a vector. Thus u · v + w adds a scalar to a vector, which is not a
defined operation.
(c) u is a vector, while v · w is a scalar. Thus u · (v · w) is the dot product of a vector and a scalar,
which is not defined.
(d) c · (u + v) is the dot product of a scalar and a vector, which is not defined.
18. Let θ be the angle between u and v. Then
√
3
u·v
3 · (−1) + 0 · 1
2
p
cos θ =
=√
=− √ =−
.
2
2
2
2
kuk kvk
2
3 2
3 + 0 (−1) + 1
Thus cos θ < 0 (in fact, θ = 3π
4 ), so the angle between u and v is obtuse.
19. Let θ be the angle between u and v. Then
cos θ =
1
u·v
2 · 1 + (−1) · (−2) + 1 · (−1)
p
= .
=p
2
2
2
2
2
2
kuk kvk
2
2 + (−1) + 1 1 + (−2) + 1
Thus cos θ > 0 (in fact, θ = π3 ), so the angle between u and v is acute.
20. Let θ be the angle between u and v. Then
cos θ =
0
4 · 1 + 3 · (−1) + (−1) · 1
u·v
p
= √ √ = 0.
=p
2
2
2
2
2
2
kuk kvk
26 3
4 + 3 + (−1) 1 + (−1) + 1
Thus the angle between u and v is a right angle.
21. Let θ be the angle between u and v. Note that we can determine whether θ is acute, right, or obtuse
u·v
by examining the sign of kukkvk
, which is determined by the sign of u · v. Since
u · v = 0.9 · (−4.5) + 2.1 · 2.6 + 1.2 · (−0.8) = 0.45 > 0,
we have cos θ > 0 so that θ is acute.
22. Let θ be the angle between u and v. Note that we can determine whether θ is acute, right, or obtuse
u·v
by examining the sign of kukkvk
, which is determined by the sign of u · v. Since
u · v = 1 · (−3) + 2 · 1 + 3 · 2 + 4 · (−2) = −3,
we have cos θ < 0 so that θ is obtuse.
23. Since the components of both u and v are positive, it is clear that u · v > 0, so the angle between
them is acute since it has a positive cosine.
√ √
◦
24. From Exercise 18, cos θ = − 22 , so that θ = cos−1 − 22 = 3π
4 = 135 .
25. From Exercise 19, cos θ = 12 , so that θ = cos−1 21 = π3 = 60◦ .
26. From Exercise 20, cos θ = 0, so that θ = π2 = 90◦ is a right angle.
1.2. LENGTH AND ANGLE: THE DOT PRODUCT
13
27. As in Example 1.21, we begin by calculating u · v and the norms of the two vectors:
u · v = 0.9 · (−4.5) + 2.1 · 2.6 + 1.2 · (−0.8) = 0.45,
p
√
kuk = 0.92 + 2.12 + 1.22 = 6.66,
p
√
kvk = (−4.5)2 + 2.62 + (−0.8)2 = 27.65.
So if θ is the angle between u and v, then
cos θ =
0.45
u·v
0.45
√
≈√
,
=√
kuk kvk
6.66 27.65
182.817
so that
−1
θ = cos
0.45
√
182.817
≈ 1.5375 ≈ 88.09◦ .
Note that it is important to maintain as much precision as possible until the last step, or roundoff
errors may build up.
28. As in Example 1.21, we begin by calculating u · v and the norms of the two vectors:
u · v = 1 · (−3) + 2 · 1 + 3 · 2 + 4 · (−2) = −3,
p
√
kuk = 12 + 22 + 32 + 42 = 30,
p
√
kvk = (−3)2 + 12 + 22 + (−2)2 = 18.
So if θ is the angle between u and v, then
cos θ =
3
u·v
1
= −√ √ = − √
kuk kvk
30 18
2 15
so that
1
θ = cos−1 − √
≈ 1.7 ≈ 97.42◦ .
2 15
29. As in Example 1.21, we begin by calculating u · v and the norms of the two vectors:
u · v = 1 · 5 + 2 · 6 + 3 · 7 + 4 · 8 = 70,
p
√
kuk = 12 + 22 + 32 + 42 = 30,
p
√
kvk = 52 + 62 + 72 + 82 = 174.
So if θ is the angle between u and v, then
u·v
70
35
cos θ =
=√ √
= √
kuk kvk
30 174
3 145
so that
−1
θ = cos
35
√
3 145
≈ 0.2502 ≈ 14.34◦ .
30. To show that 4ABC is right, we need only show that one pair of its sides meets at a right angle. So
# »
# »
# »
let u = AB, v = BC, and w = AC. Then we must show that one of u · v, u · w or v · w is zero in
order to show that one of these pairs is orthogonal. Then
# »
# »
u = AB = [1 − (−3), 0 − 2] = [4, −2] , v = BC = [4 − 1, 6 − 0] = [3, 6] ,
# »
w = AC = [4 − (−3), 6 − 2] = [7, 4] ,
and
u · v = 4 · 3 + (−2) · 6 = 12 − 12 = 0.
# » # »
Since this dot product is zero, these two vectors are orthogonal, so that AB ⊥ BC and thus 4ABC is
a right triangle. It is unnecessary to test the remaining pairs of sides.
14
CHAPTER 1. VECTORS
31. To show that 4ABC is right, we need only show that one pair of its sides meets at a right angle. So
# »
# »
# »
let u = AB, v = BC, and w = AC. Then we must show that one of u · v, u · w or v · w is zero in
order to show that one of these pairs is orthogonal. Then
# »
u = AB = [−3 − 1, 2 − 1, (−2) − (−1)] = [−4, 1, −1] ,
# »
v = BC = [2 − (−3), 2 − 2, −4 − (−2)] = [5, 0, −2],
# »
w = AC = [2 − 1, 2 − 1, −4 − (−1)] = [1, 1, −3],
and
u · v = −4 · 5 + 1 · 0 − 1 · (−2) = −18
u · w = −4 · 1 + 1 · 1 − 1 · (−3) = 0.
# » # »
Since this dot product is zero, these two vectors are orthogonal, so that AB ⊥ AC and thus 4ABC is
a right triangle. It is unnecessary to test the remaining pair of sides.
32. As in Example 1.22, the dimensions of the cube do not matter, so we work with a cube with side length
1. Since the cube is symmetric, we need only consider one diagonal and adjacent edge. Orient the cube
as shown in Figure 1.34; take the diagonal to be [1, 1, 1] and the adjacent edge to be [1, 0, 0]. Then the
angle θ between these two vectors satisfies
1
1·1+1·0+1·0
1
√ √
cos θ =
=√ ,
so
θ = cos−1 √
≈ 54.74◦ .
3 1
3
3
Thus the diagonal and an adjacent edge meet at an angle of 54.74◦ .
33. As in Example 1.22, the dimensions of the cube do not matter, so we work with a cube with side length
1. Since the cube is symmetric, we need only consider one pair of diagonals. Orient the cube as shown
in Figure 1.34; take the diagonals to be u = [1, 1, 1] and v = [1, 1, 0] − [0, 0, 1] = [1, 1, −1]. Then the
dot product is
u · v = 1 · 1 + 1 · 1 + 1 · (−1) = 1 + 1 − 1 = 1 6= 0.
Since the dot product is nonzero, the diagonals are not orthogonal.
34. To show a parallelogram is a rhombus, it suffices to show that its diagonals are perpendicular (Euclid).
But
   
1
2
d1 · d2 = 2 · −1 = 2 · 1 + 2 · (−1) + 0 · 3 = 0.
3
0
To determine its side length, note that since the diagonals are perpendicular, one half of each diagonal
are the legs of a right triangle whose hypotenuse is one side of the rhombus. So we can use the
Pythagorean Theorem. Since
2
2
kd1 k = 22 + 22 + 02 = 8,
kd2 k = 12 + (−1)2 + 32 = 11,
we have for the side length
s2 =
kd1 k
2
2
+
kd2 k
2
2
=
8 11
19
+
=
,
4
4
4
√
so that s =
19
2 ≈ 2.18.
35. Since ABCD is a rectangle, opposite sides BA and CD are parallel and congruent. So we can use
# »
the method of Example 1.1 in Section 1.1 to find the coordinates of vertex D: we compute BA =
# »
# »
[1 − 3, 2 − 6, 3 − (−2)] = [−2, −4, 5]. If BA is then translated to CD, where C = (0, 5, −4), then
D = (0 + (−2), 5 + (−4), −4 + 5) = (−2, 1, 1).
1.2. LENGTH AND ANGLE: THE DOT PRODUCT
15
36. The resultant velocity of the airplane is the sum of the velocity of the airplane and the velocity of the
wind:
200
0
200
r=p+w =
+
=
.
0
−40
−40
37. Let the x direction be east, in the direction of the current, and the y direction be north, across the
river. The speed of the boat is 4 mph north, and the current is 3 mph east, so the velocity of the boat
is
0
3
3
v=
+
=
.
4
0
4
38. Let the x direction be the direction across the river, and the y direction be downstream. Since vt = d,
use the given information to find v, then solve for t and compute
d. Since the speed of the boat is 20
20
km/h and the speed of the current is 5 km/h, we have v =
. The width of the river is 2 km, and
5
2
the distance downstream is unknown; call it y. Then d =
. Thus
y
20
2
vt =
t=
.
5
y
Thus 20t = 2 so that t = 0.1, and then y = 5 · 0.1 = 0.5. Therefore
(a) Ann lands 0.5 km, or half a kilometer, downstream;
(b) It takes Ann 0.1 hours, or six minutes, to cross the river.
Note that the river flow does not increase the time required to cross the river, since its velocity is
perpendicular to the direction of travel.
39. We want to find the angle between Bert’s resultant vector, r, and his velocity vector upstream, v. Let
the first coordinate of the vector be the direction across the river, and the second
be the direction
x
upstream. Bert’s velocity vector directly across the river is unknown, say u =
. His velocity vector
0
0
upstream compensates for the downstream flow, so v =
. So the resultant vector is r = u + v =
1
x
0
x
+
=
. Since Bert’s speed is 2 mph, we have krk = 2. Thus
0
1
1
2
x2 + 1 = krk = 4,
so that
x=
√
3.
If θ is the angle between r and v, then
√
r·v
3
cos θ =
=
,
krk kvk
2
so that
−1
θ = cos
√ !
3
= 60◦ .
2
40. We have
u·v
(−1) · (−2) + 1 · 4 −1
−1
−3
proju v =
u=
=3
=
.
1
1
3
u·u
(−1) · (−1) + 1 · 1
A graph of the situation is (with proju v in gray, and the perpendicular from v to the projection also
drawn)
16
CHAPTER 1. VECTORS
4
v
3
proju HvL
2
1
u
-3
41. We have
-2
-1
" 3#
3
4
·
1
+
−
·
2
u·v
5 = − 1 u = −u.
5
u = 3 53
proju v =
4
4
u·u
1
·
+
−
·
−
− 45
5 5
5
5
A graph of the situation is (with proju v in gray, and the perpendicular from v to the projection also
drawn)
2
v
1
proju HvL
1
u
42. We have

proju v =
u·v
u=
u·u





1
1
4
1
1
1
2
2
3
·
2
−
·
2
−
·
(−2)
8





2
4
2
1
1
2
=
=
.
−
−
−





1
1 1
1
1
1
4
4
3
3
−4 + −2 −2
2 · 2 + −4
− 12
− 12
− 43
43. We have

   3
1
1
2



 3
u·v
1 · 2 + (−1) · (−3) + 1 · (−1) + (−1) · (−2) −1 6 −1
 − 2  3
proju v =
u=
  =   =   = u.
u·u
1 · 1 + (−1) · (−1) + 1 · 1 + (−1) · (−1)  1 4  1  32  2
− 32
−1
−1

44. We have
proju v =
u·v
0.5 · 2.1 + 1.5 · 1.2 0.5
2.85 0.5
0.57
u=
=
=
= 1.14u.
1.71
u·u
0.5 · 0.5 + 1.5 · 1.5 1.5
2.5 1.5
1.2. LENGTH AND ANGLE: THE DOT PRODUCT
17
45. We have


3.01
u·v
3.01 · 1.34 − 0.33 · 4.25 + 2.52 · (−1.66) 
−0.33
proju v =
u=
u·u
3.01 · 3.01 − 0.33 · (−0.33) + 2.52 · 2.52
2.52

 

3.01
−0.301
1.5523 
1
−0.33 ≈  0.033 ≈ − u.
=−
15.5194
10
2.52
−0.252
# »
46. Let u = AB =
# »
2−1
1
4−1
3
=
and v = AC =
=
.
2 − (−1)
3
0 − (−1)
1
(a) To compute the projection, we need
1
3
u·v =
·
= 6,
3
1
Thus
u·u=
1
1
·
= 10.
3
3
" # " #
3
3 1
u·v
u=
= 59 ,
proju v =
u·u
5 3
5
so that
" # " # " #
12
3
3
5
.
v − proju v =
− 59 =
4
1
−
5
5
Then
kuk =
p
12 + 32 =
√
s
kv − proju vk =
10,
12
5
2
√
2
4
4 10
+
=
,
5
5
so that finally
√
1
1√
4 10
A = kuk kv − proju vk =
= 4.
10 ·
2
2
5
√
√
√
(b) We already know u · v = 6 and kuk = 10 from part (a). Also, kvk = 32 + 12 = 10. So
cos θ =
3
u·v
6
=√ √ = ,
kuk kvk
5
10 10
so that
s
sin θ =
p
1 − cos2 θ =
1−
2
4
3
= .
5
5
Thus
1
1√ √
4
A = kuk kvk sin θ =
10 10 · = 4.
2
2
5

  

  
4−3
1
5−3
2
# »
# »
47. Let u = AB = −2 − (−1) = −1 and v = AC = 0 − (−1) =  1.
6−4
2
2−4
−2
(a) To compute the projection, we need
   
1
2
u · v = −1 ·  1 = −3,
2
−2
Thus

  
1
1
u · u = −1 · −1 = 6.
2
2

  
1
−1
3    21 
u·v
u = − −1 =  2  ,
proju v =
u·u
6
2
−1
18
CHAPTER 1. VECTORS
so that
    
5
− 21
2
2
   1   1
v − proju v =  1 −  2  =  2 
−2
−1
−1

Then
kuk =
p
12 + (−1)2 + 22 =
√
6,
s √
2
2
5
1
30
2
+
+ (−1) =
kv − proju vk =
,
2
2
2
so that finally
√
√
1√
30
3 5
1
kuk kv − proju vk =
=
.
6·
2
2
2
2
p
√
(b) We already know u · v = −3 and kuk = 6 from part (a). Also, kvk = 22 + 12 + (−2)2 = 3.
So
√
u·v
6
−3
cos θ =
= √ =−
,
kuk kvk
6
3 6
so that
v
u
√ !2
√
u
p
6
30
t
2
=
.
sin θ = 1 − cos θ = 1 −
6
6
A=
Thus
√
√
1
1√
3 5
30
kuk kvk sin θ =
=
.
6·3·
2
2
6
2
48. Two vectors u and v are orthogonal if and only if their dot product u · v = 0. So we set u · v = 0 and
solve for k:
1
2
k+1
u·v =
·
= 0 ⇒ 2(k + 1) + 3(k − 1) = 0 ⇒ 5k − 1 = 0 ⇒ k = .
3
k−1
5
A=
Substituting into the formula for v gives
"
v=
#
1
5 +1
1
5 −1
"
=
6
5
− 45
#
.
As a check, we compute
" # " #
6
2
5 = 12 − 12 = 0,
·
u·v =
5
5
− 45
3
and the vectors are indeed orthogonal.
49. Two vectors u and v are orthogonal if and only if their dot product u · v = 0. So we set u · v = 0 and
solve for k:
   2
1
k
u · v = −1 ·  k  = 0 ⇒ k 2 − k − 6 = 0 ⇒ (k + 2)(k − 3) = 0 ⇒ k = 2, −3.
2
−3
Substituting into the formula for v gives

  
(−2)2
4
k = 2 : v1 =  −2  = −2 ,
−3
−3
 2  
3
9
k = −3 : v2 =  3  =  3 .
−3
−3
As a check, we compute
   
   
1
4
1
9
u · v1 = −1 · −2 = 1 · 4 − 1 · (−2) + 2 · (−3) = 0, u · v2 = −1 ·  3 = 1 · 9 − 1 · 3 + 2 · (−3) = 0
2
−3
2
−3
and the vectors are indeed orthogonal.
1.2. LENGTH AND ANGLE: THE DOT PRODUCT
19
50. Two vectors u and v are orthogonal if and only if their dot product u · v = 0. So we set u · v = 0 and
solve for y in terms of x:
3
x
u·v =
·
= 0 ⇒ 3x + y = 0 ⇒ y = −3x.
1
y
Substituting y = −3x back into the formula for v gives
x
1
v=
=x
.
−3x
−3
Thus any vector orthogonal to
u·v =
3
1
is a multiple of
. As a check,
1
−3
3
x
·
= 3x − 3x = 0 for any value of x,
1
−3x
so that the vectors are indeed orthogonal.
51. As noted in
the remarks just prior
to Example 1.16, the zero vector
0 is orthogonal to all vectors in
a
x
a
2
R . So if
= 0, any vector
will do. Now assume that
6= 0; that is, that either a or b is
b
y
b
nonzero. Two vectors u and v are orthogonal if and only if their dot product u · v = 0. So we set
u · v = 0 and solve for y in terms of x:
a
x
u·v =
·
= 0 ⇒ ax + by = 0.
b
y
First assume b 6= 0. Then y = − ab x, so substituting back into the expression for v we get
#
" #
" #
"
1
b
x
x
=x
=
.
v=
b −a
− ab
− ab x
Next, if b = 0, then a 6= 0, so that x = − ab y, and substituting back into the expression for v gives
"
#
" #
" #
− ab y
− ab
b
y
.
v=
=y
=−
a −a
y
1
So in either case, a vector orthogonal to
a
b
, if it is not the zero vector, is a multiple of
. As a
b
−a
check, note that
a
rb
·
= rab − rab = 0 for all values of r.
b
−ra
52. (a) The geometry of the vectors in Figure 1.26 suggests that if ku + vk = kuk + kvk, then u and v
point in the same direction. This means that the angle between them must be 0. So we first prove
Lemma 1. For all vectors u and v in R2 or R3 , u · v = kuk kvk if and only if the vectors point
in the same direction.
Proof. Let θ be the angle between u and v. Then
cos θ =
u·v
,
kuk kvk
so that cos θ = 1 if and only if u · v = kuk kvk. But cos θ = 1 if and only if θ = 0, which means
that u and v point in the same direction.
20
CHAPTER 1. VECTORS
We can now show
Theorem 2. For all vectors u and v in R2 or R3 , ku + vk = kuk + kvk if and only if u and v
point in the same direction.
Proof. First assume that u and v point in the same direction. Then u · v = kuk kvk, and thus
2
ku + vk =u · u + 2u · v + v · v
2
By Example 1.9
2
2
= kuk + 2u · v + kvk
2
Since w · w = kwk for any vector w
2
= kuk + 2 kuk kvk + kvk
By the lemma
2
= (kuk + kvk) .
Since ku + vk and kuk+kvk are both nonnegative, taking square roots gives ku + vk = kuk+kvk.
For the other direction, if ku + vk = kuk + kvk, then their squares are equal, so that
2
2
2
(kuk + kvk) = kuk + 2 kuk kvk + kvk and
2
ku + vk = u · u + 2u · v + v · v
2
are equal. But kuk = u · u and similarly for v, so that canceling those terms gives 2u · v =
2 kuk kvk and thus u · v = kuk kvk. Using the lemma again shows that u and v point in the same
direction.
(b) The geometry of the vectors in Figure 1.26 suggests that if ku + vk = kuk − kvk, then u and v
point in opposite directions. In addition, since ku + vk ≥ 0, we must also have kuk ≥ kvk. If
they point in opposite directions, the angle between them must be π. This entire proof is exactly
analogous to the proof in part (a). We first prove
Lemma 3. For all vectors u and v in R2 or R3 , u · v = − kuk kvk if and only if the vectors point
in opposite directions.
Proof. Let θ be the angle between u and v. Then
cos θ =
u·v
,
kuk kvk
so that cos θ = −1 if and only if u · v = − kuk kvk. But cos θ = −1 if and only if θ = π, which
means that u and v point in opposite directions.
We can now show
Theorem 4. For all vectors u and v in R2 or R3 , ku + vk = kuk − kvk if and only if u and v
point in opposite directions and kuk ≥ kvk.
Proof. First assume that u and v point in opposite directions and kuk ≥ kvk. Then u · v =
− kuk kvk, and thus
2
ku + vk =u · u + 2u · v + v · v
2
By Example 1.9
2
2
= kuk + 2u · v + kvk
2
Since w · w = kwk for any vector w
2
= kuk − 2 kuk kvk + kvk
By the lemma
2
= (kuk − kvk) .
Now, since kuk ≥ kvk by assumption, we see that both ku + vk and kuk − kvk are nonnegative,
so that taking square roots gives ku + vk = kuk − kvk. For the other direction, if ku + vk =
1.2. LENGTH AND ANGLE: THE DOT PRODUCT
21
kuk − kvk, then first of all, since the left-hand side is nonnegative, the right-hand side must be
as well, so that kuk ≥ kvk. Next, we can square both sides of the equality, so that
2
2
2
(kuk − kvk) = kuk − 2 kuk kvk + kvk and
2
ku + vk = u · u + 2u · v + v · v
2
are equal. But kuk = u · u and similarly for v, so that canceling those terms gives 2u · v =
−2 kuk kvk and thus u · v = − kuk kvk. Using the lemma again shows that u and v point in
opposite directions.
53. Prove Theorem 1.2(b) by applying the definition of the dot product:
u · v = u1 (v1 + w1 ) + u2 (v2 + w2 ) + · · · + un (vn + wn )
= u1 v1 + u1 w1 + u2 v2 + u2 w2 + · · · + un vn + un wn
= (u1 v1 + u2 v2 + · · · + un vn ) + (u1 w1 + u2 w2 + · · · + un wn )
= u · v + u · w.
54. Prove the three parts of Theorem 1.2(d) by applying the definition of the dot product and various
properties of real numbers:
Part 1: For any vector u, we must show u · u ≥ 0. But
u · u = u1 u1 + u2 u2 + · · · + un un = u21 + u22 + · · · + u2n .
Since for any real number x we know that x2 ≥ 0, it follows that this sum is also nonnegative, so
that u · u ≥ 0.
Part 2: We must show that if u = 0 then u · u = 0. But u = 0 means that ui = 0 for all i, so that
u · u = 0 · 0 + 0 · 0 + · · · + 0 · 0 = 0.
Part 3: We must show that if u · u = 0, then u = 0. From part 1, we know that
u · u = u21 + u22 + · · · + u2n ,
and that u2i ≥ 0 for all i. So if the dot product is to be zero, each u2i must be zero, which means
that ui = 0 for all i and thus u = 0.
55. We must show d(u, v) = ku − vk = kv − uk = d(v, u). By definition, d(u, v) = ku − vk. Then by
Theorem 1.3(b) with c = −1, we have k−wk = kwk for any vector w; applying this to the vector u − v
gives
ku − vk = k−(u − v)k = kv − uk ,
which is by definition equal to d(v, u).
56. We must show that for any vectors u, v and w that d(u, w) ≤ d(u, v) + d(v, w). This is equivalent to
showing that ku − wk ≤ ku − vk + kv − wk. Now substitute u − v for x and v − w for y in Theorem
1.5, giving
ku − wk = k(u − v) + (v − w)k ≤ ku − vk + kv − wk .
57. We must show that d(u, v) = ku − vk = 0 if and only if u = v. This follows immediately from
Theorem 1.3(a), kwk = 0 if and only if w = 0, upon setting w = u − v.
58. Apply the definitions:
u · cv = [u1 , u2 , . . . , un ] · [cv1 , cv2 , . . . , cvn ]
= u1 cv1 + u2 cv2 + · · · + un cvn
= cu1 v1 + cu2 v2 + · · · + cun vn
= c(u1 v1 + u2 v2 + · · · + un vn )
= c(u · v).
22
CHAPTER 1. VECTORS
59. We want to show that ku − vk ≥ kuk − kvk. This is equivalent to showing that kuk ≤ ku − vk + kvk.
This follows immediately upon setting x = u − v and y = v in Theorem 1.5.
60. If u · v = u · w, it does not follow that v = w. For example, since 0 · v = 0 for every vector v in Rn ,
the zero vector is orthogonal to every vector v. So if u = 0 in the above equality, we know nothing
about v and w. (as an example, 0 · [1, 2] = 0 · [−17, 12]). Note, however, that u · v = u · w implies
that u · v − u · w = u(v − w) = 0, so that u is orthogonal to v − w.
2
2
61. We must show that (u + v)(u − v) = kuk − kvk for all vectors in Rn . Recall that for any w in Rn
2
that w · w = kwk , and also that u · v = v · u. Then
2
2
2
2
(u + v)(u − v) = u · u + v · u − u · v − v · v = kuk + u · v − u · v − kvk = kuk − kvk .
62. (a) Let u, v ∈ Rn . Then
2
2
ku + vk + ku − vk = (u + v) · (u + v) + (u − v) · (u − v)
= (u · u + v · v + 2u · v) + (u · u + v · v − 2u · v)
2
2
2
2
= kuk + kvk + 2u · v + kuk + kvk − 2u · v
2
2
= 2 kuk + 2 kvk .
(b) Part (a) tells us that the sum of the
squares of the lengths of the diagonals
of a parallelogram is equal to the sum
of the squares of the lengths of its four
sides.
u+v
u-v
u
v
63. Let u, v ∈ Rn . Then
1
1
1
2
2
ku + vk − ku − vk = [(u + v) · (u + v) − ((u − v) · (u − v)]
4
4
4
1
= [(u · u + v · v + 2u · v) − (u · u + v · v − 2u · v)]
4
i
1 h
2
2
2
2
=
kuk − kuk + kvk − kvk + 4u · v
4
= u · v.
64. (a) Let u, v ∈ Rn . Then using the previous exercise,
2
2
2
2
ku + vk = ku − vk ⇔ ku + vk = ku − vk
⇔ ku + vk − ku − vk = 0
1
1
2
2
⇔ ku + vk − ku − vk = 0
4
4
⇔u·v =0
⇔ u and v are orthogonal.
1.2. LENGTH AND ANGLE: THE DOT PRODUCT
23
(b) Part (a) tells us that a parallelogram is
a rectangle if and only if the lengths of
its diagonals are equal.
u+v
u-v
u
v
2
2
2
2
65. (a) By Exercise 55, (u+v)·(u−v) = kuk −kvk . Thus (u+v)·(u−v) = 0 if and only if kuk = kvk .
It follows immediately that u + v and u − v are orthogonal if and only if kuk = kvk.
(b) Part (a) tells us that the diagonals of a
parallelogram are perpendicular if and
u+v
only if the lengths of its sides are equal,
i.e., if and only if it is a rhombus.
u-v
u
v
2
2
2
2
66. From Example 1.9 and the fact that w · w = kwk
q, we have ku + vk = kuk + 2u · v + kvk . Taking
2
2
the square root of both sides yields ku + vk = kuk + 2u · v + kvk . Now substitute in the given
√
values kuk = 2, kvk = 3, and u · v = 1, giving
r
√ 2 √
√
ku + vk = 22 + 2 · 1 +
3 = 4 + 2 + 3 = 9 = 3.
67. From Theorem 1.4 (the Cauchy-Schwarcz inequality), we have |u · v| ≤ kuk kvk. If kuk = 1 and
kvk = 2, then |u · v| ≤ 2, so we cannot have u · v = 3.
68. (a) If u is orthogonal to both v and w, then u · v = u · w = 0. Then
u · (v + w) = u · v + u · w = 0 + 0 = 0,
so that u is orthogonal to v + w.
(b) If u is orthogonal to both v and w, then u · v = u · w = 0. Then
u · (sv + tw) = u · (sv) + u · (tw) = s(u · v) + t(u · w) = s · 0 + t · 0 = 0,
so that u is orthogonal to sv + tw.
69. We have
u · v u · (v − proju v) = u · v −
u
u
·u u
·v
=u·v−u·
u
u
·
u · v u
=u·v−
(u · u)
u·u
=u·v−u·v
= 0.
24
CHAPTER 1. VECTORS
70. (a) proju (proju v) = proju
u·v
u·u u
u·v
u·v
= u·u
proju u = u·u
u = proju v.
(b) Using part (a),
u · v
u · v u · v
u−
proj u
proju (v − proju v) = proju v −
u =
u·u
u·u
u·u u
u · v
u·v
=
u−
u = 0.
u·u
u·u
(c) From the diagram, we see that
proju vku, so that proju (proju v) =
proju v. Also, (v − proju v) ⊥ u, so
that proju (v − proju v) = 0.
u
proju HvL
v
v - proju HvL
-proju HvL
71. (a) We have
(u21 + u22 )(v12 + v22 ) − (u1 v1 + u2 v2 )2 = u21 v12 + u21 v22 + u22 v12 + u22 v22 − u21 v12 − 2u1 v1 u2 v2 − u22 v22
= u21 v22 + u22 v12 − 2u1 u2 v1 v2
= (u1 v2 − u2 v1 )2 .
But the final expression is nonnegative since it is a square. Thus the original expression is as well,
showing that (u21 + u22 )(v12 + v22 ) − (u1 v1 + u2 v2 )2 ≥ 0.
(b) We have
(u21 + u22 + u23 )(v12 + v22 + v32 ) − (u1 v1 + u2 v2 + u3 v3 )2
= u21 v12 + u21 v22 + u21 v32 + u22 v12 + u22 v22 + u22 v32 + u23 v12 + u23 v22 + u23 v32
− u21 v12 − 2u1 v1 u2 v2 − u22 v22 − 2u1 v1 u3 v3 − u23 v32 − 2u2 v2 u3 v3
= u21 v22 + u21 v32 + u22 v12 + u22 v32 + u23 v12 + u23 v22
− 2u1 u2 v1 v2 − 2u1 v1 u3 v3 − 2u2 v2 u3 v3
= (u1 v2 − u2 v1 )2 + (u1 v3 − u3 v1 )2 + (u3 v2 − u2 v3 )2 .
But the final expression is nonnegative since it is the sum of three squares. Thus the original
expression is as well, showing that (u21 + u22 + u23 )(v12 + v22 + v32 ) − (u1 v1 + u2 v2 + u3 v3 )2 ≥ 0.
u·v
u, we have
72. (a) Since proju v = u·u
u·v
u·v u· v−
u
u·u
u·u
u·v
u·v
=
u·v−
u·u
u·u
u·u
u·v
=
(u · v − u · v)
u·u
= 0,
proju v · (v − proju v) =
Exploration: Vectors and Geometry
25
so that proju v is orthogonal to v − proju v. Since their vector sum is v, those three vectors form
a right triangle with hypotenuse v, so by Pythagoras’ Theorem,
2
2
2
2
kproju vk ≤ kproju vk + kv − proju vk = kvk .
Since norms are always nonnegative, taking square roots gives kproju vk ≤ kvk.
(b)
kproju vk ≤ kvk ⇐⇒
⇐⇒
⇐⇒
u · v
u ≤ kvk
u·u !
u·v
u ≤ kvk
2
kuk
u·v
kuk
2
kuk ≤ kvk
|u · v|
≤ kvk
kuk
⇐⇒ |u · v| ≤ kuk kvk ,
⇐⇒
which is the Cauchy-Schwarcz inequality.
u·v
73. Suppose proju v = cu. From the figure, we see that cos θ = ckuk
kvk . But also cos θ = kukkvk . Thus these
two expressions are equal, i.e.,
c kuk
u·v
u·v
u·v
u·v
=
⇒ c kuk =
⇒c=
=
.
kvk
kuk kvk
kuk
kuk kuk
u·u
74. The basis for induction is the cases n = 1 and n = 2. The n = 1 case is the assertion that kv1 k ≤ kv2 k,
which is obviously true. The n = 2 case is the Triangle Inequality, which is also true.
Now assume the statement holds for n = k ≥ 2; that is, for any v1 , v2 , . . . , vk ,
kv1 + v2 + · · · + vk k ≤ kv1 k + kv2 k + · · · + kvk k .
Let v1 , v2 , . . . , vk , vk+1 be any vectors. Then
kv1 + v2 + · · · + vk + vk+1 k = kv1 + v2 + · · · + (vk + vk+1 )k
≤ kv1 k + kv2 k + · · · + kvk + vk+1 k
using the inductive hypothesis. But then using the Triangle Inequality (or the case n = 2 in this
theorem), kvk + vk+1 k ≤ kvk k + kvk+1 k. Substituting into the above gives
kv1 + v2 + · · · + vk + vk+1 k ≤ kv1 k + kv2 k + · · · + kvk + vk+1 k
≤ kv1 k + kv2 k + · · · + kvk k + kvk+1 k ,
which is what we were trying to prove.
Exploration: Vectors and Geometry
# »
# »
# »
1. As in Example 1.25, let p = OP . Then p − a = AP = 31 AB = 13 (b − a), so that p = a + 13 (b − a) =
# »
# »
1
1
3 (2a + b). More generally, if P is the point n of the way from A to B along AB, then p − a = AP =
1# »
1
1
1
n AB = n (b − a), so that p = a + n (b − a) = n ((n − 1)a + b).
# »
2. Use the notation that the vector OX is written x. Then from exercise 1, we have p = 21 (a + c) and
q = 12 (b + c), so that
1
1
1
1# »
# »
P Q = q − p = (b + c) − (a + c) = (b − a) = AB.
2
2
2
2
26
CHAPTER 1. VECTORS
# »
# »
# »
# »
# »
3. Draw AC. Then from exercise 2, we have P Q = 12 AB = SR. Also draw BD. Again from exercise 2,
# » # »
# »
we have P S = 12 BD = QR. Thus opposite sides of the quadrilateral P QRS are equal. They are also
parallel: indeed, 4BP Q and 4BAC are similar, since they share an angle and BP : BA = BQ : BC.
Thus ∠BP Q = ∠BAC; since these angles are equal, P QkAC. Similarly, SRkAC so that P QkSR. In
a like manner, we see that P SkRQ. Thus P QRS is a parallelogram.
4. Following the hint, we find m, the point that is two-thirds of the distance from A to P . From exercise
1, we have
1
1
1
1
1
p = (b + c), so that m = (2p + a) =
2 · (b + c) + a = (a + b + c).
2
3
3
2
3
Next we find m0 , the point that is two-thirds of the distance from B to Q. Again from exercise 1, we
have
1
1
1
1
1
0
q = (a + c), so that m = (2q + b) =
2 · (a + c) + b = (a + b + c).
2
3
3
2
3
Finally we find m00 , the point that is two-thirds of the distance from C to R. Again from exercise 1,
we have
1
1
1
1
1
00
2 · (a + b) + c = (a + b + c).
r = (a + b), so that m = (2r + c) =
2
3
3
2
3
Since m = m0 = m00 , all three medians intersect at the centroid, G.
# »
# »
# » # »
5. With notation as in the figure, we know that AH is orthogonal to BC; that is, AH · BC = 0. Also
# »
# »
# » # »
# » # »
BH is orthogonal to AC; that is, BH · AC = 0. We must show that CH · AB = 0. But
# » # »
AH · BC = 0 ⇒ (h − a) · (b − c) = 0 ⇒ h · b − h · c − a · b + a · c = 0
# » # »
BH · AC = 0 ⇒ (h − b) · (c − a) = 0 ⇒ h · c − h · a − b · c + b · a = 0.
Adding these two equations together and canceling like terms gives
# » # »
0 = h · b − h · a − c · b + a · c = (h − c) · (b − a) = CH · AB,
so that these two are orthogonal. Thus all the altitudes intersect at the orthocenter H.
# »
# »
# »
# »
6. We are given that QK is orthogonal to AC and that P K is orthogonal to CB, and must show that
# »
# »
RK is orthogonal to AB. By exercise 1, we have q = 21 (a + c), p = 12 (b + c), and r = 12 (a + b). Thus
1
# » # »
QK · AC = 0 ⇒ (k − q) · (c − a) = 0 ⇒ k − (a + c) · (c − a) = 0
2
1
# » # »
P K · CB = 0 ⇒ (k − p) · (b − c) = 0 ⇒ k − (b + c) · (b − c) = 0.
2
Expanding the two dot products gives
1
1
1
1
k·c−k·a− a·c+ a·a− c·c+ a·c=0
2
2
2
2
1
1
1
1
k · b − k · c − b · b + b · c − c · b + c · c = 0.
2
2
2
2
Add these two together and cancel like terms to get
1
1
1
# » # »
0 = k · b − k · a − b · b + a · a = k − (b + a) · (b − a) = (k − r) · (b − a) = RK · AB.
2
2
2
# »
# »
Thus RK and AB are indeed orthogonal, so all the perpendicular bisectors intersect at the circumcenter.
1.3. LINES AND PLANES
27
2
2
7. Let O, the center of the circle, be the origin. Then b = −a and kak = kck = r2 where r is the radius
# »
# »
of the circle. We want to show that AC is orthogonal to BC. But
# » # »
AC · BC = (c − a) · (c − b)
= (c − a) · (c + a)
2
2
= kck + c · a − kak − a · c
= (a · c − c · a) + (r2 − r2 ) = 0.
Thus the two are orthogonal, so that ∠ACB is a right angle.
8. As in exercise 5, we first find m, the point that is halfway from P to R. We have p = 21 (a + b) and
r = 12 (c + d), so that
1 1
1
1
1
(a + b) + (c + d) = (a + b + c + d).
m = (p + r) =
2
2 2
2
4
Similarly, we find m0 , the point that is halfway from Q to S. We have q = 21 (b + c) and s = 12 (a + d),
so that
1
1 1
1
1
0
m = (q + s) =
(b + c) + (a + d) = (a + b + c + d).
2
2 2
2
4
# »
# »
Thus m = m0 , so that P R and QS intersect at their mutual midpoints; thus, they bisect each other.
1.3
Lines and Planes
3
0
1. (a) The normal form is n · (x − p) = 0, or
· x−
= 0.
2
0
x
(b) Letting x =
, we get
y
3
x
0
3
x
·
−
=
·
= 3x + 2y = 0.
2
y
0
2
y
The general form is 3x + 2y = 0.
3
1
2. (a) The normal form is n · (x − p) = 0, or
· x−
= 0.
−4
2
x
(b) Letting x =
, we get
y
3
x
1
3
x−1
·
−
=
·
= 3(x − 1) − 4(y − 2) = 0.
−4
y
2
−4
y−2
Expanding and simplifying gives the general form 3x − 4y = −5.
1
−1
3. (a) In vector form, the equation of the line is x = p + td, or x =
+t
.
0
3
x
x
1−t
(b) Letting x =
and expanding the vector form from part (a) gives
=
, which yields
y
y
3t
the parametric form x = 1 − t, y = 3t.
−4
1
4. (a) In vector form, the equation of the line is x = p + td, or x =
+t
.
4
1
x
(b) Letting x =
and expanding the vector form from part (a) gives the parametric form x = −4+t,
y
y = 4 + t.
28
CHAPTER 1. VECTORS
 
 
0
1
5. (a) In vector form, the equation of the line is x = p + td, or x = 0 + t −1.
0
4
 
x
(b) Letting x = y  and expanding the vector form from part (a) gives the parametric form x = t,
z
y = −t, z = 4t.
 
 
3
2
6. (a) In vector form, the equation of the line is x = p + td, or x =  0 + t 5.
−2
0
 
 


x
x
3 + 2t
(b) Letting x = y  and expanding the vector form from part (a) gives y  =  5t , which
z
z
−2
yields the parametric form x = 3 + 2t, y = 5t, z = −2.
  
 
3
0
7. (a) The normal form is n · (x − p) = 0, or 2 · x − 1 = 0.
1
0
 
x
(b) Letting x = y , we get
z
        

3
x
0
3
x
2 · y  − 1 = 2 · y − 1 = 3x + 2(y − 1) + z = 0.
1
z
0
1
z
Expanding and simplifying gives the general form 3x + 2y + z = 2.
 
  
3
2
8. (a) The normal form is n · (x − p) = 0, or 5 · x −  0 = 0.
−2
0
 
x
(b) Letting x = y , we get
z

        
x
3
2
x−3
2
5 · y  −  0 = 5 ·  y  = 2(x − 3) + 5y = 0.
0
z
−2
0
z+2
Expanding and simplifying gives the general form 2x + 5y = 6.
9. (a) In vector form, the equation of the line is x = p + su + tv, or
 
 
 
0
2
−3
x = 0 + s 1 + t  2 .
0
2
1
 
x
(b) Letting x = y  and expanding the vector form from part (a) gives
z
  

x
2s − 3t
y  =  s + 2t 
z
2s + t
which yields the parametric form the parametric form x = 2s − 3t, y = s + 2t, z = 2s + t.
1.3. LINES AND PLANES
29
10. (a) In vector form, the equation of the line is x = p + su + tv, or
 
 
 
6
0
−1
x = −4 + s 1 + t  1 .
−3
1
1
 
x
(b) Letting x = y  and expanding the vector form from part (a) gives the parametric form x = 6−t,
z
y = −4 + s + t, z = −3 + s + t.
11. Any pair of points on ` determine a direction vector, so we use P and Q. We choose P to represent
# »
the point on the line. Then a direction vector forthe line is
d
= P Q = (3, 0) − (1, −2) = (2, 2). The
1
2
vector equation for the line is x = p + td, or x =
+t
.
−2
2
12. Any pair of points on ` determine a direction vector, so we use P and Q. We choose P to represent the
# »
point on the line. Then a direction vector for the line is
d=
 P Q= (−2,
 1, 3) − (0, 1, −1) = (−2, 0, 4).
0
−2
The vector equation for the line is x = p + td, or x =  1 + t  0.
−1
4
13. We must find two direction vectors, u and v. Since P , Q, and R lie in a plane, we compute We get
two direction vectors
# »
u = P Q = q − p = (4, 0, 2) − (1, 1, 1) = (3, −1, 1)
# »
v = P R = r − p = (0, 1, −1) − (1, 1, 1) = (−1, 0, −2).
Since u and v are not scalar multiples of each other, they will serve as direction vectors (if they were
parallel to each other, wewould
have

 not aplane
 but a line). So the vector equation for the plane is
3
−1
1
x = p + su + tv, or x = 1 + s −1 + t  0.
1
−2
1
14. We must find two direction vectors, u and v. Since P , Q, and R lie in a plane, we compute We get
two direction vectors
# »
u = P Q = q − p = (1, 0, 1) − (1, 1, 0) = (0, −1, 1)
# »
v = P R = r − p = (0, 1, 1) − (1, 1, 0) = (−1, 0, 1).
Since u and v are not scalar multiples of each other, they will serve as direction vectors (if they were
parallel to each other, wewould
have

 not aplane
 but a line). So the vector equation for the plane is
1
0
−1
x = p + su + tv, or x = 1 + s −1 + t  0.
0
1
1
15. The parametric and associated vector forms x = p + td found below are not unique.
(a) As in the remarks prior to Example 1.20, we start by letting x = t. Substituting x = t into
y = 3x − 1 gives
Sowe get parametric equations x = t, y = 3t − 1, and corresponding
y=
3t − 1. x
0
1
vector form
=
+t
.
y
−1
3
(b) In this case since the coefficient of y is 2, we start by letting x = 2t. Substituting x = 2t into
3x + 2y = 5 gives 3 · 2t + 2y = 5, which gives y = −3t + 25 . So we get parametric equations x = 2t,
y = 52 − 3t, with corresponding vector equation
" # " #
" #
x
0
2
= 5 +t
.
y
−3
2
30
CHAPTER 1. VECTORS
Note that theequation
was of the form ax + by = c with a = 3, b = 2, and that a direction vector
b
was given by
. This is true in general.
−a
16. Note that x = p + t(q − p) is the line that passes through p (when t = 0) and q (when t = 1). We
write d = q − p; this is a direction vector for the line through p and q.
(a) As noted above, the line p + td passes through P at t = 0 and through Q at t = 1. So as t varies
from 0 to 1, the line describes the line segment P Q.
(b) As shown in Exploration: Vectors and Geometry, to find the midpoint of P Q, we start at P
# »
and travel half the length of P Q in the direction of the vector P Q = q − p. That is, the midpoint
of P Q is the head of the vector p + 12 (q − p). Since x = p + t(q − p), we see that this line passes
through the midpoint at t = 21 , and that the midpoint is in fact p + 21 (q − p) = 21 (p + q).
(c) From part (b), the midpoint is 12 ([2, −3] + [0, 1]) = 12 [2, −2] = [1, −1].
(d) From part (b), the midpoint is 21 ([1, 0, 1] + [4, 1, −2]) = 12 [5, 1, −1] = 52 , 12 , − 12 .
(e) Again from Exploration: Vectors and Geometry, the vector whose head is 31 of the way from
P to Q along P Q is x1 = 13 (2p + q). Similarly, the vector whose head is 32 of the way from P to
Q along P Q is also the vector one third of the way from Q to P along QP ; applying the same
formula gives for this point x2 = 31 (2q + p). When p = [2, −3] and q = [0, 1], we get
1
4
5
1
,−
x1 = (2[2, −3] + [0, 1]) = [4, −5] =
3
3
3
3
1
1
2
1
x2 = (2[0, 1] + [2, −3]) = [2, −1] =
,− .
3
3
3
3
(f ) Using the formulas from part (e) with p = [1, 0, −1] and q = [4, 1, −2] gives
1
1
4
1
(2[1, 0, −1] + [4, 1, −2]) = [6, 1, −4] = 2, , −
3
3
3
3
1
1
2
5
x2 = (2[4, 1, −2] + [1, 0, −1]) = [9, 2, −5] = 3, , − .
3
3
3
3
x1 =
17. A line `1 with slope m1 has equation y = m1 x + b1 , or −m1 x + y = b1 . Similarly, a line `2 withslope
−m1
m2 has equation y = m2 x + b2 , or −m2 x + y = b2 . Thus the normal vector for `1 is n1 =
, and
1
−m2
the normal vector for `2 is n2 =
. Now, `1 and `2 are perpendicular if and only if their normal
1
vectors are perpendicular, i.e., if and only if n1 · n2 = 0. But
−m1
−m2
n1 · n2 =
·
= m1 m2 + 1,
1
1
so that the normal vectors are perpendicular if and only if m1 m2 +1 = 0, i.e., if and only if m1 m2 = −1.
18. Suppose the line ` has direction vector d, and the plane P has normal vector n. Then if d · n = 0 (d
and n are orthogonal), then the line ` is parallel to the plane P. If on the other hand d and n are
parallel, so that d = n, then ` is perpendicular to P.
 
2
(a) Since the general form of P is 2x + 3y − z = 1, its normal vector is n =  3. Since d = 1n, we
−1
see that ` is perpendicular to P.
1.3. LINES AND PLANES
31


4
(b) Since the general form of P is 4x − y + 5z = 0, its normal vector is n = −1. Since
5
   
2
4
d · n =  3 · −1 = 2 · 4 + 3 · (−1) − 1 · 5 = 0,
−1
5
` is parallel to P.


1
(c) Since the general form of P is x − y − z = 3, its normal vector is n = −1. Since
−1
   
2
1
d · n =  3 · −1 = 2 · 1 + 3 · (−1) − 1 · (−1) = 0,
−1
−1
` is parallel to P.


4
(d) Since the general form of P is 4x + 6y − 2z = 0, its normal vector is n =  6. Since
−2
 
 
4
2
1
1
d =  3 =  6 = n,
2
2
−1
−2
` is perpendicular to P.
19. Suppose the plane P1 has normal vector n1 , and the plane P has normal vector n. Then if n1 · n = 0
(n1 and n are orthogonal), then P1 is perpendicular to P. If on the other hand n1 and n are
parallel, so that n1 = cn, thenP1is parallel to P. Note that in this exercise, P1 has the equation
4
4x − y + 5z = 2, so that n1 = −1.
5
 
2
(a) Since the general form of P is 2x + 3y − z = 1, its normal vector is n =  3. Since
−1
   
4
2
n1 · n = −1 ·  3 = 4 · 2 − 1 · 3 + 5 · (−1) = 0,
−1
5
the normal vectors are perpendicular, and thus P1 is perpendicular to P.


4
(b) Since the general form of P is 4x − y + 5z = 0, its normal vector is n = −1. Since n1 = n,
5
P1 is parallel to P.
 
1
(c) Since the general form of P is x − y − z = 3, its normal vector is n = −1. Since
−1
   
4
1
n1 · n = −1 · −1 = 4 · 1 − 1 · (−1) + 5 · (−1) = 0,
5
−1
the normal vectors are perpendicular, and thus P1 is perpendicular to P.
32
CHAPTER 1. VECTORS


4
(d) Since the general form of P is 4x + 6y − 2z = 0, its normal vector is n =  6. Since
−2

  
4
4
n1 · n = −1 ·  6 = 4 · 4 − 1 · 6 + 5 · (−2) = 0,
5
−2
the normal vectors are perpendicular, and thus P1 is perpendicular to P.
20. Since the vector form is x = p + td, we use the given information to determine
p and d. The general
2
equation of the given line is 2x − 3y = 1, so its normal vector is n =
. Our line is perpendicular
−3
2
to that line, so it has direction vector d = n =
. Furthermore, since our line passes through the
−3
2
point P = (2, −1), we have p =
. Thus the vector form of the line perpendicular to 2x − 3y = 1
−1
through the point P = (2, −1) is
x
2
2
=
+t
.
y
−1
−3
21. Since the vector form is x = p + td, we use the given information todetermine
p and d. The general
2
equation of the given line is 2x − 3y = 1, so its normal vector is n =
. Our line is parallel to that
−3
3
(note that d · n = 0). Since our line passes through the point
line, so it has direction vector d =
2
2
P = (2, −1), we have p =
, so that the vector equation of the line parallel to 2x − 3y = 1 through
−1
the point P = (2, −1) is
x
2
3
=
+t
.
y
−1
2
22. Since the vector form is x = p + td, we use the given information to determine p and d. A line is
perpendicular to a plane if its direction vector d is the normal vector n of
 the
 plane. The general
1
equation of the given plane is x − 3y + 2z = 5, so its normal vector is n = −3. Thus the direction
2
 
1
vector of our line is d = −3. Furthermore, since our line passes through the point P = (−1, 0, 3), we
2
 
−1
have p =  0. So the vector form of the line perpendicular to x − 3y + 2z = 5 through P = (−1, 0, 3)
3
is
   
 
x
−1
1
y  =  0 + t −3 .
z
3
2
23. Since the vector form is x = p + td, we use the given information to determine p and d. Since the
given line has parametric equations
   
 
x
1
−1
x = 1 − t, y = 2 + 3t, z = −2 − t, it has vector form y  =  2 + t  3 .
z
−2
−1
1.3. LINES AND PLANES
33
 
−1
So its direction vector is  3, and this must be the direction vector d of the line we want, which is
−1
 
−1
parallel to the given line. Since our line passes through the point P = (−1, 0, 3), we have p =  0.
3
So the vector form of the line parallel to the given line through P = (−1, 0, 3) is
   
 
−1
−1
x
y  =  0 + t  3 .
z
3
−1
24. Since the normal form is n · x = n · p, we use the given information to determine n and p. Note that
a plane is parallel to a given plane if their normal vectors
 are
 equal. Since the general form of the
6
given plane is 6x − y + 2z = 3, its normal vector is n = −1, so this must be a normal vector of the
2
desired
plane
as
well.
Furthermore,
since
our
plane
passes
through the point P = (0, −2, 5), we have
 
0
p = −2. So the normal form of the plane parallel to 6x − y + 2z = 3 through (0, −2, 5) is
5
   
       
x
6
0
6
x
6
−1 · y  = −1 · −2 or −1 · y  = 12.
z
2
5
2
z
2
25. Using Figure 1.34 in Section 1.2 for reference, we will find a normal vector n and a point vector p for
each of the sides, then substitute into n · x = n · p to get an equation for each plane.
(a) Start with P
1 determined by the face of the cube in the xy-plane. Clearly a normal vector for
1
P1 is n = 0, or any vector parallel to the x-axis. Also, the plane passes through P = (0, 0, 0),
0 
0
so we set p = 0. Then substituting gives
0
       
1
x
1
0
0 · y  = 0 · 0 or x = 0.
0
z
0
0
So the general equation for P1 is x = 0. Applying the same argument above to the plane P2
determined by the face in the xz-plane gives a general equation of y = 0, and similarly the plane
P3 determined by the face in the xy-plane gives a general equation of z = 0.
Now consider P4 , the plane containing the face parallel to the face 
in 
the yz-plane but passing
1
through (1, 1, 1). Since P4 is parallel to P1 , its normal vector is also 0; since it passes through
0
 
1
(1, 1, 1), we set p = 1. Then substituting gives
1
       
1
x
1
1
0 · y  = 0 · 1 or x = 1.
0
z
0
1
So the general equation for P4 is x = 1. Similarly, the general equations for P5 and P6 are
y = 1 and z = 1.
34
CHAPTER 1. VECTORS
 
x
(b) Let n = y  be a normal vector for the desired plane P. Since P is perpendicular to the
z
xy-plane, their normal vectors must be orthogonal. Thus
   
x
0
y  · 0 = x · 0 + y · 0 + z · 1 = z = 0.
z
1
 
x
Thus z = 0, so the normal vector is of the form n = y . But the normal vector is also
0
perpendicular to the plane in question, by definition. Since that plane contains both the origin
and (1, 1, 1), the normal vector is orthogonal to (1, 1, 1) − (0, 0, 0):
   
x
1
y  · 1 = x · 1 + y · 1 + z · 0 = x + y = 0.
0
1
 
x
Thus x + y = 0, so that y = −x. So finally, a normal vector to P is given by n = −x for
0
 
1
any nonzero x. We may as well choose x = 1, giving n = −1. Since the plane passes through
0
(0, 0, 0), we let p = 0. Then substituting in n · x = n · p gives
       
0
1
x
1
−1 · y  = −1 · 0 , or x − y = 0.
0
0
z
0
Thus the general equation for the plane perpendicular to the xy-plane and containing the diagonal
from the origin to (1, 1, 1) is x − y = 0.
(c) As in Example 1.22 (Figure 1.34)
  in Section 1.2, use u = [0, 1, 1] and v = [1, 0, 1] as two vectors
x
in the required plane. If n = y  is a normal vector to the plane, then n · u = 0 = n · v:
z
   
   
x
0
x
1
n · u = y  · 1 = y + z = 0 ⇒ y = −z,
n · v = y  · 0 = x + z = 0 ⇒ x = −z.
z
1
z
1
 
 
−z
1
Thus the normal vector is of the form n = −z  for any z. Taking z = −1 gives n =  1.
z
−1
Now, the side diagonals pass through (0, 0, 0), so set p = 0. Then n · x = n · p yields
       
1
x
1
0
 1 · y  =  1 · 0 , or x + y − z = 0.
−1
z
−1
0
The general equation for the plane containing the side diagonals is x + y − z = 0.
26. Finding the distance between points A and B is equivalent to finding d(a, b), where a is the vector
from the origin to A, and similarly for b. Given x = [x, y, z], p = [1, 0, −2], and q = [5, 2, 4], we want
to solve d(x, p) = d(x, q); that is,
p
p
d(x, p) = (x − 1)2 + (y − 0)2 + (z + 2)2 = (x − 5)2 + (y − 2)2 + (z − 4)2 = d(x, q).
1.3. LINES AND PLANES
35
Squaring both sides gives
(x − 1)2 + (y − 0)2 + (z + 2)2 = (x − 5)2 + (y − 2)2 + (z − 4)2
2
2
2
2
2
⇒
2
x − 2x + 1 + y + z + 4z + 4 = x − 10x + 25 + y − 4y + 4 + z − 8z + 16
8x + 4y + 12z = 40
⇒
⇒
2x + y + 3z = 10.
Thus all such points (x, y, z) lie on the plane 2x + y + 3z = 10.
0 −c|
27. To calculate d(Q, `) = |ax√0a+by
, we first put ` into general form. With d =
2 +b2
1
1
, we get n =
−1
1
since then n · d = 0. Then we have
n·x=n·p ⇒
1
x
1 −1
·
=
= 1.
1
y
1
2
Thus x + y = 1 and thus a = b = c = 1. Since Q = (2, 2) = (x0 , y0 ), we have
√
3
3 2
|1 · 2 + 1 · 2 − 1|
√
.
=√ =
d(Q, `) =
2
2
12 + 12
 
−2
28. Comparing the given equation to x = p + td, we get P = (1, 1, 1) and d =  0. As suggested by
3
# »
Figure 1.63, we need to calculate the length of RQ, where R is the point on the line at the foot of the
# »
perpendicular from Q. So if v = P Q, then
# »
# »
P R = projd v,
RQ = v − projd v.
     
1
−1
0
# »
Now, v = P Q = q − p = 1 − 1 =  0, so that
1
−1
0
projd v =
d·v
d·d
d=
  

2
−2
13
−2 · (−1) + 3 · (−1)   

 0 =  0 .
−2 · (−2) + 3 · 3
3
3
− 13
Thus
  
 

2
15
−1
− 13
13
  
 

v − projd v =  0 −  0 =  0 .
3
−1
− 13
− 10
13
Then the distance d(Q, `) from Q to ` is
 
√
3
5  
5p 2
5 13
0 =
.
kv − projd vk =
3 + 22 =
13
13
13
2
0 +by0 +cz0 −d|
29. To calculate d(Q, P) = |ax√
, we first note that the plane has equation x + y − z = 0, so
a2 +b2 +c2
that a = b = 1, c = −1, and d = 0. Also, Q = (2, 2, 2), so that x0 = y0 = z0 = 2. Hence
√
|1 · 2 + 1 · 2 − 1 · 2 − 0|
2
2 3
d(Q, P) = p
=√ =
.
3
3
12 + 12 + (−1)2
36
CHAPTER 1. VECTORS
0 +by0 +cz0 −d|
30. To calculate d(Q, P) = |ax√
, we first note that the plane has equation x − 2y + 2z = 1, so
a2 +b2 +c2
that a = 1, b = −2, c = 2, and d = 1. Also, Q = (0, 0, 0), so that x0 = y0 = z0 = 0. Hence
d(Q, P) =
1
|1 · 0 − 2 · 0 + 2 · 0 − 1|
p
= .
3
12 + (−2)2 + 22
# »
# »
31. Figure 1.66 suggests that we let v = P Q;
then
w = P R = projd v. Comparing
the
given
line ` to
# »
1
2
−1
3
x = p + td, we get P = (−1, 2) and d =
. Then v = P Q = q − p =
−
=
. Next,
−1
2
2
0
w = projd v =
d·v
d·d
d=
1 · 3 + (−1) · 0
1 · 1 + (−1) · (−1)
3
1
1
=
.
−1
2 −1
So
" # " # " #
1
3
−1
# »
2 = 2 .
r = p + P R = p + projd v = p + w =
+
1
− 32
2
2
1 1
So the point R on ` that is closest to Q is 2 , 2 .
# »
# »
32. Figure 1.66 suggests that we let
= P Q; then P R = projd v. Comparing
givenline ` to x = p+td,
   the 
 v
0
1
−1
−2
# »
we get P = (1, 1, 1) and d =  0. Then v = P Q = q − p = 1 − 1 =  0. Next,
0
1
−1
3
projd v =
d·v
d·d
d=
  

2
−2
13
−2 · (−1) + 3 · (−1)   

 0 =  0 .
(−2)2 + 32
3
3
− 13
So
  
  
15
2
1
13
13
# »
  
  
r = p + P R = p + projd v = 1 +  0 =  1  .
3
10
− 13
1
13
15
10
So the point R on ` that is closest to Q is 13 , 1, 13 .
# »
# »
33. Figure 1.67 suggests we let v = P Q, where Pis some
point on the plane; then QR = projn v. The

1
equation of the plane is x + y − z = 0, so n =  1. Setting y = 0 shows that P = (1, 0, 1) is a point
−1
on the plane. Then
     
2
1
1
# »
v = P Q = q − p = 2 − 0 = 2 ,
2
1
1
so that

projn v =
n · v
n·n
n=
1·1+1·1−1·1
12 + 12 + (−1)2
  
2
1
3
   2
 1 =  3  .
−1
− 23
Finally,
       
2
4
1
1
3
3
# » # »
       
r = p + P Q + P R = p + v − projn v = 0 + 2 −  23  =  43  .
8
1
1
− 23
3
Therefore, the point R in P that is closest to Q is 34 , 43 , 83 .
1.3. LINES AND PLANES
37
# »
# »
34. Figure 1.67 suggests we let v = P Q, where P is some
 point on the plane; then QR = projn v. The
1
equation of the plane is x − 2y + 2z = 1, so n = −2. Setting y = z = 0 shows that P = (1, 0, 0) is a
2
point on the plane. Then
     
0
1
−1
# »
v = P Q = q − p = 0 − 0 =  0 ,
0
0
0
so that
  
− 91
1
   2
−2 =  9  .
2
− 29

projn v =
n · v
n·n
n=
1 · (−1)
12 + (−2)2 + 22
Finally,
       
− 19
−1
1
− 19
# » # »
     2  2
r = p + P Q + P R = p + v − projn v = 0 +  0 −  9  =  9  .
0
0
− 29
− 29
Therefore, the point R in P that is closest to Q is − 19 , 29 , − 29 .
35. Since the given lines `1 and `2 are parallel, choose arbitrary points Q on `1 and P on `2 , say Q = (1, 1)
and P = (5, 4). The direction vector of `2 is d = [−2, 3]. Then
# »
1
5
−4
v = PQ = q − p =
−
=
,
1
4
−3
so that
−2 · (−4) + 3 · (−3)
d·v
1 −2
−2
=−
d=
.
projd v =
3
3
d·d
(−2)2 + 32
13
Then the distance between the lines is given by
1 −2
1 −54
18 3
18 √
−4
13.
kv − projd vk =
+
=
=
=
−3
3
13
13 −36
13 2
13
36. Since the given lines `1 and `2 are parallel, choose arbitrary points Q on `1 and P on `2 , say Q =
(1, 0, −1) and P = (0, 1, 1). The direction vector of `2 is d = [1, 1, 1]. Then
     
1
0
1
# »
v = P Q = q − p =  0 − 1 = −1 ,
−2
−1
1
so that
projd v =
d·v
d·d
d=
1 · 1 + 1 · (−1) + 1 · (−2)
12 + 12 + 12
 
 
1
1
2
1 = − 1 .
3
1
1
Then the distance between the lines is given by
 
 
 
5
√
1
1
3
42
1p 2
  2 
 1
kv − projd vk = −1 + 1 = − 3  =
5 + (−1)2 + (−4)2 =
.
3
3
3
4
−2
1
−3
37. Since P1 and P2 are parallel, we choose an arbitrary point on P1 , say Q = (0, 0, 0), and compute
d(Q, P2 ). Since the equation of P2 is 2x + y − 2z = 5, we have a = 2, b = 1, c = −2, and d = 5; since
Q = (0, 0, 0), we have x0 = y0 = z0 = 0. Thus the distance is
d(P1 , P2 ) = d(Q, P2 ) =
|ax0 + by0 + cz0 − d|
|2 · 0 + 1 · 0 − 2 · 0 − 5|
5
√
√
=
= .
2
2
2
2
2
2
3
a +b +c
2 +1 +2
38
CHAPTER 1. VECTORS
38. Since P1 and P2 are parallel, we choose an arbitrary point on P1 , say Q = (1, 0, 0), and compute
d(Q, P2 ). Since the equation of P2 is x + y + z = 3, we have a = b = c = 1 and d = 3; since
Q = (1, 0, 0), we have x0 = 1, y0 = 0, and z0 = 0. Thus the distance is
√
|ax0 + by0 + cz0 − d|
|1 · 1 + 1 · 0 + 1 · 0 − 3|
2
2 3
√
√
d(P1 , P2 ) = d(Q, P2 ) =
=
=√ =
.
3
3
a2 + b2 + c2
12 + 1 2 + 1 2
# »
a
0 −c|
,
where
n
=
, n · a = c, and B = (x0 , y0 ). If v = AB =
39. We wish to show that d(B, `) = |ax√0a+by
2 +b2
b
b − a, then
a
x
n · v = n · (b − a) = n · b − n · a =
· 0 − c = ax0 + by0 − c.
b
y0
Then from Figure 1.65, we see that
|n · v|
|ax0 + by0 − c|
√
=
.
n·n
knk
a2 + b2
 
a
0 +by0 +cz0 −d|

b , n · a = d, and B = (x0 , y0 , z0 ). If
40. We wish to show that d(B, `) = |ax√
,
where
n
=
a2 +b2 +c2
c
# »
v = AB = b − a, then
   
x0
a
n · v = n · (b − a) = n · b − n · a =  b  ·  y0  − d = ax0 + by0 + cz0 − d.
c
z0
d(B, `) = kprojn vk =
n · v
n =
Then from Figure 1.65, we see that
d(B, `) = kprojn vk =
n · v
n·n
n =
|ax0 + by0 + cz0 − d|
|n · v|
√
=
.
knk
a2 + b2 + c2
41. Choose B = (x0 , y0 ) on `1 ; since `1 and
`2 are parallel, the distance between them is d(B, `2 ). Then
a
x
since B lies on `1 , we have n · b =
· 0 = ax0 + by0 = c1 . Choose A on `2 , so that n · a = c2 . Set
b
y0
v = b − a. Then using the formula in Exercise 39, the distance is
d(`1 , `2 ) = d(B, `2 ) =
|n · v|
|n · (b − a)|
|n · b − n · a|
|c1 − c2 |
=
=
=
.
knk
knk
knk
knk
42. Choose B = (x0 , y0 , z0 ) on P1 ; since P1 and
parallel, the distance between them is d(B, P2 ).
P
 2 are 
a
x0
Then since B lies on P1 , we have n · b =  b  ·  y0  = ax0 + by0 + cz0 = d1 . Choose A on P2 , so
c
z0
that n · a = d2 . Set v = b − a. Then using the formula in Exercise 40, the distance is
|n · v|
|n · (b − a)|
|n · b − n · a|
|d1 − d2 |
=
=
=
.
knk
knk
knk
knk
 
 
1
2
43. Since P1 has normal vector n1 = 1 and P2 has normal vector n2 =  1, the angle θ between
1
−2
the normal vectors satisfies
d(P1 , P2 ) = d(B, P2 ) =
cos θ =
n1 · n2
1 · 2 + 1 · 1 + 1 · (−2)
1
p
=√
= √ .
2
2
2
2
2
2
kn1 k kn2 k
3 3
1 + 1 + 1 2 + 1 + (−2)
Thus
θ = cos−1
1
√
3 3
≈ 78.9◦ .
1.3. LINES AND PLANES
39


 
3
1
44. Since P1 has normal vector n1 = −1 and P2 has normal vector n2 =  4, the angle θ between
2
−1
the normal vectors satisfies
cos θ =
3 · 1 − 1 · 4 + 2 · (−1)
3
1
n1 · n2
p
=p
= −√ √ = −√ .
2
2
2
2
2
2
kn1 k kn2 k
14 18
28
3 + (−1) + 2 1 + 4 + (−1)
This is an obtuse angle, so the acute angle is
1
π − θ = π − cos−1 − √
≈ 79.1◦ .
28
45. First, to see that P and ` intersect, substitute the parametric equations for ` into the equation for P,
giving
x + y + 2z = (2 + t) + (1 − 2t) + 2(3 + t) = 9 + t = 0,
so that t = −9 represents the point of intersection,
which is thus (2 + (−9), 1 − 2(−9),
 
 3+ (−9)) =
1
1
(−7, 19, −6). Now, the normal to P is n = 1, and a direction vector for ` is d = −2. So if θ is
2
1
the angle between n and d, then θ satisfies
cos θ =
1 · 1 + 1 · (−2) + 2 · 1
1
n·d
p
=√
= ,
2
2
2
2
2
2
knk kdk
6
1 + 1 + 1 1 + (−2) + 1
so that
−1
θ = cos
1
≈ 80.4◦ .
6
Thus the angle between the line and the plane is 90◦ − 80.4◦ ≈ 9.6◦ .
46. First, to see that P and ` intersect, substitute the parametric equations for ` into the equation for P,
giving
4x − y − z = 4 · t − (1 + 2t) − (2 + 3t) = −t − 3 = 6,
so that t = −9 represents the point of intersection,
· (−9)) =
  which is thus (−9, 1 + 2 · (−9), 2 +3 
1
4
(−9, −17, −25). Now, the normal to P is n = −1, and a direction vector for ` is d = 2. So if θ
3
−1
is the angle between n and d, then θ satisfies
cos θ =
n·d
4·1−1·2−1·3
1
√
=√
= −√ √ .
knk kdk
18 14
42 + 12 + 12 12 + 22 + 32
This corresponds to an obtuse angle, so the acute angle between the two is
1
−1
θ = π − cos
−√ √
≈ 86.4◦ .
18 14
Thus the angle between the line and the plane is 90◦ − 86.4◦ ≈ 3.6◦ .
47. We have p = v − c n, so that c n = v − p. Take the dot product of both sides with n, giving
(c n) · n = (v − p) · n
⇒
c(n · n) = v · n − p · n
⇒
c(n · n) = v · n (since p and n are orthogonal)
n·v
c=
.
n·n
⇒
40
CHAPTER 1. VECTORS
Note that another interpretation of the figure is that c n = projn v =
n·v
c = n·n
.
n·v
n·n
n, which also implies that
Now substitute this value of c into the original equation, giving
n · v
n.
p = v − cn = v −
n·n
 
1
48. (a) A normal vector to the plane x + y + z = 0 is n = 1. Then
1
   
1
1
n · v = 1 ·  0 = 1 · 1 + 1 · 0 + 1 · (−2) = −1
1
−2
   
1
1
n · n = 1 · 1 = 1 · 1 + 1 · 1 + 1 · 1 = 3,
1
1
so that c = − 31 . Then
   

4
1
1
3
  1    1
p=v−
n =  0 + 1 =  3  .
n·n
3
1
−2
− 53

n · v

3
(b) A normal vector to the plane 3x − y + z = 0 is n = −1. Then
1

  
1
3
n · v = −1 ·  0 = 3 · 1 − 1 · 0 + 1 · (−2) = 1
−2
1
   
3
3
n · n = −1 · −1 = 3 · 3 − 1 · (−1) + 1 · 1 = 11,
1
1

1
so that c = 11
. Then


  

8
1
3
11
1    1
 
p=v−
n =  0 −
−1 =  11  .
n·n
11
−2
1
− 23
11
n · v


1
(c) A normal vector to the plane x − 2z = 0 is n =  0. Then
−2

  
1
1
n · v =  0 ·  0 = 1 · 1 + 0 · 0 − 2 · (−2) = 5
−2
−2
   
1
1
n · n =  0 ·  0 = 1 · 1 + 0 · 0 − 2 · (−2) = 5,
−2
−2
Exploration: The Cross Product
41
so that c = 1. Then
  
1
1
   
n =  0 −  0 = 0.
p=v−
n·n
−2
−2

n · v
Note that the projection is 0 because the vector is normal to the plane, so its projection onto the
plane is a single point.
 
2
(d) A normal vector to the plane 2x − 3y + z = 0 is n = −3. Then
1

  
2
1
n · v = −3 ·  0 = 2 · 1 − 3 · 0 + 1 · (−2) = 0
1
−2
   
2
2
n · n = −3 · −3 = 2 · 2 − 3 · (−3) + 1 · 1 = 14,
1
1

1
so that c = 0. Thus p = v =  0. Note that the projection is the vector itself because the
−2
vector is parallel to the plane, so it is orthogonal to the normal vector.

Exploration: The Cross Product
  

 
3
u2 v3 − u3 v2
1 · 3 − 0 · (−1)
1. (a) u × v = u3 v1 − u1 v3  =  1 · 3 − 0 · 2  =  3.
−3
u1 v2 − u2 v1
0 · (−1) − 1 · 3

 
  
u2 v3 − u3 v2
−1 · 1 − 2 · 1
−3
(b) u × v = u3 v1 − u1 v3  =  2 · 0 − 3 · 1  = −3.
u1 v2 − u2 v1
3 · 1 − (−1) · 0
3

 

u2 v3 − u3 v2
2 · (−6) − 3 · (−4)
(c) u × v = u3 v1 − u1 v3  = 3 · 2 − (−1) · (−6) = 0.
u1 v2 − u2 v1
−1 · (−4) − 2 · 2

 
  
u2 v3 − u3 v2
1·3−1·2
1
(d) u × v = u3 v1 − u1 v3  = 1 · 1 − 1 · 3 = −2.
1·2−1·1
1
u1 v2 − u2 v1
2. We have
    
  
1
0
0·0−0·1
0
e1 × e2 = 0 × 1 = 0 · 0 − 1 · 0 = 0 = e3
0
0
1·1−0·0
1
    
  
0
0
1·1−0·0
1
e2 × e3 = 1 × 0 = 0 · 0 − 0 · 1 = 0 = e1
0
1
0·0−1·0
0
    
  
0
1
0·0−1·0
0
e3 × e1 = 0 × 0 = 1 · 1 − 0 · 0 = 1 = e2 .
1
0
0·0−0·1
0
42
CHAPTER 1. VECTORS
3. Two vectors are orthogonal if their dot product equals zero. But

  
u2 v3 − u3 v2
u1
(u × v) · u = u3 v1 − u1 v3  · u2 
u1 v2 − u2 v1
u3
= (u2 v3 − u3 v2 )u1 + (u3 v1 − u1 v3 )u2 + (u1 v2 − u2 v1 )u3
= (u2 v3 u1 − u1 v3 u2 ) + (u3 v1 u2 − u2 v1 u3 ) + (u1 v2 u3 − u3 v2 u1 ) = 0

  
u2 v3 − u3 v2
v1
(u × v) · v = u3 v1 − u1 v3  · v2 
u1 v2 − u2 v1
v3
= (u2 v3 − u3 v2 )v1 + (u3 v1 − u1 v3 )v2 + (u1 v2 − u2 v1 )v3
= (u2 v3 v1 − u2 v1 v3 ) + (u3 v1 v2 − u3 v2 v1 ) + (u1 v2 v3 − u1 v3 v2 ) = 0.
4. (a) By Exercise 1, a vector normal to the plane is
    
  
0
3
1 · 2 − 1 · (−1)
3
n = u × v = 1 × −1 =  1 · 3 − 0 · 2  =  3 .
1
2
0 · (−1) − 1 · 3
−3
So the normal form for the equation of this plane is n · x = n · p, or
       
x
3
1
3
 3 · y  =  3 ·  0 = 9.
z
−3
−2
−3
This simplifies to 3x + 3y − 3z = 9, or x + y − z = 3.
 
 
1
2
# »
# »
(b) Two vectors in the plane are u = P Q = 1 and v = P R =  3. So by Exercise 1, a vector
−2
1
normal to the plane is
  
    
−5
1 · (−2) − 1 · 3
1
2
n = u × v = 1 ×  3 = 1 · 1 − 2 · (−2) =  5 .
5
2·3−1·1
−2
1
So the normal form for the equation of this plane is n · x = n · p, or
       
−5
0
x
−5
 5 · y  =  5 · −1 = 0.
1
z
5
5
This simplifies to −5x + 5y + 5z = 0, or x − y − z = 0.




v2 u3 − v3 u2
u2 v3 − u3 v2
5. (a) v × u = v3 u1 − u3 v1  = − u3 v1 − u1 v3  = −(u × v).
v1 u2 − v2 u1
u1 v2 − u2 v1
    
  
u1
0
u2 · 0 − u3 · 0
0
(b) u × 0 = u2  × 0 = u3 · 0 − u1 · 0 = 0 = 0.
u3
0
u1 · 0 − u2 · 0
0

  
u2 u3 − u3 u2
0
(c) u × u = u3 u1 − u1 u3  = 0 = 0.
u1 u2 − u2 u1
0




u2 kv3 − u3 kv2
u2 v3 − u3 v2
(d) u × kv = u3 kv1 − u1 kv3  = k u3 v1 − u1 v3  = k(u × v).
u1 kv2 − u2 kv1
u1 v2 − u2 v1
Exploration: The Cross Product
43
(e) u × ku = k(u × u) = k(0) = 0 by parts (d) and (c).
(f ) Compute the cross-product:


u2 (v3 + w3 ) − u3 (v2 + w2 )
u × (v + w) = u3 (v1 + w1 ) − u1 (v3 + w3 )
u1 (v2 + w2 ) − u2 (v1 + w1 )


(u2 v3 − u3 v2 ) + (u2 w3 − u3 w2 )
= (u3 v1 − u1 v3 ) + (u3 w1 − u1 w3 )
(u1 v2 − u2 v1 ) + (u1 w2 − u2 w1 )

 

u2 v3 − u3 v2
u2 w3 − u3 w2
= u3 v1 − u1 v3  + u3 w1 − u1 w3 
u1 v2 − u2 v1
u1 w2 − u2 w1
= u × v + u × w.
6. In each case, simply compute:
(a)

  
v2 w3 − v3 w2
u1
u · (v × w) = u2  · v3 w1 − v1 w3 
v1 w2 − v2 w1
u3
= u1 v2 w3 − u1 v3 w2 + u2 v3 w1 − u2 v1 w3 + u3 v1 w2 − u3 v2 w1
= (u2 v3 − u3 v2 )w1 + (u3 v1 − u1 v3 )w2 + (u1 v2 − u2 v1 )w3
= (u × v) · w.
(b)
  

u1
v2 w3 − v3 w2
u × (v × w) = u2  × v3 w1 − v1 w3 
u3
v1 w2 − v2 w1


u2 (v1 w2 − v2 w1 ) − u3 (v3 w1 − v1 w3 )
= u3 (v2 w3 − v3 w2 ) − u1 (v1 w2 − v2 w1 )
u1 (v3 w1 − v1 w3 ) − u2 (v2 w3 − v3 w2 )


(u1 w1 + u2 w2 + u3 w3 )v1 − (u1 v1 + u2 v2 + u3 v3 )w1
= (u1 w1 + u2 w2 + u3 w3 )v2 − (u1 v1 + u2 v2 + u3 v3 )w2 
(u1 w1 + u2 w2 + u3 w3 )v3 − (u1 v1 + u2 v2 + u3 v3 )w3
 
 
v1
w1
= (u1 w1 + u2 w2 + u3 w3 ) v2  − (u1 v1 + u2 v2 + u3 v3 ) w2 
v3
w3
= (u · w)v − (u · v)w.
(c)

 2
u2 v3 − u3 v2
2
ku × vk = u3 v1 − u1 v3 
u1 v2 − u2 v1
= (u2 v3 − u3 v2 )2 + (u3 v1 − u1 v3 )2 + (u1 v2 − u2 v1 )2
= (u21 + u22 + u23 )2 (v12 + v22 + v32 )2 − (u1 v1 + u2 v2 + u3 v3 )2
2
2
= kuk kvk − (u · v)2 .
44
CHAPTER 1. VECTORS
7. For problem 2, use the computation in the solution above to show that e1 × e2 = e3 . Then
e2 × e3 = e2 × (e1 × e2 )
= (e2 · e2 )e1 − (e2 · e1 )e2
6(b)
= 1 · e1 − 0 · e2
ei · ej = 0 if i 6= j, ei · ei = 1
= e1 .
Similarly,
e3 × e1 = e3 × (e2 × e3 )
= (e3 · e3 )e2 − (e3 · e2 )e3
6(b)
= 1 · e2 − 0 · e3
ei · ej = 0 if i 6= j, ei · ei = 1
= e2 .
For problem 3, we have u · (u × v) = (u × u) · v by 6(a), and then since u × u = 0 by Exercise 5(c), this
reduces to 0 · v = 0. Thus u is orthogonal to u × v. Similarly, v · (u × v) = v · (−v × u) = −v · (v × u)
by Exercise 5(a), and then as before, −v · (v × u) = −(v × v) · u = −0 · u = 0, so that v is also
orthogonal to u × v.
8. (a) We have
2
2
2
2
2
2
2
ku × vk = kuk kvk − (u · v)2
2
2
= kuk kvk − kuk kvk cos2 θ
2
2
= kuk kvk 1 − cos2 θ
= kuk kvk sin2 θ.
Since the angle between u and v is always between 0 and π, it always has a nonnegative sine, so
taking square roots gives ku × vk = kuk kvk sin θ.
(b) Recall that the area of a triangle is A = 21 (base)(height). If the angle between u and v is θ, then
the length of the perpendicular from the head of v to the line determined by u is an altitude of
the triangle; the corresponding base is u. Thus the area is
1
1
1
kuk · (kvk sin θ) = kuk kvk sin θ = ku × vk .
2
2
2
 
 
1
4
# »  
# »  
(c) Let u = AB = −1 and v = AC = −3 . Then from part (b), we see that the area is
−1
2
   
 
1
4
−5
1    
1  
1√
−1 × −3
−6 =
A=
=
62.
2
2
2
−1
2
1
A=
1.4
Applications
1. Use the method of example 1.34. The magnitude of the resultant force r is
q
p
2
2
krk = kf1 k + kf2 k = 122 + 52 = 13 N,
while the angle between r and east (the direction of f2 ) is
θ = tan−1
12
≈ 67.4◦ .
5
Note that the resultant is closer to north than east; the larger force to the north pulls the object more
strongly to the north.
1.4. APPLICATIONS
45
2. Use the method of example 1.34. The magnitude of the resultant force r is
q
p
2
2
krk = kf1 k + kf2 k = 152 + 202 = 25 N,
while the angle between r and west (the direction of f1 ) is
θ = tan−1
20
≈ 53.1◦ .
15
Note that the resultant is closer to south than west; the larger force to the south pulls the object more
strongly to the south.
4
8
8 cos 60◦
√
3. Use the method of Example 1.34. If we let f1 =
. So the resultant
, then f2 =
=
0
8 sin 60◦
4 3
force is
12
4
8
r = f1 + f2 =
+ √ = √ .
0
4 3
4 3
q
√ 2 √
√
The magnitude of r is krk = 122 + 4 3 = 192 = 8 3 N, and the angle formed by r and f1 is
√
−1 4
3
1
= tan−1 √ = 30◦ .
12
3
θ = tan
Note that the resultant also forms a 30◦ degree angle with f2 ; since the magnitudes of the two forces
are the same, the resultant points equally between them.
√ 4
6 cos 135◦
−3√ 2
4. Use the method of Example 1.34. If we let f1 =
. So the
, then f2 =
=
0
6 sin 135◦
3 2
resultant force is
√ √ 4
−3√ 2
4 −√3 2
r = f1 + f2 =
+
=
.
0
3 2
3 2
The magnitude of r is
q
q
√
√
√
2
2
krk = (4 − 3 2) + (3 2) = 52 − 24 2 ≈ 4.24 N,
and the angle formed by r and f1 is
−1
θ = tan
√
3 2
√ ≈ 93.3◦
4−3 2
2
2
−6
4 cos 60◦
√
.
5. Use the method of Example 1.34. If we let f1 =
, then f2 =
, and f3 =
=
0
0
4 sin 60◦
2 3
So the resultant force is
2
−2
2
−6
r = f1 + f2 + f3 =
+
+ √ = √ .
0
0
2 3
2 3
The magnitude of r is
krk =
q
√
√
(−2)2 + (2 3)2 = 16 = 4 N,
and the angle formed by r and f1 is
√
√
3
= tan−1 (− 3) = 120◦ .
−2
√
(Note that many CAS’s will return −60◦ for tan−1 (− 3); by convention we require an angle between
0 and 180◦ , so we add 180◦ to that answer, since tan θ = tan(180◦ + θ)).
−1 2
θ = tan
46
CHAPTER 1. VECTORS
6. Use the method of Example 1.34. If we let f1 =
10
0
−5
0
, then f2 =
, f3 =
, and f4 =
. So
0
13
0
−8
the resultant force is
10
0
−5
0
5
r = f1 + f2 + f3 + f4 =
+
+
+
=
.
0
13
0
−8
5
The magnitude of r is
krk =
p
√
√
52 + 52 = 50 = 5 2 N,
and the angle formed by r and f1 is
θ = tan−1
5
= tan−1 1 = 45◦ .
5
7. Following Example 1.35, we have the following diagram:
f
fy
60°
fx
Here f makes a 60◦ angle with fy , so that
kfy k = kf k cos 60◦ ,
kfx k = kf k sin 60◦ .
√
√
Thus kfx k = 10 · 23 = 5 3 ≈ 8.66 N and kfy k = 10 · 12 = 5 N. Finally, this gives
fx =
√ 5 3
,
0
fy =
0
.
5
8. The force that must be applied parallel to the ramp is the force needed to counteract the component
of the force due to gravity that acts parallel to the ramp. Let f be the force due to gravity; this
acts downwards. We can decompose it into components fp , acting parallel to the ramp, and fr , acting
orthogonally to the ramp. Since the ramp makes an angle of 30◦ with the horizontal, it makes an angle
of 60◦ with f . Thus
1
kfp k = kf k cos 60◦ = kf k = 5 N.
2
9. The vertical force is the vertical component of the force vector f . Since this acts at an angle of 45◦ to
the horizontal, the magnitude of the component of this force in the vertical direction is
√
√
2
◦
kfy k = kf k cos 45 = 1500 ·
= 750 2 ≈ 1060.66 N.
2
10. The vertical force is the vertical component of the force vector f , which has magnitude 100 N and acts
at an angle of 45◦ to the horizontal. Since it acts at an angle of 45◦ to the horizontal, the magnitude
of the component of this force in the vertical direction is
√
√
2
kfy k = kf k cos 45◦ = 100 ·
= 50 2 ≈ 70.7 N.
2
Note that the mass of the lawnmower itself is irrelevant; we are not considering the gravitational force
in this exercise, only the force imparted by the person mowing the lawn.
1.4. APPLICATIONS
47
11. Use the method of Example 1.36. Let t be the force vector on the cable; then the tension on the cable
is ktk. The force imparted by the hanging sign is the y component of t; call it ty . Since t and ty form
an angle of 60◦ , we have
1
kty k = ktk cos 60◦ = ktk .
2
The gravitational force on the sign is its mass times the acceleration due to gravity, which is 50·9.8 = 490
N. Thus ktk = 2 kty k = 2 · 490 = 980 N.
12. Use the method of Example 1.36. Let w be the force vector created by the sign. Then kwk = 1·9.8 = 9.8
N, since the sign weighs 1 kg. By symmetry, each string carries half the weight of the sign, since the
angles each string forms with the vertical are the same. Let s be the tension in the left-hand string.
Since the angle between s and w is 45◦ , we have
√
2
1
kwk = ksk cos 45◦ =
ksk .
2
2
√
9.8
= 4.9 2 ≈ 6.9 N.
Thus ksk = √12 kwk = √
2
13. A diagram of the situation is
25
15
20
15 kg
The triangle is a right triangle, since 152 + 202 = 625 = 252 . Thus if θ1 is the angle that the left-hand
20
wire makes with the ceiling, then sin θ1 = 25
= 45 ; likewise, if θ2 is the angle that the right-hand wire
15
3
makes with the ceiling, then sin θ2 = 25 = 5 . Let f1 be the force on the left-hand wire and f2 the
force on the right-hand wire. Let r be the force due to gravity acting on the painting. Then following
Example 1.36, we have, using the Law of Sines,
kf1 k
15 · 9.8
kf2 k
krk
=
= 147 N.
=
=
sin θ1
sin θ2
sin 90◦
1
Then
kf1 k = 147 sin θ1 = 147 ·
4
= 117.6 N,
5
kf2 k = 147 sin θ2 = 147 ·
3
= 88.2 N.
5
14. Let r be the force due to gravity acting on the painting, f1 be the tension on the wire opposite the
30◦ angle, and f2 be the tension on the wire opposite the 45◦ angle (don’t these people know to hang
paintings straight?). Then krk = 20 · 9.8 = 196 N. Note that the remaining angle in the triangle is
105◦ . Then using the method of Example 1.36, we have, using the Law of Sines,
kf1 k
kf2 k
krk
=
=
.
sin 30◦
sin 45◦
sin 105◦
Thus
196 · 12
krk · sin 30◦
≈
≈ 101.46 N
sin 105◦
0.9659√
196 · 22
krk · sin 45◦
kf2 k =
≈
≈ 143.48 N.
sin 105◦
0.9659
kf1 k =
48
CHAPTER 1. VECTORS
Chapter Review
1. (a) True. This follows from the properties of Rn listed in Theorem 1.1 in Section 1.1:
u=u+0
Zero Property, Property (c)
= u + (w + (−w))
Additive Inverse Property, Property (d)
= (u + w) + (−w)
Distributive Property, Property (b)
= (v + w) + (−w)
By the given condition u + w = v + w
= v + (w + (−w))
Distributive Property, Property (b)
=v+0
Additive Inverse Property, Property (d)
=v
Zero Property, Property (c).
(b) False. See Exercise 60 in Section 1.2. For one counterexample, let w = 0 and u and v be arbitrary
vectors. Then u · w = v · w = 0 since the dot product of any vector with the zero vector is zero.
But certainly u and v need not be equal. As a second counterexample, suppose that both u and
v are orthogonal to w; clearly they need not be equal, but u · w = v · w = 0.
(c) False. For example, let u be any nonzero vector, and v any vector orthogonal to u. Let w = u.
Then u is orthogonal to v and v is orthogonal to w = u, but certainly u and w = u are not
orthogonal.
(d) False. When a line is parallel to a plane, then d · n = 0; that is, d is orthogonal to the normal
vector of the plane.
(e) True. Since a normal vector n for P and the line ` are both perpendicular to P, they must be
parallel. See Figure 1.62.
(f ) True. See the remarks following Example 1.31 in Section 1.3.
(g) False. They can be skew lines, which 
arenonintersecting lines with nonparallel 
direction


vectors.
0
0
1
For example, let `1 be the line x = t 0 (the x-axis), and `2 be the line x = 0 + t 1 (the
0
1
0
line through (0, 0, 1) that is parallel to the y-axis). These two lines do not intersect, yet they are
not parallel.
   
1
1
(h) False. For example, 0 · 0 = 1 + 0 + 1 = 0 in Z2 .
1
1
(i) True. If ab = 0 in Z/5, then ab must be a multiple of 5. But 5 is prime, so either a or b must be
divisible by 5, so that either a = 0 or b = 0 in Z5 .
(j) False. For example, 2 · 3 = 6 = 0 in Z6 , but neither 2 nor 3 is zero in Z6 .
10
2. Let w =
. Then the head of the resulting vector is
−10
−1
3
10
9
4u + v + w = 4
+
+
=
.
5
2
−10
12
So the coordinates of the point at the head of 4u + v are (9, 12).
3. Since 2x + u = 3(x − v) = 3x − 3v, simplifying gives u + 3v = x. Thus
−1
3
8
x = u + 3v =
+3
=
.
5
2
11
# »
# »
4. Since ABCD is a square, OC = −OA, so that
# » # » # »
# » # »
BC = OC − OB = −OA − OB = −a − b.
Chapter Review
49
5. We have
   
−1
2
u · v =  1 ·  1 = −1 · 2 + 1 · 1 + 2 · (−1) = −3
2
−1
p
√
kuk = (−1)2 + 12 + 22 = 6
p
√
kvk = 22 + 12 + (−1)2 = 6.
Then if θ is the angle between u and v, it satisfies
cos θ =
1
u·v
3
= −√ √ = − .
kuk kvk
2
6 6
Thus
−1
θ = cos
1
−
= 120◦ .
2
6. We have

   
1
1
1
1 · 1 − 2 · 1 + 2 · 1   1    92 
u=
proju v =
−2 = −2 = − 9  .
u·u
1 · 1 − 2 · (−2) + 2 · 2
9
2
2
2
9

u · v
7. We are looking for a vector
  in the xy-plane; any such vector has a z-coordinate of 0. So the vector we
a
are looking for is u =  b  for some a, b. Then we want
0
     
1
a
1
u · 2 =  b  · 2 = a + 2b = 0,
3
0
3
so that a = −2b. So for example choose b = 1; then a = −2, and the vector
 
−2
u =  1
0
 
1
is orthogonal to 2. Finally, to get a unit vector, we must normalize u. We have
3
kuk =
p
√
(−2)2 + 12 + 02 = 5
to get


− √25


1
√1 
w=
u=


5
kuk
0
that is orthogonal to the given vector. We could have chosen any value for b, but we would have gotten
either w or −w for the normalized vector.
8. The vector form of the given line is


 
2
−1
x =  3 + t  2 ,
−1
1
50
CHAPTER 1. VECTORS
 
−1
so the line has a direction vector d =  2. Since the plane is perpendicular to this line, d is a normal
1
vector n to the plane. Then the plane passes through P = (1, 1, 1), so equation n · x = n · p becomes
       
−1
x
−1
1
 2 · y  =  2 · 1 = 2.
1
z
1
1
Expanding gives the general equation −x + 2y + z = 2.


2
9. Parallel planes have parallel normals, so the vector n =  3, which is a normal to the given plane,
−1
is also a normal to the desired plane. The plane we want must pass through P = (3, 2, 5), so equation
n · x = n · p becomes
       
2
x
2
3
 3 · y  =  3 · 2 = 7.
−1
z
−1
5
Expanding gives the general equation 2x + 3y − z = 7.
10. The three points give us two vectors that lie in the plane:
     
0
1
1
# »      
d1 = AB = 0 − 1 = −1 ,
1
0
1
     
−1
1
0
# »
d2 = BC = 1 − 0 =  1 .
1
1
2
 
a
A normal vector n =  b  must be orthogonal to both of these vectors, so
c
   
0
a
n · d1 =  b  · −1 = −b + c = 0 ⇒ b = c
1
c
   
a
−1
n · d2 =  b  ·  1 = −a + b + c = 0 ⇒ a = b + c.
c
1
 
2c
Since b = c and a = b + c, we get a = 2c, so n =  c  for any value of c. Choosing c = 1 gives the
c
 
2
vector n = 1. Let P be the point A = (1, 1, 0) (we could equally well choose B or C), and compute
1
n · x = n · p:
       
2
x
2
1
1 · y  = 1 · 1 = 3.
1
z
1
0
Expanding gives the general equation 2x + y + z = 3.
Chapter Review
51
11. Let

  
1−1
0
# »
u = AB = 0 − 1 = −1
1−0
1

  
0−1
−1
# »
v = AC = 1 − 1 =  0 .
2−0
2
Using the first method from Figure 1.39 (Exercises 46 and 47) in Section 1.2, we have
 
   
0
0
0
u · v
0 · (−1) − 1 · 0 + 1 · 2   2    
−1
−1
−1
u=
=
=
proju v =
u·u
02 + (−1)2 + 12
2
1
1
1
     
−1
0
−1
v − proju v =  0 − −1 =  1 .
2
1
1
Then the area of the triangle is
p
1p 2
1√ √
1√
1
kuk kv − proju vk =
0 + (−1)2 + 12 (−1)2 + 12 + 12 =
2 3=
6.
2
2
2
2
12. From the first example in Exploration: Vectors and Geometry, we have a formula for the midpoint:
   
   
5
3
8
4
1     1    
1
1 + −7
−6 = −3 .
=
m = (a + b) =
2
2
2
−2
0
−2
−1
13. Suppose that kuk = 2 and kvk = 3. Then from the Cauchy-Schwarcz inequality, we have
|u · v| ≤ kuk kvk = 2 · 3 = 6.
Thus −6 ≤ u · v ≤ 6, so the dot product cannot equal −7.
14. We will apply the formula
|ax0 + by0 + cz0 − d|
√
a2 + b2 + c2
where A = (x0 , y0 , z0 ) is a point, and a general equation for the plane is ax + by + cz = d. Here, we
have a = 2, b = 3, c = −1, and d = 0; since the point is (3, 2, 5), we have x0 = 3, y0 = 2, and z0 = 5.
So the distance from the point to the plane is
√
7
|2 · 3 + 3 · 2 − 1 · 5 − 0|
14
d(A, P) = p
=√ =
.
2
14
22 + 32 + (−1)2
d(A, P) =
15. As in example 1.32 in Section 1.3, we have B = (3, 2, 5), and the line ` has vector form
 
 
0
1
x = 1 + t 1 ,
2
1
 
1
so that A = (0, 1, 2) lies on the line, and a direction vector for the line is d = 1. Then
1
     
3
0
3
# »
v = AB = 2 − 1 = 1 ,
5
2
3
52
CHAPTER 1. VECTORS
and then
projd v =
d·v
d·d
d=
   
7
1
1 · 3 + 1 · 1 + 1 · 3    37 
1 =  3  .
12 + 12 + 12
7
1
3
Then the vector from B that is perpendicular to the line is the vector
     
7
2
3
3
3
  7  4
v − projd v = 1 −  3  = − 3  .
7
2
3
3
3
So the distance from B to ` is
s 2 2
2
2
4
2
1√
2√
kv − projd vk =
+ −
+
=
4 + 16 + 4 =
6.
3
3
3
3
3
16. We have
3 − (2 + 4)3 (4 + 3)2 = 3 − 13 · 22 = 3 − 1 · 4 = −1 = 4
in Z5 . Note that 2 + 4 = 6 = 1 in Z5 , 4 + 3 = 7 = 2 in Z5 , and −1 = 4 in Z5 .
17. 3(x + 2) = 5 implies that 5 · 3(x + 2) = 5 · 5 = 25 = 4. But 5 · 3 = 15 = 1 in Z7 , so this is the same as
x + 2 = 4, so that x = 2. To check the answer, we have
3(2 + 2) = 3 · 4 = 12 = 5 in Z7 .
18. This has no solutions. For any value of x, the left-hand side is a multiple of 3, so it cannot leave a
remainder of 5 when divided by 9 (which is also a multiple of 3).
19. Compute the dot product in Z45 :
[2, 1, 3, 3] · [3, 4, 4, 2] = 2 · 3 + 1 · 4 + 3 · 4 + 3 · 2 = 6 + 4 + 12 + 6 = 1 + 4 + 2 + 1 = 8 = 3.
20. Suppose that
[1, 1, 1, 0] · [d1 , d2 , d3 , d4 ] = d1 + d2 + d3 = 0 in Z2 .
Then an even number (either zero or two) of d1 , d2 , and d3 must be 1 and the others must be zero. d4
is arbitrary (either 0 or 1). So the eight possible vectors are
[0, 0, 0, 0], [0, 0, 0, 1], [1, 1, 0, 0], [1, 1, 0, 1], [1, 0, 1, 0], [1, 0, 1, 1], [0, 1, 1, 0], [0, 1, 1, 1].
Chapter 2
Systems of Linear Equations
2.1
Introduction to Systems of Linear Equations
1. This is equation
is linear since each of x, y, and z appear to the first power with constant coefficients.
√
1, π, and 3 5 are constants.
2. This is not linear since each of x, y, and z appears to a power other than 1.
3. This is not linear since x appears to a power other than 1.
4. This is not linear since the equation contains the term xy, which is a product of two of the variables.
5. This is not linear since cos x is a function of x other than a constant times a multiple of x.
6. This is linear, since here cos 3 is a constant, and x, y, and
√ z all appear to the first power with constant
coefficients cos 3, −4, and 1, and the remaining term, 3, is also a constant.
7. Put the equation into the general form ax + by + c by adding y to both sides to get 2x + 4y = 7. There
are no restrictions on x and y since there were none in the original equation.
8. In the given equation, the denominator must not be zero, so we must have x 6= y. With this restriction,
we have
x2 − y 2
(x + y)(x − y)
= 1 ⇐⇒
= 1 ⇐⇒ x + y = 1, x 6= y.
x−y
x−y
The corresponding linear equation is x + y = 1, for x 6= y.
9. Since x and y each appear in the denominator, neither one can be zero. With this restriction, we have
1
4
y
x
4
1
+ =
⇐⇒
+
=
⇐⇒ y + x = 4 ⇐⇒ x + y = 4.
x y
xy
xy xy
xy
The corresponding linear equation is x + y = 4, for x 6= 0 and y 6= 0.
10. Since log10 x and log10 y are defined only for x > 0 and y > 0, these are the restrictions on the variables.
With these restrictions, we have
log10 x − log10 y = 2 ⇐⇒ log10
x
x
= 2 ⇐⇒
= 102 = 100 ⇐⇒ x = 100y ⇐⇒ x − 100y = 0.
y
y
The corresponding linear equation is x − 100y = 0 for x > 0 and y > 0.
11. As in Example 2.2(a), we set x = t and solve for y. Setting x = t in 3x − 6y =0 gives
3t − 6y = 0, so
that 6y = 3t and thus y = 12 t. So the set of solutions has the parametric form t, 21 t . Note that if we
had set x = 2t to start with, we would have gotten the simpler (but equivalent) parametric form [2t, t].
53
54
CHAPTER 2. SYSTEMS OF LINEAR EQUATIONS
12. As in Example 2.2(a), we set x1 = t and solve for x2 . Setting x1 = t in 2x1 +3x2 = 5 gives 2t+3x
2 = 5,
so that 3x2 = 5 − 2t and thus x2 = 35 − 32 t. So the set of solutions has the parametric form t, 53 − 32 t .
Note that if we had set x = 3t to start with, we would have gotten the simpler (but equivalent)
parametric form [3t, 5 − 2t].
13. As in Example 2.2(b), we set y = s and z = t and solve for x. (We choose y and z since the coefficient
of the remaining variable, x, is one, so we will not have to do any division). Then x + 2s + 3t = 4, so
that x = 4 − 2s − 3t. Thus a complete set of solutions written in parametric form is [4 − 2s − 3t, s, t].
14. As in Example 2.2(b), we set x1 = s and x2 = t and solve for x3 . This substitution yields 4s+3t+2x3 =
3
1
1, so that
2x31 = 1−4s−3t
and x3 = 2 −2s− 2 t. Thus a complete set of solutions written in parametric
3
form is s, t, 2 − 2s − 2 t .
3
15. Subtract the first equation from the second to get x = 3; substituting x = 3
back into the first equation gives y =
−3. Thus the unique intersection point
is (3, −3).
2x+y‡3
2
1
1
2
3
4
5
-1
-2
H3, -3L
-3
x+y‡0
-4
-5
-6
-7
16. Add twice the second equation to the first
equation to get 7x = 21 and thus x =
3. Substitute x = 3 back into the first
equation to get y = −2. So the unique
intersection point is (3, −2).
6
3x+y‡7
4
2
1
2
3
4
5
-2
H3, -3L
x-2y‡7
-4
-6
-8
3
17. Add three times the second equation to
the first equation. This gives 0 = 6,
which is impossible. So this system has
no solutions — the two lines are parallel.
3x-6y‡3
2
1
-x + 2 y ‡ 1
-2
1
-1
2
3
4
5
4
5
-1
18. Add 53 times the first equation to the second, giving 0 = 0. Thus the system has
an infinite number of solutions — the two
lines are the same line (they are coincident).
6
4
0.1 x - 0.05 y ‡ 0.2
2
-2
1
-1
-2
2
3
-0.06 x + 0.03 y ‡ -0.12
-4
-6
-8
19. Starting from the second equation, we find y = 3. Substitute y = 3 into x − 2y = 1 to get x − 2 · 3 = 1,
so that x = 7. The unique solution is [x, y] = [7, 3].
20. Starting from the second equation, we find 2v = 6 so that v = 3. Substitute v = 3 into 2u − 3v = 5 to
get 2u − 3 · 3 = 5, or 2u = 14, so that u = 7. The unique solution is [u, v] = [7, 3].
21. Starting from the third equation, we have 3z = −1, so that z = − 13 . Now substitute this value into
the second equation:
1
1
2
1
2y − z = 1 ⇒ 2y − −
= 1 ⇒ 2y + = 1 ⇒ 2y = ⇒ y = .
3
3
3
3
2.1. INTRODUCTION TO SYSTEMS OF LINEAR EQUATIONS
55
Finally, substitute these values for y and z into the first equation:
1
1
2
2
x−y+z =0⇒x− + −
=0⇒x− =0⇒x= .
3
3
3
3
So the unique solution is [x, y, z] = 32 , 13 , − 13 .
22. The third equation gives x3 = 0; substituting into the second equation gives −5x2 = 0, so that
x2 = 0. Substituting these values into the first equation gives x1 = 0. So the unique solution is
[x1 , x2 , x3 ] = [0, 0, 0].
23. Substitute x4 = 1 from the fourth equation into the third equation, giving x3 − 1 = 0, so that x3 = 1.
Substitute these values into the second equation, giving x2 + 1 + 1 = 0 and thus x2 = −2. Finally,
substitute into the first equation:
x1 + x2 − x3 − x4 = 1 ⇒ x1 + (−2) − 1 − 1 = 1 ⇒ x1 − 4 = 1 ⇒ x1 = 5.
The unique solution is [x1 , x2 , x3 , x4 ] = [5, −2, 1, 1].
24. Let z = t, so that y = 2t − 1. Substitute for y and z in the first equation, giving
x − 3y + z = 5 ⇒ x − 3(2t − 1) + t = 5 ⇒ x − 5t + 3 = 5 ⇒ x = 5t + 2.
This system has an infinite number of solutions: [x, y, z] = [5t + 2, 2t − 1, t].
25. Working forward, we start with x = 2. Substitute that into the second equation to get 2 · 2 + y = −3,
or 4 + y = −3, so that y = −7. Finally, substitute those two values into the third equation:
−3x − 4y + z = −10 ⇒ −3 · 2 − 4 · (−7) + z = −10 ⇒ 22 + z = −10 ⇒ z = −32.
The unique solution is [x, y, z] = [2, −7, −32].
26. Working forward, we start with x1 = −1. Substituting into the second equation gives − 21 ·(−1)+x2 = 5,
or 21 + x2 = 5. Thus x2 = 92 . Finally, substitute those two values into the third equation:
3
3
9
15
1
x1 + 2x2 + x3 = 7 ⇒ · (−1) + 2 · + x3 = 7 ⇒
+ x3 = 7 ⇒ x3 = − .
2
2
2
2
2
The unique solution is [x1 , x2 , x3 ] = −1, 29 , − 12 .
27. As in Example 2.6, we create the augmented matrix from the coefficients:
1 −1 0
2
1 3
28. As in Example 2.6, we create the augmented matrix from the coefficients:


2 3 −1 1
 1 0
1 0
−1 2 −2 0
29. As in Example 2.6, we create the augmented matrix from the coefficients:


1 5 −1
−1 1 −5
2 4
4
30. As in Example 2.6, we create the augmented matrix from the coefficients:
1 −2
0
1 2
−1
1 −1 −3 1
56
CHAPTER 2. SYSTEMS OF LINEAR EQUATIONS
31. There are three variables, since there are three columns not counting the last one. Use x, y, and z.
Then the system becomes
y+z=1
x−y
=1
2x − y + z = 1 .
32. There are five variables, since there are five columns not counting the last one. Use x1 , x2 , x3 , x4 , and
x5 . Then the system becomes
x1 − x2
+ 3x4 + x5 = 2
x1 + x2 + 2x3 + x4 − x5 = 4
x2
+ 2x4 − 3x5 = 0 .
33. Add the first equation to the second, giving 3x = 3, so that x = 1. Substituting into the first equation
gives 1 − y = 0, so that y = 1. The unique solution is [x, y] = [1, 1].
34. As in Example 2.6, we perform row operations on the augmented matrix of this system (from Exercise
28) to get it into a form where we can read off the solution.

2 3
 1 0
−1 2
−1
1
−2





1 0
1 0
1 0
1 0 1
1
R −2R1
R1 ↔R2 
0 3 −3 1 −−
3 R2 . . .
0 −
2 3 −1 1 −−2−−−→
−−−−→
−→
R3 +R1
0
−1 2 −2 0 −
−−−−→ 0 2 −1 0



 R1 −R3 
0 −−−
1 0
1
−−→ 1 0 0
1 0
1 0



1  R2 +R3 
. . . 0 1 −1 13 
0
1
−1

−−−−→ 0 1 0
3 −
R3 −2R2
2
0 2 −1 0 −−−−−→ 0 0
1 −3
0 0 1
From this we get the unique solution [x1 , x2 , x3 ] =
2
1
2
3, −3, −3

2
3

− 31  .
− 32
.
35. Add the first two equations to give 6y = −6, so that y = −1. Substituting back into the second
equation gives −x − 1 = −5, so that x = 4. Further, 2 · 4 + 4 · (−1) = 4, so this solution satisfies the
third equation as well. Thus the unique solution is [x, y] = [4, −1].
36. As in Example 2.6, we perform row operations on the augmented matrix of this system (from Exercise
30) to get it into a simpler form.
1
−1
−2
0
1
1 −1 −3
2
1
R2 +R1
1 −
0
−−−−→
−2
−1
0
1 2
−R2 . . .
−1 −2 3 −
−−→
R1 +2R2 1 −2 0 1
2 −
−−−−→ 1
...
0
1 1 2 −3
0
0
1
2
1
5
2
−4
.
−3
This augmented matrix corresponds to the linear system
a
+ 2c + 5d = −4
b + c + 2d = −3
so that, letting c = s and d = t, we get a = −4 − 2s − 5t and b = −3 − s − 2t. So a complete set of
solutions is parametrized by [a, b, c, d] = [−4 − 2s − 5t, −3 − s − 2t, s, t].
37. Starting with the linear system given in Exercise 31, add the first equation to the second and to the
third, giving
x+ z=2
2x + 2z = 2 .
Subtracting twice the first of these from the second gives 0 = −2, which is impossible. So this system
is inconsistent and has no solution.
2.1. INTRODUCTION TO SYSTEMS OF LINEAR EQUATIONS
57
38. As in Example 2.6, we perform row operations on the augmented matrix of this system (from Exercise
32) to get it into a simpler form.

1
1
0



1 −1 0
3
1 2
−1 0 3
1 2
R2 −R1 
R2 ↔R3
1 2 1 −1 4 −
2 2 −2 −2 2 −
−−−−→ 0
−−−−→ . . .
1 0 2
3 0
0
1 0
2
3 0




1 −1 0
3
1 2
1 −1 0
3
1 2
0
1 0
2
3 0
1 0
2
3 0 1
. . . 0
...
R3 −2R2
2 R3
0
2 2 −2 −2 2 −
0
0
2
−6
−8
2
−−−−→
−−−→
 R1 +R2 

5
4
1 −1 0
3
1 2 −−−−−→ 1 0 0
0 1 0
1 0
2
3 0
2
3
. . . 0
0
0 1 −3 −4 1
0 0 1 −3 −4

2
0
1
Then setting x4 = s and x5 = t, we have
x1
+ 5s + 4t = 2
x2
+ 2s + 3t = 0
x3 − 3s − 4t = 1 ,
x1 = 2 − 5s − 4t
so that
x2 = −2s − 3t
x3 = 1 + 3s + 4t.
So the general solution to this system has parametric form
[x1 , x2 , x3 , x4 , x5 ] = [2 − 5s − 4t, −2s − 3t, 1 + 3s + 4t, s, t] .
39. (a) Since t = x and y = 3 − 2t, we can substitute x for t in the equation for y to get y = 3 − 2x, or
2x + y = 3.
(b) If y = s, then substituting s for y in the general equation above gives
2x + s = 3, or 2x = 3 − s,
so that x = 23 − 12 s. So the parametric solution is [x, y] = 32 − 21 s, s .
40. (a) Since x1 = t, substitute x1 for t in the second and third equations, giving
x2 = 1 + x1 and x3 = 2 − x1 .
This gives the linear system
−x1 + x2
=1
x1
+ x3 = 2 .
(b) Substitute s for x3 into the second equation, giving x1 + s = 2, so that x1 = 2 − s. Substitute that
value for x1 into the first equation, giving −(2 − s) + x2 = 1, or x2 = 3 − s. So the parametric
form given in this way is
[x1 , x2 , x3 ] = [2 − s, 3 − s, s].
41. Let u = x1 and v = y1 . Then the system of equations becomes 2u + 3v = 0 and 3u + 4v = 1. Solving
the second equation for v gives v = 41 − 43 u. Substitute that value into the first equation, giving
1 3
1
3
2u + 3
− u = 0, or − u = − , so u = 3.
4 4
4
4
Then v = 41 − 34 · 3 = −2, so the solution is [u, v] = [3, −2], or [x, y] = u1 , v1 = 13 , − 21 .
42. Let u = x2 and v = y 2 . Then the system of equations becomes u + 2v = 6 and u − v = 3. Subtracting
the second of these from the first gives 3v = 3, so that v = 1. Substituting that into the second equation
gives u − 1 = 3, so that u = 4. The solution in terms of u and v is thus [u, v] = [x2 , y 2 ] = [4, 1]. Since
1 and 4 have both a positive and a negative square root, the original system has the four solutions
[x, y] = [2, 1],
[−2, 1],
[2, −1],
or
[−2, −1].
58
CHAPTER 2. SYSTEMS OF LINEAR EQUATIONS
43. Let u = tan x, v = sin y, and w = cos z. Then the system becomes
u − 2v
= 2
u− v+w= 2
v − w = −1 .
Subtract the first equation from the second, giving v + w = 0. Add that to the third equation,
so1 that
1
these
into
the
second
equation
gives
u−
−
2v = −1 and v = − 12 . Thus w = 12 . Substituting
2 + 2 = 2,
so that u = 1. The solution is [u, v, w] = 1, − 12 , 12 , so that
π
x = tan−1 u = tan−1 1 = + kπ,
4
1
7π
π
y = sin−1 v = sin−1 −
=
+ 2`π or − + 2`π,
2
6
6
1
π
z = cos−1 w = cos−1 = ± + 2nπ,
2
3
where k, `, and n are integers.
44. Let r = 2a and s = 3b . Then the system becomes −r + 2s = 1 and 3r − 4s = 1. Adding three times the
first equation to the second gives 2s = 4, so that s = 2. Substituting that back into the first equation
gives −r + 4 = 1, so that r = 3. So the solution is [r, s] = [2a , 3b ] = [3, 2]. In terms of a and b, this
gives [a, b] = [log2 3, log3 2].
2.2
Direct Methods for Solving Linear Systems
1. This matrix is not in row echelon form, since the leading entry in row 2 appears to the right of the
leading entry in row 3, violating condition 2 of the definition.
2. This matrix is in row echelon form, since the row of zeros is at the bottom and the leading entry in
each nonzero row is to the left of the ones in all succeeding rows. It is not in reduced row echelon form
since the leading entry in the first row is not 1.
3. This matrix is in reduced row echelon form: the leading entry in each nonzero row is 1 and is to the left
of the ones in all succeeding rows, and any column containing a leading 1 contains zeros everywhere
else.
4. This matrix is in reduced row echelon form: all zero rows are at the bottom, and all other conditions
are vacuously satisfied (that is, they are true since no rows or columns satisfy the conditions).
5. This matrix is not in row echelon form since it has a zero row, but that row is not at the bottom.
6. This matrix is not in row echelon form since the leading entries in rows 1 and 2 appear to the right of
leading entries in succeeding rows.
7. This matrix is not in row echelon form since the leading entry in row 1 does not appear to the left of
the leading entry in row 2. It appears in the same column, but that is not sufficient for the definition.
8. This matrix is in row echelon form, since the zero row is at the bottom and the leading entry in each
row appears in a column to the left of the leading entries in succeeding rows. However, it is not in
reduced row echelon form since the leading entries in rows 1 and 3 are not 1. (Also, it is not in reduced
row echelon form since the third column contains a leading 1 but also contains other nonzero entries).
9. (a)

0
0
1
0
1
1


1
1
R1 ↔R2 
1 −
−−−−→ 0
1
0
1
1
0

1
1 · · ·
1
2.2. DIRECT METHODS FOR SOLVING LINEAR SYSTEMS
59
(b)
R1 −R3 
−−−
−−→ 1
R2 −R3 0
···−
−−−−→
0
1
1
0
 R1 −R2 
0 −
−−−−→ 1
0
0
1
0

0
0
1
0
1
0
10. (a)
"
4
2
"
#
3 R2 ↔R1 2
−−−−−→
4
1
# 1
"
2 R1
1 −−
−→ 1
3
4
1
2
"
#
1
R −4R1
3 −−2−−−→
0
1
2
1
#
···
(b)
"
R1 − 12 R2
1
−
−
−
−
−
−
→
···
0
0
1
#
11. (a)

3

5
2
 1 R1 
2
4 −−
−→ 1


−2
5
5
3


2
5
 R ↔R3 
−2 −−1−−−→
5
3
4


1
2
 R −5R1 
−2 −−2−−−→
0
R3 −3R1
5 −
−−−−→ 0

2

−12 · · ·
−1
(b)
 R −2R2 
1
2 −−1−−−→


1
0
R3 +R2
−1 −
−−−−→ 0

1
1
− 12
R2 
···−
−−−→ 0
0

0

1
0
12. (a)
2
3
1 R1 2
−4 −2 6 −−
−→ 1
1
6 6
3
−2 −1 3
1
R2 −3R1
1
6 6 −
−−−−→ 0
−2 −1
3
···
7
9 −3
(b)
"
1
· · · 71 R2
−−−→ 0
"
#
R +2R2
1 0
3 −−1−−−→
− 73
0 1
−2 −1
9
1
7
11
7
9
7
15
7
− 73
#
13. (a)

3
2
4
−2
−1
−3




3
−1
3 −2 −1
3R2
R2 −2R1
 6 −3 −3 −
−1 −−→
−−−−→ 0
−3R3
R3 +4R1
−1 −
9
3 −
−−→ −12
−−−−→ 0


−2 −1
3
0
1 −1
R3 −R2
1 −1 −
−−−−→ 0

−2 −1
1 −1 · · ·
0
0
(b)
 1 R1 
3
−3 −−
−→ 1

0
−1
0
0
0
1
0

−1
−1
0


2 −3
1
R +3R1
0
−6 10 −−2−−−→
R3 +2R1
−4
7 −
−−−−→ 0
2
0
0


−3
1
0
1
R3 −R2
1 −
−−−−→ 0

R +2R2
3
−−1−−−→
0
···
0
0
1
0
14. (a)

−2
−3
1


−4
7
1
R1 ↔R3 
−6 10 −
−3
−−−−→
2 −3
−2
(b)

R −3R2
1
−−1−−−→
0
···
0
2
0
0

0
1
0
2
0
0

3
1 · · ·
0
60
CHAPTER 2. SYSTEMS OF LINEAR EQUATIONS
15.

1
0

0
0





1
2 −4 −4
5
1
2 −4 −4
5
2 −4 −4
5

0 −1 10
9 −5
9 −5
−1 10
9 −5


 8R 0 −1 10

3 
0
0
0
8
8
−8
0
1
1 −1 −
0
1
1 −1
−→
R4 +29R3
0
0 29 29 −5
0 29 29 −5
0
0
0 24 −
−−−−−→ 0




1
2 −4 −4
5
1
2 −4 −4
5

0 −1 10
0
8
8 −8
9 −5
R3 ↔R2 0


−
−
−
−
−
→
0 −1 10
0
9 −5
0
8
8 −8
R −3R2
0
3 −1
2 10
0
3 −1
2 10
−−4−−−→


1 2 −4 −4 5
R2 +2R1
−−−−−→  2 4
0
0 2


R3 +2R1 
2 3
2
1 5
−−−−−→
R4 −R1
3
6 5
−−−
−−→ −1 1
16. Ri ↔ Rj undoes itself; k1 Ri undoes kR1 ; and Ri − kRj undoes Ri + kRJ .
− 1 R1 1
R1 + 7 R2 2
2
3 −1
1 2
1 2 −
− 2 −1 −
−
−
−
→
−
−
−
−
−
→
= B. So A and B are row equivalent,
17. A =
R2 −2R1
1
0
3 4 −
1
0
−−−−→ 1 0
and we can convert A to B by the sequence R2 − 2R1 , − 12 R1 , and R1 + 72 R2 .
18.

2
A= 1
−1
0
1
1
 R1 +R2 
3
−1 −
−−−−→
 1
0
−1
1



3 1 −1
−1
R
+2R
3 
0 −−2−−−→
−1 3
2
1
−1 1
1




3
3 1 −1
3 1 −1
1
R
+
R
R2 +R1 
2
−1 3
2 3 3
2 −
1 −−−
−−−−→ 2 4
−
−
−
→
R3 +R1
2
2 2
0
2 2
0
−−−
−−→
1
1
1

1
5
2

−1
1 = B.
0
Therefore the matrices A and B are row equivalent.
19. Performing R2 + R1 gives a new second row R20 = R2 + R1 . Then performing R1 + R2 gives a new first
row with value R1 + R20 = 2R1 + R2 . Only one row operation can be performed at a time (although in
a sequence of row operations, we may perform more than one in one step as long as any rows affected
do not appear in other transformations in that step, just for efficiency).
R1 −R2 −R1 x1
x1
−x2
−x2 −
−
−
−
−
−
→
−−→ x2 . The net effect is to interchange the
20.
R2 +R1
R2 +R1
x2 −
x
+
x
x
+
x
x
x1
2
1
2 −
1
−−−−→ 1
−−−−→
first and second rows. So this sequence of elementary row operations is the same as R1 ↔ R2 .
21. Note that 3R2 − 2R1 is not an elementary row operation since it is not of the form Ri ↔ Rj , kRi , or
Ri + kRj (it is not of the third form since the coefficient of R2 is 3, not 1). However,
3
1
3 1
3
1
,
R2 − 32 R1
3R2
10
2 4 −
−−−−−→ 0 3 −−→ 0 10
so this is in fact a composition of elementary row operations.
22. We can create a 1 in the top left corner as follows:
1 R1 3
3 2 R1 ↔R2 1 4
3 2 −−
−→ 1
,
1 4 −−−−−→ 3 2
1 4
1
2
3
4
,
3
1
R1 −2R2 2 −
−−−−→ 1
1
4
−6
.
4
The first method, simply interchanging the two rows, requires the least computation and seems simplest.
Our next choice would probably be the third method, since at least it produces integer results.
2.2. DIRECT METHODS FOR SOLVING LINEAR SYSTEMS
61
23. The rank of a matrix is the number of nonzero rows in its row echelon form.
(1) Since A is not in row echelon form, we first put it in row echelon form:




1 0 1
1 0 1
R2 ↔R3 
0 0 3 −
−−−−→ 0 1 0 .
0 1 0
0 0 3
Since the row echelon form has three nonzero rows, rank A = 3.
(2) A is already in row echelon form, and it has two nonzero rows, so rank A = 2.
(3) A is already in row echelon form, and it has two nonzero rows, so rank A = 2.
(4) A is already in row echelon form, and it has no nonzero rows, so rank A = 0.
(5) Since A is not in row echelon form, we first put it in row echelon form:




1 0 3 −4 0
1 0 3 −4 0
R2 ↔R3 
0 0 0
0 0 −
0 1 .
−−−−→ 0 1 5
0 1 5
0 1
0 0 0
0 0
Since the row echelon form has two nonzero rows, rank A = 2.
(6) Since A is not in row echelon form, we first put it in row echelon form:




0 0 1
1 0 0
R1 ↔R3 
0 1 0 −
−−−−→ 0 1 0
1 0 0
0 0 1
Since the row echelon form has three nonzero rows, rank A = 3.
(7) Since A is not in row echelon form, we first put it in row echelon form:






1
2
3
1 2 3
1
2
3
0 −2 −3
1 0 0





 R2 −R1 0 −2 −3


 −−−−−→ 
 R3 + 21 R2 
0 1 1
0
0 − 12 
1
1 −−−−−−→ 0
0
0
1
0
0
1
0
0

1
0


0
R +2R3
1 −−4−−−→
0

2
3
−2 −3

.
0 − 12 
0
0
Since the row echelon form has three nonzero rows, rank A = 3.
(8) A is already in row echelon form, and it has three nonzero rows, so rank A = 3.
24. Such a matrix has 0, 1, 2, or 3 nonzero rows.


0 0 0
• If it has no nonzero rows, then all rows are zero, so the only such matrix is 0 0 0.
0 0 0
• If it has one nonzero row, that must be the first row, and any entries in that row following the 1
are arbitrary, so the possibilities are






1 ∗ ∗
0 1 ∗
0 0 1
0 0 0 ,
0 0 0 ,
0 0 0 ,
0 0 0
0 0 0
0 0 0
where ∗ denotes that any entry is allowed.
• If it has two nonzero rows, then the third row is zero. There are several possibilities for the
arrangements of 1s in the first two rows: they could be in columns 1 and 2, 1 and 3, or 2 and
3. In each case, the entries following each 1 are arbitrary, except for the column in the first row
above the 1 in the second row. This gives the possibilities






1 0 ∗
1 ∗ 0
0 1 0
0 1 ∗ ,
0 0 1 ,
0 0 1 .
0 0 0
0 0 0
0 0 0
62
CHAPTER 2. SYSTEMS OF LINEAR EQUATIONS

1
• If the matrix has all nonzero rows, then the only possibility is 0
0
entries must be zero since each column has a one in it.
0
1
0

0
0. Note that all other
1
25. First row-reduce the augmented matrix of the system:

1

2
4
2 −3
−1
1
−1
1


1
9
 R −2R1 
0−−2−−−→
0
R3 −4R1
4 −
−−−−→ 0

1

0
R3 +9R2
−−−−−→ 0
R +3R3 
−−1−−−→
1
R2 + 57 R3 
−−−−−−→ 0
0


9
1
2 −3
 − 51 R2 
−18 −−−
0
1 − 75

−→
0 −9 13
−32



2 −3
1 2 −3
9
9



1 − 75 18
0 1 − 57 18
5  5
5 
R
2
2
2 3
0
1
1
−
−→ 0 0
5
5 −
 R −2R 

2
2 0 12 −−1−−−→
1 0 0 2



1 0
5
0 1 0 5
0 1
0 0 0 1
1
2 −3
−5
7
−9 13

9
18 
5 
−32
   
x1
2
Thus the solution is x2  = 5.
1
x3
26. First row-reduce the augmented matrix of the system:

1

−1
3
−1 1
3 1
1 7


0
1
 R2 +R1 
5 −−−−−→ 0
R3 −3R1
2 −
−−−−→ 0
−1 1
2 2
4 4


0
1
 12 R2 
5 −−−→ 0
2
0
−1 1
1 1
4 4


0
1

5
0
2
R3 −4R2
2 −−−−−→ 0
−1 1
1 1
0 0

0
5
2
−8
This is an inconsistent system since the bottom row of the matrix is zero but the constant is nonzero.
Thus the system has no solution.
27. First row-reduce the augmented matrix of the system:

1
−1
2
−3 −2
2
1
4
6



0
1 −3 −2 0
R2 +R1
−R2
0 −−−−−→ 0 −1 −1 0 −
−−→
R
−2R
3
1
0 −−−−−→ 0 10 10 0



1 −3 −2 0
1
0
0
1
1 0
R3 −10R2
0 10 10 0 −
−−−−−→ 0
−3 −2
1
1
0
0
 R1 +3R2 
0 −
−−−−→ 1
0
0
0
0
0
1
0
1
1
0

0
0
0
Since there is a row of zeros, and no leading 1 in the third column, we know that x3 is a free variable,
say x3 = t. Back substituting gives x2 + t = 0 and x1 + t = 0, so that x1 = x2 = −t. So the solution is
   
 
x1
−t
−1
x2  = −t = t −1 .
x3
t
1
2.2. DIRECT METHODS FOR SOLVING LINEAR SYSTEMS
63
28. First row-reduce the augmented matrix of the system:

 1 R1 



3
3
1
2
2
3 −1
4 1 −−
1
− 12
2 12
− 21
2
−→ 1
2
2
2



 R −3R1 

3
0
1 1 −−2−−−→
0
1 1
−5 − 12 
3 −1
0 − 11
3 −1
2
2
R3 −3R1
17
5
1
3 −4
1 −1 2
3 −4
1 −1 2 −
−7
−−−−→ 0 − 2
2
2



3
1
1
− 12
2
1 32
− 21
2
2
2
2

− 11
R2 
3
1 
3
10
10
0
1
−
0
1
−



−−−−→
11
11
11
11
11
R3 + 17
5
2
8
1
2 R2
−7
0 − 17
0
0
−
−
−
−
−
−
→
2
2
2
11
11

1
2
1 
11 
14
11
This matrix is in row echelon form, but we can continue, putting it in reduced row echelon form:
1
 R1 − 3 R2 
 R1 + 11


R3 
1
7
1
4
2
1 23
− 12
2
−−−−
−→ 1 0 − 11 11 11 −−−−−−→ 1 0 0 1 1
2 −



3
3
3
10
1 
10
1  R2 + 11
R3 
···
0 1 − 11
0 1 − 11
11
11 
11
11  −−−−−−→ 0 1 0 2 2
11
R
3
2
1
4
7
0 0
1
4
7
0 0 1 4 7
−−
−→ 0 0
Using z = t as the parameter gives w = 1 − t, x = 2 − 2t, y = 7 − 4t, z = t for the solution, or
 
  
  
−1
1
1−t
w
−2
 x  2 − 2t 2
 
  
 =
 y  7 − 4t = 7 + t −4 .
1
0
t
z
29. First row-reduce the augmented matrix of the system:



 1 R1 


1
1
3
3
2
2 1
1
1
3 −−
−→ 1 2
2
2
2



 R2 −4R1 
 −R2 
−
−
−
−
−
→
4
1
7
4
1
7
0
−1
1





 −−−→ 0
R3 −2R1
2 5 −1
2 5 −1 −−−−−→ 0
4 −4
0
x
2
The solution is 1 =
.
x2
−1
1
2
1
4
 R1 − 1 R2 
2
−−−−−
−→ 1


−1
0
R
−4R
2
−4 −−3−−−→
0
0
1
0

2

−1
0
−3 0
0
0 1 −2
0 0
0

−2
1
0
3
2
30. First row-reduce the augmented matrix of the system:

−1
 2
1
3 −2
4
−6
1 −2
−3
4 −8
 −R1 
−−−→
1
0
R2 +2R1 
−3 −
−−−−→ 0
R3 +R1
2 −
−−−−→ 0

1

0
−−−−→
0
− 13 R2
−3 2 −4
0 1 −2
0 2 −4
−3
2 −4
0 −3
6
0
2 −4

0
−3
2


−2
1
0
1
R3 −2R2
2 −
−−−−→ 0
−3 2 −4
0 1 −2
0 0
0

−2
1
0

R −2R2
1
−−1−−−→
0
0
The system has been reduced to
x1 − 3x2 = −2
x3 − 2x4 =
1.
x2 and x4 are free variables; setting x2 = s and x4 = t and back-substituting gives for the solution
  
  
 
 
x1
3s − 2
−2
3
0
x2   s   0
1
0
 =
  
 
 
x3   2t + 1  =  1 + s 0 + t 2 .
x4
t
0
0
1
64
CHAPTER 2. SYSTEMS OF LINEAR EQUATIONS
31. First row-reduce the augmented matrix of the system:

1
2
1
6
1
3
1
1
2
0
−1
0
−2
−6
0
−3
1
0 −4
 2R1 
2 −−→
1
 6R2 
−1−
−→ 1
3R3
8 −
−→ 1
2
3
0
−2 −12
0
0 −18
6
−6
0 −12

4
−6
24


1
2 −2 −12
0
4
−−−−−→ 0
1
2 −6
6 −10
R3 −R1
20
12 −12
−−−−−→ 0 −2 −4


1 2 −2 −12 0
4
0 1
2 −6 6 −10
R +2R2
0
0 0
0
0 0
−−3−−−→


R1 −2R2
0 −12
24
−−−−−→ 1 0 −6
0 1
2 −6
6 −10
0
0 0
0
0
0
R2 −R1
Since the matrix has rank 2 but there are 5 variables, there are three free variables, say x3 = r,
x4 = s, and x5 = t. Back-substituting gives x1 − 6r − 12t = 24 and x2 + 2r − 6s + 6t = −10, so that
x1 = 24 + 6r + 12t and x2 = −10 − 2r + 6s − 6t. Thus the solution is
 
 
  
 

12
0
x1
6
24
−6
6
x2  −10
−2
 
 
  
 

x3  =  0 + r  1 + s 0 + t  0 .
 
 
  
 

 0
1
x4   0
 0
1
0
0
0
x5
32. First row-reduce the augmented matrix of the system:
√
2

 0
0
1
√
2
−1
2
−3
√
2
 √12 R1 
1 −−−−→ 1
√  1

− 2 √2 R2 0
−−−−→
1
0
√
√
2
2
1
−1

1 0

0
1

√
2 R3
−−−−→ 0 0
2
√
−322
√
2
 R1 − √2 R2 
−−−−−2−−→ 1


−1
0
R
+R
3
2
1
−−−−−→ 0
√
2
2
0
1
0
√
3
2+
√ 2
−√3 2 2
2
2
√
√
3
2+
√ 2
3 2
− 2
1
√  R1 −( 2+ 23 )R3 
2 −−−−−−−−−−→ 1 0 0
 R2 + 3√2 R3 
0 1 0
−1 −
−−−−2−−→ 
0 0 1
0
√ 
2

−1
0
√ 
2

−1
0
33. First row-reduce the augmented matrix of the system:

1
1

0
1
1
2 1
−1 −1 1
1
1 0
1
0 1




1
1
2 1
1
1
1
R
−R
2
1
 R ↔R 0

0
2
3 
−−−−−→ 0 −2 −3 0 −1 −
0
−1
1
1 0 −1 −−−−→ 0
R4 −R1
2 −
0 −2 0
1
0
−−−−→ 0



1 1
2 1
1
1
0 1

0
1
0
−1



R +2R2 
0
0 0 −1 0 −3
−−3−−−→
R4 −2R3
0 0 −2 0
1 −−−−−→ 0
1
2 1
1
1 0
−2 −3 0
0 −2 0
1
1
0
0
2 1
1 0
−1 0
0 0

1
−1

−1
1

1
−1

−3
7
We need not proceed any further — since the last row says 0 = 7, we know the system is inconsistent
and has no solution.
2.2. DIRECT METHODS FOR SOLVING LINEAR SYSTEMS
65
34. First row-reduce the augmented matrix of the system:

1 1
1
1
1 2
3
4

1 3
6 10
1 4 10 20


1 1 1 1
4 R −R
2
1
0 1 2 3
−−−
−−→
10
 R −R 
3
1 
20 −
0 2 5 9
−−
−−→
R4 −R1
35 −
0
3 9 19
−−−−→

1 1 1 1
0 1 2 3

0 0 1 3
R4 −3R3
−−−−−→ 0 0 0 1


1 1 1
4
0 1 2
6
 R −2R2 
0 0 1
16 −−3−−−→
R4 −3R2
31 −
−−−−→ 0 0 3
 R1 −R4 
4 −−−−−→ 1
R2 −3R4 
6
 −−−−−→ 0
R3 −3R4 0
4 −
−−−−→
1
0
1
1
0
0
1
2
1
0
0
0
0
1
1
3
3
10

4
6

4
13
 R1 −R3 

−−→ 1 1 0 0 2
3 −−−
R2 −2R3 

3
 −−−−−→ 0 1 0 0 1
0 0 1 0 1
1
1
0 0 0 1 1

R1 −R2
−−−
−−→ 1 0 0
0 1 0

0 0 1
0 0 0
0
0
0
1

1
1

1
1
   
1
a
 b  1
  
so that 
 c  = 1 is the unique solution.
1
d
35. If we reverse the first and third rows, we get a system in row echelon form with a leading 1 in every
row, so rank A = 3. Thus this is a system of three equations in three variables with 3 − 3 = 0 free
variables, so it has a unique solution.
36. Note that R3 −2R2 results in the row 0 0 0 0 | 2, so that the system is inconsistent and has no solutions.
37. This system is a homogeneous system with four variables and only three equations, so the rank of the
matrix is at most 3 and thus there is at least one free variable. So this system has infinitely many
solutions.
38. Note that R3 ← R3 − R1 − R2 gives a zero row, and then R2 ← R2 − 6R1 gives a matrix in row echelon
form. So there are three free variables, and this system has infinitely many solutions.
39. It suffices to show that the row-reduced matrix has no zero rows, since it then has rank 2, so there are
no free variables and no possibility of an inconsistent system.
First, if a = 0, then since ad − bc = −bc 6= 0, we see that both b and c are nonzero. Then row-reduce
the augmented matrix:
0 b r R1 ↔R2 c d s
.
c d s −−−−−→ 0 b r
Since b and c are both nonzero, this is a consistent system with a unique solution.
Second, if c = 0, then since ad − bc = ad 6= 0, we see that both a and d are nonzero. Then the
augmented matrix is
a b r
.
0 d s
This is row-reduced since both a and d are nonzero, and it clearly has rank 2, so this is a consistent
system with a unique solution.
Finally, if a 6= 0 and c 6= 0, then row-reducing gives
a
c
b
d
cR1 r −−→
ac bc
aR2
s −
ac
ad
−→
rc
ac
bc
R2 −R1
as −
0
ad
− bc
−−−−→
rc
.
as
Since ac 6= 0 and also ad − bc 6= 0, this is a rank 2 matrix, so it has a unique solution.
66
CHAPTER 2. SYSTEMS OF LINEAR EQUATIONS
40. First reduce the augmented matrix of the given system to row echelon form:
k
2
2
−4
3 R1 ↔R2 2
−6 −−−−−→ k
−4
2
−6
2
1
R2 − 2 kR1
3 −
−−−−−→ 0
−4
2 + 2k
−6
3 + 3k
(a) The system has no solution when this matrix has a zero row with a corresponding nonzero constant.
Now, 2 + 2k = 0 for k = −1, in which case 3 + 3k = 0 as well. Thus there is no value of k for
which the system has no solutions.
(b) The system has a unique solution when rank A = 2, which happens when the bottom row is
nonzero, i.e., for k 6= −1.
(c) The system has an infinite number of solutions when A has a zero row with a zero constant. From
part (a), this happens when k = −1.
41. First row-reduce the augmented matrix of the system:
1
k
k
1
1
1
R2 −kR1
1 −
−−−−→ 0
k
1 − k2
1
1−k
(a) If 1 − k 2 = 0 but 1 − k 6= 0, this system will be inconsistent and have no solution. 1 − k 2 = 0 for
k = 1 and k = −1; of these, only k = −1 produces 1 − k 6= 0. Thus the system has no solution
for k = −1.
(b) The system has a unique solution when 1 − k 2 6= 0, so for k 6= ±1.
(c) The system has infinitely many solutions when the last row is zero, so when 1 − k 2 = 1 − k = 0.
This happens for k = 1.
42. First reduce the augmented matrix of the given system to row echelon form:

1
1
2
−2 3
1 1
−1 4


2
1
R2 −R1
−−→ 0
k  −−−
R3 −2R1
k2 −
−−−−→ 0
−2
3
3 −2
3 −2


1
2
0
k − 2
R3 −R2
k2 − 4 −
−−−−→ 0
−2
3
3 −2
0
0

2
k − 2
k2 − k − 2
(a) The system has no solution when the matrix has a zero row with a corresponding nonzero constant.
This occurs for k 2 − k − 2 6= 0, i.e. for k 6= 2, −1.
(b) Since the augmented matrix has a zero row, the system does not have a unique solution for any
value of k; z is a free variable.
(c) The system has an infinite number of solutions when the augmented matrix has a zero row with
a zero constant. This occurs for k 2 − k − 2 = 0, i.e. for k = 2, −1.
43. First reduce the augmented matrix of the given system to row echelon form:

1
1
k
1
k
1
k
1
1


1
1
1
R2 −R1
−−→ 0 k − 1
1  −−−
R3 −kR1
−2 −
−−−−→ 0 1 − k
k
1−k
1 − k2


1
1
0
0 
R3 +R2
−2 − k −
−−−−→ 0
1
k
k−1
1−k
0
2 − k − k2

1
0 
−2 − k
(a) The system has no solution when the matrix has a zero row with a corresponding nonzero constant.
This occurs when 2 − k − k 2 = (1 − k)(2 + k) = 0 but −2 − k 6= 0. For k = 1, −2 − k = −3 6= 0,
but for k = −2, −2 − k = 0. Thus the system has no solution only when k = 1.
(b) The system has a unique solution when 2 − k − k 2 6= 0; that is, when k 6= 1 and k 6= −2.
(c) The system has an infinite number of solutions when the augmented matrix has a zero row with
a zero constant. This occurs when 2 − k − k 2 = 0 and −2 − k = 0. From part (a), we see that
this is exactly when k = −2.
2.2. DIRECT METHODS FOR SOLVING LINEAR SYSTEMS
67
44. There are many examples. For instance,
(a) The following system of n equations in n variables has infinitely many solutions (for example,
choose k and set x1 = k and x2 = −k, with the other xi = 0):
x1 + x2 + · · · + xn = 0
2x1 + 2x2 + · · · + 2xn = 0
..
.
nx1 + nx2 + · · · + nxn = 0 .
Let m be any integer greater than n. Then the following system of m equations in n variables
has infinitely many solutions (for example, choose k and set x1 = k and x2 = −k, with the other
xi = 0):
x1 + x2 + · · · + xn = 0
2x1 + 2x2 + · · · + 2xn = 0
..
.
mx1 + mx2 + · · · + mxn = 0 .
(b) The following system of n equations in n variables has a unique solution:
x1 = 0
x2 = 0
..
.
xn = 0 .
Let m be any integer greater than n. Then a system whose first n equations are as above, and
which continues with equations x1 + x2 = 0, x1 + 2x2 = 0, and so forth until there are m equations
clearly also has a unique solution.
45. Observe that the normal vectors to the planes, which are [3, 2, 1] and [2, −1, 4], are not parallel, so the
two planes will indeed intersect in a line. The line of intersection is the points in the solution set of
the system
3x + 2y + z = −1
2x − y + 4z =
5
Reduce the corresponding augmented matrix to row echelon form:
"
3
2
2 1
−1 4
# 1
"
3 R1
−1 −−
−→ 1
5
2
2
3
1
3
−1
4
#
"
2
− 13
1
3
R −2R1
5 −−2−−−→
0 − 73
"
1
1 23
3
3
− 7 R2
10
0
1
−
−−−−→
7
1
3
10
3
− 13
#
17
3
#
"
R − 23 R2
− 13 −−1−−−
−→ 1 0
− 17
0 1
7
Thus z is a free variable; we set z = 7t to eliminate some fractions. Then
9
x = −9t + ,
7
y = 10t −
17
,
7
so that the line of intersection is
  

 
9
x
−9
7
   17 
 
y  = − 7  + t  10 .
z
0
7
9
7
− 10
7
9
7
− 17
7
#
68
CHAPTER 2. SYSTEMS OF LINEAR EQUATIONS
46. Observe that the normal vectors to the planes, which are [4, 1, 1] and [2, −1, 3], are not parallel, so the
two planes will indeed intersect in a line. The line of intersection is the points in the solution set of
the system
4x + y + z = 0
2x − y + 3z = 2
Reduce the corresponding augmented matrix to row echelon form:
"
4
2
1 1
−1 3
# 1
"
4 R1
0 −−
−→ 1
2
2
1
4
1
4
−1
3
"
#
1
0
R −2R1
2 −−2−−−→
0
1
4
− 32
"
1
− 32 R2
−−−−→ 0
1
4
5
2
1
4
1
#
0
2
1
4
− 35
#
"
R − 14 R2
0 −−1−−−
−→ 1
− 34
0
0
1
2
3
− 53
1
3
− 43
#
Letting the parameter be z = t, we get the parametric form of the solution:
x=
1 2
− t,
3 3
4 5
y = − + t,
3 3
z = t.
47. We are looking for examples, so we should keep in mind simple planes that we know, such as x = 0,
y = 0, and z = 0.
(a) Starting with x = 0 and y = 0, these planes intersect when x = y = 0, so they intersect on the
z-axis. So we must find another plane that contains the z-axis. Since the line y = x in R2 passes
through the origin, the plane y = x in R3 will have z as a free variable, and it contains (0, 0, 0),
so it contains the z-axis. Thus the three planes x = 0, y = 0, and x = y intersect in a line, the
z-axis. A sketch of these three planes is:
(b) Again start with x = 0 and y = 0; we are looking for a plane that intersects both of these, but not
on the z-axis. Looking down from the top, it seems that any plane whose projection onto the xy
plane is a line in the first quadrant will suffice. Since x + y = 1 is such a line, the plane x + y = 0
should intersect each of the other planes, but there will be no common points of intersection.
To find the line of intersection of x = 0 and x + y = 1, solve that pair of equations. Using
2.2. DIRECT METHODS FOR SOLVING LINEAR SYSTEMS
69
forward substitution we get y = 1, so the line of intersection is the line x = 0, y = 1, which is
the line [0, 1, t]. To find the line of intersection of y = 0 and x + y = 1, again solve using forward
substitution, giving the line x = 1, y = 0, which is the line [1, 0, t]. Clearly these two lines have
no common points. A sketch of these three planes is:
(c) Clearly x = 0 and x = 1 are parallel, while y = 0 intersects each of them in a line. A sketch of
these three planes is:
70
CHAPTER 2. SYSTEMS OF LINEAR EQUATIONS
(d) The three planes x = 0, y = 0, and z = 0 intersect only at the origin:
48. The two lines have parametric forms p + su and q + tv, where s and t are the parameters. These two
are equal when
p + su = q + tv, or su − tv = q − p.
Substitute the given values of the four vectors to get
 
       
1
−1
2
−1
3
s+t= 3
s  2 − t  1 = 2 −  2 =  0 ⇒ 2s − t = 0
−1
0
0
1
−1
−s
= −1 .
From the third equation, s = 1; substituting into the first equation gives t = 2. Since s = 1, t = 2
satisfies the second equation, this is the solution. Thus the point of intersection is
 
 
0
x
y  = p + 1u = 4 .
z
0
49. The two lines have parametric forms p + su and q + tv, where s and t are the parameters. These two
are equal when
p + su = q + tv, or su − tv = q − p.
Substitute the given values of the four vectors to get
 
       
1
2
−1
3
−4
s − 2t = −4
s 0 − t 3 =  1 − 1 =  0 ⇒ − 3t = 0
1
1
−1
0
−1
s − t = −1 .
From the second equation, t = 0; setting t = 0 in the other two equations gives s = −4 and s = −1.
This is impossible, so the system has no solution and the lines do not intersect.
50. Let Q = (a, b, c); then we want to find values of Q such that the line q + sv intersects the line p + tu.
The two lines intersect when
p + su = q + tv, or su − tv = q − p.
2.2. DIRECT METHODS FOR SOLVING LINEAR SYSTEMS
71
Substitute the values of the vectors to get
 
      

1
a−1
s − 2t = a − 1
1
2
a
s  1 − t 1 =  b  −  2 =  b − 2  ⇒ s − t = b − 2
−1
0
c
−3
c+3
−s
= c+3.
Solving those equations for a, b, and c gives
Q = (a, b, c) = (s − 2t + 1, s − t + 2, −s − 3).
Thus any pair of numbers s and t determine a point Q at which the two given lines intersect.
51. If x = [x1 , x2 , x3 ] satisfies u · x = v · x = 0, then the components of x satisfy the system
u1 x1 + u2 x2 + u3 x3 = 0
v1 x1 + v2 x2 + v3 x3 = 0.
There are several cases. First, if either vector is zero, then the cross product is zero, and the statement
does not hold: any vector that is orthogonal to the other vector is orthogonal to both, but is not a
multiple of the zero vector. Otherwise, both vectors are nonzero. Assume that u is the vector with a
lowest-numbered nonzero component (that is, if u1 = 0 but u2 6= 0, while v1 6= 0, then reverse u and
v). The augmented matrix of the system is
u1 u2 u3 0
.
v1 v2 v3 0
If u1 = u2 = 0 but u3 6= 0, then v1 = v2 = 0 and v3 6= 0 as well, so that u and v are parallel and thus
u × v = 0. However, any vector of the form
 
s
t
0
is orthogonal to both u and v, so the result does not hold in this case either.
Next, if u1 = 0 but u2 6= 0, then v1 = 0 as well, so the augmented matrix has the form
0 u2 u3 0
0
0
0 u2
u3
0 u2
u3
.
u2 R2
R −v2 R1
0 v2 v3 0 −
−−→ 0 u2 v2 u2 v3 0 −−2−−−
−→ 0 0 u2 v3 − u3 v2 0


 
0
0
Then x1 is a free variable. If u2 v3 − u3 v2 = 0, then u = u2  and v = v2  are again parallel, so
u3
v3
their dot product is zero and again the result does not hold. But if u2 v3 − u3 v2 6= 0, then x3 = 0,
which implies that x2 = 0 as well. So any vector that is orthogonal to both u and v in this case is of
the form
   
x1
s
x2  = 0 ,
x3
0
and

   
 

0
0
u2 v3 − u3 v2
u2 v3 − u3 v2
u2  × v2  = u3 · 0 − 0 · v3  = 
,
0
u3
v3
0 · v2 − 0 · u 2
0
and since u2 v3 − u3 v2 6= 0, the orthogonal vectors are multiples of the cross product.
Finally, assume that u1 6= 0. Then the augmented matrix reduces as follows:
u1
u2
u3
0
u1
u2
u1 u2 u3 0
u1 R2
R2 −v1 R1
v1 v2 v3 0 −
u
v
u
v
u
v
0
0
u
v
1 2
1 3
1 2 − u2 v1
−−→ 1 1
−−−−−−→
u3
u1 v3 − u3 v1
0
.
0
72
CHAPTER 2. SYSTEMS OF LINEAR EQUATIONS
Now consider the last row of this matrix. If it consists of all zeros, then u1 v2 = u2 v1 and u1 v3 = u3 v1 .
If v1 6= 0, then uu12 = vv21 and also uu13 = vv13 , so that again u and v are parallel and the result does not
hold. Otherwise, if v1 = 0, then u1 v2 = u1 v3 = 0, and u1 was assumed nonzero, so that v2 = v3 = 0
and thus v is the zero vector, which it was assumed not to be. So at least one of u1 v2 − u2 v1 and
u1 v3 − u3 v1 is nonzero.
If they are both nonzero, then x3 = t is free, and
(u1 v2 − u2 v1 )x2 = (u3 v1 − u1 v3 )t,
so thatx2 =
u3 v1 − u1 v3
t.
u1 v2 − u2 v1
But then u1 x1 + u2 x2 + u3 x3 = 0, so that
u1 (u3 v2 − u2 v3 )
u3 v1 − u1 v3
t + u3 t = u1 x1 +
t = 0,
u1 x1 + u2
u1 v2 − u2 v1
u1 v2 − u2 v1
and thus
x1 =
u2 v3 − u3 v2
t.
u1 v2 − u2 v1
Thus the orthogonal vectors are

 

u2 v3 −u3 v2
x1
 
 u1 v2 −u2 v1 
x2  = t  u3 v1 −u1 v3  ,
 
 u1 v2 −u2 v1 
1
x3
or, clearing fractions,
 


x1
u2 v3 − u3 v2
 


x2  = t u3 v1 − u1 v3  ,
 


x3
u1 v2 − u2 v1
so that orthogonal vectors are multiples of the cross product.
If u1 v2 − u2 v1 6= 0 but u1 v3 − u3 v1 = 0, then again v1 = 0 forces v = 0, which is impossible. So v1 6= 0
and thus uu13 = vv31 . Now, x3 = t is free and x2 = 0, so that u1 x1 + u3 t = 0 and then x1 = − uu31 t. Let
t = (u1 v2 − u2 v1 )s; then
x1 = −
u3 (u1 v2 − u2 v1 )
u3 u1 v2
u3
v3
s=−
s + (u2 v1 )s = −u3 v2 s + (u2 v1 )s = (u2 v3 − u3 v2 )s.
u1
u1
u1
v1
Thus orthogonal vectors are of the form
 


x1
u2 v3 − u3 v2
x2  = s 
,
0
x3
u1 v2 − u2 v1
and since u1 v3 − u3 v1 = 0, this column vector is u × v, and again the result holds.
The final case is that u1 v3 − u3 v1 6= 0 but u1 v2 − u2 v1 = 0. Again v1 = 0 forces v = 0, which is
impossible, so that v1 6= 0 and thus uu12 = vv21 . Now, x2 = t is free and x3 = 0, so that u1 x1 + u2 t = 0
and then x1 = − uu21 t. Let t = (u1 v3 − u3 v1 )s; then
x1 = −
u2 u1 v3
u2
v2
u2 (u1 v3 − u3 v1 )
s=−
s + u3 v1 s = −u2 v3 s + u3 v1 s = (u3 v2 − u2 v3 )s.
u1
u1
u1
v1
Thus orthogonal vectors are of the form
 


x1
u2 v3 − u3 v2
x2  = s u1 v3 − u3 v1  ,
x3
0
and since u1 v2 − u2 v1 = 0, this column vector is u × v, and again the result holds.
Whew.
2.2. DIRECT METHODS FOR SOLVING LINEAR SYSTEMS
73
52. To show that the lines are skew, note first that they have nonparallel direction vectors, so that they
are not parallel. Suppose there is some x such that
x = p + su = q + tv.
Then su − tv = q − p. Substituting the given values into this equation gives
 
       
1
−1
2s
= −1
2
0
0
s −3 − t  6 =  1 − 1 =  0 ⇒ −3s − 6t = 0
1
−1
−1
0
−1
s + t = −1 .
But the first equation gives s = − 21 , so that from the second equation t = 14 . But this pair of values
does not satisfy the third equation. So the system has no solution and thus the lines do not intersect.
Since they are not parallel, they are skew.
Next, we wish to find a pair of parallel planes, one containing each line. Since u and v are direction
vectors for the lines, those direction vectors will lie in both planes since the planes are required to be
parallel. So a normal vector is given by

  
−3 · (−1) − 1 · 6
−3
u × v =  1 · 0 − 2 · (−1)  =  2 .
2 · 6 − (−3) · 0
12
So the planes we are looking for will have equations of the form −3x + 2y + 12z = d. The plane that
passes through P = (1, 1, 0) must satisfy −3 · 1 + 2 · 1 + 12 · 0 = d, so that d = −1. The plane that
passes through Q = (0, 1, −1) must satisfy −3 · 0 + 2 · 1 + 12 · (−1) = d, so that d = −10. Hence the
two planes are
−3x + 2y + 12z = −1 containing p + su,
−3x + 2y + 12z = −10 containing q + tv.
53. Over Z3 ,
R2 +2R1 1 2 1 −
−−−−→ 1 2
1 1 2
0 2
x
0
Thus the solution is
=
.
y
2
1
1
2R2
1 −
−→ 0
R1 +R2 1 −
−−−−→ 1 0
2
0 1
2
1
0
.
2
54. Over Z2 ,

1 1 0
0 1 1
1 0 1


1
1
0
0
R3 +R1
1 −
−−−−→ 0
1
1
1
0
1
1


1
1
0
0
R3 +R2
0 −
−−−−→ 0
1
1
0
0
1
0
 R1 +R2 
1 −
−−−−→ 1 0 1
0 1 1
0
0
0 0 0

1
0
0
Thus z = t is a free variable, and we have x + t = 1, so that x = 1 − t = 1 + t and also y + t = 0, so
that y − t = t. Thus the solution is
  
    
x
1+t
0 
 1
y  =  t  = 0 , 1 ,


z
t
0
1
since the only possible values for t in Z2 are 0 and 1.
55. Over Z3 ,

1 1 0
0 1 1
1 0 1


1
1 1 0
0 1 1
0
R3 +2R1
1 −
−−−−→ 0 2 1
 R +2R2 
1 −−1−−−→
1
0
0
R3 +R2
0 −
−−−−→ 0
0
1
0

1
0
···
2R3
0 −
−→

 R1 +R3 
1 0 2 1 −−−
−−→ 1
R2 +2R3 0
· · · 0 1 1 0 −
−−−−→
0 0 1 0
0
2
1
2
0
1
0
0
0
1

1
0
0
74
CHAPTER 2. SYSTEMS OF LINEAR EQUATIONS
Thus the unique solution is
   
x
1
y  = 0 .
z
0
56. Over Z5 ,
3 2
1 4
2R1 1 −
−→ 1
1
1
4
4
2
1
R2 +4R1
1 −
−−−−→ 0
4
0
1
.
4
Since the bottom row says 0 = 4, this system is inconsistent and has no solutions.
57. Over Z7 ,
5R1 2 1 −
−→ 1
4 1
1
x
3
.
Thus the solution is
=
y
3
3
1
5 R2 +6R1 1 3
1 −−−−−→ 0 1
3
4
R1 +4R2 5 −
−−−−→ 1 0
3
0 1
3
.
3
58. Over Z5 ,

1
1

2
1
0
2
2
0
0
4
0
3
4
0
1
0


1 0 0 4
1 R +4R
2
1
−−−−−→ 0 2 4 1
3
 R +3R 
1 
1 −
0 2 0 3
−3−−−→
R
+4R
1
2 −−4−−−→
0 0 3 1

1
0
···
0
0
0
2
0
0
0
4
1
3
4
1
2
1

1
2
 R +4R · · ·
2
4 −
−3−−−→
1


1 0 0 4
1
3R
2
0 1 2 3
−
−
→
2


0 0 1 2
2
R
+2R
4
3
1 −−−−−→ 0 0 0 0


1 0 0 4
1
R2 +3R3 
1
−
−
−
−
−
→

0 1 0 4
0 0 1 2
2
0
0 0 0 0

1
2

2
0
Thus x4 = t is a free variable, and

 
  
1+t
1 − 4t
x1
x2  2 − 4t  2 + t 

 
 =
x3  2 − 2t = 2 + 3t .
x4
t
t
Since t = 0, 1, 2, 3, or 4, there are five solutions:
 
 
 
1
2
3
2
3
4
 ,  ,  ,
2
0
3
0
1
2
 
4
0
 ,
1
3
 
0
1

and 
4 .
4
59. The Rank Theorem, which holds over Zp as well as over Rn , says that if A is the matrix of a consistent
system, then the number of free variables is n − rank(A), where n is the number of variables. If there
is one free variable, then it can take on any of the p values in Zp ; since 1 = n − rank(A), the system
has exactly p = p1 = pn−rank(A) solutions. More generally, suppose there are k free variables. Then
k = n − rank(A). Each of the free variables can take any of p possible values, so altogether there are
pk choices for the values of the free variables. Each of these yields a different solution, so the total
number of solutions is pk = pn−rank(A) .
60. The first step in row-reducing the augmented matrix of this system would be to multiply by the
appropriate number to get a 1 in the first column. However, neither 2 nor 3 has a multiplicative
inverse in Z6 , so this is impossible. So the best we can do in terms of row-reduction is
2 3 4
2 3 4
R2 +R1
4 3 2 −
−−−−→ 0 0 0
Exploration: Lies My Computer Told Me
75
Thus y is a free variable, and 2x + 3y = 4. To solve this equation, choose each possible value for y in
turn:
• When y = 0, y = 2, or y = 4 we get 2x = 4, so that x = 2 or x = 5.
• When y = 1 or y = 3, we get 2x + 3 = 4, or 2x = 1, which has no solution.
So altogether there are six solutions:
2
5
,
,
0
0
2
,
2
5
,
2
2
,
4
5
.
4
Exploration: Lies My Computer Told Me
1.
x+
y=0
801
y=0
x + 800
⇒
−800x − 800y = 0
800x + 801y = 800
⇒
x = −800
y = 800 .
x+
y=0
x + 1.0012y = 0
⇒
−x −
y=0
x + 1.0012y = 1
⇒
x = −833.33
y = 833.33 .
⇒
−x −
y=0
x + 1.001y = 1
⇒
x = −1000
y = 1000 ,
2.
3. At four significant digits, we get
x+
y=0
x + 1.001y = 0
while at three significant digits, we get
x+
y=0
x + 1.00y = 0
⇒
−x −
y=0
x + 1.00y = 1
⇒
Inconsistent system; no solution.
4. For the equations we just looked at, the slopes of the two lines were very close (the angle between the
lines was very small). That means that changing one of the slopes even a little bit will result in a
large change in their intersection point. That is the cause of the wild changes in the computed output
values above — rounding to five significant digits is indeed the same as a small change in the slope of
the second line, and it caused a large change in the computed solution.
For the second example, note that the two slopes are
7.083
≈ 1.55602,
4.552
2.693
≈ 1.55575,
1.731
so that again the slopes are close, so we would expect the same kind of behavior.
Exploration: Partial Pivoting
1
≈ 4761.9.
1. (a) Solving 0.00021x = 1 to five significant digits gives x = 0.00021
1
(b) If only four significant digits are kept, we have 0.0002x = 1, so that x = 0.0002
= 5000. Thus the
effect of an error of 0.00001 in the input is an error of 238.1 in the result.
1
2. (a) Start by pivoting on 0.4. Multiply the first row by 0.4
= 250 to get [1, 249, 250]. Now we want
to subtract 75.3 times that row from the second row. Multiplying by 75.3 and keeping three
significant digits gives
75.3[1, 249, 250] = [75.3, 18749.7, 18825.0] ≈ [75.3, 18700, 18800].
Then the second row becomes (again keeping only three significant digits)
[75.3, 45.3, 30] − [75.3, 18700, 18800] = [0, 18654.7, 18770] ≈ [0, 18700, 18800].
76
CHAPTER 2. SYSTEMS OF LINEAR EQUATIONS
This gives y = 18800
18700 ≈ 1.00535 ≈ 1.01. Substituting back into the modified first row, which says
x + 249y = 250 gives x = 250 − 251.49 ≈ 250 − 251 = −1.
Substituting x = y = 1 into the original system, however, yields two true equations, so this is in
fact the solution. The problem was that the large numbers in the second row, combined with the
fact that we were keeping only three significant digits, overwhelmed the information from adding
in the first row, so all that information was lost.
(b) First interchanging the rows gives the matrix
75.3 −45.3
0.4
99.6
30
100
1
to get (after reducing to three significant digits) [1, −0.602, 0.398].
Multiply the first row by 75.3
Then subtract 0.4 times that row from the second row:
[0.4, 99.6, 100] − 0.4[1, −0.602, 0.398] ≈ [0, 99.8, 99.8].
This gives 99.8y = 99.8, so that y = 1; from the first equation, we have x − 0.602y = 0.398, so
that x − 0.602 · 1 = 0.398 and thus x = 1. We get the correct answer.
3. (a) Without partial pivoting, we pivot on 0.001 to three significant digits, so we divide the first row
by 0.001, giving [1, 995, 1000], then multiply it by 10.2 and add the result to row 2, giving the
matrix
1
995
1000
⇒
y ≈ 1.01.
0 −10100 −10200
Back-substituting gives x = 1.00−1.00
= 0 (again remembering to keep only three significant
2
digits). This error was introduced because 1.00 and −50.0 were ignored when reducing to three
significant digits in the original row-reduction, being overwhelmed by numbers in the range of
10000.
Using partial pivoting, we pivot on −10.2 to three significant digits: the second row becomes
[1, 0.098, 4.9]. Then multiply it by 0.001 and subtract it from the first row, giving [0, 0.995, 1.00].
Thus 0.995y = 1.00, so that to three significant digits y = 1.00. Back-substituting gives x =
−50.0−1.00
= 51.0
−10.2
10.2 = 5. Pivoting on the largest absolute value reduced the error in the solution.
(b) Start by pivoting on the 10.0 at upper left:

 
10.0 −7.00 0.00 7.00
10.0
−3.0 2.09 6.00 3.91 →
−  0.0
5.0
1.00 5.00 6.00
0.0
−7.00 0.00
−0.01 6.00
2.50 5.00

7.00
6.01 .
2.50
If we continue, pivoting on −0.01 to three significant digits, the second row then becomes
[0, 1, −600, −601]. Multiply it by 2.5 and add it to the third row:

 

10.0 −7.00 0.00 7.00
10.0 −7.00 0.00
7.00
 0.0 −0.01 6.00 6.01 →
−  0.0 −0.01 6.00
6.01  .
0.0
2.50 5.00 2.50
0.0
0
−1500 −1500
This gives z = 1.00, y = −1.00, and x = 0.00.
If instead we reversed the second and third rows and pivoted on the 2.5, we would first divide
that row by 2.5, giving [0, 1, 2, 1], then multiply it by 0.01 and add it to the third row:

 
 

10.0 −7.00 0.00 7.00
10.0 −7.00 0.00 7.00
10.0 −7.00 0.00 7.00
 0.0
2.50 5.00 2.50 →
−  0.0
1
2
1 →
−  0.0
1
2
1 .
0.0 −0.01 6.00 6.01
0.0 −0.01 6.00 6.01
0.0
0
6.02 6.02
This also gives z = 1.00, y = −1.00, and x = 0.00.
Exploration: An Introduction to the Analysis of Algorithms
77
Exploration: An Introduction to the Analysis of Algorithms
1. We count the number of operations, one step at a time.

 1 R1 
2
2 4
6
8 −−
1
−→
 3 9
 3
6 12
1
−1 1 −1
−1
2
9
1
3
6
−1

4
12
1
There were three operations here: 21 · 4 = 2, 12 · 6 = 3, and 21 · 8 = 4. We don’t count 12 · 2 = 1 since we
don’t actually perform that operation — once we chose 2, we knew we were going to turn it into 1.




1 2
3 4
4
1 2
3
R
−3R
1 
 3 9
6 12 −−2−−−→
0 3 −3 0
1
−1 1 −1
−1 1 −1 1
There were three operations here: 3 · 2 = 6, 3 · 3 = 9, and 3 · 4 = 12, fr a total of six operations so far.
Note that we are counting only multiplications and divisions, and that again we don’t count 3 · 1 = 3
since we didn’t actually have to perform it — we just replace the 3 in the second row by a 0.




1 2
3 4
1 2
3 4
0 3 −3 0
 0 3 −3 0
R3 +R1
−1 1 −1 1 −
2 5
−−−−→ 0 3
There were 3 operations here: 1 · 2 = 2, 1 · 3 = 3, and 1 · 4 = 4, for a total of nine operations so far.




1 2
3 4
1 2
3 4 1
0 3 −3 0 −−
3 R2 0
1 −1 0
−→
0 3
2 5
0 3
2 5
In this step, there were 2 operations: 13 · (−3) and 31 · 0 = 0, for a total of eleven so far.

1
0
0
2
1
3
3
−1
2


1 2
4
0 1
0
R3 −3R2
5 −
−−−−→ 0 0
3
−1
5

4
0
5
In this step, there were 2 operations: −3 · (−1) = 3 and −3 · 0 = 0, for a total of 13 so far.

1
0
0
2
1
0
3
−1
5


4
1 2
0 1
0 1
5 R3
5 −−
−→ 0 0
3
−1
1

4
0
1
In this step, there was one operation, 51 · 5 = 1, for a total of 14.
Finally, to complete the back-substitution, we need three more operations: −1 · x3 to find x2 , and 2 · x2
and 3 · x3 to find x1 . So in total 17 operations were required.
2. To compute the reduced row echelon form, we first compute the row echelon form above, which took
14 operations. Then:

 R1 −2R2 

1 2
3 4 −
5 4
−−−−→ 1 0
0 1 −1 0
0 1 −1 0
0 0
1 1
0 0
1 1
This required two operations, −2 · (−1) and −2 · 0, for a total of 16. Finally,

1 0
0 1
0 0
5
−1
1
 R −5R3 
4 −−1−−−→
1 0 0
R2 +R3 0 1 0
0 −
−−−−→
1
0 0 1

−1
1
1
78
CHAPTER 2. SYSTEMS OF LINEAR EQUATIONS
This required two operations, both in the constant column: 1 · 1 and −5 · 1, for a total of 18. So for this
example, Gauss-Jordan required one more operation than Gaussian elimination. We would probably
need more data to conclude that it is generally less efficient.
3. (a) There are n operations required to create the first leading 1 because we have to divide every entry
in the first row, except a11 , by a11 (we don’t have to divide a11 by itself because the 1 results
from the choice of pivot. Next, there are n operations required to create the first zero in column
1, because we have to multiply every entry in the first row, except the 1 in column one, by a21 .
Again, we don’t have to multiply the 1 by itself, since we know we are simply replacing a21 by
zero. Similarly, there are n operations required to create each zero in column 1. Since there are
n − 1 rows excluding the first row, introducing the 1 in row one and the zeros below it requires
n + (n − 1)n = n2 operations.
(b) We can think of the resulting matrix, below and to the right of a22 , as an (n − 1) × n augmented
matrix. Applying the process in part (a) to this matrix, we see that turning a22 into a 1 and
creating zeros below it requires (n − 1) + (n − 2)(n − 1) = (n − 1)2 operations. Continuing to the
matrix starting at row 3, column 3; this requires (n − 2)2 operations. So altogether, to reduce the
matrix to row echelon form requires n2 + (n − 1)2 + · · · + 22 + 12 operations.
(c) To find xn−1 requires one operation: we must multiply xn by the entry an−1,n . Then there are
two operations required to find xn−2 , and so forth, until we require n − 1 operations to find x1 .
So the total number required for the back-substitution is 1 + 2 + · · · + (n − 1).
(d) From Appendix B, we see that
k(k + 1)(2k + 1)
6
k(k + 1)
1 + 2 + ··· + k =
.
2
12 + 22 + · · · + k 2 =
Thus the total number of operations required is
n(n + 1)(2n + 1) n(n − 1)
+
6
2
1 2 1
1 3 1 2 1
= n + n + n+ n − n
3
2
6
2
2
1 3
1
= n + n2 − n.
3
3
12 + 22 + · · · + n2 + 1 + 2 + · · · + (n − 1) =
For large values of n, the cubic term dominates, so the total number of operations required is
≈ 13 n3 .
4. As we saw in Exercise 2 in this section, we get the same n2 + (n − 1)2 + · · · + 22 + 12 operations to get
the matrix into row echelon form. Then to create the 1 zero above row 2 requires 1(n − 1) operations,
since each of the remaining n − 1 entries in row 2 must be multiplied. To create the 2 zeros above row
3 requires 2(n − 2) operations, since each of the remaining entries in row 3 must be multiplied twice.
Continuing, we see that the total number of operations is
(n − 1) + 2(n − 2) + · · · +(n − 1)(n − (n − 1))
= (1 · n + 2 · n + · · · + (n − 1) · n) − (12 + 22 + · · · + (n − 1)2 )
= n(1 + 2 + · · · + (n − 1)) − (12 + 22 + · · · + (n − 1)2 ).
Adding that to the operations required to get the matrix into row echelon form, and using the formulas
2.3. SPANNING SETS AND LINEAR INDEPENDENCE
79
from Appendix B, we get for the total number of operations
n2 + (n − 1)2 + · · · + 22 + 12 + n(1 + 2 + · · · + (n − 1))−(12 + 22 + · · · + (n − 1))2
n(n − 1)
2
1 3 1 2
2
=n + n − n
2
2
1 3 1 2
= n + n .
2
2
Again, for large values of n, the cubic term dominates, so the total number of operations required is
≈ 21 n3 .
= n2 + n ·
2.3
Spanning Sets and Linear Independence
1. Yes, it is. We want to write
x
1
2
1
,
+y
=
−1
−1
2
so we want to solve the system of linear equations
x + 2y = 1
−x − y = 2.
Row reducing the augmented matrix for the system gives
R1 −2R2 1
2 1
1 2 1 −
−−−−→ 1 0
R2 +R1
−1 −1 2 −
0 1
−−−−→ 0 1 3
−5
.
3
So the solution is x = −5 and y = 3, and the linear combination is v = −5u1 + 3u2 .
2. No, it is not. We want to write
4
−2
2
x
+y
=
,
−2
1
1
so we want to solve the system of linear equations
4x − 2y = 2
−2x + y = 1
Row-reducing the augmented matrix for the system gives
# 1
#
"
"
"
4 R1
1 − 21 12
1
4 −2 2 −−
−→
R2 +2R1
−2
1 1
−2
1 1 −−−−−→ 0
− 12
0
1
2
#
2
So the system is inconsistent, and the answer follows from Theorem 2.4.
3. No, it is not. We want to write
 
   
1
0
1
x 1 + y 1 = 2 ,
0
1
3
so we want to solve the system of linear equations
x
=1
x+y =2
y = 3.
Rather than constructing the matrix and reducing it, note that the first and third equations force x = 1
and y = 3, and that these values of x and y do not satisfy the second equation. Thus the system is
inconsistent, so that v is not a linear combination of u1 and u2 .
80
CHAPTER 2. SYSTEMS OF LINEAR EQUATIONS
4. Yes, it is. We want to write
 
   
1
0
3
x 1 + y 1 =  2 ,
0
1
−1
so we want to solve the system of linear equations
x
=
3
x+y =
2
y = −1
Rather than constructing the matrix and reducing it, note that the first and third equations force x = 3
and y = −1, and that these values of x and y also satisfy the second equation. Thus v = 3u1 − u2 .
5. Yes, it is. We want to write
 
 
   
1
0
1
1
x 1 + y 1 + z 0 = 2 ,
0
1
1
3
so we want to solve the system of linear equations
x
+z = 1
x+y
=2
y+z = 3.
Row-reducing the associated augmented matrix gives

1 0 1
1 1 0
0 1 1


1 0
1
R2 −R1 
2 −
−−−−→ 0 1
3
0 1
1
−1
1


1 0
1
0 1
1
R3 −R2
3 −
−−−−→ 0 0

1 1
−1 1 1
···
2 R3
2 2 −−
−→

 R1 −R3 
1 0 1 1 −−−
−−→ 1
R2 +R3 0
· · · 0 1 −1 1 −
−−−−→
0 0 1 1
0
0
1
0
0
0
1

0
2
1
This gives the solution x = 0, y = 2, and z = 1, so that v = 2u2 + u3 .
6. Yes, it is. The associated linear system is
1.0x + 3.4y − 1.2z =
3.2
0.4x + 1.4y + 0.2z =
2.0
4.8x − 6.4y − 1.0z = −2.6
Solving this system with a CAS gives x = 1.0, y = 1.0, z = 1.0, so that v = u1 + u2 + u3 .
7. Yes, it is. b is in the span of the columns of A if and only if the system Ax = b has a solution, which
is to say if and only if the linear system
x + 2y = 5
3x + 4y = 6
has a solution. Row-reducing the augmented matrix for the system gives
#
"
#
"
#
"
"
R −2R2
1
2
5
1 2 5 −−1−−−→
1 0
1 2 5
− 21 R2
R2 −3R1
9
3 4 6 −−−−−→ 0 −2 −9 −−−−→ 0 1 2
0 1
Thus x = −4 and y = 29 , so that
A
−4
9
2
=
5
.
6
−4
9
2
#
2.3. SPANNING SETS AND LINEAR INDEPENDENCE
81
8. Yes, it is. b is in the span of the columns of A if and only if the system Ax = b has a solution, which
is to say if and only if the linear system
x + 2y + 3z = 4
5x + 6y + 7z = 8
9x + 10y + 11z = 12
Row-reducing the augmented matrix for the system gives

1 2
5 6
9 10
3
7
11


1
4
R −5R1
0
8  −−2−−−→
R3 −9R1
12 −
−−−−→ 0
2
3
−4 −8
−8 −16


4
1
− 41 R2 
−12 −
0
−−−→
−24
0

4
3
−24
2
3
1
2
−8 −16

1
0
R +8R2
0
−−3−−−→
2
1
0
3
2
0

4
3
0
This system has an infinite number of solutions, so that b is indeed in the span of the columns of A.
a
9. We must show that for any vector
, we can write
b
1
1
a
x
+y
=
1
−1
b
for some x, y. Row-reduce the associated augmented matrix:
#
"
#
"
"
1 1
a
1
1 a
1 1
− 12 R2
R2 −R1
1 −1 b −−−
−−→ 0 −2 b − a −−−
−→ 0 1
a
#
a−b
2
"
R1 −R2
−−−
−−→ 1 0
0 1
a+b
2
a−b
2
#
So given a and b, we have
a+b 1
a−b
1
a
·
+
·
=
.
1
−1
b
2
2
a
10. We want to show that for any vector
, we can write
b
x
3
0
a
+y
=
−2
1
b
for some x, y. Row-reduce the associated augmented matrix:
"
"
# 1
#
"
3 R1
3 0 a −−
1 0 a3
1
−→
R2 +2R1
−2 1 b
−2 1 b −−−−−→ 0
0
1
a
3
#
b + 2a
3
which shows that the vector can be written
a
2a
a
3
0
= ·
+ b+
·
.
b
1
3 −2
3
11. We want to show that any vector can be written as a linear combination of the three given vectors, i.e.
that
 
 
 
 
a
1
1
0
 b  = x 0 + y 1 + z 1
c
1
0
1
82
CHAPTER 2. SYSTEMS OF LINEAR EQUATIONS
for some x, y, z. Row-reduce the associated augmented matrix:

1 1 0
0 1 1
1 0 1




1 1 0
1
a
a
0 1 1
0
b
b 
R3 −R1
R3 +R2
c −
−−−−→ 0 −1 1 c − a −−−
−−→ 0



1 1
a
1 1 0
 R2 −R3 

b  −−−−−→ 0 1
0 1 1
1
b+c−a
2 R3
0
0
1
0 0
−−−→
2


R1 −R2
−−−
−−→ 1 0 0 a−b+c
2


0 1 0 a+b−c

2
b+c−a
0 0 1
2

a

b
b+c−a

a
a+b−c 

2
1 0
1 1
0 2
0
0
1
b+c−a
2
Thus
 
 
 
 
a
1
1
0
1
1
1
 b  = (a − b + c) · 0 + (a + b − c) · 1 + (−a + b + c) · 1 .
2
2
2
c
1
0
1
12. We want to show that any vector can be written as a linear combination of the three given vectors, i.e.
that
 
 
 
 
a
1
1
2
 b  = x 1 + y 2 + z  1
c
0
3
−1
for some x, y, z. Row-reduce the associated augmented matrix:

1 1
2
1 2
1
0 3 −1





a
1 1 2
a
1 1 2
a
R
−R
2
1 
0 1 −1
b −−−
0 1 −1 b − a
b−a 
−−→
R3 −3R2
c
0 3 −1
c
0
0
2
c
+
3a − 3b
−−−−−→




a
3b − 2a − c
1 1 2
1 1 0
R1 −2R3

 −−−−−→ 

1
b−a
0 1 −1
 R2 +R3 0 1 0
2 (c + a − b) 
1
−−−−−→
1
2 R3
0 0 1 12 (c + 3a − 3b)
−−
−→ 0 0 1
2 (c + 3a − 3b)


R1 −R2
−−−
−−→ 1 0 0 12 (7b − 5a − 3c)


1
0 1 0
2 (c + a − b) 
0 0 1 12 (c + 3a − 3b)
Thus
 
 
 
 
a
1
1
2
 b  = 1 (7b − 5a − 3c) 1 + 1 (c + a − b) 2 + 1 (c + 3a − 3b)  1 .
2
2
2
c
0
3
−1
2
−1
= −2
, these vectors are parallel, so their span is the line through the origin with
−4
2
−1
direction vector
.
2
x
−1
(b) The vector equation of this line is
=t
. Note that this is just another way of saying
y
2
that the line is the span of the given vector. To derive the general equation of the line, expand
the above, giving the parametric equations x = −t and y = 2t. Substitute −x for t in the second
equation, giving y = −2x, or 2x + y = 0.
13. (a) Since
2.3. SPANNING SETS AND LINEAR INDEPENDENCE
83
0
3
is just the origin, and the origin is in the span of
, the span of these two
0
4
3
3
vectors is just the span of
, which is the line through the origin with direction vector
.
4
4
x
3
(b) The vector equation of this line is
=t
. Note that this is just another way of saying that
y
4
the line is the span of the given vector. To derive the general equation of the line, expand the
above, giving the parametric equations x = 3t and y = 4t. Substitute 31 x for t in the second
equation, giving y = 43 x. Clearing fractions gives 3y = 4x, or 4x − 3y = 0.
14. (a) Since the span of
15. (a) Since the given vectors are not parallel, their span is a plane through the origin containing these
two vectors as direction vectors.
(b) The vector equation of the plane is
 
 
 
x
1
3
y  = s 2 + t  2 .
z
0
−1
One way of deriving the general equation of the plane is to solve the system of equations arising
from the above vector equation. Row-reducing the corresponding augmented system gives






1 3
x
1 3
x
1 3 x
R
−2R
2 
0 −4

2 2 y  −−2−−−→
y − 4x
0 −4 y − 2x
R3 − 41 R2
1
0 −1 z
0 −1
z
0
0
z
+
(2x
−
y)
−−−−−−→
4
Since we know the system is consistent, the bottom row must consist entirely of zeros. Thus
the points on the plane must satisfy z + 41 (2x − y) = 0, or 4z + 2x − y = 0. This is the plane
2x − y + 4z = 0.
An alternative method is to find a normal vector to the plane by taking the cross-product of the
two given vectors:
    
  
1
3
0 · 2 − 2 · (−1)
2
2 ×  2 = 1 · (−1) − 0 · 3 = −1 .
0
−1
2·3−1·2
4
So the equation of the plane is 2x − y + 4z = d; since it passes through the origin, we again get
2x − y + 4z = 0.
16. (a) First, note that


   
0
1
−1
−1 = −  0 −  1 ,
1
−1
0
so that the third vector is in the span of the first two. So we need onlyconsider
the

 first
 two
1
−1
vectors. Their span is the plane through the origin with direction vectors  0 and  1.
−1
0
(b) The vector equation of this plane is
 
   
1
−1
x
s  0 + t  1 = y  .
−1
0
z
To find the general equation of the plane, we can solve the corresponding system of equations for
s and t. Row-reducing the augmented matrix gives






1 −1
x
1 −1
x
1 −1 x






1 y
y 
y
0 1
0 1

0
R3 +R1
R3 +R2
−1 0 z −−−
0
−1
x
+
z
0
0
x
+
y
+
z
−−→
−−−−−→
84
CHAPTER 2. SYSTEMS OF LINEAR EQUATIONS
Since we know the system is consistent, the bottom row must consist of all zeros, so that points
on the plane satisfy the equation x + y + z = 0.
Alternatively, we could find a normal vector to the plane by taking the cross product of the
direction vectors:
    
  
1
−1
0 · 0 − (−1) · 1
1
 0 ×  1 = −1 · −1 − 1 · 0 = 1 .
−1
0
1 · 1 − 0 · (−1)
1
Thus the equation of the plane is x + y + z = d; since it passes through the origin, we again get
x + y + z = 0.
17. Substituting the two points (1, 0, 3) and (−1, 1, −3) into the equation for the plane gives a linear system
in a, b, and c:
a
+ 3c = 0
−a + b − 3c = 0.
Row-reducing the associated augmented matrix gives
1 0
−1 1
3
−3
0
1 0 3
R2 +R1
0 −
−−−−→ 0 1 0
0
,
0
so that b = 0 and a = −3c. Thus the equation of the plane is −3cx + cz = 0. Clearly we must have
c 6= 0, since otherwise we do not get a plane. So dividing through by c gives −3x + z = 0, or 3x − z = 0,
for the general equation of the plane.
18. Note that
u = 1u + 0v + 0w,
v = 0u + 1v + 0w,
w = 0u + 0v + 1w.
Or, if one adopts the usual convention of not writing terms with coefficients of zero, u is in the span
because u = u.
19. Note that
u = u,
v = (u + v) − u,
w = (u + v + w) − (u + v),
and the terms in parentheses on the right, together with u, are the elements of the given set.
20. (a) To prove the statement, we must show that anything that can be written as a linear combination
of the elements of S can also be written as a linear combination of the elements of T . But if x is
a linear combination of elements of S, then
x = a1 u1 + a2 u2 + · · · + ak uk
= a1 u1 + a2 u2 + · · · + ak uk + 0uk+1 + 0uk+1 + · · · + 0um .
This shows that x is also a linear combination of elements of T , proving the result.
(b) We know that span(S) ⊆ span(T ) from part (a). Since T is a set of vectors in Rn , we know that
span(T ) ⊆ Rn . But then
Rn = span(S) ⊆ span(T ) ⊆ Rn ,
so that span(T ) = Rn .
2.3. SPANNING SETS AND LINEAR INDEPENDENCE
85
21. (a) Suppose that
ui = ui1 v1 + ui2 v2 + · · · + uim vm ,
i = 1, 2, . . . , k.
Suppose that w is a linear combination of the ui . Then
w = w1 u1 + w2 u2 + · · · + wk uk
= w1 (u11 v1 + u12 v2 + · · · + u1m vm ) + w2 (u21 v1 + u22 v2 + · · · + u2m vm ) + · · ·
wk (uk1 v1 + uk2 v2 + · · · + ukm vm )
= (w1 u11 + w2 u21 + · · · + wk uk1 )v1 + (w1 u12 + w2 u22 + · · · + wk uk2 )v2 + · · ·
(w1 u1m + w2 u2m + · · · + wk ukm )vm
0
0
= w1 v1 + w20 v2 + · · · + wm
vm .
This shows that w is a linear combination of the vj . Thus span(u1 , . . . , uk ) ⊆ span(v1 , . . . , vk ).
(b) Reversing the roles of the u’s and the v’s in part (a) gives span(v1 , . . . , vk ) ⊆ span(u1 , . . . , uk ).
Combining that with part (a) shows that each of the spans is contained in the other, so they are
equal.
(c) Let
 
1
u1 = 0 ,
0
 
1
u2 = 1 ,
0
 
1
u3 = 1 .
1
We want to show that each ui is a linear combination of e1 , e2 , and e3 , and the reverse. The
first of these is obvious: since e1 , e2 , and e3 span R3 , any vector is R3 is a linear combination of
those, so in particular u1 , u2 , and u3 are. Next, note that
e1 = u1
e2 = u2 − u1
e3 = u3 − u2 ,
so that the ej are linear combinations of the ui . Part (b) then implies that span(u1 , u2 , u3 ) =
span(e1 , e2 , e3 ) = R3 .
 
 
−1
2
22. The vectors −1 and  2 are not scalar multiples of one another, so are linearly independent.
3
3
23. There is no obvious dependence relation, so follow Example 2.23. We want scalars c1 , c2 , c3 so that
 
 
   
1
1
1
0
c1 1 + c2 2 + c3 −1 = 0 .
1
3
2
0
Row-reduce the associated augmented matrix:

1 1
1 2
1 3
1
−1
2


0
1 1
R2 −R1
0 −−−−−→ 0 1
R3 −R1
0 −
−−−−→ 0 2



1 0
1 1
1 0
0 1 −2 0
−2 0
···
1
R3 −2R2
5 R3
1 0 −−−−−→ 0 0
5 0 −−
−→

 R1 −R3 
 R1 −R3 
−−→ 1 1 0 0 −
1 1
1 0 −−−
−−−−→ 1 0 0
R2 +2R3 0 1 0 0
0 1 0
· · · 0 1 −2 0 −
−−−−→
0 0
1 0
0 0 1 0
0 0 1
Thus the system has only the solution c1 = c2 = c3 = 0, so the vectors are linearly independent.

0
0
0
86
CHAPTER 2. SYSTEMS OF LINEAR EQUATIONS
24. There is no obvious dependence relation, so follow Example 2.23. We want scalars c1 , c2 , c3 so that
 
 
   
2
3
1
0
c1 2 + c2 1 + c3 −5 = 0 .
1
2
2
0
Row-reduce the associated augmented matrix:

2 3
1
2 1 −5
1 2
2
 R3 
0 −−→ 1
R1 
0 −
−→ 2
R2
0 −
−→ 2
2
3
1
2
1
−5

1
−R2 
0
−−−→
0


1
2
2
0
R −2R1
0 −1 −3
0 −−2−−−→
R3 −2R1
0 −
−−−−→ 0 −3 −9


1
2
2 0
0
1
3 0
R3 +3R2
−3 −9 0 −
−−−−→ 0

0
0
0
2
1
0
2
3
0
 R1 −2R2 
0 −
−−−−→ 1 0
0 1
0
0
0 0
−4
3
0

0
0
0
Thus the system has a nontrivial solution, and the vectors are linearly dependent. From the reduced
matrix, the general solution is c1 = 4t, c2 = −3t, c3 = t. Setting t = 1, for example, we get the relation
 
     
2
3
1
0
4 2 − 3 1 + −5 = 0 .
1
2
2
0
25. Since by inspection
     
2
2
0
v1 − v2 + v3 = 1 − 1 + 0 = 0,
3
1
2
the vectors are linearly dependent.
26. Since we are given 4 vectors in R3 , we know they are linearly dependent vectors by Theorem 2.8. To
find a dependence relation, we want to find scalars c1 , c2 , c3 , c4 such that
 
 
 
   
−2
4
3
5
0
c1  3 + c2 −1 + c3 1 + c4 0 = 0 .
7
5
3
2
0
Row-reduce the associated augmented matrix:
 1




− 2 R1
−2
4 3 5 0 −−−
1 −2 − 23 − 52 0
1 −2 − 32 − 25
−
→

 R2 −3R1 


15
3 −1

 3 −1 1 0 0
1
0 0
5 11

 −−−−−→ 0


2
2
R −7R1
39
7
5 3 2 0
7
5
3
2 0 −−3−−−→
0 19 27
2
2



1 −2 − 23 − 25 0
1 −2 − 23 − 25



1
R
3
11
3
5 2 
0
1 11
0
1
−−
−→ 0


10
2
10
2
R
−19R
2
39
0 19 27
0 −−3−−−−→
0
0 − 37
−9
2
2
5

 R1 + 3 R3 
2
1 −2 − 23 − 52 0 −−−−−
−→ 1 −2 0 − 25
37



11
3
11
6
0
 R2 − 10 R3 0
1
0
1
0
−
−
−
−
−
−
→



10
2
37
5
− 37
R3
45
0
1 45
0
0
0 1
−−−−→ 0
37
37


R +2R2
13
1 0 0 − 37
0
−−1−−−→


6
0 1 0
0


37
45
0 0 1
0
37
0


0

0

0

0

0

0

0

0
2.3. SPANNING SETS AND LINEAR INDEPENDENCE
87
Thus the system has a nontrivial solution, and the vectors are linearly dependent. From the reduced
13
6
matrix, the general solution is c1 = 37
t, c2 = − 37
t, c3 = − 45
37 t, c4 = t. Setting t = 37, for example, we
get the relation
 
 
 
   
−2
4
3
5
0
13  3 − 6 −1 − 45 1 + 37 0 = 0 .
7
5
3
2
0
 
0
27. Since v3 = 0 is the zero vector, these vectors are linearly dependent. For example,
0
 
 
 
3
6
0
0 · 4 + 0 · 7 + 137 0 = 0.
5
8
0
Any set of vectors containing the zero vector is linearly dependent.
28. There is no obvious dependence relation, so follow Example 2.23. We want scalars c1 , c2 , c3 so that
   
 
 
0
2
3
−1
 3 0
2
 1
   

 
c1 
 2 + c2 2 + c3  1 = 0
0
−1
4
1
Row-reduce the associated augmented matrix:

−1 3
 1 2

 2 2
1 4
2
3
1
−1
 −R1 
0 −
−−→ 1
1
0



2
0
0
1


1
0 R −R
2
1
−−−−−→ 0
0
 R −2R 
2 
0 −
0
−3−−−→
R4 −R1
0 −
−−−−→ 0
−3 −2
2
3
2
1
4 −1

1 −3
0
1
R −6R2 
0
−−3−−−→
0
R4 −7R2
0
0
−−−−−→

1 −3
R +2R3
−−1−−−→
0
1
R2 −R3 
−−−
−−→ 0
0
0
0
−2
1
−1
−6
0
0
1
0
−3 −2
5
5
6
5
7
1


1 −3
0
0
0
1
 −R 
3 
0 −
0
0
−−→
0
0
0
 R1 +3R2 
0 −−−−−→ 1 0
0 1
0



0 0
0
0
0 0
−2
1
1
−6
0
0
1
0


1
0 1
5 R2 0
0
 −−−→ 
0
0
0
0
−3 −2
1
1
6
5
7
1


1
0
0
0


0
0
R4 +6R3
0 −−−−−→ 0


0
0

0
0
−3 −2
1
1
0
1
0
0

0
0

0
0
0
0

0
0
Since the system has only the trivial solution, the given vectors are linearly independent.
29. There is no obvious dependence relation, so follow Example 2.23. We want scalars c1 , c2 , c3 , c4 so that


 
 
   
1
−1
1
0
0
−1
 1
 0
 1 0

 
 
   
c1 
 1 + c2  0 + c3  1 + c4 −1 = 0
0
1
−1
1
0
88
CHAPTER 2. SYSTEMS OF LINEAR EQUATIONS
Row-reduce the associated augmented matrix:






1 −1
1
0 0
1 −1
1
0 0
1 −1
1
0 0
R
+R
2
1


−1
1
0
1 0
0
1
1 0
1
0 −1 0
R2 ↔R3 0

−−−−−→ 0
−

−
−
−
−
→
R
−R


 1


3
1
0
1 −1 0 −−−−−→ 0
1
0 −1 0
0
0
1
1 0
0
1 −1
1 0
0
1 −1
1 0
0
1 −1
1 0




1 −1
1
0 0
1 −1 1
0 0
0
0
1
0 −1 0
1 0 −1 0




0


0
1
1 0
0
0 1
1 0
R4 −R2
R4 +R3
0 −1
2 0 −
0 0
3 0
−−−
−−→ 0
−−−−→ 0




1 −1 1 0 0
1 −1 1
0 0
R2 +R4 
−
0
−
−
−
−
→
0
1
0
−1
1 0 0 0
0



R
−R


0
3
4
0 1
1 0 −−−−−→ 0
0 1 0 0
1
3 R4
0 0
1 0
0
0 0 1 0
−−
−→ 0


R +R2 −R3
−−1−−−
−−→ 1 0 0 0 0
0 1 0 0 0


0 0 1 0 0
0 0 0 1 0
Since the system has only the trivial solution, the given vectors are linearly independent.
30. The vectors
 
0
0

v1 = 
0 ,
1
 
0
0

v2 = 
2 ,
1
 
0
3

v3 = 
2 ,
1
 
4
3

v4 = 
2
1
are linearly dependent. For suppose c1 v1 + c2 v2 + c3 v3 + c4 v4 = 0. Then c4 must be zero in order for
the first component of the sum to be zero. But then the only vector containing a nonzero entry in the
second component is v3 , so that c3 = 0 as well. Now for the same reason c2 = 0, looking at the third
component. Finally, that implies that c1 = 0. So the vectors are linearly independent.
31. By inspection
      
−1
1
−1
3
−1  3 1 −1
      
v1 + v2 − v3 + v4 = 
 1 +  1 − 3 +  1 = 0,
1
3
−1
−1

so the vectors are linearly dependent.
32. Construct a matrix A with these vectors as its rows and try to produce a zero row:
R0 =R1 +R2 2 −1 3 −
1 1 6
−1−−−−−→ 1 1 6
0
0
R2 =R2 +2R1
−1
2 3
−1 2 3 −
−−−−−−−→ 0 3 9
This matrix is in row echelon form and has no zero rows, so the matrix has rank 2 and the vectors are
linearly independent by Theorem 2.7.
33. Construct a matrix A with these vectors as its rows and try to produce a zero row:

1
1
1


1 1
1
0
R =R2 −R1
2 3 −−2−−−−−→ 0
−1 2 R30 =R3 −R1 0
−−−−−−−→


1
1 1
0
1 2 00 0
R3 =R3 +2R20
−2 1 −
−−−−−−−→ 0
1
1
0

1
2
5
This matrix is in row echelon form and has no zero rows, so the matrix has rank 3 and the vectors are
linearly independent by Theorem 2.7.
2.3. SPANNING SETS AND LINEAR INDEPENDENCE
89
34. Construct a matrix A with these vectors as its rows and try to produce a zero row:

2
3
1


2 1
1
R1 ↔R3 
1 2 −
−−−−→ 3
−5 2
2


−5 2
1
R20 =R2 −3R1
1 2 −−−−−−−−→ 0
2 1 R30 =R3 −2R1 0
−−−−−−−−→


1
−5
2
16 −4 00 0 3 0 0
R3 =R3 − 4 R2
12 −3 −
−−−−−−−−→ 0

−5
2
12 −3
0
0
Since there is a zero row, the matrix has rank less than 3, so the vectors are linearly dependent by
Theorem 2.7. Working backwards, we have
3
3
1
3
[0, 0, 0] = R300 = R30 − R20 = R3 − 2R1 − (R2 − 3R1 ) = R1 − R2 + R3
4
4
4
4

   

1
3
2
3
1
= −5 − 1 + 2 .
4
4
2
2
1
35. Construct a matrix A with these vectors as its rows and try to produce a zero row:

0
2
2
1
1
0
 R10 =R2 
2 −−−
−−→ 2
3 R20 =R1 0
−−−−−→
1
2


3
2
0
2 0
R3 =R3 −R10
1 −
−−−−−−→ 0
1
1
0


1
3
2
1
2 00 0 0 0
R3 =R3 +R2
−1 −2 −
−−−−−−−→ 0
1
1
0

3
1
0
This matrix is in row echelon form and has a zero row, so the vectors are linearly dependent by Theorem
2.7. Working backwards, we have
     
0
2
2
[0, 0, 0] = R300 = R30 + R20 = R3 − R10 + R20 = R3 − R2 + R1 = R1 − R2 + R3 = 1 − 1 + 0 .
2
3
1
36. Construct a matrix A with these vectors and try to produce a zero row:

−2
 4


 3
5


−2
3 7
R20 =R2 +2R1
 0
−
−
−
−
−
−
−
−
→
−1 5

 0
 R =R3 + 23 R1 
1 3 −−3−−−
−−−→  0
0
0 2 R40 =R4 + 25 R1
−−−−−−−−→
3
5
11
2
15
2


7
−2
 0
19
 00

R3 =2R30 
27 

 0
−
−
−
−
−
→
2
39
2
R00 =2R0
4
−−4−−−→

−2 3
 0 5

0
· · · R3000 =R300 − 11

5 R2
−−−−−−−−−−→  0 0
R000 =R400 −3R20
0 0
−−4−−−−
−−−→
0
3
5
11
15

7
19

···
27
39

−2 3
 0 5


 0 0
0000
000
5·18 000
R4 =R4 − 74 R3
−18 −−−−−−−−−−−−→
0 0

7
19

74 
−5

7
19

74 
−5
0
Since this matrix has a zero row, it has rank less than 2, so the vectors are linearly dependent. Working
backwards, we have
[0, 0, 0] = R40000 = R4000 −
90 000
R
74 3
45
11
R300 − R20
37
5
90
99
= 2R40 − 3R20 − R30 + R20
37
37
5
90
3
99
= 2 R4 + R1 − 3(R2 + 2R1 ) −
R3 + R1 + (R2 + 2R1 )
2
37
2
37
26
12
90
=
R1 − R2 − R3 + 2R4 .
37
37
37
= R400 − 3R20 −
90
CHAPTER 2. SYSTEMS OF LINEAR EQUATIONS
Clearing fractions gives
[0, 0, 0] = 26R1 − 12R2 − 90R3 + 74R4 .
Thus
 
 
 
 
−2
4
3
5
26  3 − 12 −1 − 90 1 + 74 0 = 0.
7
5
3
2
Note that instead of doing all this work, we could just have applied Theorem 2.8.
37. Construct a matrix A with these vectors and try to produce a zero row:


3 4 5
6 7 8
0 0 0
This matrix already has a zero row, so we know by Theorem 2.7 that the vectors are linearly dependent,
and we have an obvious dependence relation: let the coefficients of each of the first two vectors be zero,
and the coefficient of the zero vector be any nonzero number.
38. Construct a matrix A with these vectors and try to produce a zero row:

−1
 3
2
1
2
3
2
2
1


1
−1
R20 =R2 +3R1
4 −−−−−−−−→  0
−1 R30 =R3 +2R1
0
−−−−−−−−→
1
5
5
2
8
5


1
−1
 0
7 00
R3 =R3 −R2
1 −
0
−−−−−−−→
1
5
0

2
1
8
7 .
−3 −6
This matrix is in row echelon form, and it has no zero row, so it has rank 4 and thus the vectors are
linearly independent by Theorem 2.7.
39. Construct a matrix A with these vectors and try to produce a zero row:


−1
1
0
1
R20 =R2 +R1
0
1
0
1
−
−
−
−
−
−
−
→


0
1 −1 R30 =R3 −R1 0
−−−−−−−→
1 −1
1
0

1
−1

 1
0

1
0

0
0

−1
1
0
0
1
1

1
0 −1 0
R4 =R4 −R30
1 −1
1 −
−−−−−−→


−1
1
0
1
R200 =R30
0
0
1
1
−
−
−
−
−
→


1
0 −1 R300 =R20 0
−−−−−→
0 −1
2
0


−1
1
0
1
0
1
0 −1


0
1
1 00 0 0 0
R4 =R4 +R3
0 −1
2 −
−−−−−−−→ 0

−1 1
0
1 0 −1
.
0 1
1
0 0
3
This matrix is in row echelon form, and it has no zero row, so it has rank 4 and thus the vectors are
linearly independent by Theorem 2.7.
40. Construct a matrix A with these vectors and try to produce a zero row:

0
0

0
4
0
0
3
3
0
2
2
2


1 R20 =R2 −R1 0
−−−−−−−→ 
1
 R0 =R −R 0
3
1

1 −−3−−−
−−→ 0
0
1 R4 =R4 −R1 4
−−−−−−−→
0
0
3
3
0
2
2
2


0
1

0
 R300 =R30 −R20 0
0 −−−−−−−−→ 0
0 R400 =R40 −R20 4
−−−−−−−−→
0
0
3
3
0
2
0
0

1
0

0 000 00 00
R4 =R4 −R3
0 −
−−−−−−−→
0

0
0

0
4
0
0
3
0
0
2
0
0
000
R1 =R4
−
−−−−→ 
4
1
R200 =R300
0
0
−
−
−
−
−
→


0 R3000 =R20 0
−−−−−→
0 R0000 =R1 0
4
−−−
−−−→
0
3
0
0
0
0
2
0

0
0
.
0
1
2.3. SPANNING SETS AND LINEAR INDEPENDENCE
91
This matrix is in row echelon form, and it has no zero row, so it has rank 4 and thus the vectors are
linearly independent by Theorem 2.7.
41. Construct a matrix A with these vectors and try to produce a zero row:

3
−1

 1
−1
 R10 =R3 
1
−1 1 −1 −−−
−−→
−1
3 1 −1


1 3
1 R30 =R1  3
−−−−−→
−1
−1 1
3

1 3
1 R20 =R2 +R10
−−−−−−−→
3 1 −1
 R00 =R0 −3R0
1

−1 1 −1 −−3−−−3−−−→
0
0
−1 1
3 R4 =R4 +R1
−−−−−−−→



1
1
3
1
1
0

0
4
4
0

 R000 =R00 +R0 +R0 
3
2
4 0
0 −4 −8 −4 −
−3−−−−
−−−
−−→
0
0
4
4
0
1
4
0
0
3
4
0
4

1
0
.
0
4
Since this matrix has a zero row, it has rank less than 4, so the vectors are linearly dependent. Working
backwards, we have
[0, 0, 0, 0] = R3000 = R300 + R20 + R40
= R30 − 3R10 + R2 + R10 + R4 + R10
= R30 − R10 + R2 + R4
= R1 − R3 + R2 + R4 .
Thus
      
−1
1
−1
3
−1  3 1 −1
  +   −   +   = 0.
 1  1 3  1
3
1
−1
−1

42. (a) In this case, rank(A) = n. To see this, recall that v1 , v2 , . . . , vn are linearly independent if and
only if the only solution of [A | 0] = [v1 v2 . . . vn | 0] is the trivial solution. Since the trivial
solution is always a solution, it is the only one if and only if the number of free variables in the
associated system is zero. Then the Rank Theorem (Theorem 2.2 in Section 2.2) says that the
number of free variables is n − rank(A), so that n = rank(A).
(b) Let


v1
 v2 
 
A =  . .
 .. 
vn
Then by Theorem 2.7, the vectors, which are the rows of A, are linearly independent if and only
if the rank of A is greater than or equal to n. But A has n rows, so its rank is at most n. Thus,
the rows of A are linearly independent if and only if the rank of A is exactly n.
43. (a) Yes, they are. Suppose that c1 (u + v) + c2 (v + w) + c3 (u + w) = 0. Expanding gives
c1 u + c1 v + c2 v + c2 w + c3 u + c3 w = (c1 + c3 )u + (c1 + c2 )v + (c2 + c3 )w = 0.
Since u, v, and w are linearly independent, all three of these coefficients are zero. So to see what
values of c1 , c2 , and c3 satisfy this equation, we must solve the linear system
c1
+ c3 = 0
c1 + c2
=0
c2 + c3 = 0.
92
CHAPTER 2. SYSTEMS OF LINEAR EQUATIONS
Row-reducing the coefficient matrix gives

 
1 0 1
1
1 1 0 →
− 0
0 1 1
0

0
1
1 −1 .
0
2
Since the rank of the reduced matrix is 3, the only solution is the trivial solution, so c1 = c2 =
c3 = 0. Thus the original vectors u + v, v + w, and u + w are linearly independent.
(b) No, they are not. Suppose that c1 (u − v) + c2 (v − w) + c3 (u − w) = 0. Expanding gives
c1 u − c1 v + c2 v − c2 w + c3 u − c3 w = (c1 + c3 )u + (−c1 + c2 )v + (−c2 − c3 )w = 0.
Since u, v, and w are linearly independent, all three of these coefficients are zero. So to see what
values of c1 , c2 , and c3 satisfy this equation, we must solve the linear system
c1
+ c3 = 0
−c1 + c2
=0
− c2 − c3 = 0.
Row-reducing the coefficient matrix gives
 

1
1
0
1
−1
1
0 →
− 0
0
0 −1 −1
0
1
0

1
1 .
0
So c3 is a free variable, and a solution to the original system is c1 = c2 = −c3 . Taking c3 = −1,
for example, we have
(u − v) + (v − w) − (u − w) = 0.
44. Let v1 and v2 be the two vectors. Suppose first that at least one of them, say v1 , is zero. Then they
are linearly dependent since v1 + 0v2 = 0 + 0 = 0; further, v1 = 0v2 , so one is a scalar multiple of the
other.
Now suppose that neither vector is the zero vector. We want to show that one is a multiple of the
other if and only if they are linearly dependent. If one is a multiple of the other, say v1 = kv2 , then
clearly 1v1 − kv2 = 0, so that they are linearly dependent. To go the other way, suppose they are
linearly dependent. Then c1 v1 + c2 v2 = 0, where c1 and c2 are not both zero. Suppose that c1 6= 0 (if
instead c2 6= 0, reverse the roles of the first and second subscripts). Then
c1 v1 + c2 v2 = 0
⇒
c1 v1 = −c2 v2
⇒
v1 = −
c2
v2 ,
c1
so that v1 is a multiple of v2 , as was to be shown.
45. Suppose we have m row vectors v1 , v2 , . . . , vm in Rn with m > n. Let
 
v1
 v2 
 
A =  . .
 .. 
vm
By Theorem 2.7, the rows (the vi ) are linearly dependent if and only if rank(A) < m. By definition,
rank(A) is the number of nonzero rows in its row echelon form. But since there can be at most one
leading 1 in each column, the number of nonzero rows is at most n < m. Thus rank(A) ≤ n < m, so
that by Theorem 2.7, the vectors are linearly dependent.
2.4. APPLICATIONS
93
46. Suppose that A = {v1 , v2 , . . . , vn } is a set of linearly independent vectors, and that B is a subset of
A, say B = {vi1 , vi2 , . . . , vik } where 1 ≤ i1 < i1 < · · · < ik ≤ n for j = 1, 2, . . . , k. We want to show
that B is a linearly independent set. Suppose that
ci1 vi1 + ci2 vi2 + · · · + cik vik = 0.
Set cr = 0 if r is not in {i1 , i2 , . . . , ik }, and consider
c1 v1 + c2 v2 + · · · + cn vn .
This sum is zero since the sum of the i1 , i2 , . . . , ik elements is zero, and the other coefficients are zero.
But A forms a linearly independent set, so all the coefficients must in fact be zero. Thus the linear
combination of the vij above must be the trivial linear combination, so B is a linearly independent set.
47. Since S 0 = {v1 , . . . , vk } ⊂ S = {v1 , . . . , vk , v}, Exercise 20(a) proves that span(S 0 ) ⊆ span(S). Using
Exercise 21(a), to show that span(S) ⊆ span(S 0 ), we must show that each element of S is a linear
combination of elements of S 0 . But clearly each vi is a linear combination of elements of S 0 , since
vi ∈ S 0 . Also, we are given that v is linear combination of the vi , so it too is a linear combination
of elements of S 0 . Thus span(S) ⊆ span(S 0 ). Since we have proven both inclusions, the two sets are
equal.
48. Suppose bv + b2 v2 + · · · + bk vk = 0. Substitute for v from the given equation, to get
b(c1 v1 + c2 v2 + · · · + ck vk ) + b2 v2 + · · · + bk vk = bc1 v1 + (bc2 + b2 )v2 + · · · + (bck + bk )vk = 0.
Since the vi are a linearly independent set, all of the coefficients must be zero:
bc1 = 0,
bc2 + b2 = 0,
bc3 + b3 = 0,
...
bck + bk = 0.
In particular, bc1 = 0; since we are given that c1 6= 0, it follows that b = 0. Then substituting 0 for b
in the remaining equations above
b2 = 0,
b3 = 0,
...
bk = 0.
Thus b and all of the bk are zero, so the original linear combination was the trivial one. It follows that
{v, v2 , . . . , vk } is a linearly independent set.
2.4
Applications
1. Let x1 , x2 , and x3 be the number of bacteria of strains I, II, and III, respectively. Then from the
consumption of A, B, and C, we get the following system:
 


1 0 0 160
x1 + 2x2
= 400
1 2 0 400
− 0 1 0 120 .
2x1 + x2 + x3 = 600 ⇒ 2 1 1 600 →
1 1 2 600
0 0 1 160
x1 + x2 + 2x3 = 600
So 160 bacteria of strain I, 120 bacteria of strain II, and 160 bacteria of strain III can coexist.
2. Let x1 , x2 , and x3 be the number of bacteria of strains I, II, and III, respectively. Then from the
consumption of A, B, and C, we get the following system:

 

x1 + 2x2
= 400
1 2 0 400
1 0
2 0
2x1 + x2 + 3x3 = 500 ⇒ 2 1 3 500 →
− 0 1 −1 0 .
1 1 1 600
0 0
0 1
x1 + x2 + x3 = 600
This is an inconsistent system, so the bacteria cannot coexist.
94
CHAPTER 2. SYSTEMS OF LINEAR EQUATIONS
3. Let x1 , x2 , and x3 be the number of small, medium, and large arrangements. Then from the consumption of flowers in each arrangement we get:
 


x1 + 2x2 + 4x3 = 24
1 0 0 2
1 2 4 24
− 0 1 0 3 .
3x1 + 4x2 + 8x3 = 50 ⇒ 3 4 8 50 →
3x1 + 6x2 + 6x3 = 48
3 6 6 48
0 0 1 4
So 2 small, 3 medium, and 4 large arrangements were sold that day.
4. (a) Let x1 , x2 , and x3 be the number of nickels, dimes, and quarters. Then we get
 


x1 + x2 + x3 = 20
20
1 0 0 4
1
1 1
0 →
2x1 − x2
= 0 ⇒ 2 −1 0
− 0 1 0 8 .
5 10 25 300
0 0 1 8
5x1 + 10x2 + 25x3 = 300
So there are 4 nickels, 8 dimes, and 8 quarters.
(b) This is similar to part (a), except that the constraint 2x1 − x2 no longer applies, so we get
1
1 1
20
1 0 −3 −20
→
−
.
40
5 10 25 300
0 1
4
The general solution is x1 = −20 + 3t, x2 = 40 − 4t, and x3 = t. From the first equation, we must
have 3t − 20 ≥ 0, so that t ≥ 7. From the second, we have t ≤ 10. So the possible values for t are
7, 8, 9, and 10, which give the following combinations of nickels, dimes, and quarters:
[x1 , x2 , x3 ] = [1, 12, 7] or [4, 8, 8] or [7, 4, 9] or [10, 0, 10].
5. Let x1 , x2 , and x3 be the number of house, special, and gourmet blends. Then from the consumption
of beans in each blend we get
300x1 + 200x2 + 100x3 = 30,000
200x2 + 200x3 = 15,000
200x1 + 100x2 + 200x3 = 25,000 .
Forming the augmented matrix and reducing it gives
 

300 200 100 30000
1 0 0
 0
200 200 15000 →
− 0 1 0
200 100 200 25000
0 0 1

65
30 .
45
The merchant should make 65 house blend, 30 special blend, and 45 gourmet blend.
6. Let x1 , x2 , and x3 be the number of house, special, and gourmet blends. Then from the consumption
of beans in each blend we get
300x1 + 200x2 + 100x3 = 30,000
50x1 200x2 + 350x3 = 15,000
150x1 + 100x2 + 50x3 = 15,000 .
Forming the augmented matrix and reducing it gives
 

1 0
300 200 100 30000
 50 200 350 15000 →
− 0 1
150 100 50 15000
0 0
−1
2
0

60
60 .
0
Thus x1 = 60 + t, x2 = 60 − 2t, and x3 = t. From the first equation, we get t ≥ −60; from the second,
we get t ≤ 30, and the third gives us t ≥ 0. So the range of possible values is 0 ≤ t ≤ 30. We wish to
maximize the profit
P =
1
3
1
3
1
x1 + x2 + 2x3 = (60 + t) + (60 − t) + 2t = 120 − t.
2
2
2
2
2
Since 0 ≤ t ≤ 30, the profit is maximized if t = 0, in which case x1 = x + 2 = 60 and x3 = 0. Therefore
the merchant should make 60 house and special blends, and no gourmet blends.
2.4. APPLICATIONS
95
7. Let x, y, z, and w be the number of FeS2 , O2 , Fe2 O3 , and SO2 molecules respectively. Then compare
the number of iron, sulfur, and oxygen atoms in reactants and products:
Iron:
x =
2z
Sulfur:
2x =
w
Oxygen:
2y = 3z + 2w,
so we get the matrix

1 0

2 0
0 2
−2
0
0 −1
−3 −2
 
1 0 0
0
 
0 →
− 0 1 0
0 0 1
0
− 21
− 11
8
− 14

0

0 .
0
1
Thus z = 14 w, y = 11
8 w, and x = 2 w. The smallest positive value of w that will produce integer values
for all four variables is 8, so letting w = 8 gives x = 4, y = 11, z = 2, and w = 8. The balanced
equation is 4FeS2 + 11O2 −→ 2Fe2 O3 + 8SO2 .
8. Let x, y, z, and w be the number of CO2 , H2 O, C6 H12 O6 , and O2 molecules respectively. The compare
the number of carbon, oxygen, and hydrogen atoms in reactants and products:
Carbon:
x
=
6z
Oxygen:
2x + y = 6z + 2w
Hydrogen:
2y
=
12z.
This system is easy to solve using back-substitution: the first equation gives x = 6z and the third gives
y = 6z. Substituting 6z for both x and y in the second equation gives 2 · 6z + 6z = 6z + 2w, so that
w = 6z as well. The smallest positive value of z giving integers is z = 1, so that x = y = w = 6 and
z = 1. The balanced equation is 6CO2 + 6H2 O −→ C6 H12 O6 + 6O2 .
9. Let x, y, z, and w be the number of C4 H10 , O2 , CO2 , and H2 O molecules respectively. The compare
the number of carbon, oxygen, and hydrogen atoms in reactants and products:
Carbon:
4x =
z
Oxygen:
2y = 2z + w
Hydrogen:
10x =
2w.
This system is easy to solve using back-substitution: the first equation gives z = 4x and the third gives
w = 5x. Substituting these values for z and w in the second equation gives 2y = 2 · 4x + 5x = 13x, so
that y = 13
2 x. The smallest positive value of x giving integers for all four variables is x = 2, resulting
in x = 2, y = 13, z = 8, and w = 10. The balanced equation is 2C4 H10 + 13O2 −→ 8CO2 + 10H2 O.
10. Let x, y, z, and w be the number of C7 H6 O2 , O2 , H2 O, and CO2 molecules respectively. The compare
the number of carbon, oxygen, and hydrogen atoms in reactants and products:
Carbon:
7x
=
Oxygen:
2x + 2y = z + 2w
Hydrogen:
6x
=
w
2z.
This system is easy to solve using back-substitution: the first equation gives w = 7x and the third
gives z = 3x. Substituting these values for w and z in the second equation gives 2x + 2y = 3x + 2 · 7x,
so that y = 15
2 x. The smallest positive value for x giving integer values for all four variables is
x = 2, so that the solution is x = 2, y = 15, z = 6, and w = 14. The balanced equation is
2C7 H6 O2 + 15O2 −→ 6H2 O+14CO2 .
96
CHAPTER 2. SYSTEMS OF LINEAR EQUATIONS
11. Let x, y, z, and w be the number of C5 H11 OH, O2 , H2 O, and CO2 molecules respectively. The compare
the number of carbon, oxygen, and hydrogen atoms in reactants and products:
Carbon:
5x
Oxygen:
x + 2y = z + 2w
=
w
Hydrogen:
12x
=
2z.
This system is easy to solve using back-substitution: the first equation gives w = 5x and the third
gives z = 6x. Substituting these values for w and z in the second equation gives x + 2y = 6x + 2 · 5x,
so that y = 15
2 x. The smallest positive value for x giving integer values for all four variables is
x = 2, so that the solution is x = 2, y = 15, z = 12, and w = 10. The balanced equation is
2C5 H11 OH+15O2 −→ 12H2 O+10CO2 .
12. Let x, y, z, and w be the number of HClO4 , P4 O10 , H3 PO4 , and Cl2 O7 molecules respectively. The
compare the number of hydrogen, chlorine, oxygen, and phosphorus atoms in reactants and products:
Hydrogen:
x
=
3z
Chlorine:
x
=
2w
Oxygen:
4x + 10y = 4z + 7w
Phosphorus:
4y
=
z.
This gives the matrix

1
0 −3
0
1
0
0 −2


4 10 −4 −7
0
4 −1
0
 
1 0 0
0


0 0 1 0
−
→
0 0 0 1
0 0 0
0

0
0

.
0
−2
− 16
− 23
0
0
Thus the general solution is x = 2t, y = 16 t, z = 32 t, w = t. The smallest positive value of t giving all
integer answers is t = 6, so the desired solution is x = 12, y = 1, z = 4, and w = 6. The balanced
equation is 12HClO4 +P4 O10 −→ 4H3 PO4 + 6Cl2 O7 .
13. Let x1 , x2 , x3 , x4 , and x5 be the number of Na2 CO3 , C, N2 , NaCN, and CO molecules respectively.
The compare the number of sodium, carbon, oxygen, and nitrogen atoms in reactants and products:
Sodium:
2x1
=
x4
Carbon:
x1 + x2 = x4 + x5
Oxygen:
3x1
=
x5
Nitrogen:
2x3
=
x4 .
 
0
1


0 0
−
→
0 0
0
1
0
0
This gives the matrix

2 0 0
1 1 0


3 0 0
0 0 2
−1
0
−1 −1
0 −1
−1
0
0
0
4
x5 ,
3
x3 =
0
0
1
0
0
0
0
1
− 31
− 34
− 31
− 32

0
0


0
0
Thus x5 is a free variable, and
x1 =
1
x5 ,
3
x2 =
1
x5 ,
3
x4 =
2
x5 .
3
The smallest value of x5 that produces integer solutions is 3, so the desired solution is x1 = 1, x2 = 4,
x3 = 1, x4 = 2, and x5 = 3. The balanced equation is Na2 CO3 + 4C+N2 −→ 2NaCN+3CO.
2.4. APPLICATIONS
97
14. Let x1 , x2 , x3 , x4 , and x5 be the number of C2 H2 Cl4 , Ca(OH)2 , C2 HCl3 , CaCl2 , and H2 O molecules
respectively. The compare the number of carbon, hydrogen, chlorine, calcium, and oxygen atoms in
reactants and products:
Carbon:
2x1
Hydrogen:
2x1 + 2x2 = x3 + 2x5
=
2x3
Chlorine:
4x1
= 3x3 + 2x4
Calcium:
x2
=
x4
Oxygen:
2x2
=
x5 .
−2
0
0
−1
0 −2
−3 −2
0
0 −1
0
0
0 −1
 
1
0
 
0 0
 
−
0
0
→
 
0 0
0
0
This gives the matrix

2

2

4


0
0
0
2
0
1
2
0
1
0
0
0
0
0
1
0
0
0
0
0
1
0
−1
− 21
−1
− 21
0

0

0

0


0
0
Thus x5 is a free variable, and
1
1
x5 , x3 = x5 , x4 = x5 .
2
2
The smallest value of x5 that produces integer solutions is 2, so the desired solution is x1 = 2, x2 = 1,
x3 = 2, x4 = 1, and x5 = 2. The balanced equation is 2C2 H2 Cl4 +Ca(OH)2 −→ 2C2 HCl3 +CaCl2 +
2H2 O.
x1 = x5 ,
x2 =
15. (a) By applying the conservation of flow rule to each node we obtain the system of equations
f1 + f2 = 20
f2 − f3 = −10
f1 + f3 = 30.
Form the augmented matrix and row-reduce it:

 
1 1
0
1 0
20
0 1 −1 −10 →
− 0 1
1 0
1
30
0 0
1
−1
0

30
−10
0
Letting f3 = t, the possible flows are f1 = 30 − t, f2 = t − 10, f3 = t.
(b) The flow through AB is f2 , so in this case 5 = t − 10 and t = 15. So the other flows are
f1 = f3 = 15.
(c) Since each flow must be nonnegative, the first equation gives t ≤ 30, the second gives t ≥ 10, and
the third gives t ≥ 0. So we have 10 ≤ t ≤ 30, which means that
0 ≤ f1 ≤ 20
0 ≤ f2 ≤ 20
10 ≤ f3 ≤ 30.
(d) If a negative flow were allowed, it would mean a flow in the opposite direction, so that a negative
flow on f2 would mean a flow from B into A.
16. (a) By applying the conservation of flow rule to each node we obtain the system of equations
f1 + f2 = 20
f1 + f3 = 25
f2 + f4 = 25
Form the augmented matrix and row-reduce it:

 
1 1 0 0 20
1 0 0
1 0 1 0 25 0 1 0

− 
0 1 0 1 25 →
0 0 1
0 0 1 1 30
0 0 0
−1
1
1
0
f3 + f4 = 30.

−5
0

0
0
Letting f4 = t, the possible flows are f1 = t − 5, f2 = 25 − t, f3 = 30 − t, f4 = t.
98
CHAPTER 2. SYSTEMS OF LINEAR EQUATIONS
(b) If f4 = t = 10, then the average flows on the other streets will be f1 = 10−5 = 5, f2 = 25−10 = 15,
and f3 = 30 − 10 = 20.
(c) Each flow must be nonnegative, so we get the constraints t ≥ 5, t ≤ 25, t ≤ 30, and t ≥ 0. Thus
5 ≤ t ≤ 25. This gives the constraints
0 ≤ f1 ≤ 20
0 ≤ f2 ≤ 20
5 ≤ f3 ≤ 25
5 ≤ f4 ≤ 25.
(d) Reversing all of the directions would have no effect on the solution. This would mean multiplying
all rows of the augmented matrix by −1. However, multiplying a row by −1 is an elementary row
operation, so the new matrix would have the same reduced row echelon form and thus the same
solution.
17. (a) By applying the conservation of flow rule to each node we obtain the system of equations
f1 + f2 = 100
f2 + f3 = f4 + 150
f4 + f5 = 150
f1 + 200 = f3 + f5 .
Rearrange the equations, set up the augmented matrix, and row reduce it:

 

100
1 0 −1 0 −1 −200
1 1
0
0
0
0 1
0 1
1 −1
0
150
1 0
1
300

→

−
0 0
0
1
1
0 1
1
150 0 0
150
1 0 −1
0 −1 −200
0 0
0 0
0
Then both f3 = s and f5 = t are free variables, so the flows are
f1 = −200 + s + t,
f2 = 300 − s − t,
f3 = s,
f4 = 150 − t,
f5 = t.
(b) If DC is closed, then f5 = t = 0, so that f1 = −200 + s and f2 = 300 − s. Thus s, which is the
flow through DB, must satisfy 200 ≤ s ≤ 300 liters per day.
(c) If DB = f3 = S were closed, then f1 = −200 + t, so that t = f5 ≥ 200. But that would mean
that f4 = 150 − t ≤ −50 < 0, which is impossible.
(d) Since f4 = 150 − t, we must have 0 ≤ t ≤ 150, so that 0 ≤ f4 ≤ 150. Note that all of these values
for t are consistent with the first three equations as well: if t = 0, then s = 250 produces positive
values for all flows, while if t = 150, then s = 75 produces all positive values. So the flow through
DB is between 0 and 150 liters per day.
18. (a) By applying the conservation of flow rule to each node we obtain:
f3 + 200 = f1 + 100
f1 + 150 = f2 + f4
f2 + f5 = 300
f6 + 100 = f3 + 200
f4 + f7 = f6 + 100
f5 + f7 = 250.
Create and row-reduce the augmented matrix for the system:

 
1
0 −1
0 0
0 0
100
1 0 0
1 −1
 0 1 0
0
−1
0
0
0
−150

 
0
0 0 1
1
0
0 1
0 0
300

→
−
0

0
1
0 0 −1 0 −100 

0 0 0
0
0
0
1 0 −1 1
100 0 0 0
0
0
0
0 1
0 1
250
0 0 0
0
0
0
1
0
0
0
0
0
0
1
0
−1
0
0 −1
−1
0
−1
1
0
1
0
0

0
50

−100

100

250
0
Let f7 = t and f6 = s, so the flows are
f1 = s
f2 = 50 + t
f3 = −100 + s
f4 = 100 + s − t
f5 = 250 − t
f6 = s
f7 = t.
(b) From the solution in part (a), since f1 = s = f6 , it is not possible for f1 and f6 to have different
values.
2.4. APPLICATIONS
99
(c) If f4 = 0, then s − t = 100, so that t = 100 + s. Substituting this for t everywhere gives
f1 = s
f2 = 150 + s
f3 = −100 + s
f4 = 0 f5 = 150 − s
f6 = s
f7 = 100 + s.
The constraints imposed by the requirement that all the fi be nonnegative means that 100 ≤ s ≤
150, so that
100 ≤ f1 ≤ 150
50 ≤ f5 ≤ 100
250 ≤ f2 ≤ 300
0 ≤ f3 ≤ 50 f4 = 0
100 ≤ f6 ≤ 150
200 ≤ f7 ≤ 250.
19. Applying the current law to node A gives I1 + I3 = I2 or I1 − I2 + I3 = 0. Applying the voltage
law to the top circuit, ABCA, gives −I2 − I1 + 8 = 0. Similarly, for the circuit ABDA we obtain
−I2 + 13 − 4I3 = 0. This gives the system

I1 − I2 + I3 = 0
1
I1 + I2
= 8 ⇒ 1
I2 + 4I3 = 13
0
−1 1
1 0
1 4
 
1 0 0
0
8 →
− 0 1 0
0 0 1
13

3
5
2
Thus I1 = 3 amps, I2 = 5 amps, and I3 = 2 amps.
20. Applying the current law to node B and the voltage law to the upper and lower circuits gives

1
I1 − I2 + I3 = 0
I1 + 2I2
= 5 ⇒ 1
0
2I2 + 4I3 = 8
−1 1
2 0
2 4
 
0
1
5 →
− 0
0
8
0
1
0

1
2
1
0
0
1
Thus I1 = 1 amps, I2 = 2 amps, and I3 = 1 amp.
21. (a) Applying the current and voltage laws to the circuit gives the system:
Node B:
I = I1 + I4
Node C:
I1 = I2 + I3
Node D:
I2 + I5 = I
Node E:
I3 + I4 = I5
Circuit ABEDA: −2I4 − I5 + 14 = 0
Circuit BCEB:
−I1 − I3 + 2I4 = 0
Circuit CDEC: −2I2 + I3 + I5 = 0
Construct the augmented matrix and row-reduce it:

1
0

1

0

0

0
0
−1
0
0 −1
0
1 −1 −1
0
0
0 −1
0
0 −1
0
0
1
1 −1
0
0
0
2
1
1
0
1 −2
0
0
2 −1
0 −1
 
1
0

0
 0

0
 0

0 →
−
0

14 
0
0 0
0
0
0
1
0
0
0
0
0
0
0
1
0
0
0
0
0
0
0
1
0
0
0
0
0
0
0
1
0
0
0
0
0
0
0
1
0

10
6

4

2

4

6
0
Thus the currents are I = 10, I1 = 6, I2 = 4, I3 = 2, I4 = 4, and I5 = 6 amps.
7
(b) From Ohm’s Law, the effective resistance is Reff = VI = 14
10 = 5 ohms.
(c) To force the current in CE to be zero, we force I3 = 0 and let r be the resistance in BC. This
100
CHAPTER 2. SYSTEMS OF LINEAR EQUATIONS
gives the following system of equations:
Node B:
I = I1 + I4
Node C:
I1 = I2
Node D:
I2 + I5 = I
Node E:
I4 = I5
Circuit ABEDA: −2I4 − I5 + 14 = 0
−rI1
+ 2I4 = 0
Circuit CDEC: −2I2
+ I5 = 0
Circuit BCEB:
It is easier to solve this system by substitution. Substitute I4 = I5 in the circuit equation for
14
7
ABEDA to get −3I5 + 14 = 0, so that I4 = I5 = 14
3 . Thus −2I2 + 3 = 0 so that I1 = I2 = 3 .
Then the circuit equation for BCED gives us
−r ·
7
14
+2·
=0
3
3
⇒ r = 4.
So if we use a 4 ohm resistor in BC, the current in CE will be zero.
22. (a) Applying the voltage law to Figure 2.23(a) gives IR1 +IR2 = E. But from Ohm’s Law, E = IReff ,
so that I(R1 + R2 ) = IReff and thus Reff = R1 + R2 .
(b) Applying the current and voltage laws to Figure 2.23(b) gives the system of equations I = I1 + I2 ,
−I1 R1 + I2 R2 = 0, −I1 R1 + E = 0, and −I2 R2 + E = 0. Gauss-Jordan elimination gives
 


2 )E
1 0 0 (R1R+R
0
1
−1
−1
1 R2


0 −R
E
0
R2
 0 1 0


1
R1
−
→
.

E

0 −R1
0 −E  0 0 1
R
2
0
0 −R2
−E
0
0
0
0
2 )E
Thus I = (R1R+R
, I1 = RE1 , I2 = RE2 . But E = IReff ; substituting this value of E into the
1 R2
equation for I gives
I=
(R1 + R2 )IReff
R1 R2
⇒
1
Reff = R1 +R2 =
R1 R2
1
1 .
1
R1 + R2
23. The matrix for this system, with inputs on the left, outputs along the top, is
Farming
Manufacturing
Farming
Manufacturing
1
2
1
2
1
3
2
3
Let the output of the farm sector be f and the output of the manufacturing sector be m. Then
1
1
f+ m= f
2
3
1
2
f + m = m.
2
3
Simplifying and row-reducing the corresponding augmented matrix gives
"
# "
1
− 12 f + 13 m = 0
− 12
0
1 − 23
3
⇒
→
−
1
1
1
− 13 0
0
0
2f − 3m = 0
2
Thus f = 23 m, so that farming and manufacturing must be in a 2 : 3 ratio.
#
0
0
2.4. APPLICATIONS
101
24. The matrix for this system, with needed resources on the left, outputs along the top, is
Coal
Steel
Coal Steel
0.3
0.8
0.7
0.2
Let the coal output be c and the steel output be s. Then
0.3c + 0.8s = c
0.7c + 0.2s = s.
Simplifying and row-reducing the corresponding augmented matrix gives
−0.7c + 0.8s = 0
−0.7
0.8 0
1 − 87
→
−
⇒
0.7c − 0.8s = 0
0.7 −0.8 0
0
0
0
0
Thus c = 87 s, so that coal and steel must be in an 8 : 7 ratio. For a total output of $20 million, coal
8
7
28
output must be 8+7
· 20 = 32
3 ≈ $10.667 million, while steel output is 8+7 · 20 = 3 ≈ $9.333 million.
25. Let x be the painter’s rate, y the plumber’s rate, and z the electrician’s rate. From the given schedule,
we get the linear system
2x + y + 5z = 10x
4x + 5y + z = 10y
4x + 4y + 4z = 10z.
Simplifying and row-reducing the corresponding augmented matrix gives

 
−8x + y + 5z = 0
−8
1
5 0
1 0 − 13
18

 
4x − 5y + z = 0 ⇒  4 −5
1 0 →
− 0 1 − 79
4x + 4y − 6z = 0
4
4 −6 0
0 0
0

0

0 .
0
13
Thus x = 18
z and y = 79 z. Let z = 54; this is a whole number in the required range such that the
others will be whole numbers as well. Then x = 39 and y = 42. The painter should charge $39 per
hour, the plumber $42 per hour, and the electrician $54 per hour.
26. From the given matrix, we get the linear system
1
1
1
L+ T + Z = B
4
8
6
1
1
1
1
B+ L+ T + Z = L
2
4
4
6
1
1
1
1
B+ L+ T + Z = T
4
4
2
3
1
1
1
1
B + L + T + Z = Z.
4
4
8
3
Simplify and row-reduce the associated augmented matrix:

1
1
1
−B + 14 L + 81 T + 16 Z = 0
−1
4
8
6

1
3
1
1
1
1
1
− 34

2B − 4L + 4T + 6Z = 0
4
6
⇒  21
1
1
1
1
1
1
1
 4
−2
4B + 2L − 2T + 3Z = 0
2
3
1
1
1
2
1
1
1
− 32
4B + 8L + 8T − 3Z = 0
4
8
8
 
0
1 0 0


0 0 1 0
−
→
0 0 0 1
0
0 0 0
− 23
− 65
− 85
0

0
0

.
0
0
Thus B = 23 Z, L = 56 Z, and T = 85 Z. Furthermore, B is the smallest price of the four, so B = 23 Z
must be at least 50, in which case Z = 75. With that value of Z, we get B = 50, L = 90, and T = 120.
102
CHAPTER 2. SYSTEMS OF LINEAR EQUATIONS
27. The matrix for this system, with needed resources on the left, outputs along the top, is
Coal
Steel
Coal Steel
0.15
0.2
0.1
0.1
(a) Let the coal output be c and the steel output be s. Then since 45 million of external demand is
required for coal and 124 million for steel, we have
c = 0.15c + 0.25s + 45
s = 0.2c + 0.1s + 124.
Simplify and row-reduce the associated matrix:
0.85c − 0.25s = 45
0.85
⇒
−0.2c + 0.9s + 124
−0.2
−0.25
0.9
1
45
→
−
124
0
0
1
100
160
So the coal industry should produce $100 million and the steel industry should produce $160
million.
(b) The new external demands for coal and steel are 40 and 130 million respectively. The only thing
that changes in the equations is the constant terms, so we have
0.85c − 0.25s = 40
0.85 −0.25
40
1 0
95.8
→
−
⇒
−0.2c + 0.9s + 130
−0.2
0.9 130
0 1 165.73
So the coal industry should reduce production by 100−95.8 = $4.2 million, while the steel industry
should increase production by 165.73 − 160 = $5.73 million.
28. This is an open system with external demand given by the other city departments. From the given
matrix, we get the system
1 + 0.2A + 0.1H + 0.2T = A
1.2 + 0.1A + 0.1H + 0.2T = H
0.8 + 0.2A + 0.4H + 0.3T = T .
Simplifying and row-reducing the associated augmented matrix gives

 
0.8 −0.1 −0.2
0.8A − 0.1H − 0.2T = 1
1
1
−0.1A + 0.9H − 0.2T = 1.2 ⇒ −0.1
0.9 −0.2 1.2 →
− 0
−0.2A − 0.4H + 0.7T = 0.8
−0.2 −0.4
0.7 0.8
0
0
1
0
0
0
1

2.31
2.28 .
3.11
So the Administrative department should produce $2.31 million of services, the Health department
$2.28 million, and the Transportation department $3.11 million.
29. (a) Over Z2 , we need to solve x1 a + x2 b + · · · + x5 e = t + s:
 
 
 

0
1 0 0 0 1
0
1 1 0 0 0 0
0
1
1 1 1 0 0 1 0 1 0 0 1
 
 
 


 
 − 0 0 1 0 0

s=
0 , t = 0 , ⇒ 0 1 1 1 0 0 →

0
1
0 0 1 1 1 1 0 0 0 1 1
0 0 0 1 1 0
0 0 0 0 0
0
0

1
1

1

0
0
Hence x5 is a free variable, so there are exactly two solutions (with x5 = 0 and with x5 = 1).
Solving for the other variables, we get x1 = 1 + x5 , x2 = 1 + x5 , x3 = 1, and x4 = x5 . So the two
solutions are
   
   
x1
1
x1
0
x2  1
x2  0
   
   
x3  = 1 ,
x3  = 1 .
   
   
x4  0
x4  1
x5
0
x5
1
So we can push switches 1, 2, and 3 or switches 3, 4, and 5.
2.4. APPLICATIONS
(b) In this case, t = e2 over Z2 , so the matrix becomes

 
1 1 0 0 0 0
1 0 0 0 1
1 1 1 0 0 1 0 1 0 0 1

 
0 1 1 1 0 0 →


 − 0 0 1 0 0
0 0 1 1 1 0 0 0 0 1 1
0 0 0 1 1 0
0 0 0 0 0
103

0
0

0
.
0
1
This is an inconsistent system, so there is no solution. Thus it is impossible to start with all the
lights off and turn on only the second light.
30. (a) With s = e4 and t = e2 + e4 over Z2 , so that s + t = e2 and we get the matrix

 

1 1 0 0 0 0
1 0 0 0 1 0
1 1 1 0 0 1 0 1 0 0 1 0

 

0 1 1 1 0 0 →



 − 0 0 1 0 0 0 .
0 0 1 1 1 0 0 0 0 1 1 0
0 0 0 1 1 0
0 0 0 0 0 1
Since the system is inconsistent, there is no solution in this case: it is impossible to start with
only the fourth light on and end up with only the second and fourth lights on.
(b) In this case t = e2 over Z2 , so that s + t = e2 + e4 , and we get

 

1 1 0 0 0 0
1 0 0 0 1 1
1 1 1 0 0 1 0 1 0 0 1 1

 

0 1 1 1 0 0 →



 − 0 0 1 0 0 1 .
0 0 1 1 1 1 0 0 0 1 1 0
0 0 0 1 1 0
0 0 0 0 0 0
So there are two solutions:
   
   
x1
x1
1
0
x2  1
x2  0
   
   
x3  = 1 ,
x3  = 1 .
   
   
x4  0
x4  1
0
x5
1
x5
So we can push switches 1, 2, and 3 or switches 3, 4, and 5.
31. Recall from Example 2.35 that it does not matter what order we push the switches in, and that pushing
a switch twice is the same as not pushing it at all. So with
 
 
 
 
 
0
0
0
1
1
0
0
1
1
1
 
 
 
 
 
 
 
 

 
a=
0 , b = 1 , c = 1 , d = 1 , e = 0 ,
1
1
1
0
0
1
1
0
0
0
there are 25 = 32 possible choices, since there are 5 buttons and we have the choice of pushing or not
pushing each one. If we make a list of the resulting 32 resulting light states and eliminate duplicates,
we see that the list of possible states is
               
0
0
0
0
0
0
0
0
0 0 0 0 1 1 1 1
               
0 , 0 , 1 , 1 , 1 , 1 , 0 , 0 ,
               
0 1 1 0 1 0 0 1
0
1
1
0
0
1
1
0
               
1
1
1
1
1
1
1
1
1 1 1 1 0 0 0 0
               
1 , 1 , 0 , 0 , 0 , 0 , 1 , 1 .
               
0 1 1 0 1 0 0 1
0
1
1
0
0
1
1
0
104
CHAPTER 2. SYSTEMS OF LINEAR EQUATIONS
32. (a) Over Z3 , we need to solve x1 a + x2 b + x3 c = t, where
 
 

0
1
1 1 0 0
1 1 1 2 →
− 0
t = 2 , so that
0 1 1 1
0
1
0
1
0
0
0
1

1
2 .
2
Thus x1 = 1, x2 = 2, and x3 = 2, so we must push switch A once and switches B and C twice
each.
(b) Over Z3 , we need to solve x1 a + x2 b + x3 c = t, where
 
 

1
1
1 1 0 1
1 1 1 0 →
− 0
t = 0 , so that
0 1 1 1
0
1
0
1
0
0
0
1

2
2 .
2
In other words, we must push each switch twice.
(c) Let x, y, and z be the final states of the three lights. Then we want to solve
 


1 0 0
y + 2z
1 1 0 x
1 1 1 y  →
− 0 1 0 x + 2y + z  .
0 1 1 z
0 0 1
2x + y
So we can achieve this configuration if we push switch A y + 2z times, switch B x + 2y + z times,
and switch C 2x + y times (adding in Z3 ).
33. The switches here correspond to the same vectors as in Example 2.35, but now the numbers are
interpreted in Z3 rather than in Z2 . So in Z53 , with s = 0 (all lights initially off), we need to solve
x1 a + x2 b + · · · + x5 e = t, where

 
 

1 1 0 0 0 2
2
1 0 0 0 1 1
1 1 1 0 0 1 0 1 0 0 2 1
1

 
 

 , so that 0 1 1 1 0 2 →


2
t=

 
 − 0 0 1 0 0 2
0 0 1 1 1 1 0 0 0 1 1 2
1
2
0 0 0 1 1 2
0 0 0 0 0 0
So x5 is a free variable, so there are three solutions, corresponding to x5 = 0, 1, or 2. Solving for the
other variables gives
x1 = 1 − x5 ,
x2 = 1 − 2x5 ,
x3 = 2,
x4 = 2 − x5 ,
so that the solutions are (remembering that 1 − 2 = 2 and 2 · 2 = 1 in Z3 )
   
 
 
x1
1
0
2
x2  1
2
0
   
 
 
x3  = 2 , 2 , 2 .
   
 
 
x4  2
1
0
x5
0
1
2
34. The possible configurations are

1
1

0

0
0
1
1
1
0
0
0
1
1
1
0
0
0
1
1
1
  

0
s1
s1 + s2
  

0
 s2  s1 + s2 + s3 




0 s3  = s2 + s3 + s4 

1 s4  s3 + s4 + s5 
1
s5
s4 + s5
where si ∈ {0, 1, 2} is the number of times that switch i is thrown.
2.4. APPLICATIONS
105
35. (a) Let a value of 1 represent white, and 0 represent black. Let ai correspond to touching the number
i; it has a 1 in a given entry if the value of that square changes when i is pressed, and a 0 if it does
not (thus the 1’s correspond to the squares with the blue dots in them in the diagram). Since
each square has two states, white and black, we need to solve, over Z2 , the equation
s + x1 a1 + x2 a2 + · · · + x9 a9 = t.
Here
 
 
0
0
0
1
 
 
0
1
 
 
0
1
 
 
 , t = 0 ,
0
s=
 
 
0
1
 
 
0
1
 
 
0
1
0
0
so that
x1 a1 + x2 a2 + · · · + x9 a9 = 0 − s = s.
Row-reducing the corresponding augmented matrix gives

 
1 1 0 1 0 0 0 0 0 0
1 0 0
1 1 1 0 1 0 0 0 0 1 0 1 0

 
0 1 1 0 0 1 0 0 0 1 0 0 1

 
1 0 0 1 1 0 1 0 0 1 0 0 0

 
1 0 1 0 1 0 1 0 1 0 →


 − 0 0 0
0 0 1 0 1 1 0 0 1 1 0 0 0

 
0 0 0 1 0 0 1 1 0 1 0 0 0

 
0 0 0 0 1 0 1 1 1 1 0 0 0
0 0 0 0 0 1 0 1 1 0
0 0 0
0
0
0
1
0
0
0
0
0
0
0
0
0
1
0
0
0
0
0
0
0
0
0
1
0
0
0
0
0
0
0
0
0
1
0
0
0
0
0
0
0
0
0
1
0
0
0
0
0
0
0
0
0
1

0
0

1

0

0
.
0

1

0
0
So touching the third and seventh squares will turn all the squares black.
(b) In part (a), note that the row-reduced matrix consists of the identity matrix and a column vector
corresponding to the constants. Changing the constants cannot produce an inconsistent system,
since the matrix will still reduce to the identity matrix. So any configuration is reachable, since
we can solve x1 a1 + x2 a2 + · · · + x9 a9 = s for any s.
36. In this version of the puzzle, the squares have three states, so we work over Z3 , with 2 corresponding
to black, 1 to gray, and 0 to white. Then we need to go from
 
 
2
2
2
2
 
 
0
2
 
 
1
2
 
 
 to t = 2 .
1
s=
 
 
0
2
 
 
2
2
 
 
0
2
1
2
Thus we want to solve
 
0
0
 
2
 
1
 

x1 a1 + x2 a2 + · · · + x9 a9 = t − s = 
1 .
2
 
0
 
2
1
106
CHAPTER 2. SYSTEMS OF LINEAR EQUATIONS
Row-reducing the corresponding augmented matrix gives

 
1 0
1 1 0 1 0 0 0 0 0 0
1 1 1 0 1 0 0 0 0 0 0 1

 
0 1 1 0 0 1 0 0 0 2 0 0

 
1 0 0 1 1 0 1 0 0 1 0 0

 

1 0 1 0 1 0 1 0 1 1 →

 − 0 0
0 0 1 0 1 1 0 0 1 2 0 0

 
0 0 0 1 0 0 1 1 0 0 0 0

 
0 0 0 0 1 0 1 1 1 2 0 0
0 0 0 0 0 1 0 1 1 1
0 0
0
0
1
0
0
0
0
0
0
0
0
0
1
0
0
0
0
0
0
0
0
0
1
0
0
0
0
0
0
0
0
0
1
0
0
0
0
0
0
0
0
0
1
0
0
0
0
0
0
0
0
0
1
0
0
0
0
0
0
0
0
0
1

0
1

0

2

2
,
1

0

1
2
showing that a solution exists.
37. Let Grace and Hans’ ages be g and h. Then g = 3h and g + 5 = 2(h + 5), or g = 2(h + 5) − 5 = 2h + 5.
It follows that g = 3h = 2h + 5, so that h = 5 and g = 15. So Hans is 5 and Grace is 15.
38. Let the ages be a, b, and c. Then the first two conditions give us a + b + c = 60 and a − b = b − c.
For the third condition, Bert will be as old as Annie is now in a − b years; at that time, Annie will be
a + (a − b) = 2a − b years, and we are given that 2a − b = 3c. So we get the system

 

31 0 0 28
31
1
1 60
a + b + c = 60
1 0 →
−  0 1 0 20 .
a − 2b + c = 0 ⇒  1 −2
0 0 1 12
2 −1 −3 0
2a − b − 3c = 0
Annie is 28, Bert is 20, and Chris is 12.
39. Let the areas of the fields be a and b. We have the two equations
a + b = 1800
2
1
a + b = 1100.
3
2
Multiply the first equation by two and subtract the first equation from the result to get 13 a = 400, so
that a = 1200 square yards and thus b = 1800 − a = 600 square yards.
40. Let x1 , x2 , and x3 be the number of bundles of the first, second, and third types of corn. Then from
the given conditions we get
3x1 + 2x2 + x3 = 39
2x1 + 3x2 + x3 = 34
x1 + 2x2 + 3x3 = 26.
Row-reduce the augmented matrix to get

3 2 1

2 3 1
1 2 3
 
1 0 0
39
 
34 →
− 0 1 0
26
0 0 1

37
4
17 
.
4 
11
4
Therefore there are 9.25 measures of corn in a bundle of the first type, 4.25 measures of corn in a
bundle of the second type, and 2.75 measures of corn in a bundle of the third type.
41. (a) The addition table gives a + c = 2, a + d = 4, b + c = 3, b + d = 5. Construct and row-reduce the
corresponding augmented matrix:

 

1 0 1 0 2
1 0 0
1
4
1 0 0 1 4 0 1 0
1
5

− 

0 1 1 0 3 →
0 0 1 −1 −2 .
0 1 0 1 5
0 0 0
0
0
2.4. APPLICATIONS
107
So d is a free variable. Solving for the other variables in terms of d gives a = 4 − d, b = 5 − d, and
c = d − 2. So there are an infinite number of solutions; for example, with d = 1, we get a = 3,
b = 4, and c = −1.
(b) We have a + c = 3, a + d = 4, b + c = 6, b + d = 5. Construct and row-reduce the corresponding
augmented matrix:

 

1 0 0
1 0
1 0 1 0 3
1 0 0 1 4 0 1 0
1 0

− 

0 0 1 −1 0 .
0 1 1 0 6 →
0 1 0 1 5
0 0 0
0 1
So this is an inconsistent system, and there are no such values of a, b, c, and d.
42. The addition table gives a + c = w, a + d = y, b + c = x, b + d = z. Construct and row-reduce the
corresponding augmented matrix:
 


1 0 0 1
1 0 1 0 w
w

1 0 0 1 y  0 1 0 1
x
− 
.

0 0 1 −1

0 1 1 0 x  →
y−w
0 1 0 1 z
0 0 0 0 z−x−y+w
In order for this to be a consistent system, we must have z − x − y + w = 0, or z − y = x − w. Another
way of phrasing this is x + y = z + w; in other words, the sum of the diagonal entries must equal the
sum of the off-diagonal entries.
43. (a) The addition table gives the following system of equations:
a+d=3
b+d=2
c+d=1
a+e=5
b+e=4
c+e=3
a+f =4
b+f =3
c + f = 1.
Construct and row-reduce the corresponding augmented matrix:

 
1 0 0 1 0 0 3
1 0 0 0 0
1
1 0 0 0 1 0 5 0 1 0 0 0
1

 
1 0 0 0 0 1 4 0 0 1 0 0
1

 
0 1 0 1 0 0 2 0 0 0 1 0 −1

 
0 1 0 0 1 0 4 →


 − 0 0 0 0 1 −1
0 1 0 0 0 1 3 0 0 0 0 0
0

 
0 0 1 1 0 0 1 0 0 0 0 0
0
 

0 0 1 0 1 0 3 0 0 0 0 0
0
0 0 1 0 0 1 1
0 0 0 0 0
0

0
0

0

0

0
.
1

0

0
0
The sixth row shows that this is an inconsistent system.
(b) The addition table gives the following system of equations:
a+d=1
b+d=2
c+d=3
a+e=3
b+e=4
c+e=5
a+f =4
b+f =5
c + f = 6.
Construct and row-reduce the corresponding augmented matrix:

 
1 0 0 1 0 0 1
1 0 0 0 0
1
1 0 0 0 1 0 3 0 1 0 0 0
1

 
1 0 0 0 0 1 4 0 0 1 0 0
1

 
0 1 0 1 0 0 2 0 0 0 1 0 −1

 
0 1 0 0 1 0 4 →


 − 0 0 0 0 1 −1
0 1 0 0 0 1 5 0 0 0 0 0
0
 

0 0 1 1 0 0 3 0 0 0 0 0
0

 
0 0 1 0 1 0 5 0 0 0 0 0
0
0 0 1 0 0 1 6
0 0 0 0 0
0

4
5

6

−3

−1
.
0

0

0
0
108
CHAPTER 2. SYSTEMS OF LINEAR EQUATIONS
So f is a free variable, and there are an infinite number of solutions of the form a = 4 − f ,
b = 5 − f , c = 6, d = f − 3, e = f − 1.
44. Generalize the addition table to
+
d
e
f
a
m
p
s
b
n
q
t
c
o
r
u
The addition table gives the following system of equations:
a+d=m
b+d=n
c+d=o
a+e=p
b+e=q
c+e=r
a+f =s
b+f =t
c + f = u.
Construct and row-reduce the corresponding augmented matrix:

 
1 0 0 1 0 0 m
1 0 0 0 0 1
1 0 0 0 1 0 p  0 1 0 0 0 1

 
1 0 0 0 0 1 s  0 0 1 0 0 1

 
0 1 0 1 0 0 n  0 0 0 1 0 −1

 
0 1 0 0 1 0 q  →


 − 0 0 0 0 1 −1
0 1 0 0 0 1 t  0 0 0 0 0 0

 
0 0 1 1 0 0 o  0 0 0 0 0 0

 
0 0 1 0 1 0 r  0 0 0 0 0 0
0 0 1 0 0 1 u
0 0 0 0 0 0

m

n


o


m−p

n + p − m − t
.
m − n − p + q

n−o−q+r

p−q−s+t 
q−r−t+u
For the table to be valid, the last entry in the zero rows must also be zero, so we need the conditions
m + q = n + p, n + r = o + q, p + t = q + s, and q + u = r + t. In terms of the addition table, this means
that for each 2 × 2 submatrix, the sum of the diagonal entries must equal the sum of the off-diagonal
entries (compare with Exercise 42).
45. (a) Since the three points (0, 1), (−1, 4), and (2, 1) must satisfy the equation y = ax2 + bx + c,
substitute those values to get the three equations c = 1, a − b + c = 4, and 4a + 2b + c = 1.
Construct and row-reduce the corresponding augmented matrix:

 

0
0 1 1
1 0 0
1
1 −1 1 4 →
− 0 1 0 −2
4
2 1 1
0 0 1
1
Thus a = 1, b = −2, and c = 1, so that the equation of the parabola is y = x2 − 2x + 1.
(b) Since the three points (−3, 1), (−2, 2), and (−1, 5) must satisfy the equation y = ax2 + bx + c,
substitute those values to get the three equations 9a − 3b + c = 1, 4a − 2b + c = 2, and a − b + c = 5.
Construct and row-reduce the corresponding augmented matrix:

 

9 −3 1 1
1 0 0 1
4 −2 1 2 →
− 0 1 0 6 
1
1 1 5
0 0 1 10
Thus a = 1, b = 6, and c = 10, so that the equation of the parabola is y = x2 + 6x + 10.
46. (a) Since the three points (0, 1), (−1, 4), and (2, 1) must satisfy the equation x2 + y 2 + ax + by + c = 0,
substitute those values to get the three equations
1 + b + c = 0
1 + 16 − a + 4b + c = 0
4 + 1 + 2a + b + c = 0
⇒
b + c = −1
a − 4b − c = 17
2a + b + c = −5 .
2.4. APPLICATIONS
109
Construct and row-reduce the corresponding augmented matrix:

0
1
2
1
1
−4 −1
1
1
 
−1
1
17 →
− 0
−5
0
0
1
0
0
0
1

−2
−6
5
Thus a = −2, b = −6, and c = 5, so that the equation of the circle is x2 + y 2 − 2x − 6y + 5 = 0.
Completing the square gives (x − 1)2 + (y − 3)2 +√5 − 1 − 9 = 0, so that (x − 1)2 + (y − 3)2 = 5.
This is a circle whose center is (1, 3), with radius 5. A sketch of the circle follows part (b).
(b) Since the three points (−3, 1), (−2, 2), and (−1, 5) must satisfy the equation x2 +y 2 +ax+by+c = 0,
substitute those values to get the three equations
9 + 1 − 3a + b + c = 0
4 + 4 − 2a + 2b + c = 0
1 + 25 − a + 5b + c = 0
⇒
3a − b − c = 10
2a − 2b − c = 8
a − 5b − c = 26 .
Construct and row-reduce the corresponding augmented matrix:

3
2
1
−1
−2
−5
−1
−1
−1
 
10
1 0 0
8 →
− 0 1 0
26
0 0 1

12
−10
36
Thus a = 12, b = −10, and c = 36, so that the equation of the circle is x2 +y 2 +12x−10y +36 = 0.
Completing the square gives (x + 6)2 + (y − 5)2 + 36 − 36 − 25 = 0, so that (x + 6)2 + (y − 5)2 = 25.
This is a circle whose center is (−6, 5), with radius 5. Sketches of both circles are:
6
10
5
9
8
H-1, 4L
4
7
6
3
H1, 3L
H-6, 5L
H-1, 5L
5
4
2
3
1
H2, 1L
H0, 1L
H-2, 2L
2
H-3, 1L
-2
-1
1
2
3
4
-11 -10 -9
-8
-7
-6
-5
-4
1
-3
-2
-1
3x+1
A
B
47. We have x23x+1
+2x−3 = (x−1)(x+3) = x−1 + x+3 , so that clearing fractions, we get (x + 3)A + (x − 1)B =
3x + 1. Collecting terms gives (A + B)x + (3A − B) = 3x + 1. This gives the linear system A + B = 3,
3A − B = 1. Construct and row-reduce the corresponding augmented matrix:
1
3
1
−1
3
1 0
→
−
1
0 1
1
2
Thus A = 1 and B = 2, so that x23x+1
+2x−3 = x−1 + x+3 .
1
.
2
110
CHAPTER 2. SYSTEMS OF LINEAR EQUATIONS
48. We have
x2 − 3x + 3
A
x3 − 3x + 3
B
C
= +
=
+
x3 + 2x2 + x
x(x + 1)2
x
x + 1 (x + 1)2
x2 − 3x + 3 = A(x + 1)2 + Bx(x + 1) + Cx
2
⇒
2
x − 3x + 3 = (A + B)x + (2A + B + C)x + A
A + B = 1,
2A + B + C = −3,
⇒
⇒
A = 3.
This is most easily solved by substitution: A = 3, so that A + B = 1 gives B = −2. Substituting into
the remaining equation gives 2 · 3 − 2 + C = −3, so that C = −7. Thus
x2 − 3x + 3
3
2
7
= −
−
x3 + 2x2 + x
x x + 1 (x + 1)2
49. We have
x−1
A
Bx + C
Dx + E
=
+ 2
+ 2
(x + 1)(x2 + 1)(x2 + 4)
x+1
x +1
x +4
⇒
x − 1 = A(x2 + 1)(x2 + 4) + (Bx + C)(x + 1)(x2 + 4) + (Dx + E)(x + 1)(x2 + 1).
Expand the right-hand side and collect terms to get
x − 1 = (A + B + D)x4 + (B + C + D + E)x3 + (5A + 4B + C + D + E)x2
+ (4B + 4C + D + E)x + (4A + 4C + E).
This gives the linear system
A+ B
=
0
B+ C +D+E =
0
5A + 4B + C + D + E =
0
4B + 4C + D + E =
1
4A
+D
+ 4C
+ E = −1.
Construct and row-reduce the corresponding augmented matrix:
 

0
1 1 0 1 0
1 0 0 0

 
0 1 1 1 1
0 0 1 0 0
 

5 4 1 1 1
0
−
→
0 0 1 0

 

1 0 0 0 1
0 4 4 1 1
4 0 4 0 1 −1
0 0 0 0
0
0
0
0
1
− 15

1

3
0
.
2 
− 15

− 15
2
Thus A = − 15 , B = 13 , C = 0, D = − 15
, and E = − 15 , giving
2
1
x−1
1
x
15 x + 5
=
−
+
−
(x + 1)(x2 + 1)(x2 + 4)
5(x + 1) 3(x2 + 1)
x2 + 4
1
x
2x + 3
=−
+
−
.
5(x + 1) 3(x2 + 1) 15(x2 + 4)
50. We have
x3 + x + 1
A
B
Cx + D
Ex + F
Gx + H
Ix + J
= +
+ 2
+ 2
+ 2
+ 2
.
2
2
3
2
x(x − 1)(x + x + 1)(x + 1)
x
x−1 x +x+1
x +1
(x + 1)
(x + 1)3
2.4. APPLICATIONS
111
Multiply both sides of the equation by x(x − 1)(x2 + x + 1)(x2 + 1)3 , simplify, and equate coefficients
of xn to get the system
A + B + C + E = 0 −3A + 3B − 3C + 3D − 2E + F − G + H + J = 0
B−C +D+F =0
A + 4B + C − 3D − 2F − H = 1
3A + 4B + 3C − D + 2E + G = 0
−3A + B − C + D − E − G − I = 0
−A + 3B − 3C + 3D − E + 2F + H = 0
B−D−F −H −J =1
3A + 6B + 3C − 3D + E − F + G + I = 0
−A = 1.
Construct and row-reduce the corresponding augmented matrix:

 
1 0
1 1
1
0
1
0
0
0
0
0 0

 
 0 1 −1
1
0
1
0
0
0
0 0 0 1

 

 3 4
3 −1
2
0
1
0
0
0 0

 0 0

 
3 −1
2
0
1
0
0 0 0 0
−1 3 −3

 
0 0
 3 6
3 −3
1 −1
1
0
1
0 0

→
−
−3 3 −3

3 −2
1 −1
1
0
1 0 
0 0


 
 1 4
1 −3
0 −2
0 −1
0
0 1 0 0

 

−3 1 −1
1 −1
0 −1
0 −1
0 0

 0 0

 
0 −1
0 −1
0 −1
0 −1 1 0 0
 0 1
−1 0
0
0
0
0
0
0
0
0 1
0 0
0
0
1
0
0
0
0
0
0
0
0
0
0
1
0
0
0
0
0
0
0
0
0
0
1
0
0
0
0
0
0
0
0
0
0
1
0
0
0
0
0
0
0
0
0
0
1
0
0
0
0
0
0
0
0
0
0
1
0
0
0
0
0
0
0
0
0
0
1
0
0
0
0
0
0
0
0
0
0
1

−1
1

8
−1


0

15 
8 .
− 98 

7

4
1
−4
1
2
1
2
This gives the partial fraction decomposition
1
15
7
1
9
1
1
x3 + x + 1
1
x
8
8 x− 8
4x − 4
2x + 2
=
−
+
−
+
+
+
x(x − 1)(x2 + x + 1)(x2 + 1)3
x x − 1 x2 + x + 1
x2 + 1
(x2 + 1)2
(x2 + 1)3
1
1
x
15x − 9
7x − 1
x+1
=− +
−
+
+
+
.
x 8(x − 1) x2 + 1 8(x2 + 1) 4(x2 + 1)2
2(x2 + 1)3
51. Suppose that 1 + 2 + · · · + n = an2 + bn + c, and let n = 0, 1, and 2 to get the system
c=0
a+ b+c=1
4a + 2b + c = 3.
Construct and row-reduce the associated augmented matrix:

 
0 0 1 0
1 0 0

 
− 0 1 0
1 1 1 1 →
4 2 1 3
0 0 1

1
2
1
2
0
Thus a = b = 21 and c = 0, so that
1 + 2 + ··· + n =
1 2 1
1
n + n = n(n + 1).
2
2
2
52. Suppose that 12 + 22 + · · · + n2 = an3 + bn2 + cn + d, and let n = 0, 1, 2, and 3 to get the system
d= 0
a+ b+ c+d= 1
8a + 4b + 2c + d = 5
27a + 9b + 3c + d = 14.
112
CHAPTER 2. SYSTEMS OF LINEAR EQUATIONS
Construct and row-reduce the associated augmented matrix:

 
1 0 0
0 0 0 1 0
 1 1 1 1 1  0 1 0

 
−

→
 8 4 2 1 5  0 0 1
0 0 0
27 9 3 1 14
1
3
1
2 .
1

6

0
0
0
1
0
Thus a = 31 , b = 12 , c = 16 , and d = 0, so that
12 + 22 + · · · + n2 =
1 3 1 2 1
1
1
n + n + n = (2n3 + 3n2 + n) = n(n + 1)(2n + 1).
3
2
6
6
6
53. Suppose that 13 + 23 + · · · + n3 = an4 + bn3 + cn2 + dn + e, and let n = 0, 1, 2, 3, and 4 to get the
system
e=
0
c+ d+e=
1
16a + 8b + 4c + 2d + e =
9
a+
b+
81a + 27b + 9c + 3d + e = 36
256a + 64b + 16c + 4d + e = 100.
Construct and row-reduce the associated augmented matrix:

 
0
0
0 0 1
0
1 0

 
 1


1
1 1 1
1
0 1

 
 16
→
0 0
9
−
8
4
2
1

 

 
 81 27 9 3 1 36  0 0
256 64 16 4 1 100
0 0
0
0
1
0
0
0
0
0
1
0
0
0
0
0
1

1
4
1

2
1
.
4

0
0
Thus a = 14 , b = 12 , c = 14 , and d = e = 0, so that
1
1
1
1
1
1 + 2 + · · · + n = n4 + n3 + n2 = n2 (n2 + 2n + 1) = n2 (n + 1)2 =
4
2
4
4
4
3
2.5
3
3
n(n + 1)
2
2
.
Iterative Methods for Solving Linear Systems
1. Start by solving the first equation for x1 and the second for x2 :
6 1
x1 = + x2
7 7
4 1
x2 = + x1 .
5 5
Using the initial vector [x1 , x2 ] = [0, 0], we get a sequence of approximations:
n
x1
x2
0
0
0
1
0.857
0.800
2
0.971
0.971
3
0.996
0.994
4
0.999
0.999
5
1.000
1.000
6
1.000
1.000
Solving the first equation for x2 gives x2 = 7x1 − 6; substituting into the second equation gives
x1 − 5(7x1 − 6) = −4
so that
− 34x1 = −34
⇒
x1 = 1.
Substituting this value back into either equation gives x2 = 1, so the exact solution is [x1 , x2 ] = [1, 1].
2.5. ITERATIVE METHODS FOR SOLVING LINEAR SYSTEMS
113
2. Solving for x1 and x2 gives
5 1
− x2
2 2
x2 = −1 + x1 .
x1 =
Using the initial vector [x1 , x2 ] = [0, 0], we get a sequence of approximations:
n
x1
x2
0
0
0
1
2.5
−1.0
n
x1
x2
13
2.008
0.969
14
2.016
1.008
2
3
1.5
3
1.75
2.0
4
1.4
0.75
5
2.125
0.5
6
2.25
1.125
7
1.938
1.25
8
1.875
0.938
9
2.031
0.875
10
2.062
1.031
11
1.984
1.062
12
1.969
0.984
15
1.996
1.016
16
1.992
0.996
17
2.002
0.992
18
2.004
1.002
19
1.999
1.004
20
1.998
0.999
21
2.000
0.998
22
2.001
1.000
23
2.000
1.001
24
2.000
1.000
25
2.000
1.000
Using x2 = x1 − 1 in the first equation gives x1 = 52 − 12 (x1 − 1), so that x1 = 2 and x2 = 1. So the
exact solution is [x1 , x2 ] = [2, 1].
3. Solving for x1 and x2 gives
2
1
x2 +
9
9
2
2
x2 = x2 + .
7
7
x1 =
Using the initial vector [x1 , x2 ] = [0, 0], we get a sequence of approximations:
n
x1
x2
0
0
0
1
0.222
0.286
2
0.254
0.349
3
0.261
0.358
4
0.262
0.360
5
0.262
0.361
6
0.262
0.361
Solving the first equation for x2 gives x2 = 9x1 − 2. Substitute this into the second equation to get
8
8
x1 − 3.5(9x2 − 2) = −1, so that 30.5x1 = 8 and thus x1 = 30.5
≈ 0.262295. This gives x2 = 9 · 30.5
−2 ≈
0.360656.
4. Solving for x1 , x2 , and x3 gives
x1 = −
x2 =
x3 =
1
17
1
x2 + x3 +
20
20
20
1
1
13
x1 + x3 −
10
10
10
1
1
9
x1 − x2 + .
10
10
5
Using the initial vector [x1 , x2 , x3 ] = [0, 0, 0], we get a sequence of approximations:
n
x1
x2
x3
0
0
0
0
1
0.85
−1.3
1.8
2
1.005
−1.035
2.015
3
1.002
−0.998
2.004
4
1.000
−0.999
2.000
To solve the system directly, row-reduce the augmented matrix:

 
20
1 −1 17
1 0 0
 1 −10
1 13 →
− 0 1 0
−1
1 10 18
0 0 1
giving the exact solution [x1 , x2 , x3 ] = [1, −1, 2].
5
1.000
−1.000
2.000

1
−1
2
6
1.000
−1.000
2.000
114
CHAPTER 2. SYSTEMS OF LINEAR EQUATIONS
5. Solve to get
1
1
x2 +
3
3
1
1
1
x2 = − x1 − x3 +
4
4
4
1
1
x3 = − x2 +
.
3
3
x1 =
Using the initial vector [x1 , x2 , x3 ] = [0, 0, 0], we get a sequence of approximations
n
x1
x2
x3
0
0
0
0
1
0.333
0.250
0.333
2
0.250
0.083
0.250
3
0.306
0.125
0.306
4
0.292
0.097
0.292
5
0.301
0.104
0.301
6
0.299
0.100
0.299
7
0.300
0.101
0.300
8
0.300
0.100
0.300
9
0.300
0.100
0.300
To solve the system directly, row-reduce the augmented matrix:
 


3
1 0 0 10
3 1 0 1
 

1 
− 0 1 0 10

1 4 1 1 →
3
0 1 3 1
0 0 1 10
3 1 3
giving the exact solution [x1 , x2 , x3 ] = 10
, 10 , 10 .
6. Solve to get
1
1
x2 +
3
3
1
1
x2 = x1 + x3
3
3
1
1
1
x3 = x2 + x4 +
3
3
3
1
1
x4 = x3 +
.
3
3
x1 =
Using the initial vector [x1 , x2 , x3 , x4 ] = [0, 0, 0, 0], we get a sequence of approximations
n
x1
x2
x3
x4
0
0
0
0
0
1
0.333
0
0.333
0.333
2
3
4
0.333 0.407 0.420
0.222 0.259 0.321
0.444 0.556 0.580
0.444 0.481 0.519
5
6
7
8
9
10
11
12
0.440 0.444 0.450 0.452 0.453 0.454 0.454 0.454
0.333 0.351 0.355 0.360 0.361 0.363 0.363 0.363
0.613 0.620 0.630 0.632 0.634 0.635 0.636 0.636
0.527 0.538 0.540 0.543 0.544 0.545 0.545 0.545
To solve the system directly, row-reduce the augmented matrix:
 

3 −1
0
0 1
1 0 0 0


−1
3 −1
0 0 0 1 0 0

−
→

 0 −1
3 −1 1 0 0 1 0
0
0 −1
3 1
0 0 0 1
5 4 7 6
giving the exact solution [x1 , x2 , x3 , x4 ] = 11
, 11 , 11 , 11 .
5
11
4 
11 
7 

11
6
11

7. Using the equations from Exercise 1, we get (to three decimal places, but keeping four during the
calculations)
n 0
1
2
3
4
x1 0 0.857 0.996 1.000 1.000
x2 0 0.971 0.999 1.000 1.000
The Gauss-Seidel method takes 3 iterations, while the Jacobi method takes 5.
2.5. ITERATIVE METHODS FOR SOLVING LINEAR SYSTEMS
115
8. Using the equations from Exercise 2, we get (to three decimal places, but keeping four during the
calculations)
n
x1
x2
0
0
0
1
2.5
1.5
2
1.75
0.75
3
2.125
1.125
4
1.938
0.938
5
2.031
1.031
6
1.984
0.984
7
2.008
1.008
8
1.996
0.996
9
2.002
1.002
10
1.999
0.999
11
2.000
1.000
12
2.000
1.000
The Gauss-Seidel method takes 11 iterations, while the Jacobi method takes 24.
9. Using the equations from Exercise 3, we get (to three decimal places, but keeping four during the
calculations)
n 0
1
2
3
4
x1 0 0.222 0.261 0.262 0.262
x2 0 0.349 0.360 0.361 0.361
The Gauss-Seidel method takes 3 iterations, while the Jacobi method takes 5.
10. Using the equations from Exercise 4, we get (to three decimal places, but keeping four during the
calculations)
n 0
1
2
3
4
x1 0
0.850
1.011
1.000
1.000
x2 0 −1.215 −0.998 −1.000 −1.000
x3 0
2.006
2.001
2.000
2.000
The Gauss-Seidel method takes 3 iterations, while the Jacobi method takes 5.
11. Using the equations from Exercise 5, we get (to three decimal places, but keeping four during the
calculations)
1
2
3
4
5
6
n 0
x1 0 0.333 0.278 0.296 0.299 0.300 0.300
x2 0 0.167 0.111 0.102 0.100 0.100 0.100
x3 0 0.278 0.296 0.299 0.300 0.300 0.300
The Gauss-Seidel method takes 5 iterations, while the Jacobi method takes 8.
12. Using the equations from Exercise 6, we get (to three decimal places, but keeping four during the
calculations)
n 0
1
2
3
4
5
6
7
x1 0 0.333 0.370 0.416 0.433 0.451 0.454 0.454
x2 0 0.111 0.247 0.328 0.353 0.361 0.363 0.363
x3 0 0.370 0.568 0.617 0.631 0.635 0.636 0.636
x4 0 0.457 0.523 0.539 0.544 0.545 0.545 0.545
The Gauss-Seidel method takes 6 iterations, while the Jacobi method takes 11.
1.
13.
0.8
0.6
0.4
0.2
0.2
0.4
0.6
0.8
1.
116
14.
CHAPTER 2. SYSTEMS OF LINEAR EQUATIONS
1.5
1.
0.5
1.
0.5
2.
1.5
2.5
15. Applying the Gauss-Seidel method to x1 − 2x2 = 3, 3x1 + 2x2 = 1 with an initial estimate of [0, 0]
gives
n 0
1
2
3
4
x1 0
3
−5
19
−53
x2 0 −4
8
−28
80
which evidently diverges. If, however, we reverse the two equations to get 3x1 + 2x2 = 1, x1 − 2x2 = 3,
this is strictly diagonally dominant since |3| > |2| and |−2| > |1|. Applying the Gauss-Seidel method
gives
n
x1
x2
0
0
0
1
2
0.333
1.222
−1.333 −0.889
3
0.926
−1.037
4
1.025
−0.988
5
0.992
−1.004
6
1.003
−0.999
7
8
9
0.999
1.000
1.000
−1.000 −1.000 −1.000
The exact solution is [x1 , x2 ] = [1, −1].
16. Applying the Gauss-Seidel method to x1 − 4x2 + 2x3 = 2, 2x2 + 4x3 = 1, 6x1 − x2 − 2x3 = 1 with an
initial estimate of [0, 0, 0] gives
n
x1
x2
x3
0
0
0
0
1
2
2
−6.5
0.5 −10
5.25 −15
3
−8
30.5
−39.75
4
203.5
80
570
which evidently diverges. If, however, we arrange the equations as
6x1 − x2 − 2x3 = 1
x1 − 4x2 + 2x3 = 2
2x2 + 4x3 = 1,
we get a strictly diagonally dominant system. Applying the Gauss-Seidel method gives
n
x1
x2
x3
0
0
0
0
1
2
3
0.167
0.250
0.250
−0.458 −0.198 −0.263
0.479
0.349
0.382
The exact solution is [x1 , x2 , x3 ] = 14 , − 41 , 38 .
17.
75
50
25
-50
-40
-30
-20
10
-10
-25
20
4
0.250
−0.247
0.373
5
0.250
−0.251
0.375
6
7
0.250
0.250
−0.250 −0.250
0.375
0.375
2.5. ITERATIVE METHODS FOR SOLVING LINEAR SYSTEMS
117
18. Applying the Gauss-Seidel method to the equations
−4x1 + 5x2 = 14
x1 − 3x2 = −7
gives
n
x1
x2
0
0
0
1
2
3
4
5
−3.5 −2.04 −1.43 −1.18 −1.08
1.17
1.65
1.86
1.94
1.97
6
7
8
−1.03 −1.01 −1.01
1.99
2.00
2.00
This gives an approximate solution (to 0.01) of x1 = −1.01, x2 = 2.00. The exact solution is [x1 , x2 ] =
[−1, 2].
19. Applying the Gauss-Seidel method to the equations
5x1 − 2x2 + 3x3 = −8
x1 + 4x2 − 4x3 = 102
−2x1 − 2x2 + 4x3 = 90
gives
n
x1
x2
x3
0
0
0
0
1
2
3
4
5
6
−1.6
14.97
8.55
10.74
9.84
10.12
25.9
11.41
14.05
11.62
11.72
11.25
−10.35 −9.31 −11.2 −11.32 −11.72 −11.82
n
x1
x2
x3
9
10.00
11.05
−11.97
10
10.00
11.03
−11.98
11
10.00
11.01
−11.99
7
9.99
11.19
−11.90
8
10.02
11.08
−11.95
12
13
14
10.00
10.00
10.00
11.01
11.00
11.00
−12.00 −12.00 −12.00
This gives an approximate solution (to 0.01) of x1 = 10.00, x2 = 11.00, x3 = −12.00. The exact
solution is [x1 , x2 , x3 ] = [10, 11, −12].
20. Redoing the computation from Exercise 18, showing three decimal places, and computing to an accuracy
of 0.001, gives
n
x1
x2
0
0
0
1
2
−3.5
−2.042
1.167
1.653
n
x1
x2
3
−1.434
1.855
9
−1.002
1.999
4
−1.181
1.940
10
−1.001
2.000
5
−1.075
1.975
11
−1.000
2.000
6
−1.031
1.990
7
8
−1.013 −1.005
1.996
1.998
12
−1.000
2.000
This gives an approximate solution (to 0.001) of x1 = −1.000, x2 = 2.000. The exact solution is
[x1 , x2 ] = [−1, 2].
118
CHAPTER 2. SYSTEMS OF LINEAR EQUATIONS
21. Redoing the computation from Exercise 19, showing three decimal places, and computing to an accuracy
of 0.001, gives
n
x1
x2
x3
0
0
0
0
1
−1.6
25.9
−10.35
2
3
14.97
8.55
11.408
14.051
−9.311 −11.199
4
5
10.74
9.839
11.615
11.718
−11.322 −11.721
6
10.120
11.249
−11.816
7
9.989
11.187
−11.912
n
x1
x2
x3
9
10.002
11.052
−11.973
10
10.005
11.026
−11.985
11
10.001
11.015
−11.992
12
10.001
11.008
−11.996
13
10.000
11.005
−11.998
n
x1
x2
x3
14
10.000
11.002
−11.999
15
10.000
11.001
−11.999
16
10.000
11.001
−12.000
17
10.000
11.000
−12.000
18
10.000
11.000
−12.000
8
10.022
11.082
−11.948
This gives an approximate solution (to 0.001) of x1 = 10.000, x2 = 11.000, x3 = −12.000. The exact
solution is [x1 , x2 , x3 ] = [10, 11, −12].
22. With the interior points labeled as shown, we use the temperature-averaging property to get the
following system:
1
1
(0 + 40 + 40 + t2 ) = 20 + t2
4
4
1
5 1
1
t2 = (0 + t1 + t3 + 5) = + t1 + t3
4
4 4
4
85 1
1
+ t2 .
t3 = (t2 + 40 + 40 + 5) =
4
4
4
t1 =
The Gauss-Seidel method gives the following with initial approximation [0, 0, 0]:
n
t1
t2
t3
0
0
0
0
1
20.000
6.250
22.812
2
21.562
12.344
24.336
3
23.086
13.105
24.526
4
23.276
13.201
24.550
5
23.300
13.213
24.553
6
23.303
13.214
24.554
7
23.304
13.214
24.554
8
23.304
13.214
24.554
This gives equilibrium temperatures (to 0.001) of t1 = 23.304, t2 = 13.214, and t3 = 24.554.
23. With the interior points labeled as shown, we use the temperature-averaging property to get the
following system:
1
(t2 + t3 )
4
1
t2 = (t1 + t4 )
4
1
t3 = (t1 + t4 + 200)
4
1
t4 = (t2 + t3 + 200)
4
t1 =
The Gauss-Seidel method gives the following with initial approximation [0, 0, 0]:
n
t1
t2
t3
t4
0
0
0
0
0
1
0
0
50
62.5
2
12.5
18.75
68.75
71.875
3
21.875
23.438
73.438
74.219
4
5
6
7
8
9
10
24.219 24.805 24.951 24.988 24.997 24.999 25.000
24.609 24.902 24.976 24.994 24.998 25.000 25.000
74.609 74.902 74.976 74.994 74.998 75.000 75.000
74.805 74.951 74.988 74.997 74.998 75.000 75.000
This gives equilibrium temperatures (to 0.001) of t1 = t2 = 25 and t3 = t4 = 75.
2.5. ITERATIVE METHODS FOR SOLVING LINEAR SYSTEMS
119
24. With the interior points labeled as shown, we use the temperature-averaging property to get the
following system:
1
(t2 + t3 )
4
1
t2 = (t1 + t4 + 40)
4
1
t3 = (t1 + t4 + 80)
4
1
t4 = (t2 + t3 + 200)
4
t1 =
The Gauss-Seidel method gives the following with initial approximation [0, 0, 0]:
n
t1
t2
t3
t4
0
0
0
0
0
1
0
10
20
57.5
2
7.5
26.25
36.25
65.625
3
15.625
30.312
40.312
67.656
4
5
6
7
8
9
10
17.656 18.164 18.291 18.323 18.331 18.333 18.333
31.328 31.582 31.646 31.661 31.665 31.666 31.667
41.328 41.582 41.646 41.661 41.665 41.666 41.667
68.164 68.291 68.323 68.331 68.333 68.333 68.333
This gives equilibrium temperatures (to 0.001) of t1 = 18.333, t2 = 31.667, t3 = 41.667, and t4 = 68.333.
25. With the interior points labeled as shown, we use the temperature-averaging property to get the
following system:
1
(t2 + 80)
4
1
t2 = (t1 + t3 + t4 )
4
1
t3 = (t2 + t5 + 80)
4
1
t4 = (t2 + t5 + 5)
4
1
t5 = (t3 + t4 + t6 + 5)
4
1
t6 = (t5 + 85).
4
t1 =
The Gauss-Seidel method gives the following with initial approximation [0, 0, 0]:
n
t1
t2
t3
t4
t5
t6
n
t1
t2
t3
t4
t5
t6
0
0
0
0
0
0
0
1
2
20
21.25
5
11.25
21.25 24.609
2.5
5.859
7.188 14.629
23.047 24.097
7
23.809
15.282
28.058
9.308
16.963
25.491
8
23.821
15.297
28.065
9.315
16.968
25.492
3
22.812
13.32
26.987
8.237
16.283
25.321
9
23.824
15.301
28.067
9.317
16.969
25.492
4
23.33
14.639
27.730
8.980
16.758
25.439
10
23.825
15.302
28.068
9.318
16.970
25.492
5
23.66
15.093
27.963
9.213
16.904
25.476
11
23.826
15.303
28.068
9.318
16.970
25.492
6
23.773
15.237
28.035
9.285
16.949
25.487
12
23.826
15.303
28.068
9.318
16.970
25.492
This gives equilibrium temperatures (to 0.001) of t1 = 23.826, t2 = 15.303, t3 = 28.068, t4 = 9.318,
t5 = 16.970, and t6 = 25.492.
120
CHAPTER 2. SYSTEMS OF LINEAR EQUATIONS
26. With the interior points labeled as shown, we use the temperature-averaging property to get the
following system:
1
(t2 + t5 )
4
1
t2 = (t1 + t3 + t6 )
4
1
t3 = (t2 + t4 + t7 + 20)
4
1
t4 = (t3 + t8 + 40)
4
1
t5 = (t1 + t6 + t9 )
4
1
t6 = (t2 + t5 + t7 + t10 )
4
1
t7 = (t3 + t6 + t8 + t11 )
4
1
t8 = (t4 + t7 + t12 + 20)
4
1
(t5 + t10 + t13 + 40)
4
1
t10 = (t6 + t9 + t11 + t14 )
4
1
t11 = (t7 + t10 + t12 + t15 )
4
1
t12 = (t8 + t11 + t16 + 100)
4
1
t13 = (t9 + t14 + 80)
4
1
t14 = (t10 + t13 + t15 + 40)
4
1
t15 = (t11 + t14 + t16 + 100)
4
1
t16 = (t12 + t15 + 200).
4
t1 =
n
t1
t2
t3
t4
t5
t6
t7
t8
t9
t10
t11
t12
t13
t14
t15
t16
n
t1
t2
t3
t4
t5
t6
t7
t8
t9
t10
t11
t12
t13
t14
t15
t16
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
1
2
0
0
0
1.25
5.
8.438
11.25 14.141
0
2.5
0
1.875
1.25
4.844
8.125 16.562
10.
16.875
2.5
8.984
0.938 17.598
27.266 49.575
22.5
28.281
16.25 26.641
29.297 52.095
64.141 75.417
10
7.274
13.794
23.179
24.998
16.522
26.029
34.83
37.142
34.088
40.284
53.83
69.006
40.452
48.051
71.733
85.185
11
7.579
14.197
23.506
25.162
16.924
26.559
35.259
37.357
34.415
40.714
54.178
69.18
40.617
48.266
71.907
85.272
3
0.938
2.812
10.449
16.753
4.922
5.391
12.5
24.707
20.547
17.544
32.928
58.263
31.797
35.359
60.926
79.797
12
7.78
14.461
23.721
25.269
17.189
26.906
35.54
37.497
34.63
40.995
54.406
69.294
40.724
48.406
72.021
85.329
t9 =
4
1.934
4.443
13.424
19.533
6.968
10.364
20.356
29.538
24.077
25.682
41.307
62.661
34.859
40.367
65.368
82.007
13
7.912
14.635
23.861
25.34
17.362
27.133
35.724
37.589
34.77
41.179
54.554
69.368
40.794
48.498
72.095
85.366
5
2.853
6.66
16.637
21.544
9.323
15.505
25.747
32.488
27.466
31.161
46.234
65.182
36.958
43.372
67.903
83.271
14
7.999
14.748
23.953
25.386
17.476
27.282
35.845
37.65
34.862
41.299
54.652
69.417
40.84
48.559
72.144
85.39
6
3.996
9.035
19.081
22.892
11.742
19.421
29.306
34.345
29.965
34.748
49.285
66.725
38.334
45.246
69.451
84.044
15
8.056
14.823
24.013
25.416
17.55
27.379
35.923
37.689
34.922
41.378
54.716
69.449
40.87
48.598
72.176
85.406
7
5.194
10.924
20.781
23.781
13.645
22.156
31.642
35.537
31.682
37.092
51.227
67.702
39.232
46.444
70.429
84.533
16
8.093
14.871
24.053
25.435
17.599
27.443
35.975
37.715
34.962
41.43
54.757
69.47
40.89
48.624
72.197
85.417
8
6.142
12.27
21.923
24.365
14.995
24.
33.172
36.31
32.83
38.625
52.482
68.331
39.818
47.218
71.058
84.847
17
8.118
14.903
24.078
25.448
17.631
27.485
36.009
37.732
34.988
41.463
54.785
69.483
40.903
48.641
72.21
85.423
9
6.816
13.185
22.68
24.748
15.911
25.223
34.174
36.813
33.589
39.628
53.298
68.74
40.202
47.722
71.467
85.052
18
8.133
14.924
24.095
25.457
17.651
27.512
36.031
37.743
35.004
41.485
54.803
69.492
40.911
48.652
72.219
85.428
19
8.144
14.938
24.106
25.462
17.665
27.53
36.045
37.75
35.015
41.5
54.814
69.498
40.917
48.659
72.225
85.431
2.5. ITERATIVE METHODS FOR SOLVING LINEAR SYSTEMS
n
t1
t2
t3
t4
t5
t6
t7
t8
t9
t10
t11
t12
t13
t14
t15
t16
20
8.151
14.947
24.114
25.466
17.674
27.541
36.055
37.755
35.023
41.509
54.822
69.502
40.92
48.664
72.229
85.433
21
8.155
14.953
24.118
25.468
17.68
27.549
36.061
37.758
35.027
41.516
54.827
69.504
40.923
48.667
72.232
85.434
22
8.158
14.956
24.121
25.47
17.684
27.554
36.065
37.76
35.03
41.52
54.83
69.506
40.924
48.669
72.233
85.435
n
t1
t2
t3
t4
t5
t6
t7
t8
t9
t10
t11
t12
t13
t14
t15
t16
23
8.16
14.959
24.123
25.471
17.686
27.557
36.068
37.761
35.033
41.522
54.832
69.507
40.925
48.67
72.234
85.435
30
8.163
14.963
24.127
25.473
17.691
27.563
36.072
37.764
35.036
41.527
54.836
69.509
40.927
48.673
72.236
85.436
24
8.161
14.961
24.125
25.471
17.688
27.56
36.069
37.762
35.034
41.524
54.834
69.508
40.926
48.671
72.235
85.436
31
8.164
14.963
24.127
25.473
17.691
27.563
36.073
37.764
35.036
41.527
54.836
69.509
40.927
48.673
72.236
85.436
25
8.162
14.962
24.126
25.472
17.689
27.561
36.071
37.763
35.035
41.525
54.835
69.508
40.926
48.672
72.235
85.436
32
8.164
14.964
24.127
25.473
17.691
27.563
36.073
37.764
35.036
41.527
54.836
69.509
40.927
48.673
72.236
85.436
121
26
8.163
14.962
24.126
25.472
17.69
27.562
36.071
37.763
35.035
41.526
54.835
69.509
40.927
48.672
72.236
85.436
33
8.164
14.964
24.127
25.473
17.691
27.564
36.073
37.764
35.036
41.527
54.836
69.509
40.927
48.673
72.236
85.436
27
8.163
14.963
24.127
25.472
17.69
27.562
36.072
37.763
35.036
41.526
54.836
69.509
40.927
48.672
72.236
85.436
28
8.163
14.963
24.127
25.472
17.69
27.563
36.072
37.763
35.036
41.527
54.836
69.509
40.927
48.672
72.236
85.436
29
8.163
14.963
24.127
25.473
17.691
27.563
36.072
37.763
35.036
41.527
54.836
69.509
40.927
48.673
72.236
85.436
34
8.164
14.964
24.127
25.473
17.691
27.564
36.073
37.764
35.036
41.527
54.836
69.509
40.927
48.673
72.236
85.436
Column 34 gives the equilibrium temperature to an accuracy of 0.001.
27. (a) Let x1 correspond to the left end of the paper and x2 to the right end, and let n be the number
of folds. Then the first six values of [x1 , x2 ] are
n
x1
x2
0
0
1
1
0
1
2
2
3
4
5
6
1
4
1
2
1
4
3
8
5
16
3
8
5
16
11
32
21
64
11
32
The graph below shows these points, together with the two lines determined in part (b):
1.
0.9
0.8
0.7
0.6
0.5
0.4
0.3
0.2
0.1
0.
0.
0.1
0.2
0.3
0.4
0.5
0.6
0.7
0.8
0.9
1.
122
CHAPTER 2. SYSTEMS OF LINEAR EQUATIONS
(b) From the table, we see that x1 = 0 gives x2 = 12 in the next iteration, and x1 = 41 gives x2 = 38 in
the next iteration. Similarly, x2 = 1 gives x1 = 0 in the next iteration, and x2 = 21 gives x1 = 14
in the next iteration. Thus
x2 −
3
−1
1
1
1
= 81 2 (x1 − 0), or x2 = − x1 +
2
2
2
4 −0
1
−0
1
1
(x2 − 1), or x1 = − x2 +
2
2
−
1
2
x1 − 0 = 41
(c) Switching to decimal representations, we apply Gauss-Seidel to these equations after n = 6,
21
and x2 = 11
starting with the estimates x1 = 64
32 , to obtain an approximate point of convergence:
n
x1
x2
0
0
1
1
0
1
2
2
3
4
5
6
1
4
1
2
1
4
3
8
5
16
3
8
5
16
11
32
21
64
11
32
7
0.328
0.336
8
0.332
0.334
9
0.333
0.333
10
0.333
0.333
(d) Row-reduce the augmented matrix for this pair of equations, giving
"
# "
#
1
1 12
1 0 13
2
.
→
−
1 21 12
0 1 13
The exact solution to the system of equations is [x1 , x2 ] = 13 , 13 . The ends of the piece of paper
converge to 13 .
28. (a) Let x1 record the positions of the left endpoints and x2 the positions of the right endpoints at the
end of each walks. Then the first six values of [x1 , x2 ] are
n
x1
x2
0
0
1
2
1
2
3
4
5
6
1
4
1
2
1
4
5
8
5
16
5
8
5
16
21
32
21
64
21
32
21
64
85
128
The graph below shows these points, together with the two lines determined in part (b):
1.
0.9
0.8
0.7
0.6
0.5
0.4
0.3
0.2
0.1
0.
0.
(b)
0.1
0.2
0.3
0.4
0.5
0.6
0.7
0.8
0.9
1.
From the table, we see that x1 = 0 gives x2 = 12 in the next iteration, and x1 = 41 gives x2 = 58 in
5
the next iteration. Similarly, x2 = 12 gives x1 = 14 in the next iteration, and x2 = 85 gives x1 = 16
in the next iteration. Thus
5
−1
1
1
1
= 81 2 (x1 − 0), or x2 = x1 +
2
2
2
4 −0
5
−1
1
1
1
x1 − = 85 21 x2 −
, or x1 = x2 .
4
2
2
−
16
4
x2 −
(c) Switching to decimal representations, we apply Gauss-Seidel to these equations after n = 6,
21
85
starting with the estimates x1 = 64
and x2 = 128
, to obtain an approximate point of convergence:
n
x1
x2
0
0
1
2
1
2
3
4
5
6
1
4
1
2
1
4
5
8
5
16
5
8
5
16
21
32
21
64
21
32
21
64
85
128
7
8
9
0.332 0.333 0.333
0.666 0.667 0.667
Chapter Review
123
(d) Row-reduce the augmented matrix for this pair of equations, giving
"
# "
#
1
− 21
1
1 0 13
2
.
→
−
1 − 12 0
0 1 23
The exact solution to the system of equations is [x1 , x2 ] = 13 , 23 . So the ant eventually oscillates
between 13 and 32 .
Chapter Review
1. (a) False. Some linear systems are inconsistent, and have no solutions. See Section 2.1. One example
is x + y = 0 and x + y = 1.
(b) True. See Section 2.2. Since in a homogeneous system, the lines defined by the equations all
pass through the origin, the system has at least one solution; namely, the solution in which all
variables are zero.
(c) False. See Theorem 2.6. This is guaranteed only if the system is homogeneous, but not if it is not
(for example, x + y = 0, x + y = 1 is inconsistent in R3 ).
(d) False. See the discussion of homogeneous systems in Section 2.2. It can have either a unique
solution, infinitely many solutions, or no solutions.
(e) True. See Theorem 2.4 in Section 2.3.
(f ) False. If u and v are linearly dependent (i.e., one is a multiple of the other), we get a line, not
a plane. And if u and v are both zero, we get only the origin. If the two vectors are linearly
independent, then their span is indeed a plane through the origin.
(g) True. If u and v are parallel, then v = cu, so that −cu + v = 0, so that u and v are linearly
dependent.
(h) True. If they can be drawn that way, then the sum of the vectors is 0, since tracing the vectors
in turn returns you to the starting point.
(i) False. For example, u = [1, 0, 0], v = [0, 1, 0] and w = [1, 1, 0] are linearly dependent since
u + v − w = 0, yet none of these is a multiple of another.
(j) True. See Theorem 2.8 in Section 2.3. m vectors in Rn are linearly dependent if m > n.
2. The rank is the number of nonzero rows in row echelon form:





1
1 −2
0
3 2
1 −2
0
3 2
0

 R ↔R 0 −5 −1
3 −1
6
2
1
3
4
 R −3R1 +2R2 
 2 4 

3
−−−→ 0
4
2 −3 2 −−3−−−−
4
2 −3 2 −−−−−→ 3
R
−3R
1 +R2
3 −1
1
3 4 −−4−−−−
0 −5 −1
6 2
−−→ 0

−2
0 3 2
−5 −1 6 2

0
0 0 0
0
0 0 0
This row echelon form of A has two nonzero rows, so that rank(A) = 2.
3. Construct the augmented matrix of the system and row-reduce it:








1 1 −2 4
1
1 −2
4
1 1 −2 4
1 1 −2
4
R2 −R1
1 3 −1 7 −
0 2
0 2
−−−−→ 0
2
1
3
1 3
1
3
−2R3
R3 −R2
R3 −2R1
2 1 −5 7 −
0
−1
−1
−1
0
2
2
2
0
0
1
−1
−−−→
−−−−−→
−−−−→






R +2R3
R1 −R2
−−1−−−→
1 1 0
2 1
1 1 0
2 −
0
−−−−→ 1 0 0
R2 +R3 0 2 0
0 1 0
2 R2 0
4 −−
1 0
2
2 .
−→
−−−
−−→
0 0 1 −1
0 0 1 −1
0 0 1 −1
Thus the solution is
   
x1
0
x2  =  2 .
x3
−1
124
CHAPTER 2. SYSTEMS OF LINEAR EQUATIONS
4. Construct the augmented matrix and row-reduce it:

3 8
1 2
1 3
−18 1
−4 0
−7 1

 R2 


1 2 −4 0 11
35 −−→ 1 2 −4 0 11
R2 −R1
R3 
11−
−→ 1 3 −7 1 10 −−−−−→ 0 1 −3 1 −1
R3 −3R1
R1
10 −
2
−→ 3 8 −18 1 35 −−−−−→ 0 2 −6 1




1 2 −4
0 11
1 2 −4 0 11
0 1 −3
0 1 −3 1 −1
1 −1
R3 −2R2
−R3
0 −1
4 −−−→ 0 0
0 1 −4
−−−−−→ 0 0

 R1 −2R2 

2 0
1 2 −4 0 11 −−−−−→ 1 0
5
R2 −R3 
0 1 −3 0
3
3
−−−
−−→ 0 1 −3 0
0 0
0 1 −4
0 0
0 1 −4
Then y = t is a free variable, and the general solution is w = 5 − 2t, x = 3 + 3t, y = t, z = −4:
 
  
  
−2
5
5 − 2t
w
 3
 x  3 + 3t  3
 
  
 =
 y   t  =  0 + t  1 .
0
−4
−4
z
5. Construct the augmented matrix and row-reduce it over Z7 :
4R1 R1 +4R2 2 3 4 −
1 5 2 −
−→ 1 5 2
−−−−→ 1
R2 +6R1
1 2 3
1 2 3 −
0
−−−−→ 0 4 1
Thus the solution is
0
4
6
.
2
x
6
=
.
y
2
6. Construct the augmented matrix and row-reduce over Z5 :
3 2
3 2 1
R2 +3R1
1 4 2 −
−−−−→ 0 0
1
0
So the solutions are all of the form 3x + 2y = 1. Setting y = 0, 1, 2, 3, 4 in turn, we get
x
2
y = 0 ⇒ 3x = 1 ⇒ x = 2 ⇒
=
y
0
x
3
y = 1 ⇒ 3x + 2 = 1 ⇒ 3x = 4 ⇒ x = 3 ⇒
=
y
1
x
4
y = 2 ⇒ 3x + 4 = 1 ⇒ 3x = 2 ⇒ x = 4 ⇒
=
y
2
x
0
y = 3 ⇒ 3x + 1 = 1 ⇒ x = 0 ⇒
=
y
3
x
1
y = 4 ⇒ 3x + 3 = 1 ⇒ 3x = 3 ⇒ x = 1 ⇒
=
.
y
4
7. Construct the augmented matrix and row-reduce it:
k 2 1 R2 ↔R1 1 2k 1
1
R2 −kR1
1 2k 1 −−−−−→ k 2 1 −
−−−−→ 0
2k
2 − 2k 2
1
1−k
The system is inconsistent when 2 − 2k 2 = −2(k + 1)(k − 1) = 0 but 1 − k 6= 0. The first of these is
zero for k = −1 and for k = 1; when k = −1 we have 1 − k = 2 6= 0. So the system is inconsistent
when k = −1.
Chapter Review
125
8. Form the augmented matrix of the system and row reduce it (see Example 2.14 in Section 2.2):
1
5
2
6
3
7
4
1
R2 −5R1
8 −
0
−−−−→
2
3
−4 −8
4
1 2 3
1
− 4 R2
−12 −
0
1 2
−−−→
R1 −2R2 4 −
−−−−→ 1 0
3
0 1
−1
2
−2
3
Thus z = t is a free variable, and x = −2 + t, y = 3 − 2t, so the parametric equation of the line is
  
  
 
x
−2 + t
−2
1
y  =  3 − 2t  =  3 + t −2 .
z
t
0
1
9. As in Example 2.15 in Section 2.2, we wish to find x = [x, y, z] that satisfies both equations simultaneously; that is, x = p + su = q + tv. This means su − tv = q − p. Substituting the given vectors
into this equation gives
 
       
1
−1
5
1
4
s + t = 4
s −1 − t  1 = −2 − 2 = −4 ⇒ −s − t = −4
2
1
−4
3
−7
2s − t = −7
Form the augmented matrix of this system and row-reduce it to determine s and t:

1
−1
2
1
−1
−1


4
1
R2 ↔R3 
−4 −
2
−−−−→
−7
−1
1
−1
−1


4
1
R −2R1
0
7 −−2−−−→
R3 +R1
−4 −
−−−−→ 0

4
−15
0

1
1
− 3 R2 
−−−
−→ 0
0
1
−3
0
1
1
0
 R1 −R2 
4 −
−−−−→ 1 0
0 1
5
0
0 0

−1
5
0
Thus s = −1 and t = 5, so the point of intersection is
       
x
1
1
0
y  = 2 − −1 = 3 .
z
3
2
1
(We could equally well have substituted t = 5 into the equation for the other line; check that this gives
the same answer.)
10. As in Example 2.18 in Section 2.3, we want to find scalars x and y such that
 
   
1
1
3
x + y = 3
x + 2y = 5
x 1 + y  2 =  5 ⇒
3
−2
−1
3x − 2y = −1
Form the augmented matrix of this system and row-reduce it to determine x and y:

1
1
3
1
2
−2


3
1
R2 −R1
5 −−−
−−→ 0
R3 −3R1
−1 −
−−−−→ 0
1
1
−5


3
1 0
R1 −R2
2 −−−
−−→ 0 1
R3 +5R3
−10 −
−−−−→ 0 0

1
2
0
So this system has a solution, and thus so the vector is in the span of the other two vectors:
   
 
3
1
1
 5 = 1 + 2  2  .
−1
3
−2
126
CHAPTER 2. SYSTEMS OF LINEAR EQUATIONS
11. As in Example 2.21 in Section 2.3, the equation of the plane we are looking for is
 
 
 
x
1
3
s + 3t = x
y  = s 1 + t 2 ⇒ s + 2t = y
z
1
1
s + t = z
Form the augmented matrix and row-reduce it to find conditions for x, y, and z:

1 3
1 2
1 1


1
x
R2 −R1
y  −−−
−−→ 0
R3 −R2
z −
−−−−→ 0
3
−1
−2


1
x
−R2 
y − x −
0
−−→
0
z−x
3
1
−2


x
1 3
0 1
x − y
R3 +2R2
z−x −
−−−−→ 0 0

x
x−y 
x − 2y + z
We know the system is consistent, since the two given vectors span some plane. So we must have
x − 2y + z = 0, and this is the general equation of the plane we were seeking. (An alternative method
would be to take the cross product of the two given vectors, giving [−1, 2, −1], so that the plane is of
the form −x + 2y − z = d; plugging in x = y = z = 0 gives −x + 2y − z = 0, or x − 2y + z = 0.)
12. As in Example 2.23 in Section 2.3, we want to find scalars c1 , c2 , and c3 so that
 
 
   
2
1
3
0
c1  1 + c2 −1 + c3  9 = 0 .
−3
−2
−2
0
Form the augmented matrix of the resulting system and row-reduce it:
 


1 0
4 0
2
1
3 0
 1 −1
9 0 →
− 0 1 −5 0
−3 −2 −2 0
0 0
0 0
Since c1 = −4c3 , c2 = 5c3 is a solution, the vectors are linearly dependent. For example, setting
c3 = −1 gives
 
     
2
1
3
0
4  1 − 5 −1 −  9 = 0 .
−3
−2
−2
0
13. By part (a) of Exercise 21 in Section 2.3, it is sufficient to prove that ei is in span(u, v, w) ⊂ R3 , since
then we have R3 = span(e1 , e2 , e3 ) ⊂ span(u, v, w) ⊂ R3 .
(a) Begin with e1 . We want to find scalars x, y, and z such that
     
 
1
1
0
1
x 1 + 0 + 1 = e1 = 0 .
0
1
1
0
Form the augmented matrix and row reduce:

 

1
1 0 0
1 1 0 1
2

 
1 
− 0 1 0
1 0 1 0 →
2 
0 1 1 0
0 0 1 − 21
Similarly, for e2 ,

1 1 0

1 0 1
0 1 1
 
0
1
 
1 →
− 0
0
0
0
1
0
0
0
1
⇒
e1 =
1
1
1
u + v − w.
2
2
2
⇒
e1 =
1
1
1
u − v + w.
2
2
2

1
2

− 21 
1
2
Chapter Review
Similarly, for e3 ,

1 1 0

1 0 1
0 1 1
127
 
1 0 0
0
 
0 →
− 0 1 0
1
0 0 1
− 21

1
2
1
2


⇒
1
1
1
e1 = − u + v + w.
2
2
2
Thus R3 = span(u, v, w).
(b) Since w = u + v, these vectors are not linearly independent. But any set of vectors that spans
R3 must contain 3 linearly independent vectors, so span(u, v, w) 6= R3 .
14. All three statements are true, so (d) is the correct answer. If the vectors are linearly independent, and
b is any vector in R3 , then the augmented matrix [A | b], when reduced to row echelon form, cannot
have a zero row (since otherwise, the portion of the matrix corresponding to A would have a zero row,
so the three vectors would be linearly dependent). Thus [A | b] has a unique solution, which can be
read off from the row-reduced matrix. This proves statement (c). Next, since there is not a zero row,
continuing and putting the system in reduced row-echelon form results in a matrix with 3 leading ones,
so it is the identity matrix. Hence the reduced row echelon form of A is I3 , and thus A has rank 3.
15. Since A is a 3 × 3 matrix, rank(A) ≤ 3. Since the vectors are linearly dependent, rank(A) cannot be
3; since they are not all zero, rank(A) cannot be zero. Thus rank(A) = 1 or rank(A) = 2.
16. By Theorem 2.8 in Section 5.3, the maximum rank of a 5 × 3 matrix is 3: since the rows are vectors in
R3 , any set of four or more of them are linearly dependent. Thus when we row reduce the matrix, we
must introduce at least two zero rows. The minimum rank is zero, achieved when all entries are zero.
17. Suppose that c1 (u + v) + c2 (u − v) = 0. Rearranging gives (c1 + c2 )u + (c1 − c2 )v = 0. Since u and v
are linearly independent, it follows that c1 + c2 = 0 and c1 − c2 = 0. Adding the two equations gives
2c1 = 0, so that c1 = 0; then c2 = 0 as well. It follows that u + v and u − v are linearly independent.
18. Using Exercise 21 in Section 2.3, we show first that u and v are linear combinations of u and u + v.
But this is clear: u = u, and v = (u + v) − u. This shows that span(u, v) ⊆ span(u, u + v).
Next, we show that u and u + v are linear combinations of u and v. But this is also clear. Thus
span(u, u + v) ⊆ span(u, v). Hence the two spans are equal.
19. Note that rank(A) ≤ rank([A | b]), since any zero rows in an echelon form of [A | b] are also zero rows
in a row echelon form of A, so that a row echelon form of A has at least as many zero rows as a row
echelon form of [A | b]. If rank(A) < rank([A | b]), then a row echelon form of [A | b] will contain at least
one row that is zero in the columns corresponding to A but nonzero in the last column on that row,
so that the system will be inconsistent. Hence the system is consistent only if rank(A) = rank([A | b]).
20. Row-reduce both A and B:

1 1
 2 3
−1 4

1 0
1 1
0 1





1
1 1
1
1 1
1
R −2R1
0 1 −3
0 1 −3
−1−−2−−−→
R
−5R
3
2
R
+R
3
1
1 −−−
0 5
2 −−−−−→ 0 0 17
−−→





−1
1 0 −1
1 0 −1
R2 −R1 
0 1
1−
2
2 .
−−−−→ 0 1
R3 −R2
3
0 1
3 −
0
0
1
−−−−→
Since in row echelon form each of these matrices has no zero rows, it follows that both matrices have
rank 3, so that their reduced row echelon forms are both I3 . Thus the matrices are row-equivalent.
The sequence of row operations required to derive B from A is the sequence of row operations required
to reduce A to I3 , followed by the sequence of row operations, in reverse, required to reduce B to I3 .
Chapter 3
Matrices
3.1
Matrix Operations
1. Since A and D have the same shape, this operation makes sense.
3 0
0 −3
3 0
0 −6
3+0 0−6
3
A + 2D =
+2
=
+
=
=
−1 5
−2
1
−1 5
−4
2
−1 − 4 5 + 2
−5
−6
7
2. Since D and A have the same shape, this operation makes sense.
0 −3
3 0
0 −9
6 0
0−6
−9 − 0
−6
3D − 2A = 3
−2
=
−
=
=
−2
1
−1 5
−6
3
−2 10
−6 − (−2) 3 − 10
−4
−9
−7
3. Since B is a 2 × 3 matrix and C is a 3 × 2 matrix, they do not have the same shape, so they cannot
be subtracted.
4. Since both C and B T are 3 × 2 matrices, this operation makes sense.



 
 
T
1 2
1 2
4 0
1−4
4
−2
1
C − B T = 3 4 −
= 3 4 − −2 2 = 3 − (−2)
0
2 3
5 6
5 6
1 3
5−1
 
2−0
−3
4 − 2 =  5
6−3
4
5. Since A has two columns and B has two rows, the product makes sense, and
3 0 4 −2 1
3·4+0·0
3 · (−2) + 0 · 2
3·1+0·3
12 −6
AB =
=
=
−1 5 0
2 3
−1 · 4 + 5 · 0 −1 · (−2) + 5 · 2 −1 · 1 + 5 · 3
−4 12

2
2
3
3
.
14
6. Since the number of columns of B (3) does not equal the number of rows of D (2), this matrix product
does not make sense.
7. Since B has three columns and C has three rows, the multiplication BC makes sense, and results in a
2 × 2 matrix (since B has 2 rows and C has 2 columns). Since D is also a 2 × 2 matrix, they can be
subtracted, so this formula makes sense. First,


1 2
4 −2 1 
4·1−2·3+1·5 4·2−2·4+1·6
3 6

3 4 =
BC =
=
.
0
2 3
0·1+2·3+3·5 0·2+2·4+3·6
21 26
5 6
Then
D + BC =
0
−2
−3
3
+
1
21
6
0+3
=
26
−2 + 21
129
−3 + 6
3
=
1 + 26
19
3
.
27
130
CHAPTER 3. MATRICES
8. Since B has three columns and B T has three rows, the product makes sense, and
T
4 −2 1 4 −2 1
BB =
0
2 3 0
2 3


4 0
4 −2 1 
−2 2
=
0
2 3
1 3
4 · 4 − 2 · (−2) + 1 · 1 4 · 0 − 2 · 2 + 1 · 3
21 −1
=
=
0 · 4 + 2 · (−2) + 3 · 1 0 · 0 + 2 · 2 + 3 · 3
−1 13
T
9. Since A has two columns and F has two rows, AF makes sense; since A has two rows and F has one
column, the result is a 2 × 1 matrix. So this matrix has two rows, and E has two columns, so we can
multiply by E on the left. Thus this product makes sense. Then
3 0 −1
3 · (−1) + 0 · 2
−3
AF =
=
=
and then
−1 5
2
−1 · (−1) + 5 · 2
11
−3
E(AF ) = 4 2
= 4 · (−3) + 2 · 11 = 10 .
11
10. F has shape 2 × 1. D has shape 2 × 2, so DF has shape 2 × 1. Since the number of columns (1) of F
does not equal the number of rows (2) of DF , the product does not make sense.
11. Since F has one column and E has one row, the product makes sense, and
FE =
−1 4
2
−1 · 4
2 =
2·4
−1 · 2
−4
=
2·2
8
−2
.
4
12. Since E has two columns and F has one row, the product makes sense, and
EF = 4
−1
2
= 4 · (−1) + 2 · 2 = 0 .
2
(Note that F E from Exercise 11 and EF are not the same matrix. In fact, they don’t even have the
same shape! In general, matrix multiplication is not commutative.)
13. Since B has two rows, B T has two columns; since C has two columns, C T has two rows. Thus B T C T
makes sense, and the result is a 3 × 3 matrix (since B T has three rows and C T has three columns).
Next, CB is defined (3 × 2 multiplied by 2 × 3) and also yields a 3 × 3 matrix, so its transpose is also
a 3 × 3 matrix. Thus B T C T − (BC)T is defined.



 

4 0 4·1+0·2
4·3+0·4
4·5+0·6
4 12 20
1
3
5
B T C T = −2 2
= −2 · 1 + 2 · 2 −2 · 3 + 2 · 4 −2 · 5 + 2 · 6 = 2 2 2 
2 4 6
1 3
1·1+3·2
1·3+3·4
1·5+3·6
7 15 23



 

1 2
1 · 4 + 2 · 0 1 · (−2) + 2 · 2 1 · 1 + 2 · 3
4 2 7
4
−2
1
CB = 3 4
= 3 · 4 + 4 · 0 3 · (−2) + 4 · 2 3 · 1 + 4 · 3 = 12 2 15
0
2 3
5 6
5 · 4 + 6 · 0 5 · (−2) + 6 · 2 5 · 1 + 6 · 3
20 2 23
T 


4 12 20
4 2 7
(CB)T = 12 2 15 = 2 2 2  .
20 2 23
7 15 23
Thus B T C T = (CB)T , so their difference is the zero matrix.
3.1. MATRIX OPERATIONS
131
14. Since A and D are both 2 × 2 matrices, the products in both orders are defined, and each results in a
2 × 2 matrix, so their difference is defined as well.
0 −3
3 0
0 · 3 − 3 · (−1)
0·0−3·5
3 −15
DA =
=
=
−2
1 −1 5
−2 · 3 + 1 · (−1) −2 · 0 + 1 · 5
−7
5
3 0
0 −3
3 · 0 + 0 · (−2)
3 · (−3) + 0 · 1
0 −9
AD =
=
=
−1 5 −2
1
−1 · 0 + 5 · (−2) −1 · (−3) + 5 · 1
−10
8
3 −15
0 −9
3−0
−15 − (−9)
3 −6
DA − AD =
−
=
=
.
−7
5
−10
8
−7 − (−10)
5−8
3 −3
15. Since A is 2 × 2, A2 is defined and is also a 2 × 2 matrix, so we can multiply by A again to get A3 .
3 0
3 0
3 · 3 + 0 · (−1)
3·0+0·5
9
0
2
A =
=
=
−1 5 −1 5
−1 · 3 + 5 · (−1) −1 · 0 + 5 · 5
−8 25
3 0
9
0
3 · 9 + 0 · (−8)
3 · 0 + 0 · 25
27
0
3
2
A = AA =
=
=
.
−1 5 −8 25
−1 · 9 + 5 · (−8) −1 · 0 + 5 · 25
−49 125
16. Since D is a 2 × 2 matrix, I2 − A makes sense and is also a 2 × 2 matrix. Since it has the same number
of rows as columns, squaring it (which is multiplying it by itself) also makes sense.
1 0
0 −3
1 3
.
I2 − D =
−
=
0 1
−2
1
2 0
Thus
1
(I2 − D) =
2
2
a
17. Suppose that A =
c
2 3
1·1+3·2
=
0
2·1+0·2
1·3+3·0
7
=
2·3+0·0
2
3
6
b
. Then if A2 = 0,
d
A2 =
a
c
b a
d c
2
b
a + bc
=
d
ac + cd
ab + bd
0
=
bc + d2
0
0
0
If c = 0, so that the upper right-hand
entry
is zero, then we must also have a = 0 and d = 0 so that
0 0
the diagonal entries are zero. So
is one solution. Alternatively, if c 6= 0, then ac = −cd and
1 0
thus a = −d. This will make the lower left entry zero as well, and the two diagonal entries will both
be equal to a2 + bc. So any values of a, b, and c 6= 0 with a2 = −bc will work. For example, a = b = 1,
c = −1, d = −1:
1
1
1
1
1−1
1−1
0 0
=
=
.
−1 −1 −1 −1
−1 + 1 −1 + 1
0 0
18. Note that the first column of A is twice its second column. Thus
1 0
0 0
0 0
A
=
=A
.
−2 0
0 0
0 0
1.50 1.00 2.00
, where the first row
1.75 1.50 1.00
gives the costs of shipping each product by truck, and the second row the costs when shipping by train.
Then BA gives the costs of shipping all the products to each of the warehouses by truck or train:


200 75
1.50 1.00 2.00 
650.00 462.50

150 100 =
BA =
.
1.75 1.50 1.00
675.00 406.25
100 125
19. The cost of shipping one unit of each product is given by B =
132
CHAPTER 3. MATRICES
For example, the upper left entry in the product is the sum of 1.50 · 200, the cost of shipping doohickies
to warehouse 1 by truck, 1.00 · 150, the cost of shipping gizmos to warehouse 1 by truck, and 2.00 · 100,
the cost of shipping widgets to warehouse 1 by truck. So 650.00 is the cost of shipping all of the items
to warehouse 1 by truck. Comparing, we see that it is cheaper to ship items to warehouse 1 by truck
($650.00 vs $675.00), but to warehouse 2 by train ($406.25 vs $462.50).
0.75 0.75 0.75
20. The cost of distributing one unit of the product is given by C =
, where the entry
1.00 1.00 1.00
in row i, column j is the cost of shipping one unit of product j from warehouse i. Then the cost of
distribution is


200 75
0.75 0.75 0.75 
337.50 225.00
150 100 =
CA =
1.00 1.00 1.00
450.00 300.00
100 125
In this product, the entry in row i, column i represents the total cost of shipping all the products from
warehouse i. The off-diagonal entries do not represent anything in particular, since they are a result
of multiplying shipping costs from one warehouse with products from the other warehouse.
 
x1
1 −2
3  
0
x2 =
21.
, so that Ax = b, where
2
1 −5
4
x3
 
x1
1 −2
3
0


.
A=
, x = x2 , b =
2
1 −5
4
x3

−1
22.  1
0
   
1
0 2 x1
−1 0 x2  = −2, so that Ax = b, where
−1
1 1 x3


 
−1
0 2
x1
A =  1 −1 0 , x = x2  ,
0
1 1
x3


1
b = −2 .
−1
23. The column vectors of B are


2
b1 =  1 ,
−1


3
b2 = −1 ,
6
 
0
b3 = 1 ,
4
so the matrix-column representation is


 
   
1
0
−2
4
Ab1 = 2 −3 + 1 1 − 1  1 = −6
2
0
−1
5
 
 
   
1
0
−2
−9
Ab2 = 3 −3 − 1 1 + 6  1 = −4
2
0
−1
0
 
 
   
1
0
−2
−8
Ab3 = 0 −3 + 1 1 + 4  1 =  5 .
2
0
−1
−4
Thus

4
AB = −6
5

−9 −8
−4
5 .
0 −4
3.1. MATRIX OPERATIONS
133
24. The row vectors of A are
A1 = 1
0
−2 ,
A2 = −3
1
1 ,
A3 = 2
0
−1 ,
so the row-matrix representation is
A1 B = 1 2 3 0 + 0 1 −1 1 − 2 −1 6 4 = 4 −9 −8
A2 B = −3 2 3 0 + 1 1 −1 1 + 1 −1 6 4 = −6 −4 5
A3 B = 2 2 3 0 + 0 1 −1 1 − 1 −1 6 4 = 5 0 −4 .
Thus

4
AB = −6
5

−9 −8
−4
5 .
0 −4
25. The outer product expansion is


 
 
1 0 −2 a1 B1 + a2 B2 + a3 B3 = −3 2 3 0 + 1 1 −1 1 +  1 −1 6
2
0
−1

 
 

2
3 0
0
0 0
2 −12 −8
6
4
= −6 −9 0 + 1 −1 1 + −1
4
6 0
0
0 0
1
−6 −4


4 −9 −8
5 .
= −6 −4
5
0 −4
26. We have


 
  

2
3
0
−7
Ba1 = 1  1 − 3 −1 + 2 1 =  6
−1
6
4
−11
 
 
   
2
3
0
3
Ba2 = 0  1 + 1 −1 + 0 1 = −1
−1
6
4
6
 
 
   
2
3
0
−1
Ba3 = −2  1 + 1 −1 − 1 1 = −4 .
−1
6
4
4
Thus


−7
3 −1
BA =  6 −1 −4 .
−11
6
4
27. We have
B1 A = 2 1 0 −2 + 3 −3 1 1 + 0 2 0 −1 = −7 3 −1
B2 A = 1 1 0 −2 − 1 −3 1 1 + 1 2 0 −1 = 6 −1 −4
B3 A = −1 1 0 −2 + 6 −3 1 1 + 4 2 0 −1 = −11 6 4 .
Thus


−7
3 −1
BA =  6 −1 −4 .
−11
6
4
4
134
CHAPTER 3. MATRICES
28. The outer product expansion is


 
 
2 3 0 b1 A1 + b2 A2 + b3 A3 =  1 1 0 −2 + −1 −3 1 1 + 1 2 0
−1
6
4

 
 

2 0 −4
−9
3
3
0 0
0
=  1 0 −2 +  3 −1 −1 + 2 0 −1
−1 0
2
−18
6
6
8 0 −4


−7
3 −1
=  6 −1 −4 .
−11
6
4
−1
29. Assume that the columns of B = b1 b2 . . . bn are linearly dependent. Then there exists a
solution to x1 b1 + x2 b2 + · · · + xn bn = 0 with at least one xi 6= 0. Now, we have
AB = A b1 b2 . . . bn = Ab1 Ab2 . . . Abn ,
so that A(x1 b1 + x2 b2 + · · · + xn bn ) = x1 (Ab1 ) + x2 (Ab2 ) + · · · + xn (Abn ) = 0, so that the columns
of AB are linearly dependent.
30. Let


a1
 a2 
 
A =  . ,
 .. 


a1 B
 a2 B 


AB =  .  .
 .. 
so that
an
an B
Since the rows of A are linearly dependent, there are xi , not all zero, such that x1 a1 +x2 a2 +· · ·+xn an =
0. But then (x1 a1 + x2 a2 + · · · + xn an )B = x1 (a1 B) + x2 (a2 B) + · · · + xn (an B) = 0, so that the rows
of AB are linearly dependent.
31. We have the block structure
A11 A12 B11
AB =
A21 A22 B21
B12
A11 B11 + A12 B21
=
B22
A21 B11 + A22 B21
A11 B12 + A12 B22
.
A21 B12 + A22 B22
Now, the blocks A12 , A21 , B12 , and B21 are all zero blocks, so any products involving those block are
zero. Thus AB reduces to
A11 B11
0
AB =
.
0
A22 B22
But
1 −1
2 3
1 · 2 − 1 · (−1)
=
0
1 −1 1
0 · 2 + 1 · (−1)
1
A22 B22 = 2 3
= 2·1+3·1 = 5 .
1
A11 B11 =
1·3−1·1
3
=
0·3+1·1
−1
2
1
So finally,
A11 B11
AB =
0
32. We have
AB = A1
B11
A2
B21
0
A22 B22

3
= −1
0
B12
= A1 B11 + A2 B21
B22
2
1
0

0
0 .
5
A1 B12 + A2 B22 .
But B11 is the zero matrix, and both A2 and B12 are the identity matrix I2 , so this expression simplifies
to
1 7 7
AB = I2 B21 A1 I2 + I2 B22 = B21 A1 + B22 =
.
−2 7 7
3.1. MATRIX OPERATIONS
135
33. We have the block structure
A11 A12 B11
AB =
A21 A22 B21
B12
A11 B11 + A12 B21
=
B22
A21 B11 + A22 B21
A11 B12 + A12 B22
.
A21 B12 + A22 B22
Now, B21 is the zero matrix, so products involving that block are zero. Further, A12 , A21 , and B11 are
all the identity matrix I2 . So this product simplifies to
A11 I2 A11 B12 + I2 B22
A11 A11 B12 + B22
AB =
=
.
I2 I2 I2 B12 + A22 B22
I2 B12 + A22 B22
The remaining products are
1
3
0 1
2 1
0 −1
2 0
+ B22 =
+
=
1 0
4 3
1
0
5 3
−1
1 0 −1
0 1
1
1
1
B12 + A22 B22 = B12 +
=
+
=
1 −1 1
0
1 0
−1 −1
0
A11 B12 + B22 =
2
4
Thus

1
3
AB = 
1
0
34. We have the block structure
A11 A12 B11
AB =
A21 A22 B21
2
4
0
1
2
5
1
0
2
.
−1

0
3
.
2
−1
B12
A11 B11 + A12 B21
=
B22
A21 B11 + A22 B21
A11 B12 + A12 B22
.
A21 B12 + A22 B22
Now, A11 is the identity matrix I3 , and A21 is the zero matrix. Further, B22 is a 1 × 1 matrix, as is
A22 , so they both behave just like scalar multiplication. Thus this product reduces to
I3 B11 + A12 B21 I3 B12 − A12
B11 + A12 B21 B12 − A12
AB =
=
.
4B21
4 · (−1)
4B21
−4
Compute the matrices in the first row:
 

1 1
B11 + A12 B21 = B11 + 2 1 1 1 = 0
3
0
     
1
1
0
B12 − A12 = 1 − 2 = −1 .
1
3
−2
So

2
2
AB = 
3
4
3
3
3
4
4
6
4
4
2
1
0
 
3
1
4 + 2
1
3
1
2
3
 
1
2
2 = 2
3
3

0
−1
.
−2
−4
35. (a) Computing the powers of A as required, we get
−1 1
−1
0
0 −1
2
3
4
A =
, A =
, A =
,
−1 0
0 −1
1 −1
1 −1
1 0
0 1
A5 =
, A6 =
= I2 , A7 =
= A.
1
0
0 1
−1 1
Note that A6 = I2 and A7 = A.
3
3
3

4
6
4
136
CHAPTER 3. MATRICES
(b) Since A6 = I2 , we know that A6k = A6
2015 = 2010 + 5 = 6 · 370 + 5, we have
A
36. We have
"
B2 =
2015
=A
#
0
−1
1
0
B3 = B2B =
,
6·370+5
k
=A
"
− √12
− √12
√1
2
− √12
= I2k = I2 for any positive integer value of k. Since
6·370
1
A = I2 A = A =
1
5
5
#
,
5
−1
0
"
−1
#
0
0
−1
B 4 = (B 2 )2 =
"
,
B8 =
1
#
0
0
1
= I2
Thus any power of B whose exponent is divisible by 8 is equal to I2 . Now, 2015 = 8 · 251 + 7, so that
# "
#
"
# "
√1
√1
− √12 − √12
0 −1
2015
8·251
7
7
4
3
2
2
=
B
=B
·B =B =B ·B =
·
√1
− √12 √12
− √12
1
0
2
1 n
37. Using induction, we show that A =
for n ≥ 1. The base case, n = 1, is just the definition of
0 1
1
A = A. Now assume that this equation holds for k. Then
n
A
k+1
1
=A A
inductive k hypothesis 1
=
0
1
1
1
0
k
1
=
1
0
k+1
.
1
So by induction, the formula holds for all n ≥ 1.
38. (a)
cos θ
A =
sin θ
2
− sin θ
cos θ
cos θ
sin θ
2
− sin θ
cos θ − sin2 θ
=
cos θ
2 cos θ sin θ
−2 cos θ sin θ
.
cos2 θ − sin2 θ
But cos2 θ − sin2 θ = cos 2θ and 2 cos θ sin θ = sin 2θ, so that we get
cos 2θ − sin 2θ
2
A =
.
sin 2θ
cos 2θ
(b) We have already proven the cases n = 1 and n = 2, which serve as the basis for the induction.
Now assume that the given formula holds for k. Then
inductive hypothesis cos θ − sin θ
cos kθ − sin kθ
Ak+1 = A1 Ak
=
sin θ
cos θ sin kθ
cos kθ
cos θ cos kθ − sin θ sin kθ −(cos θ sin kθ + sin θ cos kθ)
.
=
sin θ cos kθ + cos θ sin kθ
cos θ cos kθ − sin θ sin kθ
But cos θ cos kθ − sin θ sin kθ = cos(θ + kθ) = cos(k + 1)θ, and sin θ cos kθ + cos θ sin kθ = sin(θ +
kθ) = sin(k + 1)θ, so this matrix becomes
cos(k + 1)θ − sin(k + 1)θ
k+1
A
=
.
sin(k + 1)θ
cos(k + 1)θ
So by induction, the formula holds for all k ≥ 1.
39. (a) aij = (−1)i+j means that if i + j is even, then the corresponding entry is 1; otherwise, it is −1.


1 −1
1 −1
−1
1 −1
1
.
A=
 1 −1
1 −1
−1
1 −1
1
3.1. MATRIX OPERATIONS
137
(b) Each entry is its row number minus its column number:


0 −1 −2 −3
1
0 −1 −2
.
A=
2
1
0 −1
3
2
1
0
(c)

(1 − 1)1
(2 − 1)1
A=
(3 − 1)1
(4 − 1)1
(1 − 1)2
(2 − 1)2
(3 − 1)2
(4 − 1)2
(1 − 1)3
(2 − 1)3
(3 − 1)3
(4 − 1)3
 
0
(1 − 1)4
1
(2 − 1)4 
=
(3 − 1)4  2
3
(4 − 1)4
0
1
4
9
0
1
8
27

0
1
.
16
81
(d)

sin (1+1−1)π
sin (1+2−1)π
sin (1+3−1)π
4
4
4

(2+1−1)π
(2+2−1)π
(2+3−1)π
sin
sin
sin
4
4
4
A=
sin (3+1−1)π sin (3+2−1)π sin (3+3−1)π

4
4
4
(4+2−1)π
(4+3−1)π
sin (4+1−1)π
sin
sin
4
4
4


π
π
3π
sin 4
sin 2 sin 4
sin π


π
3π
 sin 2 sin 4

sin π sin 5π
4 

=
3π
5π
3π 
sin π sin 4 sin 2 
sin 4
sin π sin 5π
sin 3π
sin 7π
4
2
4
√
√


2
2
1
0
2
√
√ 
 2
2
2
1
0
−
2
2 
√
=
.
 √2
0
− 22
−1 
 2

√
√
0 − 22
−1 − 22
sin (1+4−1)π
4



sin (2+4−1)π
4

(3+4−1)π 
sin

4
sin (4+4−1)π
4
40. (a) This matrix has zeros whenever the row number exceeds the column number, and the nonzero
entries are the sum of the row and column numbers:


2 3 4 5 6 7
0 4 5 6 7 8 


0 0 6 7 8 9 
.

A=

0 0 0 8 9 10
0 0 0 0 10 11
0 0 0 0 0 12
(b) This matrix has 1’s on the main diagonal and one diagonal away from the main diagonal, and
zeros elsewhere:


1 1 0 0 0 0
1 1 1 0 0 0


0 1 1 1 0 0

.
A=

0 0 1 1 1 0
0 0 0 1 1 1
0 0 0 0 1 1
(c) This matrix is zero except where the row number plus the column number is 6, 7, or 8:


0 0 0 0 1 1
0 0 0 1 1 1


0 0 1 1 1 0

A=
0 1 1 1 0 0 .


1 1 1 0 0 0
1 1 0 0 0 0
138
CHAPTER 3. MATRICES
41. Let A be an m × n matrix with rows a1 , a2 , . . . , am . Since ei has a 1 in the ith position and zeros
elsewhere, we get
ei A = 0 · a1 + 0 · a2 + · · · + 1 · ai + · · · + 0 · am = ai ,
which is the ith row of A.
3.2
Matrix Algebra
1. Since X − 2A + 3B = O, we have X = 2A − 3B, so
1 2
−1 0
2 · 1 − 3 · (−1)
X=2
−3
=
3 4
1 1
2·3−3·1
2. Since 2X = A − B, we have X = 21 (A − B), so
1
1
1 2
−1
X = (A − B) =
−
3 4
1
2
2
2·2−3·0
5
=
2·4−3·1
3
4
.
5
1
1 2
0
=
1
2 2
1
2
=
3
1
#!
"
0
2 −1
=
3
1
5
# "
2
−2
= 103
6
3
3
2
3. Solving for X gives
2
2
X = (A + 2B) =
3
3
"
1
3
#
"
2
−1
+2
4
1
4
3
#
4
.
4. The given equation is 2A − 2B + 2X = 3X − 3A; solving for X gives X = 5A − 2B.
1 2
−1 0
7 10
−2
=
X = 5A − 2B = 5
3 4
1 1
13 18
5. As in Example 3.16, we form a matrix whose columns consist of the entries of the matrices in question
and row-reduce it:

 

1 0 2
1 0 2
 2 1 5 0 1 1

− 

−1 2 0 →
0 0 0
1 1 3
0 0 0
Thus
2
0
5
1
=2
3
−1
2
0
+
1
2
1
.
1
6. As in Example 3.16, we form a matrix whose columns consist of the entries of the matrices in question
and row-reduce it:




1
0 1
2
1 0 0
3
0 −1 1


3

 → 0 1 0 −4
0 0 1 −1
0
1 0 −4
1
0 1
2
0 0 0
0
so that
2
−4
3
1
=3
2
0
0
0
−4
1
1
−1
1
−
0
0
1
1
7. As in Example 3.16, we form a matrix whose columns consist of the entries of the matrices in question
and row-reduce it:




1 −1 1 3
1 0 0 0
 0
0 1 0 0
2 1 1





0 0 1 0
−1
0 1 1


.
→
 0

0 0 0


0 0 0 1
 1


1 0 1
0 0 0 0
0
0 0 0
0 0 0 0
So the system is inconsistent, and B is not a linear combination of A1 , A2 , and A3 .
3.2. MATRIX ALGEBRA
139
8. As in Example 3.16, we form a matrix whose columns consist of the entries of the matrices in question
and row-reduce it:




1 0 0 0 1
1 0 −1
1
2
0 1 0 0 2
0 1
0 −1 −2




0 0 1 0 3
0 1 −1

1
3




0 0 0 1 4
0 0
0
0
0





1 0

0
1 −1

 → 0 0 0 0 0
0 0 0 0 0
0 1

0
−1
−2




0 0 0 0 0
0 0
0
0
0




0 0 0 0 0
0 0
0
0
0
2
1 0 −1
1
0 0 0 0 0
Since the linear system is consistent, we conclude that B = A1 + 2A2 + 3A3 + 4A4 .
w x
9. As in Example 3.17, let
be an element in the span of the matrices. Then form a matrix whose
y z
columns consist of the entries of the matrices in question and row-reduce it:

 

1 0 w
1 0
w
 2 1 x 0 1
x − 2w 

− 

−1 2 y  →
0 0 y + 5w − 2x .
1 1 z
0 0
z+w−x
This gives the restrictions y + 5w − 2x = 0, so that y = 2x − 5w, and z + w − x = 0, so that z = x − w.
So the span consists of matrices of the form
w
x
.
2x − 5w x − w
w x
10. As in Example 3.17, let
be an element in the span of the matrices. Then form a matrix whose
y z
columns consist of the entries of the matrices in question and row-reduce it:




1 0 0 w + x−y
1
0 1 w
2
0 1 0

0 −1 1 x
x+y




2
→


y−x
0 0 1

0
1 0 y
2
1
0
1
z
0
0
0
z−w
This gives the single restriction z − w = 0, so that z = w. So the span consists of matrices of the form
w x
.
y w
w x
11. As in Example 3.17, let
be an element in the span of the matrices. Then form a matrix whose
y z
columns consist of the entries of the matrices in question and row-reduce it:




1 −1 1 a
1 0 0
e
 0
0 1 0

2 1 b
c+e




−1

0 0 1

0
1
c
b
−
2c
−
2e
→


 0

0 0 0 a + 3b − 4c − 5e .
0
0
d





 1
0 0 0
1 0 e
d
f
0
0 0 f
0 0 0
We get the restrictions a + 3b − 4c − 5e = 0, so that a = −3b + 4c + 5e, as well as d = f = 0. Thus the
span consists of matrices of the form
−3b + 4c + 5e b c
.
0
e 0
140
CHAPTER 3. MATRICES


x1 x2 x3
12. As in Example 3.17, let x4 x5 x6  be an element in the span of the matrices. Then form a matrix
x7 x8 x9
whose columns consist of the entries of the matrices in question and row-reduce it:




x1 +x5
1 0 0 0
1 0 −1
1 x1
2
0 1 0 0
0 1

1
0 −1 x2 
x3 + x5 −x




2





0 1 −1

1
x
0
0
1
0
x
+
x
−
x
−
x
3
3
5
1
2



0 0 0 1 x3 − x2 + x5 −x1 
0 0
0
0 x4 




2




1 −1 x5  → 0 0 0 0
x6 − x2
1 0






0 1


0
−1
x
0
0
0
0
x
6
4




0 0


x7
0
0 x7 
0 0 0 0






0 0 0 0
0 0

0
0 x8 
x8
1
0
−1
1
0
x9
0
0
x9 − x1
0
We get the restrictions
x6 = x2 ,
x9 = x1 ,
Thus the span consists of matrices of the form

x1
0
0
x2
x5
0
x4 = x7 = x8 = 0.

x3
x6  .
x1
13. Following Example 3.18, we construct an augmented matrix whose columns are the entries of the given
matrices and with zeros in the constant column, and row-reduce it:




1 4 0
1 0 0
2 3 0
0 1 0




3 2 0 → 0 0 0
4 1 0
0 0 0
Since the only solution is the trivial solution, we conclude that these matrices are linearly independent.
14. Following Example 3.18, we construct an augmented matrix whose columns are the entries of the given
matrices and with zeros in the constant column, and row-reduce it:




1
2 1 0
1 0 13 0
2
0 1 1 0
1 1 0




3

→

4 −1 1 0
0 0 0 0
3
0
1
0
0
0
0
0
Since the system has a nontrivial solution, the original matrices are not linearly independent; in fact,
from the solutions we get from the row-reduced matrix,
1 1 2
1
2 1
1 1
+
=
1 1
3 4 3
3 −1 0
15. Following Example 3.18, we construct an augmented matrix whose columns are the entries of the given
matrices and with zeros in the constant column, and row-reduce it:




1 0 0 0 0
0 1 −2 −1 0
0 1 0 0 0
 1 0 −1 −3 0







 5 2
0
1
0
 → 0 0 1 0 0

 2 3

0 0 0 1 0
1
9
0




0 0 0 0 0
−1 1
0
4 0
0 1
2
5 0
0 0 0 0 0
Since the only solution is the trivial solution, we conclude that these matrices are linearly independent.
3.2. MATRIX ALGEBRA
141
16. Following Example 3.18, we construct an augmented matrix whose columns are the entries of the given
matrices and with zeros in the constant column, and row-reduce it:

1
−1

 0

 0

 2

 0

 0

 2
6
2
1
0
0
3
0
0
4
9
1
2
0
0
1
0
0
3
5
−1
1
0
0
−1
0
0
0
−4


1
0
0
0


0
0


0

0


0 → 
0
0
0


0

0

0

0
0
0
0
1
0
0
0
0
0
0
0
0
0
1
0
0
0
0
0
0
0
0
0
1
0
0
0
0
0

0
0

0

0

0

0

0

0
0
The only solution is the trivial solution, so the matrices are linearly independent.
17. Let A, B, and C be m × n matrices.
(a)

a11
 ..
A+B = .
...
..
.
...
 
a1n
b11
..  +  ..
.   .

. . . b1n
.. 
..
.
. 
. . . bmn

am1
amn
bm1

a11 + b11 . . . a1n + b1n


..
..
..
=

.
.
.
am1 + bm1 . . . amn + bmn


b11 + a11 . . . b1n + a1n


..
..
..
=

.
.
.
bm1 + am1 . . . bmn + amn

 
b11 . . . b1n
a11 . . .
 ..


.
.
..
..
..  +  ...
= .
.
bm1
= B + A.
...
bmn
am1
...

a1n
.. 
. 
amn
142
CHAPTER 3. MATRICES
(b)

a11
 ..
(A + B) + C =  .
am1


=
 
c11 . . .
. . . b1n
..  +  ..
..
..
.
.
.   .
cm1 . . .
bm1 . . . bmn
amn

 
c11 . . . c1n
. . . a1n + b1n
  ..
.. 
..
..
..
+ .
.
.
. 
.
 
b11
a1n
..  +  ..
.   .
...
..
.
...
a11 + b11
..
.

c1n
.. 
. 
cmn
cm1 . . . cmn
am1 + bm1 . . . amn + bmn


a11 + b11 + c11
. . . a1n + b1n + c1n


..
..
..
=

.
.
.
am1 + bm1 + cm1 . . . amn + bmn + cmn

 

a11 . . . a1n
b11 + c11 . . . b1n + c1n


..  + 
..
..
..
..
=  ...

.
.
.  
.
.
am1 . . . amn

b11 . . .
 ..
..
= A +  .
.
bm1
bm1 + cm1 . . . bmn + cmn
 

b1n
c11 . . . c1n
..  +  ..
.. 
..
.
.   .
. 
bmn
cm1 . . . cmn
...
= A + (B + C).
(c)

a11
 ..
A+O = .
am1
 
. . . a1n
0
..  +  ..
..
.
.  .
. . . amn
0
 
0
a11 + 0
..  = 
..
. 
.
...
..
.
...
0
...
..
.
...
am1 + 0

a1n + 0

..

.
amn + 0

a11
 ..
= .
am1
(d)

a11
 ..
A + (−A) =  .
am1

a11
 ..
= .
...
..
.
...
...
..
.
...
  
a1n
a11
..  + −  ..
.    .
amn
am1

a1n
−a11
..  +  ..
.   .
am1
amn

a11 − a11 . . .

..
..
=
.
.
am1 − am1 . . .


0 ... 0


=  ... . . . ... 
0
= O.
...
0

−am1
...
..
.
...

a1n − a1n

..

.
amn − amn
...
..
.
...

a1n
.. 
. 
amn

−a1n
.. 
. 
−amn
...
..
.
...

a1n
..  = A.
. 
amn
3.2. MATRIX ALGEBRA
143
18. Let A and B be m × n matrices, and c and d be scalars.
(a)

a11
 ..
c(A + B) = c  .
 
a1n
b11
..  +  ..
.   .
...
..
.
...
am1
amn

a11 + b11 . . .

..
..
= c 
.
.
am1 + bm1
...
bm1

. . . b1n
.. 
..
.
. 
. . . bmn

a1n + b1n

..

.
amn + bmn

. . . c(a1n + b1n )

..
..

.
.
c(am1 + bm1 ) . . . c(amn + bmn )


ca11 + cb11 . . . ca1n + cb1n


..
..
..
=

.
.
.

c(a11 + b11 )

..
=
.
cam1 + cbm1 . . . camn + cbmn

 
ca11 . . . ca1n
cb11 . . .
 ..
  ..
.
.
..
.
.
= .
.
.
. + .
cam1

a11
 ..
= c .

cb1n
.. 
. 
...
camn
cbm1 . . . cbmn



. . . a1n
b11 . . . b1n
..  + c  ..
.. 
..
..
 .
.
.
. 
. 
. . . amn
bm1 . . . bmn
am1
= cA + cB.
(b)

a11

(c + d)A = (c + d)  ...
am1

(c + d)a11

..
=
.
(c + d)am1

ca11 + da11

..
=
.
...
..
.
...
...
..
.
...

a1n
.. 
. 
amn

(c + d)a1n

..

.
(c + d)amn

...
ca1n + d1n

..
..

.
.
cam1 + dam1 . . . camn + damn

 

ca11 . . . ca1n
da11 . . . da1n

..  +  ..
.. 
..
..
=  ...
.
.
.   .
. 
cam1

a11
 ..
= c .
am1
...
camn
dam1 . . . damn



. . . a1n
a11 . . . a1n
..  + d  ..
.. 
..
..
 .
.
.
. 
. 
. . . amn
am1 . . . amn
= cA + dA.
144
CHAPTER 3. MATRICES
(c)

da11
 ..
c(dA) = c  .
...
..
.
...
dam1
 
da1n
cda11
..  =  ..
.   .
damn
...
..
.
...
cdam1


cda1n
a11
..  = cd  ..
 .
. 
cdamn
am1
 
a11
1a1n
..  =  ..
.   .
...
..
.
...
...
..
.
...

a1n
..  .
. 
amn
(d)

a11
 ..
1A = 1  .
am1
 
1a11
. . . a1n
..  =  ..
..
.
.   .
1am1
. . . amn
...
..
.
...
1amn
am1

a1n
..  = A.
. 
amn
19. Let A and B be r × n matrices, and C be an n × s matrix, so that the product makes sense. Then

 

 
c11 . . . c1s
b11 . . . b1n
a11 . . . a1n

.. 
..   ..
..  +  ..
..
..
..
(A + B)C =  ...
.
.
.
. 
.   .
.   .
cn1 . . . cns
br1 . . . brn
ar1 . . . arn



a11 + b11 . . . a1n + b1n
c11 . . . c1s

  ..
..
..
.. 
..
..
=
 .
.
.
.
.
. 
ar1 + br1 . . . arn + brn
cn1 . . . cns


(a11 + b11 )c11 + · · · + (a1n + b1n )cn1 . . . (a11 + b11 )c1s + · · · + (a1n + b1n )cns


..
..
..
=

.
.
.
(ar1 + br1 )c11 + · · · + (arn + brn )cn1 . . . (ar1 + br1 )c1s + · · · + (ar1 + br1 )c1s


a11 c11 + · · · + a1n cn1 . . . a11 c1s + · · · + a1n cns


..
..
..
=
+
.
.
.
ar1 c11 + · · · + arn cn1 . . . ar1 c1s + · · · + ar1 cns


b11 c11 + · · · + b1n cn1 . . . b11 c1s + · · · + b1n cns


..
..
..


.
.
.
br1 c11 + · · · + brn cn1
...
br1 c1s + · · · + br1 cns
= AC + BC.
20. Let A be an r × n matrix and C be an n × s matrix, so that the product makes sense. Let k be a
scalar. Then


a11 c11 + · · · + a1n cn1 · · · a11 c1s + · · · + a1n cns


..
..
..
k(AB) = k 

.
.
.
ar1 c11 + · · · + arn cn1 · · · ar1 c1s + · · · + arn cns


k(a11 c11 + · · · + a1n cn1 ) · · · k(a11 c1s + · · · + a1n cns )


..
..
..
=

.
.
.
k(ar1 c11 + · · · + arn cn1 ) · · · k(ar1 c1s + · · · + arn cns )


(ka11 )c11 + · · · + (ka1n )cn1 · · · (ka11 )c1s + · · · + (ka1n )cns


..
..
..
=

.
.
.
(kar1 )c11 + · · · + (karn )cn1 · · · (kar1 )c1s + · · · + (karn )cns



ka11 . . . ka1n
c11 . . . c1s

..   ..
.. 
..
..
=  ...
.
.
.  .
. 
kar1
= (kA)B.
...
karn
cn1
...
cns
3.2. MATRIX ALGEBRA
145
That proves the first equality. Regrouping the expression on the second line above differently, we get


a11 (kc11 ) + · · · + a1n (kcn1 ) · · · a11 (kc1s ) + · · · + a1n (kcns )


..
..
..
k(AB) = 

.
.
.
ar1 (kc11 ) + · · · + arn (kcn1 ) · · · ar1 (kc1s ) + · · · + arn (kcns )



a11 . . . a1n
kc11 . . . kc1s

..   ..
.. 
..
..
=  ...
.
.
.  .
. 
ar1 . . . arn
kcn1 . . . kcns
= A(kB).
21. If A is m × n, then to prove Im A = A, note that


e1
 e2 
 
Im =  .  ,
 .. 
em
where ei is a standard (row) unit vector; then
 

 

A1
e1
e1 A
 e2 
 e2 A   A2 
 

 

Im A =  .  A =  .  =  .  = A.
 .. 
 ..   .. 
em
em A
Am
22. Note that
(A − B)(A + B) = A(A + B) − B(A + B) = A2 + AB − BA + B 2 = (A2 + B 2 ) + (AB − BA),
so that (A − B)(A + B) = A2 + B 2 if and only if AB − BA = O; that is, if and only if AB = BA.
23. Compute both AB and BA:
1 1 a b
a+c b+d
=
0 1 c d
c
d
a b 1 1
a a+b
BA =
=
.
c d 0 1
c c+d
AB =
So if AB = BA, then a + c = a, b + d = a + b, c = c, and d = c + d. The first equation gives c = 0 and
the second gives d = a. So the matrices that commute with A are matrices of the form
a b
.
0 a
24. Compute both AB and BA:
1 −1 a b
a−c
b−d
AB =
=
−1
1 c d
−a + c −b + d
a b
1 −1
a − b −a + b
BA =
=
.
c d −1
1
c − d −c + d
So if AB = BA, then a − c = a − b, b − d = −a + b, −a + c = c − d, and −b + d = −c + d. The first
(or fourth) equation gives b = c and the second (or third) gives d = a. So the matrices that commute
with A are matrices of the form
a b
.
b a
146
CHAPTER 3. MATRICES
25. Compute both AB and BA:
1 2 a b
a + 2c b + 2d
=
3 4 c d
3a + 4c 3b + 4d
a b 1 2
a + 3b 2a + 4b
BA =
=
.
c d 3 4
c + 3d 2c + 4d
AB =
So if AB = BA, then a + 2c = a + 3b, b + 2d = 2a + 4b, 3a + 4c = c + 3d, and 3b + 4d = 2c + 4d. The
first (or fourth) equation gives 3b = 2c and the third gives a = d − c. Substituting a = d − c into the
second equation gives 3b = 2c again. Thus those are the only two conditions, and the matrices that
commute with A are those of the form
#
"
d − c 32 c
.
c
d
26. Let A1 be the first matrix and A4 be the second. Then
1 0 a b
a
A1 B =
=
0 0 c d
0
a b 1 0
a
BA1 =
=
c d 0 0
c
0 0 a b
0
A4 B =
=
0 1 c d
c
a b 0 0
0
BA4 =
=
c d 0 1
0
b
0
0
0
0
d
b
.
d
So the conditions imposed by requiring that A1 B = BA1 and A4 B = BA4 are b = c = 0. Thus the
matrices that commute with both A1 and A4 are those of the form
a 0
0 d
27. Since we want B to commute with every2 × 2matrix, in particular it must commute with A1 and A4
a 0
from the previous exercise, so that B =
. In addition, it must commute with
0 d
A2 =
0
0
1
0
0
1
0
.
0
0
0
=
d
0
1
0
=
0
0
0
0
=
d
a
0
0
=
0
d
d
0
a
0
0
0
0
.
0
and A3 =
Computing the products, we get
0
0
a
BA2 =
0
0
A3 B =
1
a
BA3 =
0
A2 B =
1 a
0 0
0 0
d 0
0 a
0 0
0 0
d 1
The condition that arises from these equalities is a = d. Thus
a 0
B=
.
0 a
3.2. MATRIX ALGEBRA
147
Finally, note that for any 2 × 2 matrix M ,
x y
M=
= xA1 + yA2 + zA3 + wA4 ,
z w
so that if B commutes with A1 , A2 , A3 , and A4 , then it will commute with M . So the matrices B
that satisfy this condition are the matrices above: those where the off-diagonal entries are zero and
the diagonal entries are the same.
28. Suppose A is an m × n matrix and B is an a × b matrix. Since AB is defined, the number of columns
(n) of A must equal the number of rows (a) of B; since BA is defined, the number of columns (b) of
B must equal the number of rows (m) of A. Since m = b and n = a, we see that A is m × n and B is
n × m. But then AB is m × m and BA is n × n, so both are square.
29. Suppose A = (aij ) and B = (bij ) are both upper triangular n × n matrices. This means that aij =
bij = 0 if i > j. But if C = (cij ) = AB, then
ckl = ak1 b1l + ak2 b2l + · · · + akl bll + ak(l+1) b(l+1)l + · · · + akn bnl .
If k > l, then ak1 b1l + ak2 b2l + · · · + akl bll = 0 since each of the akj is zero. But also ak(l+1) b(l+1)l +
· · · + akn bnl = 0 since each of the bil is zero. Thus ckl = 0 for k > l, so that C is upper triangular.
30. Let A and B whose sizes permit the indicated operations, and let k be a scalar. Denote the ith row of
a matrix X by rowi (X) and its j th column by colj (X). Then
Theorem 3.4(a): For any i and j,
h
T i
= AT ji = A ij , so that (AT )T = A.
AT
ij
Theorem 3.4(b): For any i and j,
(A + B)T ij = A + B ji = A ji + B ji = AT ij + B ij , so that (A + B)T = AT + B T .
Theorem 3.4(c): For any i and j,
(kA)T ij = kA ji = k A ji = k A ij , so that (kA)T = k(AT ).
31. We prove that (Ar )T = (AT )r by induction on r. For r = 1, this says that (A1 )T = (AT )1 , which is
clear. Now assume that (Ak )T = (AT )k . Then
(Ak+1 )T = (A · Ak )T = (Ak )T AT = (AT )k AT = (AT )k+1 .
(The second equality is from Theorem 3.4(d), and the third is the inductive assumption).
32. We prove that (A1 + A2 + · · · + An )T = AT1 + AT2 + · · · + ATn by induction on n. For n = 1, this says
that AT1 = AT1 , which is clear. Now assume that (A1 + A2 + · · · + Ak )T = AT1 + AT2 + · · · + ATk . Then
(A1 + A2 + · · · + Ak + Ak+1 )T = ((A1 + A2 + · · · + Ak ) + Ak+1 )T
= (A1 + A2 + · · · + Ak )T + ATk+1
= AT1 + AT2 + · · · + ATk + ATk+1
(Thm. 3.4(b))
(Inductive assumption).
33. We prove that (A1 A2 · · · An )T = ATn ATn−1 · · · AT2 AT1 by induction on n. For n = 1, this says that
AT1 = AT1 , which is clear. Now assume that (A1 A2 · · · Ak )T = ATk ATk−1 · · · AT2 AT1 . Then
(A1 A2 · · · Ak Ak+1 )T = ((A1 A2 · · · Ak )Ak+1 )T
= ATk+1 (A1 A2 · · · Ak )T
= ATk+1 ATk · · · AT2 AT1
(Thm. 3.4(d))
(Inductive assumption).
148
CHAPTER 3. MATRICES
34. Using Theorem 3.4, we have (AAT )T = (AT )T AT = AAT and (AT A)T = AT (AT )T = AT A. So each
of AAT and AT A is equal to its own transpose, so both matrices are symmetric.
35. Let A and B be symmetric n × n matrices, and let k be a scalar.
(a) By Theorem 3.4(b), (A + B)T = AT + B T ; since both A and B are symmetric, this is equal to
A + B. Thus A + B is equal to its transpose, so is symmetric.
(b) By Theorem 3.4(c), (kA)T = kAT ; since A is symmetric, this is equal to kA. Thus kA is equal
to its own transpose, so is symmetric
1 1
0 1
36. (a) For example, let n = 2, with A =
and B =
. Then both A and B are symmetric,
1 0
1 0
but
1 1 0 1
1 1
AB =
=
,
1 0 1 0
0 1
which is not symmetric.
(b) Suppose A and B are symmetric. Then if AB is symmetric, we have
AB = (AB)T = B T AT = BA
using Theorem 3.4(d) and the fact that A and B are symmetric. In the other direction, if A and
B are symmetric and AB = BA, then
(AB)T = B T AT = BA = AB,
so that AB is symmetric.
1 −2
−1 −2
T
37. (a) A =
while −A =
, so AT 6= −A and A is not skew-symmetric.
2
3
2 −3
0 1
T
(b) A =
= −A, so A is skew-symmetric.
−1 0


0 −3
1
0 −2 = −A, so A is skew-symmetric.
(c) AT =  3
−1
2
0


0 −1 2
0 5 6= −A, so A is not skew-symmetric.
(d) AT = 1
2
5 0
38. A matrix A is skew-symmetric if AT = −A. But this means that for every i and j, (AT )ij = Aji = −Aij .
Thus the components must satisfy aij = −aji for every i and j.
39. By the previous exercise, we must have aij = −aji for every i and j. So when i = j, we get aii = −aii ,
which implies that aii = 0 for all i.
40. If A and B are skew-symmetric, so that AT = −A and B T = −B, then
(A + B)T = AT + B T = −A − B = −(A + B),
so that A + B is also skew-symmetric.
0 a
0 b
41. Suppose that A =
and B =
are skew-symmetric. Then
−a 0
−b 0
−ab
0
−ab
0
T
AB =
, and thus (AB) =
.
0
−ab
0
−ab
So AB = −(AB)T only if ab = −ab, which happens if either a or b is zero. Thus either A or B must
be the zero matrix.
3.2. MATRIX ALGEBRA
149
42. We must show that A−AT is skew-symmetric, so we must show that (A−AT )T = −(A−AT ) = AT −A.
But
(A − AT )T = AT − (AT )T = AT − A
as desired, proving the result.
43. (a) If A is a square matrix, let S = A + AT and S 0 = A − AT . Then clearly A = 12 S + 12 S 0 . S is
symmetric by Theorem 3.5, so 21 S is as well. S 0 is symmetric by Exercise 42, so 12 S 0 is as well.
Thus A is a sum of a symmetric and a skew-symmetric matrix.
(b) We have

 
 

1 2 3
1 4 7
2 6 10
S = A + AT = 4 5 6 + 2 5 8 =  6 10 14
7 8 9
3 6 9
10 14 18

 
 

1 2 3
1 4 7
0 −2 −4
S = A − AT = 4 5 6 − 2 5 8 = 2 0 −2 .
7 8 9
3 6 9
4 2
0
Thus

1
1 0 
1
A= S+ S = 3
2
2
5
3
5
7
 
5
0
7 + 1
9
2
 
−1 −2
1
0 −1 = 4
1
0
7
2
5
8

3
6 .
9
44. Let A and B be n × n matrices, and k be a scalar. Then
(a)
tr(A + B) = (a11 + b11 ) + (a22 + b22 ) + · · · + (ann + bnn )
= (a11 + a22 + · · · + ann ) + (b11 + b22 + · · · + bnn ) = tr(A) + tr(B).
(b) tr(kA) = ka11 + ka22 + · · · + kann = k(a11 + a22 + · · · + ann ) = k tr(A).
45. Let A and B be n × n matrices. Then
tr(AB) = (AB)11 + (AB)22 + · · · + (AB)nn
= (a11 b11 + a12 b21 + · · · + a1n bn1 ) + (a21 b12 + a22 b22 + · · · + a2n bn2 ) + · · ·
(an1 b1n + an2 b2n + · · · + ann bnn )
= (b11 a11 + b12 a21 + · · · + b1n an1 ) + (b21 a12 + b22 a22 + · · · + b2n an2 ) + · · ·
(bn1 a1n + bn2 a2n + · · · + bnn ann )
= tr(BA).
46. Note that the ii entry of AAT is equal to
(AAT )ii = (A)i1 (AT )1i + (A)i2 (AT )2i + · · · (A)in (AT )ni = a2i1 + a2i2 + · · · + a2in ,
so that
tr(AAT ) = (a211 + a212 + · · · + a21n ) + (a221 + a222 + · · · + a22n ) + (a2n1 + a2n2 + · · · + a2nn ).
That is, it is the sum of the squares of the entries of A.
47. Note that by Exercise 45, tr(AB) = tr(BA). By Exercise 44, then,
tr(AB − BA) = tr(AB) − tr(BA) = tr(AB) − tr(AB) = 0.
Since tr(I2 ) = 2, it cannot be the case that AB − BA = I2 .
150
3.3
CHAPTER 3. MATRICES
The Inverse of a Matrix
1. Using Theorem 3.8, note that
4
det
1
7
= ad − bc = 4 · 2 − 1 · 7 = 1,
2
so that
−1
1
7
2
=
2
−1
1
4
1
−7
2
=
4
−1
−7
.
4
As a check,
4
1
7
2
2 −1
−7
4 · 2 + 7 · (−1) 4 · (−7) + 7 · 4
1
=
=
4
1 · 2 + 2 · (−1) 1 · (−7) + 2 · 4
0
0
.
1
2. Using Theorem 3.8, note that
4
2
−2
= ad − bc = 4 · 0 − (−2) · 2 = 4,
0
"
4
−2
2
0
det
so that
#−1
#
2
"
0
1
=
4 −2
4
"
=
#
0
1
2
− 12
1
As a check,
"
4
2
# "
0
−2
·
1
0
−2
1
2
#
1
"
=
4 · 0 − 2 · − 21
4 · 12 − 2 · 1
2 · 0 + 0 · − 21
2 · 12 + 0 · 1
#
=
"
1
0
0
1
#
3. Using Theorem 3.8, note that
3
det
6
4
= ad − bc = 3 · 8 − 4 · 6 = 0.
8
Since the determinant is zero, the matrix is not invertible.
4. Using Theorem 3.8, note that
1
= ad − bc = 0 · 0 − 1 · (−1) = 1,
0
1
0
0
det
−1
so that
0
−1
−1
0
=1·
1
−1
0
=
0
1
−1
0
As a check,
0
−1
1
0
·
0
1
5. Using Theorem 3.8, note that
"
det
3
4
5
6
−1
0·0+1·1
=
0
−1 · 0 + 0 · 1
3
5
2
3
#
= ad − bc =
0 · (−1) + 1 · 0
1
=
−1 · (−1) + 0 · 0
0
3 2 3 5
1 1
· − · = − = 0.
4 3 5 6
2 2
Since the determinant is zero, the matrix is not invertible.
0
1
3.3. THE INVERSE OF A MATRIX
6. Using Theorem 3.8, note that
"
√1
2
− √12
det
151
√1
2
√1
2
#
1
−1 1
1 1
1
= ad − bc = √ · √ − √ · √ = + = 1
2 2
2
2
2
2
so that
"
As a check,
"
√1
2
√1
2
− √12
√1
2
− √12
# "
√1
2
− √12
·
√1
2
√1
2
√1
2
√1
2
√1
2
#
#−1
=1·
"
=
"
√1
2
√1
2
− √12
#
√1
2
"
=
√1 · √1 + − √1 · − √1
2
2
2
2
√1 · √1 + √1 · − √1
2
2
2
2
√1
2
√1
2
− √12
#
√1
2
√1 · √1 + − √1 · √1
2
2
2
2
√1 · √1 + √1 · √1
2
2
2
2
#
=
"
1
#
0
0
1
7. Using Theorem 3.8, note that
−1.5 −4.2
= ad − bc = −1.5 · 2.4 − (−4.2) · 0.5 = −3.6 + 2.1 = −1.5.
det
0.5
2.4
Then
−1
1
2.4
−4.2
·
=−
2.4
1.5 −0.5
−1.5
0.5
4.2
−1.6
=
−1.5
0.333
As a check,
−1.5 −4.2 −1.6 −2.8
−1.5 · (−1.6) − 4.2 · 0.333
=
0.5
2.4 0.333
1
0.5 · (−1.6) + 2.4 · 0.333
−2.8
1
−1.5 · (−2.8) − 4.2 · 1
1
≈
0.5 · (−2.8) + 2.4 · 1
0
0
.
1
8. Using Theorem 3.8, note that
3.55 0.25
det
= ad − bc = 3.55 · 0.5 − 0.25 · 8.52 = 1.775 − 2.13 = −0.335.
8.52 0.50
Then
−1
1
0.25
0.5
−0.25
−1.408
=−
·
≈
0.50
24
0.335 −8.52 3.55
3.55
8.52
As a check,
3.55 0.25 −1.408
8.52 0.50
24
0.704
−10
0.704
3.55 · (−1.408) + 0.25 · 24 3.55 · 0.704 + 0.25 · (−10)
1
=
≈
−10
8.52 · (−1.408) + 0.50 · 24 8.52 · 0.704 + 0.50 · (−10)
0
0
.
1
9. Using Theorem 3.8, note that
a −b
det
= a · a − (−b) · b = a2 + b2 .
b
a
Then
"
"
#−1
a
a −b
1
·
= 2
a + b2 −b
b
a
# "
a
b
2
2
= a +bb
a
− a2 +b2
b
#
a2 +b2
a
a2 +b2
As a check,
"
#"
a
a −b
a2 +b2
b
b
a − a2 +b
2
b
a2 +b2
a
a2 +b2
#

b
a
− a2 +b
a · a2 +b
2 − b ·
2
=
a
b
b · a2 +b
− a2 +b
2 + a ·
2
b
a
a · a2 +b
2 − b · a2 +b2

b
a
b · a2 +b
2 + a · a2 +b2
" 2 2

=
a +b
a2 +b2
ab−ab
a2 +b2
ab−ab
a2 +b2
a2 +b2
a2 +b2
#
"
1
=
0
#
0
.
1
152
CHAPTER 3. MATRICES
10. Using Theorem 3.8, note that
"
det
Then
"
1
a
1
c
1
b
1
d
1
a
1
c
#−1
1
b
1
d
#
=
1 1 1 1
1
1
bc − ad
· − · =
−
=
a d b c
ad bc
abcd
"
1
abcd
=
· d1
bc − ad − c
− 1b
1
a
#
"
=
abc
bc−ad
abd
− bc−ad
acd
− bc−ad
bcd
bc−ad
#
.
As a check,
"
1
a
1
c
1
b
1
d
#"
abc
bc−ad
abd
− bc−ad
acd
− bc−ad
bcd
bc−ad
#
"
=
bc
ad
bc−ad − bc−ad
ab
ab
bc−ad − bc−ad
# "
cd
cd
1
− bc−ad
+ bc−ad
=
ad
bc
− bc−ad + bc−ad
0
#
0
.
1
11. Following Example 3.25, we first invert the coefficient matrix
2 1
.
5 3
The determinant of this matrix is 2 · 3 − 1 · 5 = 1, so its inverse is
1
3 −1
3 −1
=
.
2
−5
2
1 −5
Thus the solution to the system is given by
−1 x
2 1
−1
3
=
·
=
y
5 3
2
−5
−1
−1
3 · (−1) − 1 · 2
−5
·
=
=
,
2
2
−5 · (−1) + 2 · 2
9
so the solution is x = −5, y = 9.
12. Following Example 3.25, we first invert the coefficient matrix
1 −1
.
2
1
The determinant of this matrix is 1 · 1 − (−1) · 2 = 3, so its inverse is
# "
"
#
1
1
1 1
1
3
3
·
=
3 −2 1
−2 1
3
3
and thus the solution to the system is given by
" #
x
y
=
"
1
2
#−1 " # "
1
−1
1
3
·
=
− 32
1
2
1
3
1
3
# " # "
# " #
1
1
1
1
3 ·1+ 3 ·2
·
=
=
2
− 23 · 1 + 31 · 2
0
and the solution is x = 1, y = 0.
13. (a) Following Example 3.25, we first invert the coefficient matrix
1 2
.
2 6
The determinant of this matrix is 1 · 6 − 2 · 2 = 2, so its inverse is
"
# "
#
6 −2
3 −1
1
·
=
.
1
2 −2
1
−1
2
3.3. THE INVERSE OF A MATRIX
153
Then the solutions for the three vectors given are
3 −1 3
4
x1 = A−1 b1 =
=
1
5
−1
− 12
2
3 −1 −1
−5
x2 = A−1 b2 =
=
1
2
2
−1
2
3 −1 2
6
=
.
x3 = A−1 b3 =
1
0
−2
−1
2
(b) To solve all three systems at once, form the augmented matrix A | b1
reduce to solve.
1 0
4 −5
6
1 2 3 −1 2
→
2 0
2 6 5
0 1 − 12
2 −2
b2
b3 and row
(c) The method of constructing A−1 requires 7 multiplications, while row reductions requires only 6.
14. To show the result, we must show that (cA) · 1c A−1 = I. But
1
(cA) · A−1 =
c
1
· c AA−1 = AA−1 = I.
c
15. To show the result, we must show that AT (A−1 )T = I. But
AT (A−1 )T = (A−1 A)T = I T = I,
where the first equality follows from Theorem 3.4(d).
16. First we show that In−1 = In . To see that, we need to prove that In · In = In , but that is true. Thus
In−1 = In , and it also follows that In is invertible, since we found its inverse.
17. (a) There are many possible counterexamples. For instance, let
1
2
0
1
1
AB =
2
0
1
A=
B=
1
1
0
.
−1
Then
1
1
0
1
=
−1
3
0
.
−1
Then det(AB) = 1 · (−1) − 0 · 3 = −1, det A = 1 · 1 − 0 · 2 = 1, and det B = 1 · (−1) − 0 · 1 = −1,
so that
1 −1 0
1
0
−1
(AB) =
=
3 −1
−1 −3 1
1
1 0
1 0
−1
A =
=
−2 1
1 −2 1
1 −1 0
1
0
−1
B =
=
.
1 −1
−1 −1 1
Finally,
A
and (AB)−1 6= A−1 B −1 .
−1
B
−1
1
=
−2
0 1
1 1
0
1
=
−1
−1
0
,
−1
154
CHAPTER 3. MATRICES
(b) Claim that (AB)−1 = A−1 B −1 if and only if AB = BA. First, if (AB)−1 = A−1 B −1 , then
AB = ((AB)−1 )−1 = (A−1 B −1 )−1 = (B −1 )−1 (A−1 )−1 = BA.
To prove the reverse, suppose that AB = BA. Then
(AB)−1 = (BA)−1 = A−1 B −1 .
This proves the claim.
18. Use induction on n. For n = 1, the statement says that (A1 )−1 = A−1
1 , which is clear. Now assume
that the statement holds for n = k. Then
−1
(A1 A2 · · · Ak Ak+1 )−1 = ((A1 A2 · · · Ak )Ak+1 )−1 = A−1
k+1 (A1 A2 · · · Ak )
by Theorem 3.9(c). But then by the inductive hypothesis, this becomes
−1
−1 −1
−1
−1
−1 −1
−1
A−1
= A−1
k+1 (A1 A2 · · · Ak )
k+1 (Ak · · · A2 A1 ) = Ak+1 Ak · · · A2 A1 .
So by induction, the statement holds for all k ≥ 1.
19. There are many examples. For instance, let
1 0
A=
,
0 1
B = −A =
−1
0
0
.
−1
Then A−1 = A and B −1 = B = −A, so that A−1 + B −1 = O. But A + B = O, so (A + B)−1 does not
exist.
20. XA2 = A−1 ⇒ (XA2 )A−2 = A−1 A−2 ⇒ X(A2 A−2 ) = A−1+(−2) ⇒ XI = A−3 ⇒ X = A−3 .
21. AXB = (BA)2 ⇒ A−1 (AXB)B −1 = A−1 (BA)2 B −1 ⇒ (A−1 A)X(BB −1 ) = A−1 (BA)2 B −1 . Thus
X = A−1 (BA)2 B −1 .
22. By Theorem 3.9, (A−1 X)−1 = X −1 (A−1 )−1 = X −1 A and (B −2 A)−1 = A−1 (B −2 )−1 = A−1 B 2 , so the
original equation becomes X −1 A = AA−1 B 2 = B 2 . Then
X −1 A = B 2 ⇒ XX −1 A = XB 2 ⇒ A = XB 2 ⇒ AB −2 = XB 2 B −2 = X,
so that
X = AB −2 .
23. ABXA−1 B −1 = I + A, so that
B −1 A−1 (ABXA−1 B −1 )BA = B −1 A−1 (I + A)BA = B −1 A−1 BA + B −1 A−1 ABA
But
B −1 A−1 (ABXA−1 B −1 )BA = (B −1 A−1 AB)X(A−1 B −1 BA) = X,
and
B −1 A−1 BA + B −1 A−1 ABA = B −1 A−1 BA + A.
Thus X = B −1 A−1 BA + A = (AB)−1 BA + A.


0 0 1
24. To get from A to B, reverse the first and third rows. Thus E = 0 1 0.
1 0 0


0 0 1
25. To get from B to A, reverse the first and third rows. Thus E = 0 1 0.
1 0 0


1 0 0
26. To get from A to C, add the first row to the third row. Thus E = 0 1 0.
1 0 1
3.3. THE INVERSE OF A MATRIX
155


0
0.
1

1 0
28. To get from C to D, subtract twice the third row from the second row. Thus E = 0 1
0 0


1 0 0
29. To get from D to C, add twice the third row to the second row. Thus E = 0 1 2.
0 0 1
1
27. To get from C to A, subtract the first row from the third row. Thus E =  0
−1
0
1
0

0
−2.
1
30. The row operations R2 − 2R1 − 2R3 → R2 , R3 + R1 → R3 will turn matrix A into matrix D. Thus




 

1 0
0
1 0
0 1
2 −1
1
2 −1
1
1 = −3 −1
3 = D.
E = −2 1 −2 satisfies EA = −2 1 −2 1
1 0
1
1 0
1 1 −1
0
2
1 −1
However, E is not an elementary matrix since it includes several row operations, not just one.
31. This matrix multiplies the first row by 3, so its inverse multiplies the first row by 31 :
3
0
−1 1
0
= 3
1
0
0
.
1
32. This matrix adds twice the second row to the first row, so its inverse subtracts twice the second row
from the first row:
−1 1 2
1 −2
=
.
0 1
0
1
33. This matrix reverses rows 1 and 2, so its inverse does the same thing. Hence this matrix is its own
inverse.
34. This matrix adds − 21 times the first row to the second row, so its inverse adds 12 times the first row to
the second row:
−1 1 0
1 0
=
.
1
1
− 12 1
2
35. This matrix subtracts twice the third row from the second row, so its inverse adds twice the third row
to the second row:

−1 

1 0
0
1 0 0
0 1 −2 = 0 1 2 .
0 0
1
0 0 1
36. This matrix reverses rows 1 and 3, so its inverse does the same thing. Hence this matrix is its own
inverse.
37. This matrix multiplies row 2 by c. Since c 6= 0, its inverse multiplies the second row by 1c :

1
0
0
0
c
0
−1 
1
0
0 = 0
1
0
0
1
c
0

0
0 .
1
38. This matrix adds c times the third row to the second row, so its inverse subtracts c times the third
row from the second row:

−1 

1 0 0
1 0
0
0 1 c  = 0 1 −c .
0 0 1
0 0
1
156
CHAPTER 3. MATRICES
39. As in Example 3.29, we first row-reduce A:
1
0
1
A=
R2 +R1
−1 −2 −
−−−−→ 0
0
1
1
− 2 R2
−2 −
−−−→ 0
0
.
1
The first row operation adds the first row to the second row, and the second row operation multiplies
the second row by − 12 . These correspond to the elementary matrices
1
0
1 0
E1 =
and E2 =
.
1 1
0 − 12
So E2 E1 A = I and thus A = E1−1 E2−1 . But E1−1 is R2 − R1 → R2 , and E2−1 is −2R2 → R2 , so
1 0
1
0
−1
−1
E1 =
and E2 =
,
−1 1
0 −2
so that
A = E1−1 E2−1 =
0 1
1 0
1
−1
0
.
−2
Taking the inverse of both sides of the equation A = E1−1 E2−1 gives
A−1 = (E1−1 E2−1 )−1 = (E2−1 )−1 (E1−1 )−1 = E2 E1 ,
so that
#"
1
− 21 1
# "
0
1
=
1
− 12
− 21
40. As in Example 3.29, we first row-reduce A:
2 4 R1 ↔R2 1 1
1
A=
R2 −2R1
1 1 −−−−−→ 2 4 −
−−−−→ 0
1 1
1
2 R2
2 −−
−→ 0
R1 −R2 1 −
−−−−→ 1
1
0
"
1
A−1 = E2 E1 =
0
0
Converting each of these steps to an elementary matrix gives
1 0
0 1
1 0
,
E1 =
, E2 =
, E3 =
1 0
−2 1
0 12
0
#
.
E4 =
1
0
0
.
1
−1
.
1
So E4 E3 E2 E1 A = I and thus A = E1−1 E2−1 E3−1 E4−1 . We can invert each of the elementary matrices
above by considering the row operation which reverses its action, giving
0 1
1 0
1 0
1 1
−1
−1
−1
−1
E1 =
, E2 =
, E3 =
, E4 =
,
1 0
2 1
0 2
0 1
so that
A = E1−1 E2−1 E4−1 =
0
1
1 1
0 2
0 1
1 0
0 1
2 0
1
.
1
Taking the inverse of both sides of the equation A = E1−1 E2−1 E3−1 E4−1 gives A−1 = E4 E3 E2 E1 , so that
#"
#"
# "
#
"
#"
1 0 0 1
− 12
2
1 −1 1 0
−1
A = E4 E3 E2 E1 =
=
.
1
−1
0
1 0 21 −2 1 1 0
2
41. Suppose AB = I and consider the equation Bx = 0. Left-multiplying by A gives ABx = A0 = 0.
But AB = I, so ABx = Ix = x. Thus x = 0. So the system represented by Bx = 0 has the unique
solution x = 0. From the equivalence of (c) and (a) in the Fundamental Theorem, we know that B is
invertible (i.e., B −1 exists and satisfies BB −1 = I = B −1 B.) So multiply both sides of the equation
AB = I on the right by B −1 , giving
ABB −1 = IB −1 ⇒ AI = IB −1 ⇒ A = B −1 .
This proves the theorem.
3.3. THE INVERSE OF A MATRIX
157
42. (a) Let A be invertible and suppose that AB = O. Left-multiply both sides of this equation by A−1 ,
giving A−1 AB = A−1 O. But A−1 A = I and A−1 O = O, so we get IB = B = O. Thus B is the
zero matrix.
(b) For example, let
A=
Then
1
AB =
2
1
2
2
,
4
2
2
4 −1
B=
−2
.
1
2
−1
−2
0
=
1
0
0
, but B 6= O.
0
43. (a) Since A is invertible, we can multiply both sides of BA = CA on the right by A−1 , giving
BAA−1 = CAA−1 , so that BI = CI and thus B = C.
(b) For example, let
2
,
4
1
B=
1
1
,
1
2
3
=
4
3
6
3
=
6
1
0 1
1 2
2
= CA, but B 6= C.
4
44. (a) There are many possibilities. For example:
1 0
1
A=
= I, B =
0 1
1
0
,
0
1
C=
0
A2 = I 2 = I = A
1 0 1 0
1
2
B =
=
1 0 1 0
1
1 1 1 1
1
C2 =
=
0 0 0 0
0
0
=B
0
1
= C,
0
1
A=
2
Then
1
BA =
1
1 1
1 2
3
C=
1
0
.
1
1
.
0
Then
so that A, B, and C are idempotent.
(b) Suppose that A and n × n matrix that is both invertible and idempotent. Then A2 = A since A is
idempotent. Multiply both sides of this equation on the right by A−1 , giving AAA−1 = AA−1 = I.
Then AAA−1 = A, so we get A = I. Thus A is the identity matrix.
45. To show that A−1 = 2I − A, it suffices to show that A(2I − A) = I. But A2 − 2A + I = O means that
I = 2A − A2 = A(2I − A). This equation says that A multiplied by 2I − A is the identity, which is
what we wanted to prove.
46. Let A be an invertible symmetric matrix. Then AA−1 = I. Take the transpose of both sides of this
equation:
(AA−1 )T = I T ⇒ (A−1 )T AT = I ⇒ (A−1 )T A = I ⇒ (A−1 )T AA−1 = IA−1 = A−1 .
Thus (A−1 )T AA−1 = A−1 . But also clearly (A−1 )T AA−1 = (A−1 )T I = (A−1 )T , so that (A−1 )T =
A−1 . This says that A−1 is its own transpose, so that A−1 is symmetric.
47. Let A and B be square matrices and assume that AB is invertible. Then
AB(AB)−1 = A(B(AB)−1 ) = I,
so that A is invertible with A−1 = B(AB)−1 . But this means that
B (AB)−1 A = B(AB)−1 A = I,
so that also B is invertible, with inverse (AB)−1 A.
158
CHAPTER 3. MATRICES
48. As in Example 3.30, we adjoin the identity matrix to A then row-reduce A
1 5
1 4
1
0
1
0
R2 −R1
1 −
−−−−→ 0
5
−1
1
0
−R2
1 −
−−→ 0
1
−1
5
1
1
1
I to I
R1 −5R2 0 −
−−−−→ 1
−1
0
0
1
A−1 .
5
−1
−4
1
Thus the inverse of the matrix is
−4
1
5
.
−1
49. As in Example 3.30, we adjoin the identity matrix to A then row-reduce A
"
−2
3
4
−1
1
0
# 1
"
− 2 R1
0 −−−
−→ 1
1
3
−2
−1
− 21
0
I to I
"
#
#
1 −2 − 21 0
0
1
···
R −3R1
3
5 R2
1 −−2−−−→
1 −−
0
5
−→
2
#
"
"
R +2R2
1
1 −2 − 21 0 −−1−−−→
3
1
0
1
0
10
5
A−1 .
1
10
3
10
0
1
2
5
1
5
#
Thus the inverse of the matrix is
"
1
10
3
10
2
5
1
5
#
.
50. As in Example 3.30, we adjoin the identity matrix to A then row-reduce A
"
4
2
−2
0
1
0
"
# 1
4 R1
0 −−
−→ 1
2
1
− 21
0
1
4
0
#
"
0
1
R −2R1
0
1 −−2−−−→
− 12
1
1
4
− 21
I to I
#
"
R1 + 12 R2
0 −−−
−−−→ 1
1
0
0
1
A−1 .
#
0
1
2
− 21
1
Thus the inverse of the matrix is
"
#
0
1
2
− 12
1
.
51. As in Example 3.30, we adjoin the identity matrix to A then row-reduce A
"
1
−a
a
1
1
0
#
"
0
1
aR1 +R2
1 −−−
−−→ 0
a
a2 + 1
I to I
#
1 0
1
R ···
+1 2
a 1 −a−2−
−−→
"
#
"
R −aR2
1 a
1
0
1 0
−−1−−−→
0 1 a2a+1 a21+1
0 1
Thus the inverse of the matrix is
"
a2 +1
− a2a+1
a
a2 +1
1
a2 +1
1
#
.
1
a2 +1
a
a2 +1
A−1 .
− a2a+1
1
a2 +1
#
.
3.3. THE INVERSE OF A MATRIX
159
52. As in Example 3.30, we adjoin the identity matrix to A then row-reduce A I to I A−1 .






1 −2 −1 0 1 0
1 −2 −1 0
2
3
0 1 0 0
1 0

 R1 ↔R2 
 R2 −2R1 



1 −2 −1 0 1 0 −
3
0 1 0 0
7
2 1 −2 0
 −−−−→ 2
 −−−−−→ 0


R −2R1
2
0 −1 0 0 1
2
0 −1 0 0 1 −−3−−−→
0
4
1 0 −2 1




1 −2 −1
1 0
0
1 0
1 −2 −1 0




1
1
1
2
2
7 R2 
0
− 72 0
− 27 0
1
1
−−
−→ 0



7
7
7
7
R −4R2
0
4
1 0 −2 1 −−3−−−→
0
0 − 71 − 47 − 67 1

 R +R 

1
3
1 −2 −1 0
1
0 −−−−−→ 1 −2 0
4
7 −7

 R2 − 2 R3 

2
1
0

7
1
− 72
0
1 0 −1 −2
2

 −−−−−−→ 0

7
7
−7R3
0
0
1 4
6 −7
0
0 1
4
6 −7
−−−→


R +2R2
1 0 0
2
3 −3
−−1−−−→


0 1 0 −1 −2
2


0 0 1
4
6 −7
The inverse of the matrix is

2
−1
4

3 −3
−2
2 .
6 −7
53. As in Example 3.30, we adjoin the identity matrix to A then row-reduce A

1
3
2
−1
2
1
2
3 −1
1
0
0
0
1
0


1
0
R2 −3R1
0 −−−−−→ 0
R3 −2R1
1 −
−−−−→ 0
−1
2
4 −4
5 −5
1
−3
−2


1
0
0
0
4R3 −5R2
1 −
−−−−−→ 0
0
1
0
I to I
−1
2
4 −4
0
0
A−1 .
1
−3
7

0 0
1 0 .
−5 4
Since the matrix on the left has a zero row, it cannot be reduced to the identity matrix, so that the
given matrix is not invertible.
54. As in Example 3.30, we adjoin the identity matrix to A then row-reduce A I to I A−1 .






1
1 0
1 1
0 1
1 1 0 1 0 0
1 0 0
0 0
 R2 −R1 
 −R2 


1 0 1 0 1 0−





 −−−−→ 0 −1 1 −1 1 0 −−−→ 0 1 −1 1 −1 0
0 1 1 0 0 1
0
1 1
0 0 1
0 1
1 0
0 1




1 1
0
1
0 0
1 1
0
1
0 0




0 1 −1
0 1 −1
1 −1 0
1 −1 0

 1


R3 −R2
1
1
1
2 R3
1 1 −−
−
2 −1
0
0
1
−−−
−−→ 0 0
−→
2
2
2




1
1
1
1 1 0
1
0 0 R1 −R2 1 0 0
−
2
2
2

 −−−−−→ 
R2 +R3 
1
1
1
0 1 0
− 12 12 
− 21
−−−
−−→ 0 1 0


2
2
2
1
1
1
1
0 0 1 − 12
0 0 1 − 12
2
2
2
2
The inverse of the matrix is

1
 2
 1
 2
− 12
1
2
− 12
1
2
− 21


1
2
1
2
160
CHAPTER 3. MATRICES
55. As in Example 3.30, we adjoin the identity matrix to A then row-reduce A

a

1
0
0
a
1
0
0
a
1
0
0
0
1
0
 a1 R1 
0 −−−→ 1
 a1 R2  1
0 −−
−→  a
1 a1 R3 0
−−−→
0
1
0
0
1
1
a
1
a
0
0
0
1
a


1
0
 R2 − a1 R1 
0  −−−−−−→ 0
1
0
a
0
0
1
0
0
1
1
a
1
a
− a12
0
0
0

1 0 0

0 1 0
1
R3 − a
R2
−−−−−−→ 0 0 1
Thus the inverse of the matrix is

1
a
 1
− a2
1
a3
0
1
a
− a12
A−1 .
I to I
1
a

0

0
1
a
− a12
1
a3
1
a
0
1
a
− a12

0

0 .
1
a

0

0 .
1
a
Note that these calculations assume a 6= 0. But if a = 0, then the original matrix is not invertible since
the first row is zero.
56. As in Example 3.30, we adjoin the identity matrix to A then row-reduce A I to I A−1 . Note
that if a = 0, then the first row is zero, so the matrix is clearly not invertible. So assume a 6= 0. Then




0 a 0 1 0 0
0 a 0
1 0 0




0 1 0 .
 b 0 c 0 1 0 .
b 0 c
d
R − R1
0 d 0 0 0 1 −−3−−a−−→
0 0 0 − ad 0 1
Since the matrix on the left has a zero row, it cannot be reduced to the identity matrix, so the original
matrix is not invertible.
57. As in Example 3.30, we adjoin the identity matrix to A then row-reduce A I to I A−1 .




1 0 0 0 −11 −2
5 −4
0 −1 1
0 1 0 0 0

2
4
1 −2
2
1 0
2 0 1 0 0
 → 0 1 0 0
.



1 −1 3
0 0 0 1 0
0 0 1 0
5
1 −2
2
0
1 1 −1 0 0 0 1
0 0 0 1
9
2 −4
3
Thus the inverse of the matrix is

−11
 4

 5
9

−2
5 −4
1 −2
2
.
1 −2
2
2 −4
3
58. As in Example 3.30, we adjoin the identity matrix to A then row-reduce A
 √
2
−4√2


 0
0
√
0 2 2 0
√
2
0
0
0
1
0
0
3
1
Thus the inverse of the matrix is
√
1
0
0
0
0
1
0
0
0
0
1
0
 √2
2
 √
2 2

 0
0


0
1 0 0 0

0
0

 1 0 0
→
0 0 1 0
0
1
0 0 0 1
0
√
2
2
0
0

−2 0

−8 0
.
1 0
−3 1
2
2
√
2 2
0
0
I to I
0
√
2
2
0
0
A−1 .

−2 0

−8 0
.
1 0
−3 1
3.3. THE INVERSE OF A MATRIX
161
59. As in Example 3.30, we adjoin the identity matrix to A then row-reduce A I to I




1 0 0 0 1 0 0 0
1 0 0 0
1
0
0
0
0 1 0 0 0 1 0 0
0 1 0 0
0
1
0
0





.
→
0 0 1 0 0 0 1 0
0 0 1 0
0
0
1
0
a b c d 0 0 0 1
0 0 0 1 − ad − db − dc d1
Thus the inverse of the matrix is

1
 0

 0
− ad
0
1
0
− db
0
0
1
− dc
A−1 .

0
0
.
0
1
d
60. As in Example 3.30, we adjoin the identity matrix to A then row-reduce A I over Z2 to I
R1 +R2 0 1 1 0 −
1 0 1 1
−−−−→ 1 0 1 1
.
R2 +R1
1 1 0 1
1 1 0 1 −
−−−−→ 0 1 1 0
Thus the inverse of the matrix is
1
1
A−1 .
1
.
0
61. As in Example 3.30, we adjoin the identity matrix to A then row-reduce A I over Z5 to I
4R1 1 3 4 0
4 2 1 0 −
−→ 1 3 4 0
.
R2 +2R1
3 4 0 1
3 4 0 1 −
−−−−→ 0 0 3 1
A−1 .
Since the left-hand matrix has a zero row, it cannot be reduced to the identity matrix, so the original
matrix is not invertible.
62. As in Example 3.30, we adjoin the identity matrix to A then row-reduce A I over Z3 to I A−1 .

2 1 0
1 1 2
0 2 1
1
0
0
0
1
0
 2R1 


0 −−→
1 2 0 2 0 0
1 2
R2 +R1 0 2 2 1 1 0 2R2 0 1
0−
−−→
−−−−→
0 2 1 0 0 1
0 2
1



1 2
1 2 0 2 0 0
0 1
0 1 1 2 2 0
2R3
R3 +R2
0 0
−−−
−−→ 0 0 2 2 2 1 −−→

 R1 +R2 
1 2 0 2 0 0 −−−−−→ 1
R2 +2R3 
0
0
1 0 1 1 1
−−−−−→
0 0 1 1 1 2
0
Thus the inverse of the matrix is

0
1
1
1
1
1
0
1
1
2
2
0
0
2
0
0
1
1
2
2
1
0
2
1

0
0
1

0
0
2
0
1
0
0
0
1
0
1
1
1
1
1

1
1 .
2
63. As in Example 3.30, we adjoin the identity matrix to A then row-reduce A




1 5 0 1 0 0
1 0 0 4 6 4
1 2 4 0 1 0 → 0 1 0 5 3 2 .
3 6 1 0 0 1
0 0 1 0 6 5
Thus the inverse of the matrix is

4
5
0
6
3
6

1
1 .
2

4
2 .
5
I over Z7 to I
A−1 .
162
CHAPTER 3. MATRICES
64. Using block multiplication gives
A
O
B
D
−1
A
O
−A−1 BD−1
AA−1 + BO −AA−1 BD−1 + BD−1
=
D−1
OA−1 + DO −OA−1 BD−1 + DD−1
−1
AA
−BD−1 + BD−1
=
O
DD−1
I O
=
.
O I
65. Using block multiplication gives
O
C
B
I
−(BC)−1
(BC)−1 B
C(BC)−1 I − C(BC)−1 B
−O(BC)−1 + BC(BC)−1 O(BC)−1 B + B(I − C(BC)−1 B)
=
−C(BC)−1 + IC(BC)−1 C(BC)−1 B + I(I − C(BC)−1 B)
BC(BC)−1
BI − BC(BC)−1 B
=
−C(BC)−1 + C(BC)−1 C(BC)−1 B + I − C(BC)−1 B
I O
=
.
O I
66. Using block multiplication gives
I
C
B
I
(I − BC)−1
−(I − BC)−1 B
−C(I − BC)−1 I + C(I − BC)−1 B
I(I − BC)−1 − BC(I − BC)−1 −I(I − BC)−1 B + B(I + C(I − BC)−1 B)
=
C(I − BC)−1 − I(C(I − BC)−1 ) −C(I − BC)−1 B + I(I + C(I − BC)−1 B)
(I − BC)(I − BC)−1
(BC − I)(I − BC)−1 B + B
=
C(I − BC)−1 − C(I − BC)−1 −C(I − BC)−1 B + I + C(I − BC)−1 B
I −IB + B
=
O
I
I O
=
.
O I
67. Using block multiplication gives
O
C
B
D
−(BD−1 C)−1
(BD−1 C)−1 BD−1
D−1 C(BD−1 C)−1 D−1 − D−1 C(BD−1 C)−1 BD−1


BD−1 C(BD−1 C)−1
D−1 − D−1 CBD−1 C)−1 BD−1

C(BD−1 C)−1 BD−1
=
−C(BD−1 C)−1 + C(BD−1 C)−1
−1
−1
−1
−1
−1
+D(D − D C(BD C) BD )
−1
−1
−1
I
D − D CC DB −1 BD−1
=
O C(BD−1 C)−1 BD−1 + I − DD−1 C(BD−1 C)−1 BD−1
I
D−1 − D−1
=
O C(BD−1 C)−1 BD−1 + I − C(BD−1 C)−1 BD−1
I O
=
.
O I
3.3. THE INVERSE OF A MATRIX
163
68. Using block multiplication gives
A B P Q
= AP + BR AQ + BS CP + DR DQ + DS
C D R S
AP + B(−D−1 CP ) A(−P BD−1 + B(D−1 + D−1 CP BD−1 )
=
CP + D(−D−1 CP ) C(−P BD−1 ) + D(D−1 + D−1 CP BD−1 )
(A − BD−1 C)P −(A − BD−1 C)P BD−1 + BD−1
=
CP − CP
−CP BD−1 + I + CP BD−1
−1
P P −P −1 P BD−1 + P BD−1
=
O
I
I −BD−1 + BD−1
I O
=
=
.
O
I
O I
69. Partition the matrix

1
0
A=
2
1
0
1
3
2
0
0
1
0

0
0
= I
0
C
1
B
, where B = O.
I
Then using Exercise 66, (I − BC)−1 = (I − OC)−1 = I −1 = I, so that
(I − BC)−1
−(I − BC)−1 B
I
−IB
I
−1
A =
=
=
−C(I − BC)−1 I + C(I − BC)−1 B
−CI I + CIB
−C
Substituting back in for C gives

1
 0
−1

A =
−2
−1
70. Partition the matrix

0 0 0
1 0 0
.
−3 1 0
−2 0 1
√

√0 2 2 0
2
0
0
= A
O
0
1
0
0
3
1
 √
2
√
−4 2
M =
 0
0
B
D
and use Exercise 64. Then
#
√
−1 " √2
2
0
0
2
√
√
√
A =
= √
−4 2
2
2 2 22
−1 1 0
1 0
D−1 =
=
3 1
−3 1
"√
# √
2
0
1 0
−2
2 2 0
2
√
−A−1 BD = √
=
2
−8
0
0 −3 1
2 2
−1
2
Then
"
A−1
−1
M =
O
71. Partition the matrix
 √2
2
#
 √
−A−1 BD−1
2 2
=
 0
D−1
0

0
0
A=
0
1

0 1 1
0 1 0
= O
−1 1 0
C
1 0 1
0
√
2
2
0
0
B
I
0
.
0

−2 0

−8 0
.
1 0
−3 1
O
.
I
164
CHAPTER 3. MATRICES
and use Exercise 65. Then
−1
1 1 0 −1
1
−(BC)−1 = −
=−
1 0 1
1
0
1
0 1 1
1 1
(BC −1 )B =
=
0 −1 1 0
−1 0
0 −1 1
0
0
1
C(BC)−1 =
=
1
1 0 −1
1 −1
0
1 1 1
0 0
I − C(BC)−1 B = I −
=
.
1 −1 1 0
0 0
Thus
A−1 =

−1
 0
(BC)−1 B
=
 0
I − C(BC)−1 B
1
−(BC)−1
C(BC)−1
72. Partition the matrix

0
A= 1
−1
where
B= 1
1 ,
1
3
5

1
O

1 =
C
2
1
C=
,
−1
0
−1
=
−1
0
0
1

0
1 1
1 −1 0
.
1
0 0
−1
0 0
B
,
D
3
D=
5
1
,
2
and use Exercise 67. Then
"
# " #−1
i 3 1 −1 1
h i−1 h i
 = − −5
= 51
1
5 2
−1
"
#
i 3 1 −1 h
i
1h
−1
−1
−1
(BD C) BD = − 1 1
= 35 − 52
5
5 2
"
#−1 " # " 3#
−5
3
1
1
1
=
D−1 C(BD−1 C)−1 =
−
8
5
5 2
−1
5
"
#−1 " #
"
#
"
#
i 3 1 −1
3 h
1
1
3
1
−
5
5
5 .
D−1 − D−1 C(BD−1 C)−1 BD−1 =
−
=
1 1
8
5 2
5 2
− 15 − 51
5

h
−(BD−1 C)−1 = −  1
Thus
"
A−1 =
3.4
−1
−1
−(BD C)
D C(BD−1 C)−1
−1
−1
D
−1
−1
−1
(BD C) BD
− D−1 C(BD−1 C)−1 BD−1
#

1
5
 3
= − 5
8
5
3
5
1
5
− 51
− 25

1
.
5
1
−5
The LU Factorization
1. As in the method outlined after Theorem 3.15, we want to solve Ax = (LU )x = b. Let y = U x and
first solve Ly = b for y by forward substitution. Then solve U x = y by backward substitution. We
have
−2 1
1 0 −2 1
5
A=
=
= LU, and b =
.
2 5
−1 1
0 6
1
Then
1
Ly = b ⇒
−1
y1 = 5
0 y1
5
y1
= 5
5
=
⇒
⇒
⇒y=
.
1 y2
1
−y1 + y2 = 1
6
y2 = y1 + 1
3.4. THE LU FACTORIZATION
165
Likewise,
Ux = y ⇒
x2 = 1
1 x1
5
−2x1 + x2 = 5
x
−2
=
⇒
⇒
⇒ 1 =
.
6 x2
6
6x2 = 6
x2
1
−2x1 = 5 − x2
−2
0
2. As in the method outlined after Theorem 3.15, we want to solve Ax = (LU )x = b. Let y = U x and
first solve Ly = b for y by forward substitution. Then solve U x = y by backward substitution. We
have
"
# "
#"
#
" #
4 −2
1 0 4 −2
0
A=
= 1
= LU, and b =
.
2
3
1 0
4
8
2
Then
Ly = b ⇒
1
1
2
0 y1
y
= 0
0
0
⇒y=
.
=
⇒1 1
8
8
1 y2
y
+
y
=
8
1
2
2
Likewise,
4
Ux = y ⇒
0
x2 = 2
−2 x1
0
4x1 − 2x2 = 0
x
1
=
⇒
⇒
⇒ 1 =
.
4 x2
8
4x2 = 8
x
2
4x1 = 2x2
2
3. As in the method outlined after Theorem 3.15, we want to solve Ax = (LU )x = b. Let y = U x and
first solve Ly = b for y by forward substitution. Then solve U x = y by backward substitution. We
have


 

 
2 1 −2
−3
1
0 0
2
1 −2


 

 
A = −2
1 0 0 4 −6 = LU, and b =  1 .
3 −4 = −1
5
0
2 − 4 1 0 0 − 72
4 −3
0
Then

1
Ly = b ⇒ −1
2
0
1
− 45
   
y1
= −3
0
−3
y1
= 1
0 y2  =  1 ⇒ −y1 + y2
y3
0
2y1 − 45 y2 + y3 = 0
1
y1 = −3
   
−3
y1
⇒ y2 = y1 + 1
⇒ y2  = −2 .
7
5
y3
2
y3 = −2y1 + y2
4
Likewise,

2

U x = y ⇒ 0
0
1
4
0
   
−2 x1
−3
2x1 + x2 − 2x3 = −3
   
−6 x2  = −2 ⇒
4x2 − 6x3 = −2
7
7
−2
x3
− 27 x3 = 27
2
   
x1
− 23
   
⇒ 4x2 = 6x3 − 2
⇒ x2  =  −2 .
x3
−1
2x1 = −x2 + 2x3 − 3
x3 = −1
4. As in the method outlined after Theorem 3.15, we want to solve Ax = (LU )x = b. Let y = U x and
first solve Ly = b for y by forward substitution. Then solve U x = y by backward substitution. We
have


 

 
2 −4 0
1 0 0
2 −4 0
2


  3

 
A =  3 −1 4 =  2 1 0 0
5 4 = LU, and b =  0 .
1
−1
2 2
−2 0 1
0
0 2
−5
166
CHAPTER 3. MATRICES
Then

1

Ly = b ⇒  32
− 12
   
y1
= 2
2
0 0
y1
   
1 0 y2  =  0 ⇒ 32 y1 + y2
= 0
1
−5
y3
0 1
− 2 y1
+ y3 = −5
y1 = 2
3
⇒ y2 = − 2 y1
1
y3 = y1 − 5
2
   
2
y1
   
⇒ y2  = −3 .
−4
y3
Likewise,

2
U x = y ⇒ 0
0
   
−4 0 x1
2
2x1 − 4x2
= 2
5 4 x2  = −3 ⇒
5x2 + 4x3 = −3
0 2 x3
−4
2x3 = −4
   
x1
3
   
⇒ 5x2 = −4x3 − 3 ⇒ x2  =  1 .
x3
−2
2x1 = 4x2 + 2
x3 = −2
5. As in the method outlined after Theorem 3.15, we want to solve Ax = (LU )x = b. Let y = U x and
first solve Ly = b for y by forward substitution. Then solve U x = y by backward substitution. We
have

 

 

1
0
1
0 0 0 2 −1 0
2 −1 0
0
2
 0 −1 5 −3
6 −4 5 −3 3
1
0
0
 = LU, and b =   .

=
A=
2
8 −4 1
0 1
0
0 1 0 0
0 4
1
0
0 0
4
2 −1 5 1
4 −1 0
7
Then

1
3
Ly = b ⇒ 
4
2
   
y1
= 1
0 0 0 y1
1
y2  2
3y
+
y
= 2
1 0 0
1
2
  =   ⇒
4y1
+ y3
= 2
0 1 0 y3  2
2y1 − y2 + 5y3 + y4 = 1
−1 5 1 y4
1
y1 = 1
   
y1
1
y2  −1
y2 = −3y1 + 2
  
⇒
⇒
y3  = −2 .
y3 = −4y1 + 2
8
y4
y = −2y + y − 5y + 1
4
1
2
3
Likewise,

2
0
Ux = y ⇒ 
0
0
   
−1 0
0 x1
1
2x1 − x2
= 1
x2  −1
−1 5 −3
−
x
+
5x
−
3x
= −1
2
3
4
  =   ⇒
0 1
0 x3  −2
x3
= −2
0 0
4 x4
8
4x4 = 8
x4 = 2
  

x1
−7
x2  −15
x3 = −2
 

⇒
⇒
x3  =  −2 .
x2 = 5x3 − 3x4 + 1
x4
2
2x = x + 1
1
2
6. As in the method outlined after Theorem 3.15, we want to solve Ax = (LU )x = b. Let y = U x and
first solve Ly = b for y by forward substitution. Then solve U x = y by backward substitution. We
3.4. THE LU FACTORIZATION
167
have

1
A = −2
3


1
4
3
0
−2
−5 −1
2 = 
 3
6 −3 −4
−5

0
0 0 1 4

1
0 0
 0 3

−2
1 0 0 0
0 0
4 −2 1

 
3 0
1
−3
5 2
 = LU, and b =   .
−1
−2 0
0 1
0
Then
   
1
y1
= 1
0
0 0 y1
y2  −3
−2y
+
y
= −3
1
0 0
1
2
  =   ⇒
3y1 − 2y2 + y3
= −1
−2
1 0 y3  −1
0
−5y1 + 4y2 − 2y3 + y4 = 0
4 −2 1 y4

1
−2
Ly = b ⇒ 
 3
−5
y1 = 1
   
1
y1
y2  −1
y2 = 2y1 − 3
  
⇒
⇒
y3  = −6 .
y3 = −3y1 + 2y2 − 1
−3
y4
y = 5y − 4y + 2y
4
1
2
3
Likewise,

1
0
Ux = y ⇒ 
0
0
4
3
0
0
   
x1 + 4x2 + 3x3
= 1
1
3 0 x1
x2  −1
3x
+
5x
+
2x
5 2
2
3
4 = −1
  =   ⇒
− 2x3
= −6
−2 0 x3  −6
0 1 x4
x4 = −3
−3
x4 = −3
   16 
x1
3
x   − 10 
x3 = 3
 2  3 
⇒
⇒ =
.
x3  
3x2 = −5x3 − 2x4 − 1
3
x1 = −4x2 − 3x3 + 1
7. We use the multiplier method from Example 3.35:
1
2
1
A=
R3 −(−3R1 )
−3 −1 −
−−−−−−−→ 0
x4
2
= U.
5
The multiplier was −3, so
1
L=
−3
0
.
1
8. We use the multiplier method from Example 3.35:
2 −4
2
3
A=
R3 − 2 R1
3
1 −
−−−−−→ 0
The multiplier was 23 , so
L=
1
3
2
−4
= U.
7
0
.
1
9. We use the multiplier method from Example 3.35:





1 2 3
1
2
3
1
R −4R1
0 −3 −6
0
A = 4 5 6 −−2−−−→
R3 −3R2
R3 −8R1
8 7 9 −
−−−−→ 0 −9 −15 −−−−−→ 0
The multipliers were `21 = 4, `31 = 8, and `32 = 3, so we get


1 0 0
L = 4 1 0
8 3 1

2
3
−3 −6 = U.
0
3
−3
168
CHAPTER 3. MATRICES
10. We use the multiplier method from Example 3.35:





2
2 −1
2 2 −1 R −2R
2
2
1
0
6
4 −−−−−→ 0 −4
A = 4 0
R3 −(− 41 R2 )
R3 − 23 R1
3 4
4 −
1 11
2
−−−−−−−−→ 0
−−−−−→ 0

2 −1
−4
6 = U.
0
7
The multipliers were `21 = 2, `31 = 23 , and `32 = − 14 , so we get


1
0 0
1 0
L = 2
1
3
−4 1
2
11. We use the multiplier method from Example 3.35:

1
 2
A=
 0
−1


1
2
3 −1
R2 −2R1
−−−−−→ 0
6
3
0
 R3 −0R1 
6 −6
7 −
−−−−→ 0
−2 −9
0 R4 −(−1R1 ) 0
−−−−−−−−→
2
2
6
0


1 2
3 −1
0 2
−3
2
 R −3R2 
0 0
−6
7 −−3−−−→
R
−0R
4
2
−6 −1 −−−−−→ 0 0

3 −1
−3
2

3
1
−6 −1

1 2
0 2

0 0
R4 −(−2R3 )
−−−−−−−−→ 0 0

3 −1
−3
2
 = U.
3
1
0
1
The multipliers were `21 = 2, `31 = 0, `41 = −1, `32 = 3, `42 = 0, and `43 = −2, so


1 0
0 0
 2 1
0 0
.
L=
 0 3
1 0
−1 0 −2 1
12. We use the multiplier method from Example 3.35:

2
−2

A=
4
6
2
4
4
9


2 1 R −(−R ) 2
2
1
−−−→ 
−1 2
 −−R−−
0
3 −2R1

7 3 −−−−−→ 0
R −3R1
0
5 8
−−4−−−→
2
6
0
3


2
2 1

1 3
 R3 −0R2 0
3 1 −−−−−→ 0
R4 − 21 R2
−1 5 −
−−−−−→ 0
2
6
0
0
2
1
3
− 23

1
3

1
7
2

2
0

0
1
R4 −(− 2 R3 )
−−−−−−−−→ 0
2
6
0
0
2
1
3
0

1
3
 = U.
1
4
The multipliers were `21 = −1, `31 = 2, `41 = 3, `32 = 0, `42 = 21 , and `43 = − 12 , so


1 0
0 0
−1 1
0 0
.
L=
 2 0
1 0
3 21 − 12 1
13. We can use the multiplier method as before to make the matrix “upper triangular”, i.e., in row echelon
form. In this case, however, the matrix is already in row echelon form, so it is equal to U , and L = I3 :

 


1 0 1 −2
1 0 0
1 0 1 −2
0 3 3
1 = 0 1 0 0 3 3
1 .
0 0 0
5
0 0 1
0 0 0
5
3.4. THE LU FACTORIZATION
169
14. We can use the multiplier method as before to make the matrix “upper triangular”, i.e., in row echelon
form.

1
−2

 1
0


2
0 −1 1
1
0
−3
3
6 0
 R3 − 31 R2 
−1
3
6 1 −−−−−
−→ 0
3 −3 −6 0 R4 −(−1R2 ) 0
−−−−−−−−→


2
0 −1
1 R −(−2R ) 1
2
1
−−−−−→ 
−7
3
8 −2
0
 −−−
R
−1R
1
0
1
3
5
2 −−3−−−→
R4 −0R1
0
3 −3 −6
0
−−−−−→

2 0 −1 1
−3 3
6 0
 = U.
0 2
4 1
0 0
0 0
The multipliers were `21 = −2, `31 = 1, `41 = 0, `32 = 31 , and `42 = −1, so

1
−2
L=
 1
0
0
1
1
3
−1

0 0
0 0
.
1 0
0 1
15. In Exercise 1, we have
−2
A=
2
1
1
= LU =
5
−1
0
1
−2
0
1
.
6
Then
−1
L
"
1 1
=
1 1
# "
0
1
=
1
1
#
0
,
1
U
−1
"
1 6
=−
12 0
# "
−1
− 21
=
−2
0
1
12
1
6
#
,
so that
"
− 12
A−1 = (LU )−1 = U −1 L−1 =
0
1
12
1
6
#"
1
1
# "
5
0
− 12
=
1
1
6
1
12
1
6
#
.
16. In Exercise 4, we have

2

A= 3
−1


1
−4 0


−1 4 = LU =  32
2 2
− 21

0 0
2

1 0 0
0 1 0

−4 0

5 4 .
0 2
To compute L−1 , we adjoin the identity matrix to L and row-reduce:





1 0 0 1 0 0
1 0 0
1 0 0
1
 3



 3
−1
3
→
⇒
L
=
1
0
0
1
0
0
1
0
−
1
0
−
 2



 2
2
1
1
− 12 0 1 0 0 1
0 0 1
0
1
2
2
To compute U −1 , we adjoin the identity matrix to U and row-reduce:





1
2 −4 0 1 0 0
1 0 0 12 25 − 54
2





−1
5 4 0 1 0 → 0 1 0 0 15 − 52  ⇒ U =  0
0
1
0
0 2 0 0 1
0 0 1 0 0
0
2

0 0

1 0 .
0 1
2
5
1
5

− 54

− 52  .
0
1
2
2
5
1
5

− 45

− 25  .
Then

1
2

A−1 = (LU )−1 = U −1 L−1 =  0
0
2
5
1
5
0

− 45
1
2  3
− 5  − 2
1
2
1
2
 
0 0
− 12
  1
1 0 = − 2
1
0 1
4
0
1
2
17. Using the LU decomposition from Exercise 1, we compute A−1 one column at a time using the method
outlined in the text. That is, we first solve Lyi = ei , and then solve U xi = yi . Column 1 of A−1 is
170
CHAPTER 3. MATRICES
found as follows:
"
#" # " #
" #
1 0 y11
1
1
y11
= 1
Ly1 = e1 ⇒
=
⇒
⇒ y1 =
−1 1 y21
0
1
−y11 + y21 = 0
"
#" # " #
"
#
h i
5
−2 1 x11
1
−2x11 + x21 = 1
− 12
U x1 = y1 ⇒
=
⇒
⇒ x =
.
1
1
0 6 x21
1
6x21 = 1
6
Use the same procedure to find the second column:
"
#" # " #
" #
1 0 y12
0
0
y12
= 0
Ly2 = e2 ⇒
=
⇒
⇒ y2 =
−1 1 y22
1
1
−y12 + y22 = 1
"
#" # " #
" #
h
i
1
−2 1 x12
0
−2x12 + x22 = 0
U x2 = y2 ⇒
=
⇒
⇒ x = 121 .
1
0 6 x22
1
6x21 = 1
6
So
h
A−1 = x1
i
x2 =
"
5
− 12
1
6
1
12
1
6
#
.
This matches what we found in Exercise 15.
18. Using the LU decomposition from Exercise 4, we compute A−1 one column at a time using the method
outlined in the text. That is, we first solve Lyi = ei , and then solve U xi = yi . Column 1 of A−1 is
found as follows:
   

   
1 0 0
y11
1
y11
= 1
y11
1
 3
   
   3
3
Ly1 = e1 =  2
1 0 y21  = 0 ⇒ 2 y11 + y21
= 0 ⇒ y21  = − 2 
1
− 12 0 1 y31
0
− 21 y11
+ y31 = 0
y31
2

   
   
2 −4 0 x11
1
x11
− 21
2x11 − 4x21
= 1

    3




U x1 = y1 = 0
5 4 x21  = − 2  ⇒
5x21 + 4x31 = − 23 ⇒ x21  = − 12 
1
1
0
0 2 x31
x31
2x31 = 12
2
4
Repeat this procedure to find the second column:
   
   

y12
= 0
0
y12
0
y12
1 0 0
   
   
 3
3
Ly2 = e2 =  2
1 0 y22  = 1 ⇒ 2 y12 + y22
= 1 ⇒ y22  = 1
0
0
− 12 0 1 y32
− 21 y12
+ y32 = 0
y32

   
   
2
2 −4 0 x12
0
2x12 − 4x22
= 0
x12
5

   
  1
U x2 = y2 = 0
5 4 x22  = 1 ⇒
5x22 + 4x32 = 1 ⇒ x22  =  5 
0
0 2 x32
0
0
2x32 = 0
x32
Do this again to find the third column:
   
   

1 0 0
y13
0
y13
= 0
y13
0
   
   

Ly3 = e3 =  32
1 0 y23  = 0 ⇒ 32 y13 + y23
= 0 ⇒ y23  = 0
− 12 0 1 y33
1
− 12 y13
+ y33 = 1
y33
1

   
   
2 −4 0
x13
0
2x13 − 4x23
= 0
x13
− 54

   
   2
U x3 = y3 = 0
5 4 x23  = 0 ⇒
5x23 + 4x33 = 0 ⇒ x23  = − 5 
1
0
0 2 x33
1
2x33 = 1
x33
2


− 12 25 − 54
h
i
 1 1

−1
Thus A = x1 x2 x3 = − 2 5 − 52 ; this matches what we found in Exercise 16.
1
1
0
4
2
3.4. THE LU FACTORIZATION
171
19. This matrix results from the identity matrix by first interchanging rows 1 and 2, and then interchanging
the (new) row 1 with row 3, so it is


 

0 0 1
0 1 0
0 0 1
E13 E12 = 0 1 0 1 0 0 = 1 0 0 .
1 0 0
0 0 1
0 1 0
Other products of elementary matrices are possible.
20. This matrix results from the identity matrix by first interchanging rows 1 and 4, and then interchanging
rows 2 and 3, so it is
 



1 0 0 0
0 0 0 1
0 0 0 1
0 1 0 0 0 0 1 0 0 0 1 0
 


E14 E23 = 
0 0 1 0 0 1 0 0 = 0 1 0 0
1 0 0 0
0 0 0 1
1 0 0 0
Other products of elementary matrices are possible; for example, also P = E13 E23 E34 .
21. To get from the given matrix to the identity matrix, perform the following exchanges:







1 0 0
1 0 0 0
1 0 0 0
0 1 0 0
0 0 0 1 R ↔R 0 0 0 1 R ↔R 0 1 0 0 R ↔R 0 1 0
 3 4 
 2 3 
 1 3 

1 0 0 0 −−−−−→ 0 1 0 0 −−−−−→ 0 0 0 1 −−−−−→ 0 0 1
0 0 0
0 0 1 0
0 0 1 0
0 0 1 0

0
0

0
1
Thus we can construct the given matrix P by multiplying these elementary matrices together in the
reverse order:
P = E13 E23 E34 .
Other combinations are possible.
22. To get from the given matrix to the identity matrix, perform the following exchanges:

0
1

0

0
0
0
0
0
0
1
1
0
0
0
0
0
0
1
0
0


1
0
0
0
 R ↔R 
1
2 
0
 −−−−−→ 0
0

1
0
0
0
0
0
0
1
0
1
0
0
0
0
0
1
0
0


1
0
0
0
 R ↔R 
2
5 
0
 −−−−−→ 0
0

1
0
0

1
0

0

0
0
0
1
0
0
0
0
0
0
0
1
0
0
1
0
0
0
1
0
0
0
0
0
1
0
0
0
0
0
0
1

0
0
 R ↔R
3
5
0
 −−−−−→

1
0


0
1
0
0
 R ↔R 
4
5 
0
 −−−−−→ 0

0
1
0
0
0
1
0
0
0
0
0
1
0
0
0
0
0
1
0

0
0

0

0
1
Thus we can construct the given matrix P by multiplying these elementary matrices together in the
reverse order:
P = E12 E25 E35 E45 .
Other combinations are possible.
23. To find a P T LU factorization of A, we begin by permuting the rows of A to get it into a form that
will result in a small number of operations to reduce to row echelon form. Since




 

0 1 4
0 1 0
0 1 4
−1 2 1
A = −1 2 1 , we have P A = 1 0 0 −1 2 1 =  0 1 4 .
1 3 3
0 0 1
1 3 3
1 3 3
172
CHAPTER 3. MATRICES
Now follow the multiplier method from Example 3.35 to find the LU factorization of this matrix:

−1
PA =  0
1
2
1
3


1
−1
 0
4
R3 −(−R1 )
3 −
0
−−−−−−→


1
−1
 0
4
R3 −5R2
4 −
0
−−−−→
2
1
5
2
1
0

1
4 = U.
−16
Then `21 = 0, `31 = −1, and `32 = 5, so that

1
L= 0
−1
Thus

0
A = P T LU = 1
0
1
0
0

0
1
0  0
1
−1

0
0
1
0
1
5
0
1
5

0
−1
0  0
1
0
2
1
0

1
4 .
−16
24. To find a P T LU factorization of A, we begin by permuting the rows of A to get it into a form that
will result in a small number of operations to reduce to row echelon form. Since





 
1 1 −1 0
0 0
1 2
0 0 0 1
0 0
1 2



−1 1
1 1
3 2
3 2
.
= 0 2
 , we have P A = 0 0 1 0 −1 1
A=
1 0 0 0  0 2
 0 2
1 2
1 1  0 0
1 1
−1 1
3 2
1 1 −1 0
0 1 0 0
1 1 −1 0
Now follow the multiplier method from Example 3.35 to find the LU factorization of this matrix:

1
 0
PA = 
 0
−1
1
2
0
1


1
−1 0
0
1 1


1 2 R4 −(−R1 ) 0
3 2 −−−−−−−→ 0
1
2
0
2


1
−1 0
0
1 1


1 2 R4 −R2 0
2 2 −−−−−→ 0

−1 0
1 1

1 2
1 1

1 1
0 2

0 0
R4 −R3
−−−−−→ 0 0
1
2
0
0

−1
0
1
1
 = U.
1
2
0 −1
Then `41 = −1, `42 = 1, `42 = 1, and the other `ij are zero, so that


1 0 0 0
 0 1 0 0

L=
 0 0 1 0 .
−1 1 1 1
Thus

0
0
T
A = P LU = 
0
1
0
0
1
0
1
0
0
0

0
1
 0
1

0  0
0
−1
0
1
0
1
0
0
1
1

0
1
0
0

0 0
1
0
1
2
0
0

−1
0
1
1
.
1
2
0 −1
25. To find a P T LU factorization of A, we begin by permuting the rows of A to get it into a form that
will result in a small number of operations to reduce to row echelon form. Since




 

0 −1
1 3
0 1 0 0
0 −1
1 3
−1
1
1 2
−1



1
1 2
1
1 2
1 3
 , we have P A = 1 0 0 0 −1
 =  0 −1
.
A=
 0






1 −1 1
0 0 0 1
0
1 −1 1
0
0
1 1
0
0
1 1
0 0 1 0
0
0
1 1
0
1 −1 1
3.4. THE LU FACTORIZATION
173
Now follow the multiplier method from Example 3.35 to find the LU factorization of this matrix:




−1
1 1 2
−1
1
1 2
 0 −1 1 3
 0 −1
1 3

 = U.

PA = 
 0

0 1 1
0
1 1 R4 −(−R2 )  0
0
0 0 4
0
1 −1 1 −−−−−−−→
Then `42 = −1 and the other `ij are zero, so that

1
0
L=
0
0

0 0 0
1 0 0
.
0 1 0
−1 0 1
Thus

0

1
A = P T LU = 
0
0
1
0
0
0
0
0
0
1

1
0
0
0

1 0
0
0

0 0 0 −1

1 0 0
 0
0 1 0  0
0
−1 0 1

1 1 2
−1 1 3
.
0 1 1
0 0 4
26. From the text, a permutation matrix consists of the rows of the identity matrix written in some order.
So each row and each column of a permutation matrix has exactly one 1 in it, and the remaining entries
in that row or column are zero. Thus there are n choices for the column containing the 1 in the first
row; once that is chosen, there are n − 1 choices for the column to contain the 1 in the second row.
Next, there are n − 2 choices in the third row. Continuing, we decrease the number of choices by one
each time, until there is just one choice in the last row. Multiplying all these together gives the total
number of choices, i.e., the number of n×n permutation matrices, which is n·(n−1)·(n−2) · · · 2·1 = n!.
27. To solve Ax = P T LU x = b, multiply both sides by P to get LU x = P b. Now compute P b, which is a
permutation of the rows of b, and solve the resulting equation by the method outlined after Theorem
3.15:
   





1
1
0 1 0
0 1 0
0 1 0
P T = 1 0 0 ⇒ P = 1 0 0 ⇒ P b = 1 0 0 1 = 1 .
0 0 1
0 0 1
0 0 1
5
5
Then

1

Ly = P b ⇒  0
0
1
− 12
1
2
   
0
y1
1
y1
= 1
   
0 y2  = 1 ⇒
y2
= 1
1
1
1
y3
5
2 y1 − 2 y2 + y3 = 5
y1 = 1
   
y1
1
   
⇒ y2 = 1
⇒ y2  = 1 .
1
1
y3
5
y3 = 5 − y1 + y2
2
2
Likewise,

2
U x = y ⇒ 0
0
3
1
0
   
2
2x1 + 3x2 + 2x3 = 1
x1
1
−1 x2  = 1 ⇒
x2 − x3 = 1
x3
5
− 25
− 52 x3 = 5
   
x1
4
   
⇒ x2 = x3 + 1
⇒ x = x2  = −1 .
x3
−2
2x1 = 1 − 3x2 − 2x3
x3 = −2
174
CHAPTER 3. MATRICES
28. To solve Ax = P T LU x = b, multiply both sides by P to get LU x = P b. Now compute P b, which is a
permutation of the rows of b, and solve the resulting equation by the method outlined after Theorem
3.15:





   
0 0 1
0 1 0
0 1 0
16
−4
P T = 1 0 0 ⇒ P = 0 0 1 ⇒ P b = 0 0 1 −4 =  4 .
0 1 0
1 0 0
1 0 0
4
16
Then

1
Ly = P b ⇒ 1
2
   
0 0 y1
−4
y1
= −4
1 0 y2  =  4 ⇒ y1 + y2
= 4
−1 1
y3
16
2y1 − y2 + y3 = 16
   
−4
y1
   
⇒ y2  =  8 .
⇒ y2 = 4 − y1
32
y3
y3 = 16 − 2y1 + y2
y1 = −4
Likewise,

4
U x = y ⇒ 0
0
   
4x1 + x2 + 2x3 = −4
−4
1 2 x1
− x2 + x3 = 8
−1 1 x2  =  8 ⇒
2x3 = 32
32
0 2 x3
  

x1
−11
  

⇒ x2 = x3 − 8
⇒ x = x2  =  8 .
x3
16
4x1 = −x2 − 2x3 − 4
x3 = 16
29. Let A = (aij ) and B = (bij ) be unit lower triangular n × n matrices, so that


0 i < j
aij = bij = 1 i = j


∗ i>j
(where of course the ∗’s maybe different for A and B). Then if C = (cij ) = AB, we have
cij = ai1 b1j + ai2 b2j + · · · + ai(j−1) b(j−1)j + aij bjj + · · · + ain bnj .
If i < j, then all the terms up through ai(j−1) b(j−1)j since the b’s are zero, and all terms from the next
term to the end are zero since the a’s are zero.
If i = j, then all the terms up through ai(j−1) b(j−1)j since the b’s are zero, the aij bjj = ajj bjj term is
1, and all terms from the next term to the end are zero since the a’s are zero.
Finally, if i > j, multiple terms in the sum may be nonzero.
Thus we get


0
cij = 1


∗
so that C is lower unit triangular as well.
30. To invert L, we row-reduce L I :

1 0 ···
∗ 1 · · ·

 .. .. . .
. .
.
∗ ∗ ···
i<j
i=j
i>j
0
0
..
.
1
0
..
.
0
1
..
.
···
···
..
.
1
0
0
···

0
0

..  .
.
1
3.4. THE LU FACTORIZATION
175
To do this, we first add multiples of the first row to each of the lower rows to make the first column
zero except for the 1 in row 1. Note that the result of this is a unit lower triangular matrix on the right
(since we added multiples of the leading 1 in the first row to the lower rows). Now do the same with
row 2: add multiples of that row to each of the lower rows, clearing column 2 on the left except for the
1 in row 2. Again, for the same reason, the result on the right remains unit lower triangular. Continue
this process for each row. The result is the identity matrix on the left, and a unit lower triangular
matrix on the right. Thus L is invertible and its inverse is unit lower triangular.
31. We start by finding D. First, suppose D is a diagonal 2 × 2 matrix and A is any 2 × 2 matrix with 1’s
on the diagonal. Then
a 0
DA =
A
0 b
has the effect of multiplying the first row of A by a and the second row of A by b, so results in a matrix
with a and b in the diagonal entries. In our example, we want to write
−2 1
U=
0 6
as unit upper triangular, say U = DU1 . So we must turn the −2 and 6 into 1’s, and thus
−2 0 1 − 21
= DU1 .
U=
0 6 0
1
Thus
A = LU =
1
−1
0 −2
1
0
0
1
=
6
−1
0 −2
1
0
0 1
6 0
− 21
= LDU1 .
1
32. We start by finding D. First, suppose D is a diagonal 3 × 3 matrix and A is any 3 × 3 matrix with 1’s
on the diagonal. Then


a 0 0
DA = 0 b 0 A
0 0 c
has the effect of multiplying the first row of A by a, the second row of A by b, and the third row of A
by c, so the result is a matrix with a, b, and c on the diagonal. In our example, we want to write


2 −4 0
5 4
U = 0
0
0 2
as unit upper triangular, say U = DU1 . So we must turn the 2, 5, and 2 into 1’s, and thus



1 −2 0
2 0 0
1 54  = DU1 .
U = 0 5 0 0
0 0 2
0
0 1
Thus

1

A = LU =  23
− 12

0 0
2

1 0 0
0 1
0
 
−4 0
1
  3
5 4 =  2
0 2
− 21

0 0
2

1 0 0
0 1
0
0
5
0

0
1

0 0
2 0
−2
1
0

0
4
= LDU1 .
5
1
33. We must show that if A = LDU is both symmetric and invertible, then U = LT . Since A is symmetric,
L(DU ) = LDU = A = AT = (LDU )T = U T DT LT = U T (DT LT ) = U T (DLT )
since the diagonal matrix D is symmetric. Since L is lower triangular, it follows that LT is upper
triangular. Since any diagonal matrix is also upper triangular, Exercise 29 in Section 3.2 tells us that
176
CHAPTER 3. MATRICES
DLT and DU are both upper triangular. Finally, since U is upper triangular, it follows that U T is
lower triangular. Thus we have A = U T (DLT ) = L(DU ), giving two LU -decompositions of A. But
since A is invertible, Theorem 3.16 tells us that the LU -decomposition is unique, so that U T = L.
Taking transposes gives U = LT .
34. Suppose A = L1 D1 LT1 = L2 D2 LT2 where L1 and L2 are unit lower triangular, D1 and D2 are diagonal,
and A is symmetric and invertible. We want to show that L1 = L2 and D1 = D2 . First, since
L2 (D2 LT2 ) = A = L1 (D1 LT1 )
gives two LU -factorizations of an invertible matrix A, it follows from Theorem 3.16 that these two
factorizations are the same, so that L1 = L2 and D2 LT2 = D1 LT1 . Since L1 = L2 , this equation
becomes D2 LT2 = D1 LT2 . From Exercise 47 in Section 3.3, since A = L2 (D2 LT2 ) and A is invertible,
T
we see that both L2 and D2 LT2 are invertible; thus LT2 is invertible as well ((LT2 )−1 = (L−1
2 ) ). So
T
T
T −1
multiply D2 L2 = D1 L2 on the right by (L2 ) , giving D2 = D1 , and we are done.
3.5
Subspaces, Basis, Dimension, and Rank
1. Geometrically, x = 0 is a line through the origin in R2 , so you might conclude from Example 3.37 that
x
this is a subspace. And in fact, the set of vectors
with x = 0 is
y
0
0
=y
.
y
1
0
Since y is arbitrary, S = span
, and this is a subspace of R2 by Theorem 3.19.
1
1
1
−1
2. This is not a subspace. For example,
is in S, but −
=
is not in S. This violates property
0
0
0
3 of the definition of a subspace.
2
3. Geometrically, y = 2x is a line through the origin in R
, so you might conclude from Example 3.37
x
that this is a subspace. And in fact, the set of vectors
with y = 2x is
y
x
1
=x
.
2x
2
1
Since y is arbitrary, S = span
, and this is a subspace of R2 by Theorem 3.19.
2
−3
2
4. This is not a subspace. For example, both
and
are in S, since −3 · (−1) and 2 · 2 are both
−1
2
nonnegative. But their sum,
−3
2
−1
+
=
,
−1
2
1
is not in S since −1 · 1 < 0. So S is not a subspace (violates property 2 of the definition).
5. Geometrically, x = y = z is a line through the origin in R3 , so youcould
 conclude from Exercise 9 in
x
this section that this is a subspace. And in fact, the set of vectors y  with x = y = z is
z
 
 
x
1
x = x 1 .
x
1
3.5. SUBSPACES, BASIS, DIMENSION, AND RANK
177
 
1
Since x is arbitrary, S = span 1, and this is a subspace of R3 by Theorem 3.19.
1
6. Geometrically, this is a line in R3 lying in the xz-plane and passing
  through the origin, so Exercise 9
x
in this section implies that it is a subspace. The set of vectors y  satisfying these conditions is
z


 
x
1
 0  = x 0 .
2x
2
 
1
Since x is arbitrary, S = span 0, and this is a subspace of R3 by Theorem 3.19.
2
7. This is a plane in R3 , but it does not pass through the origin since (0, 0, 0) does not satisfy the equation.
So condition 1 in the definition is violated, and this is not a subspace.
8. This is not a subspace. For example
 
 
0
1
0 ∈ S and 1 ∈ S
2
1
since for the first vector |1 − 0| = |0 − 1| and for the second |0 − 1| = |1 − 2|. But their sum,
     
0
1
1
0 + 1 = 1 ∈
/S
2
3
1
since |1 − 1| =
6 |1 − 3|.
9. If a line ` passes through the origin, then it has the equation x = 0 + td = td, so that the set of points
on ` is just span(d). Then Theorem 3.19 implies that ` is a subspace of R3 . (Note that we did not use
the assumption that the vectors lay in R3 ; in fact this conclusion holds in any Rn for n ≥ 1.)
10. This is not a subspace of R2 . For example, (1, 0) ∈ S and (0, 1) ∈ S, but their sum, (1, 1), is not in S
since it does not lie on either axis.
11. As in Example 3.41, b is in col(A) if and only if the augmented matrix
1 0 −1 3
A b =
1 1
1 2
is consistent as a linear system. So we row-reduce that augmented matrix:
1 0
1 1
−1
1
3
1 0
R2 −R1
2 −
−−−−→ 0 1
−1
2
3
.
−1
This matrix is in row echelon form and has no zero rows, so the system is consistent and thus b ∈ col(A).
Using Example 3.41, w is in row(A) if and only if the matrix


1 0 −1
A
1 
= 1 1
w
−1 1
1
178
CHAPTER 3. MATRICES
can be row-reduced to a matrix whose last row is zero by operations excluding row interchanges
involving the last row. Thus

1
 1
−1
0
1
1


1 0
−1
R2 −R1
1  −−−
−−→  0 1
R3 +R1
1 −
−−−−→ 0 1


−1
1
 0
2 
R3 −R2
0 −
−−−−→ 0
0
1
0

−1
2 
−2
This matrix is in row echelon form and has no zero rows, so we cannot make the last row zero. Thus
w∈
/ row(A).
12. Using Example 3.41, b is in col(A) if and only if the augmented matrix


1
1 −3 1
2
1 1
A b = 0
1 −1 −4 0
is consistent as a linear system. So we row-reduce that augmented matrix:

1

0
1


1
1


1
0
R3 −R1
0 −−−−−→ 0
1 −3
2
1
−1 −4
1 −3
2
1
−2 −1


1
1


1
0
R3 +R2
−1 −−−−−→ 0
1
2
0


1
1
 21 R2 
1 −−−→ 0
0
0
−3
1
0
1
1
0
−3
1
2
0

1
1
2
0
The linear system has a solution (in fact, an infinite number of them), so is consistent. Thus b is in
col(A).
Using Example 3.41, w is in row(A) if and only if the matrix


1
1 −3
 0
A
2
1 

=

1 −1 −4 
w
2
4 −5
can be row-reduced to a matrix whose last row is zero by operations excluding row interchanges
involving the last row. Thus

1
 0

 1
2


1
1 −3
 0
2
1 
 R3 −R1 
−1 −4  −−−
−−→  0
R
−2R
4
1
4 −5 −−−−−→
0


1 −3
1
 0
2
1 


 0
−2 −1 
R4 +R3
2
1 −−−−−→ 0

1 −3
2
1 

−2 −1 
0
0
Thus w is in row(A).
13. As in the remarks following Example 3.41, we determine whether w is in row(A) using col(AT ). To
say w is in row(A) means that wT is in col(A), so we form the augmented matrix and row-reduce to
see if the system has a solution:

T
A
w
T
1 1
= 0 1
−1 1


−1
1 1
0 1
1
R3 +R2
1 −
−−−−→ 0 2


−1
1 1
1 R2 − 12 R3 0 0
0 −−−−−−→ 0 2


−1
1 1
R2 ↔R3 
1 −
−−−−→ 0 2
0
0 0

−1
0
1
This matrix has a row that is all zeros except for the final column, so the system is inconsistent. Thus
wT ∈
/ col(AT ) and thus w ∈
/ row(A).
14. As in the remarks following Example 3.41, we determine whether w is in row(A) using col(AT ). To
say w is in row(A) means that wT is in col(A), so we form the augmented matrix and row-reduce to
3.5. SUBSPACES, BASIS, DIMENSION, AND RANK
179
see if the system has a solution:

T
A
w
T
1
= 1
−3
0
2
1
1
−1
−4


1 0
2
R2 −R1
4 −−−
−−→ 0 2
R3 +3R1
−5 −
−−−−→ 0 1
1
−2
−1

2
2
1

1 0
1
2 R2 0
1
−−−→
0 1
1
−1
−1


2
1
0
1
R3 −R2
1 −
−−−−→ 0
0
1
0
1
−1
0

2
1 .
0
This is in row reduced form, and the final row consists of all zeros. This is a consistent system, so that
wT ∈ col(AT ) and thus w ∈ row(A).
15. To check if v is in null(A), we need only multiply A by v:
 
−1
1 0 −1  
0
3 =
Av =
6= 0,
1 1
1
1
−1
so that v ∈
/ null(A).
16. To check if v is in null(A), we need only multiply A by v:
   

0
7
1
1 −3
2
1 −1 = 0 = 0,
Av = 0
0
2
1 −1 −4
so that v ∈ null(A).
17. We follow the comment after Example 3.48. Start by finding the reduced row echelon form of A:
1
1
0
1
−1
1
R2 −R1
1 −
−−−−→ 0
{ 1
0
0
1
−1
2
Thus a basis for row(A) is given by
−1 , 0
1
2 }.
A basis for col(A) is given by the column vectors of the original matrix A corresponding to the leading
1s in the row-reduced matrix, so a basis for col(A) is
1
0
,
.
1
1
Finally, from the row-reduced form, the solutions to Ax = 0 where
 
x1
x = x2 
x3
can be found by noting that x3 is a free variable; setting it to t and then solving for the other two
variables gives x1 = t and x2 = −2t. Thus a basis for the nullspace of A is given by
 
1
x = −2 .
1
180
CHAPTER 3. MATRICES
18. We follow the comment after Example 3.48. Start by finding the reduced row echelon form of A:










R1 −R2
1 1 −3
1
1 −3
1
1 −3
1 1 −3 −−−
−−→ 1 0 − 72
 1 R2 








1
1
2
0 2
0 1
0
0

1
2
1
2
1
 −−−→ 0 1






2
2
R3 +R2
R3 −R1
0
1 −1 −4 −−−
0 0
0
0 0
0
−−→ 0 0
−−→ 0 −2 −1 −−−
Thus a basis for row(A) is given by
1 0 − 72 , 0 1 12
or, clearing fractions,
2
−7 , 0
0
2
1 .
Since the row-reduction did not involve any row exchanges, a basis for row(A) is also given by the rows
of A corresponding to nonzero rows in the reduced matrix:
1 1 −3 , 0 2 1 .
A basis for col(A) is then given by the original column vectors of A corresponding to the columns
containing the leading 1s in the row-reduced matrix; thus a basis for col(A) is given by
   
1 
 1
0 ,  2


1
−1
Finally, from the row-reduced form, the solutions of Ax = 0 where
 
x1
x = x2 
x3
can be found by noting that x3 is a free variable; we set it to t, and then solving for the other two
variables in terms of t to get x1 = 27 t and x2 = − 21 t. Thus a basis for the nullspace of A is given by
 
 
7
7
 
 2
1



x = − 2  or −1

2
1
19. We follow the comment after Example 3.48. Start by finding the reduced row echelon form of A:

1
0
0
1
1
1


1 1
0
1
0 1
−1
1
R3 −R2
−1 −1 −
−−−−→ 0 0
Thus a basis for row(A) is given by
1 0 1


 R1 −R3
0
1
1 1
0 1 −−−
−−→
0 1 −1 1 R2 −R3
−1
1 1
−−−−−→
− 2 R3
0 −2 −
0 1
−−−→ 0 0

 R1 −R2 
1 1
0 0 −
−−−−→ 1 0
0 1 −1 0
0 1
0 0
0 1
0 0
0 , 0
1
−1
0 , 0
0
0
1

1 0
−1 0
0 1
.
Since the row-reduction did not involve any row exchanges, a basis for row(A) is also given by the rows
of A corresponding to nonzero rows in the reduced matrix; since all the rows are nonzero, the rows of
A form a basis for row(A).
A basis for col(A) is then given by the original column vectors of A corresponding to the columns
containing the leading 1s in the row-reduced matrix; thus a basis for col(A) is given by
     
1
1 
 1
0 , 1 ,  1


0
1
−1
3.5. SUBSPACES, BASIS, DIMENSION, AND RANK
181
Finally, from the row-reduced form, the solutions of Ax = 0 where
 
x1
x2 

x=
x3 
x4
can be found by noting that x3 is a free variable; we set it to t. Also, x4 = 0; solving for the other two
variables in terms of t gives x1 = −t and x2 = t. Thus a basis for the nullspace of A is given by
 
−1
 1

x=
 1 .
0
20. We follow the comment after Example 3.48. Start by finding the reduced row echelon form of A:






1 −2
1
4
4
1 −2 1 4 4
2 −4 0 2 1

 R2 +R1 
 R ↔R3 

0
2
6
7
−−→ 0
2 1 2 3 −−−
2 1 2 3 −−1−−−→
−1
−1
R3 +R2
R3 −2R1
0 −2 −6 −7 −−−
2 −4 0 2 1 −
1 −2 1 4 4
−−→
−−−−→ 0



 R −R 

1
2
1 −2 1 4 4
1 −2 1 4 4 −−−
1 −2 0 1 12
−−→

 21 R2 



0 2 6 7 −−
0 1 3 72 
0 1 3 72 
0
0
−→ 0
0
0 0 0 0
0
0 0 0 0
0
0 0 0 0
Thus a basis for row(A) is given by the nonzero rows of the reduced matrix, which after clearing
fractions are
2 −4 0 2 1 , 0 0 2 6 7 .
A basis for col(A) is then given by the original column vectors of A corresponding to the columns
containing the leading 1s in the row-reduced matrix; thus a basis for col(A) is given by
   
0 
 2
−1 , 1 .


1
1
Finally, from the row-reduced form, the solutions of Ax = 0 where
 
x1
x2 
 

x=
x3 
x4 
x5
can be found by noting that x2 = s, x4 = t, and x5 = u are free variables. Solving for the other two
variables gives x1 = 2s − t − 12 u and x3 = −3t − 27 u. Thus the nullspace of A is given by

   
     1 
 
 1 
2s − t + 12 u 
2
2
−1
−2 
−1
−2







 
  



1  0  0
 0
 0
s
1





     
 
 
 
 −3t − 7 u  = s 0 + t −3 + u − 7  = span 0 , −3 , − 7  .
2
2




     2 





 
  



0  1  0
 1
 0
0
t









 

0
0
0
0
u
1
1
Clearing fractions, we get for a basis for the nullspace of A
     
−1
−1 

 2



  0  0

1

     
0 , −3 , −7 .
     


   1  0


 0



0
0
2
182
CHAPTER 3. MATRICES
21. Since row(A) = col(AT ) and col(A) = row(AT ), it suffices to find the row and column spaces of AT .
Then




1 1
1 0
AT =  0 1 → 0 1 ,
−1 1
0 0
so that the column space of AT is given by the columns of AT corresponding to columns of this matrix
containing a leading 1, so
   
1 
 1
col(AT ) =  0 , 1 ⇒ row(A) = 1 0 −1 , 1 1 1 .


−1
1
The row space of AT is given by the nonzero rows of the reduced matrix, so
1
0
T
row(A ) = 1 0 , 0 1 ⇒ col(A) =
,
.
0
1
22. Since row(A) = col(AT ) and col(A) = row(AT ), it suffices to find the row and column spaces of AT .
Then




1 0
1
1 0
1
AT =  1 2 −1 → 0 1 −1 ,
−3 1 −4
0 0
0
so that the column space of AT is given by the columns of AT corresponding to columns of the reduced
matrix containing a leading 1, so
   
0 
 1
T



1
, 2 ⇒ row(A) = 1 1 −3 , 0 2 1 .
col(A ) =


1
−3
The row space of AT is given by the nonzero rows of the reduced matrix, so
   
0 
 1
row(AT ) = 1 0 1 , 0 1 −1 ⇒ col(A) = 0 ,  1 .


−1
1
23. Since row(A) = col(AT ) and col(A) = row(AT ), it suffices to find the row and column spaces of AT .
Then




1
0
1
1 0 0
1
0 1 0
1
1



AT = 
0 −1 −1 → 0 0 1 ,
1
1 −1
0 0 0
so that the column space of AT is given by the columns of AT corresponding to columns of the reduced
matrix containing a leading 1, so
     
1
0
0 


     

1  1  1
T

col(A ) =   ,   ,   ⇒ row(A) = 1 1 0 1 , 0 1 −1 1 , 0 1 −1 −1 .
0
−1
−1 





1
1
−1
The row space of AT is given by the nonzero rows of the reduced matrix, so
     
0
0 
 1
row(AT ) = 1 0 0 , 0 1 0 , 0 0 1 ⇒ col(A) = 0 , 1 , 0 .


0
0
1
3.5. SUBSPACES, BASIS, DIMENSION, AND RANK
183
24. Since row(A) = col(AT ) and col(A) = row(AT ), it suffices to find the row and column spaces of AT .
Then




1 0 1
2 −1
1
0 1 1
−4
2 −2




 → 0 0 0 ,
0
1
1
AT = 




0 0 0
 2
2
4
0 0 0
1
3
4
so that the column space of AT is given by the columns of AT corresponding to columns of the reduced
matrix containing a leading 1, so
   
−1 
2










−4

   2

T
 ,  1 ⇒ row(A) = 2 −4 0 2 1 , −1 2 1 2 3 .
0
col(A ) = 







 2  2






3
1
The row space of AT is given by the nonzero rows of the reduced matrix, so
   
0 
 1
row(AT ) = 1 0 1 , 0 1 1 ⇒ col(A) = 0 , 1 .


1
1
25. There is no reason to expect the bases for the row and column spaces found in Exercises 17 and 21 to
be the same, since we used different methods to find them. What is importantis that they
span
should
1 0 −1 , 0 1 2 },
the same subspaces. The
basis
for
the
row
space
found
in
Exercise
17
was
{
and Exercise 21 gave { 1 0 −1 , 1 1 1 }. Since the
vector is the same, in order to
first basis
0 1 2 is in the span of the second basis,
show the subspaces
are
the
same
it
suffices
to
see
that
and that 1 1 1 is in the span of the first basis. But
0 1 2 = − 1 0 −1 + 1 1 1 ,
1 1 1 = 1 0 −1 + 0 1 2 .
So the two bases span the same row space. Similarly, the basis for the two column spaces were
1
0
1
0
Exercise 17:
,
,
Exercise 21:
,
.
1
1
0
1
Here the second basis vectors are the same, and since
1
1
0
1
1
0
=
+
,
=
−
,
1
0
1
0
1
1
each basis vector is a linear combination of the vectors in the other basis, so they span the same column
space.
26. There is no reason to expect the bases for the row and column spaces found in Exercises 18 and 22 to
be the same, since we used different methods to find them. What is importantis that they
span
should
2
0
−7
0
2
1
the same subspaces. The
basis
for
the
row
space
found
in
Exercise
18
was
{
,
},
and Exercise 22 gave { 1 1 −3 , 0 2 1 }. Since the
vector is the same, in order to
second basis
show the subspaces are the same it suffices to see that 2 0 −7 is in the span of the second basis,
and that 1 1 −3 is in the span of the first basis. But
2
0
−7 = 2 1
1
−3 − 0
2
1 ,
1
1
1
−3 =
2
2
0
1
−7 +
0
2
2
1 .
So the two bases span the same row space. Similarly, the basis for the two column spaces were
   
   
1 
0 
 1
 1
0 ,  1 .
0 ,  2 ,
Exercise 18:
Exercise 22:




1
−1
1
−1
184
CHAPTER 3. MATRICES
Here the first basis vectors are the same, and since

  
 
1
0
1
 2 = 0 + 2  1 ,
−1
1
−1
 
 

0
1
1
 1 = 1  2 − 1 0 ,
2
2
−1
1
−1

each basis vector is a linear combination of the vectors in the other basis, so they span the same column
space.
27. As in Example 3.46, a basis for the span of the given set of vectors is the same as a basis for the
row
space of
the matrixthat has
rows equal
to these vectors. Thus we form a matrix whose rows are
1 −1 0 , −1 0 1 , and 0 1 −1 and row-reduce it:

1
−1
0


−1
0
1
R2 +R1 
0
1 −
−−−−→ 0
1 −1
0


−1
0
1
−R2 
−1
1 −
−−→ 0
1 −1
0
 R1 +R2 
−1
0 −−−
−−→ 1 0
0 1
1 −1
R
−R
3
2
1 −1 −−−−−→ 0 0

−1
−1
0
A basis for the row space of the matrix, and thus a basis for the span of the given vectors, is therefore
given by the nonzero rows of the reduced matrix:
   
0 
 1
 0 ,  1 .


−1
−1
Note that since no row interchanges were involved, the rows of the original matrix corresponding to
nonzero rows in the reduced matrix also form a basis for the span of the given vectors.
28. As in Example 3.46, a basis for the span of the given set of vectors is the same as a basis for the
row
space of
to these
vectors. Thus we form a matrix whose rows are
the matrix
that has rows equal
1 −1 1 , 1 2 0 , 0 1 1 , and 2 1 2 and row-reduce it:

1
1

0
2


1
−1 1
R
−R
2
1
0
−
−
−
−
−
→
2 0


0
1 1
R
−2R
4
1
1 2 −−−−−→ 0

1
0
R3 −R2 
−−−
−−→ 0
R4 −2R2
−−−−−→ 0




1
1 −1
1 1
−1
1
 3 R2 0
0
3
0
3 −1
R
↔R
2
4
 −−−−−→ 
 −−−→ 
0
0
1
1
1
1
3
0
0
3 −1
0



−1
1
1 −1 1
0
1
0
1 0



0
0
1
0 1
R4 +R3
0 −1 −
0 0
−−−−→ 0

−1
1
1
0

1
1
3 −1
A basis for the row space of the matrix, and thus a basis for the span of the given vectors, is therefore
given by
     
0
0 
 1
−1 , 1 , 0 .


1
0
1
(Note that this matrix could be further reduced, and we would get a different basis).
29. As in Example 3.46, a basis for the span of the given set of vectors is the same as a basis for the
row space of the matrix that has rows equal to these vectors. Thus we form a matrix whose rows are
3.5. SUBSPACES, BASIS, DIMENSION, AND RANK
2
−3
185
1 , 1 −1 0 , 4 −4 1 and row-reduce it:








1 −1 0
1 −1
0
2 −3 1
1 −1 0
R2 −2R1
−R2 
R1 ↔R2 
1 −1 0−
0
1 −1
−−−−→ 2 −3 1 −−−−−→ 0 −1 1 −−−→
R3 −4R1
0
−2
1
0
−2
1
4 −4 1
4 −4 1 −
−−−−→






1 −1
0
1 −1
0
1 −1 0
R2 +R3 
0
0
1 −1
1 −1 −
1 0
−−−−→ 0
R3 +2R2
−R3
0 −1 −−−→ 0
0
1
0
0 1
−−−−−→ 0


R1 +R2
−−−−−→ 1 0 0
0 1 0
0 0 1
A basis for the row space of the matrix, and thus a basis for the span of the given vectors, is therefore
given by
{ 1 0 0 , 0 1 0 , 0 0 1 }
(Note that as soon as we got to the first matrix on the second line above, we could have concluded that
the reduced matrix would have no zero rows; since it was 3 × 3, it would thus be the identity matrix,
so that we could immediately write down the basis above, or simply use the basis given by the rows of
that matrix).
30. As in Example 3.46, a basis for the span of the given set of vectors is the same as a basis for the
row
space of the
matrix that has
rows equal to
these vectors. Thus we form a matrix whose rows are
0 1 −2 1 , 3 1 −1 0 , 2 1 5 1 and row-reduce it:
 1





1
1
3 R1
3 1 −1 0 −−
1
0 1 −2 1
−
0
3
3
 −→ 
 R1 ↔R2 


0 1 −2 1
0 1 −2 1
3 1 −1 0 −
−
−
−
−
→






2
1
5
1
2 1
5 1
2 1



1 31 − 31 0
1



0 1 −2 1
0



R3 − 31 R2
R −2R1
17
0 13
1
−−3−−−→
−−−−−−→ 0
3


1 13 − 31
0


0 1 −2
1


3
R3
2
1 19
−19
−−→ 0 0
5
1
1
3
− 13
1
−2
0
19
3

0

1

2
3
Although this matrix could be reduced further, we already have a basis for the row space:
2
0 1 −2 1 ,
1 13 − 13 0 ,
0 0 1 19
,
or
3
1
−1
0 ,
0
1
−2
1 ,
0
0
19
2 .
31. If, as in Exercise 29, we form the matrix whose rows are those vectors and row-reduce it, we get
the identity matrix (see Exercise 29). Because there were no row exchanges, the rows of the original
matrix corresponding to nonzero rows in the row-reduced matrix form a basis for the span of the
vectors (because there were no row exchanges). Thus a basis consists of all three given vectors:
2 −3 1 , 1 −1 0 , 4 −4 1 .
32. If, as in Exercise 30, we form the matrix whose rows are those vectors and row-reduce it, we get a
matrix with no zero rows. Thus the rank of the matrix is 3, so the rows are linearly independent, so a
basis consists of all three row vectors:
0 1 −2 1 , 3 1 −1 0 , 2 1 5 1 .
186
CHAPTER 3. MATRICES
33. If R is a matrix in echelon form, then row(R) is the span of the rows of R, by definition. But nonzero
rows of R are linearly independent, since their first entries are in different columns, so those nonzero
rows form a basis for row(R) (they span, and are linearly independent).
34. col(A) is the span of the columns of A by definition. If the columns of A are also linearly independent,
then they form a basis for col(A) by definition of “basis”.
35. The rank of A is the number of vectors in a basis for either row(A) or col(A), by definition. From
Exercise 17, row(A) has two vectors in its basis, so rank(A) = 2. Then nullity(A) is the number of
columns minus the rank, or 3 − 2 = 1.
36. The rank of A is the number of vectors in a basis for either row(A) or col(A), by definition. From
Exercise 18, row(A) has two vectors in its basis, so rank(A) = 2. Then nullity(A) is the number of
columns minus the rank, or 3 − 2 = 1.
37. The rank of A is the number of vectors in a basis for either row(A) or col(A), by definition. From
Exercise 19, row(A) has three vectors in its basis, so rank(A) = 3. Then nullity(A) is the number of
columns minus the rank, or 4 − 3 = 1.
38. The rank of A is the number of vectors in a basis for either row(A) or col(A), by definition. From
Exercise 20, row(A) has two vectors in its basis, so rank(A) = 2. Then nullity(A) is the number of
columns minus the rank, or 5 − 2 = 3.
T
39. Since nullity(A) > 0, the equation
P Ax = 0 has a nontrivial solution x = c1 c2 · · · cn , where at
least one ci 6= 0. Then Ax =
ci ai = 0. Since at least one ci 6= 0, this shows that the columns of A
are linearly dependent.
40. Since rank(A) is at most the number of columns of A and also at most the number of rows of A, in
this case rank(A) ≤ 2. Thus any row-reduced form of A will have at least 4 − 2 = 2 zero rows, which
means that some row of A is a linear combination of the other rows, so that the rows of A are linearly
dependent.
41. Since rank(A) is at most the number of columns of A and also at most the number of rows of A, in
this case rank(A) ≤ 5 and rank(A) ≤ 3, so that rank(A) ≤ 3 and its possible values are 0, 1, 2, and 3.
Using the Rank Theorem, we know that
nullity(A) = n − rank(A) = 5 − rank(A), so that nullity(A) = 5, 4, 3, or 2.
42. Since rank(A) is at most the number of columns of A and also at most the number of rows of A, in
this case rank(A) ≤ 2 and rank(A) ≤ 4, so that rank(A) ≤ 2 and its possible values are 0, 1, and 2.
Using the Rank Theorem, we know that
nullity(A) = 2 − rank(A) = 1 − rank(A), so that nullity(A) = 2, 1, or 0.
43. First row-reduce A:



1
2 a
1
R2 +2R1
−2 4a 2 −−−−−→ 0
R3 −aR1
a −2 1 −
−−−−→ 0


1
2
a
0
4a + 4 2a + 2
R2 + 21 R2
−2a − 2 1 − a2 −
−−−−−→ 0
2
4a + 4
0
If a = −1, then we have

1
0
0
2
0
0

−1
0 so that rank(A) = 1.
0

1
0
0
2
12
0

2
6 so that rank(A) = 2.
0
If a = 2, then we have
Otherwise, the row-reduced matrix has all rows nonzero, so that rank(A) = 3.

a
2a + 2 
−a2 + a + 2
3.5. SUBSPACES, BASIS, DIMENSION, AND RANK
187
44. First row-reduce A:



 R1 +R2 

a
2 −1
−2 −1
a −
−−−−→ 1 2 a − 2
R1 ↔R3 
 3
3 3
3 −2−
3
3 −2
−2
−−−−→
−2 −1
a
a
2 −1
a 2
−1



1
2
1
2
a−2
1
R2 −3R1
− 3 R2 
−3
4 − 3a  −
−−−−−→ 0
1
−−−→ 0
2
R3 −aR2
0
2
−
2a
−a
+
2a
−
1
0
2
−
2a
−−−−−→


1 2
a−2
0 1

a − 34
R3 −(2−2a)R2
5
0
0
(a
−
1)
a
−
−−−−−−−−−→
3

a−2

a − 34
2
−a + 2a − 1
The first two rows are always nonzero. If a = 1 or a = 53 , then the bottom row is zero, so that
rank(A) = 2. Otherwise, all three rows are nonzero and rank(A) = 3.
45. As in example 3.52,
 
1
1 ,
0
 
1
0 ,
1
 
0
1
1
form a basis for R3 if and only if the matrix with those vectors as its rows has rank 3. Row-reduce
that matrix:






1
1 0
1 1 0
1
1 0
R2 −R1 
0 −1 1
1 0 1 −
−−−−→ 0 −1 1
R3 +R2
0 2
0 1 1
0
1 1 −
−−−−→ 0
Since the matrix has rank 3 (it has no nonzero rows and is in echelon form), the vectors are linearly
independent; since there are three of them, they form a basis for R3 .
46. As in example 3.52,

1
−1 ,
3

 
−1
 5 ,
1


1
−3
1
form a basis for R3 if and only if the matrix with those vectors as its rows has rank 3. Row-reduce
that matrix:






1 −1 3
1 −1
3
1 −1
3
R +2R1
R2 +2R3 
−1
0
5 1 −−2−−−→
4
4 −
0
0
−−−−→ 0
R
−R
3
1
1 −3 1 −−−
0
−2
−2
0
−2
−2
−−→
Since the matrix does not have rank 3, it follows that these vectors do not form a basis for R3 . (In
fact,
   
   
1
−1
1
0
− −1 +  5 + 2 −3 = 0 ,
3
1
1
0
so the vectors are linearly dependent.)
47. As in example 3.52,
 
1
1
 ,
1
0
 
1
1
 ,
0
1
 
1
0
 ,
1
1
 
0
1
 
1
1
188
CHAPTER 3. MATRICES
form a basis for R4 if and only if the matrix with those vectors as its rows has rank 4. Row-reduce
that matrix:

1
1

1
0
1
1
0
1
1
0
1
1


1
0
R
−R
2
1 
−
−
−
−
−
→
1
0

R3 −R1 0
1 −
−−−−→
0
1


1
1
1 0
0
0 −1 1
R
↔R
4
2


−1
0 1 −−−−−→ 0
0
1
1 1

1
1 0
1
1 1

−1
0 1
0 −1 1



1
1 1
1 0
0
0 1

1
1



R3 +R2 
0
1 2
−−−
−−→ 0 0
R4 +R3
0 0 −1 1 −−−−−→ 0
1
1
0
0
1
1
1
0

0
1

2
3
This matrix is in echelon form and has no zero rows, so it has rank 4 and thus these vectors form a
basis for R4 .
48. As in example 3.52,


1
−1
 ,
 0
0


0
 1
 ,
 0
−1


0
 0
 ,
−1
1
 
−1
 0
 
 1
0
form a basis for R4 if and only if the matrix with those vectors as its rows has rank 4. Row-reduce
that matrix:






1 −1
0
0
1 −1
0
0
1 −1
0
0
0
0
 0
1
0 −1
1
0 −1
1
0 −1










 0
0
0 −1
1
0
0 −1
1
0 −1
1
R4 +R2
R4 +R1
0
1 −1
1
0 −
−1
0
1
0 −
−−−−→ 0
−−−−→ 0 −1


1 −1
0
0
0
1
0 −1


0
0 −1
1
R4 +R3
0
0
0
−−−
−−→ 0
Since the matrix does not have rank 4, these vectors do not form a basis for R4 . (In fact, the sum of
the four vectors is zero, so they are linearly dependent.)
49. As in example 3.52,
 
1
1 ,
0
 
0
1 ,
1
 
1
0
1
form a basis for Z32 if and only if the matrix with those vectors as its rows has rank 3. Row-reduce
that matrix over Z2 :






1 1 0
1 1 0
1 1 0
0 1 1
0 1 1
0 1 1
R3 +R1
R3 +R2
1 0 1 −
−−−−→ 0 1 1 −−−
−−→ 0 0 0
Since the matrix does not have rank 3, these vectors do not form a basis for Z32 .
50. As in example 3.52,
 
1
1 ,
0
 
0
1 ,
1
 
1
0
1
3.5. SUBSPACES, BASIS, DIMENSION, AND RANK
189
form a basis for Z33 if and only if the matrix with those vectors as its rows has rank 3. Row-reduce
that matrix over Z3 :






1 1 0
1 1 0
1 1 0
0 1 1
0 1 1
0 1 1
R3 +2R1
R3 +R2
1 0 1 −
−−−−→ 0 2 1 −−−
−−→ 0 0 2
Since the matrix has rank 3, these vectors form a basis for Z33 .
51. We want to solve
 
   
1
1
1
c1 2 + c2  0 = 6 ,
0
−1
2
so set up the augmented matrix of the linear system and row-reduce to solve it:




 R1 −R2 

1
1 1
−−→ 1 0
1
1
1 −−−
1
1 1
1
− 2 R2 
R2 −2R1 
0 1
2
0 6 −
1 −2
−−−−→ 0 −2 4 −−−
−→ 0
R
+R
3
2
0 −1 2
0 −1 2
0 −1
2 −−−−−→ 0 0

3
−2
0
Thus c1 = 3 and c2 = −2, so that
   
 
1
1
1
3 2 − 2  0 = 6 .
2
−1
0
3
Then the coordinate vector [w]B =
.
−2
52. We want to solve
   
 
1
5
3
c1 1 + c2 1 = 3 ,
4
6
4
so set up the augmented matrix of the linear system and row-reduce to solve it:






 R1 −R2 

3 5 1
1 1
1 1 3
1 1
3 1
3 −−−
−−→ 1 0
R
−3R
2
1
R
R
↔R
2 
1 1 3 −−1−−−→

0 1
2 2 0
3 5 1 −−−−−→ 0 2 −8 −−
1
−4
−→
R3 −4R1
R3 −2R2
4 6 4
4 6 4 −
0
2
−8
0
2
−8
−−−−→
−−−−−→ 0 0

7
−4
0
Thus c1 = 7 and c2 = −4, so that
 
   
3
5
1
7 1 − 4 1 = 3 ,
4
6
4
7
Then the coordinate vector [w]B =
.
−4
53. Row-reduce A over Z2 :

1
A = 0
1
1
1
0


0
1
0
1
R3 +R1
1 −
−−−−→ 0
1
1
1


0
1
0
1
R3 +R2
1 −
−−−−→ 0
1
1
0

0
1 .
0
This matrix is in echelon form with one zero row, so it has rank 2. Thus nullity(A) = 3 − rank(A) =
3 − 2 = 1.
54. Row-reduce A over Z3 :

1
A = 2
2
1
1
0


2
1
R2 +R1
2 −−−
−−→ 0
R3 +R1
0 −
−−−−→ 0
1
2
1


2
1
0
1
R3 +R2
2 −
−−−−→ 0
1
2
0

2
1 .
0
190
CHAPTER 3. MATRICES
This matrix is in echelon form with one zero row, so it has rank 2. Thus nullity(A) = 3 − rank(A) =
3 − 2 = 1.
55. Row-reduce A over Z5 :

1
A = 2
1
3
3
0
1
0
4


1
4
R +3R1
0
1 −−2−−−→
R3 +4R1
0 −
−−−−→ 0
3
2
2
1
3
3


4
1
0
3
R3 +4R2
1 −
−−−−→ 0
3
2
0
1
3
0

4
3
3
This matrix is in echelon form with no zero rows, so it has rank 3. Thus nullity(A) = 4 − rank(A) =
4 − 3 = 1.
56. Row-reduce A over Z7 :

2
6
A=
1
1
4
3
0
1
0
5
2
1
0
1
2
1


2
1 R +4R
2
1
−−−−−→ 0
0
 R +3R 
1 
0
5 −
−3−−−→
R
+3R
1
0
1 −−4−−−→
4
5
5
6
0
5
2
1
0
1
2
1

1
4
 R +6R2
1 −−3−−−→
R4 +3R2
4 −
−−−−→

2 4 0
0 5 5

0 0 4
0 0 2
0
1
1
4


2
1
0
4


0
4
R4 +3R3
2 −
−−−−→ 0
4
5
0
0
0
5
4
0
0
1
1
0

1
4
.
4
0
This matrix is in echelon form with one zero row, so it has rank 3. Thus nullity(A) = 5 − rank(A) =
5 − 3 = 2.
57. Since x is in null(A), we know that Ax = 0. But if Ai is the ith row of A, then

  
 
0
a11 x1 + a12 x2 + · · · + a1n xn
A1 · x
 a21 x1 + a22 x2 + · · · + a2n xn   A2 · x  0

  
 
Ax = 
 =  ..  =  ..  .
..



.  .
.
am1 x1 + am2 x2 + · · · + amn xn
Am · x
0
Thus Ai · x = 0 for all i, so that every row of A is orthogonal to x.
58. By the Fundamental Theorem, if A and B have rank n, then they are both invertible. Therefore, by
Theorem 3.9(c), AB is invertible as well. Applying the Fundamental Theorem again shows that AB
also has rank n.
59. (a) Let bk be the k th column of B. Then Abk is the k th column of AB. Now suppose that some set
of columns of AB, say Abk1 , Abk2 , . . . , Abkr , are linearly independent. Then
ck1 bk1 + ck2 bk2 + · · · + ckr bkr = 0 ⇒
A (ck1 bk1 + ck2 bk2 + · · · + ckr bkr ) = 0 ⇒
ck1 (Abk1 ) + ck2 (Abk2 ) + · · · + ckr (Abkr = 0).
But the Abki are linearly independent, so all the cki are zero and thus the bki are linearly
independent. So the number of linearly independent columns in B is at least as large as the
number of linearly independent columns of AB; it follows that rank(AB) ≤ rank(B).
(b) There are many examples. For instance,
1 1
1 0
A=
, B=
.
0 0
1 1
Then B has rank 2, and
AB =
2
0
1
has rank 1.
0
3.5. SUBSPACES, BASIS, DIMENSION, AND RANK
191
60. (a) Recall that by Theorem 3.25, if C is any matrix, then rank(C T ) = rank(C). Thus
rank(AB) = rank((AB)T ) = rank(B T AT ) ≤ rank(AT ) = rank(A),
using Exercise 59(a) with AT in place of B and B T in place of A.
(b) For example, we can use the transposes of the matrices in part (b) of Exercise 59:
1 1
1 0
A=
, B=
.
0 1
1 0
Then A has rank 2 and
2
AB =
1
0
has rank 1.
0
61. (a) Since U is invertible, we have A = U −1 (U A), so that using Exercise 59(a) twice,
rank(A) = rank U −1 (U A) ≤ rank(U A) ≤ rank(A).
Thus rank(A) = rank(U A).
(b) Since V is invertible, we have A = (AV )V −1 , so that using Exercise 60(a) twice,
rank(A) = rank (AV )V −1 ≤ rank(AV ) ≤ rank(A).
Thus rank(A) = rank(AV ).
62. Suppose first that A has rank 1. Then col(A) = span(u) for some
vector u. It follows that if ai is the
ith column of A, then for each i, we have ai = ci u. Let vT = c1 c2 · · · cn . Then A = uvT . For
the converse, suppose that A = uvT . Suppose that vT = c1 c2 · · · cn . Then
A = uvT = c1 u · · · cn u .
Thus every column of A is a scalar multiple of u, so that col(A) = span(u) and thus rank(A) = 1.
63. If rank(A) = r, then a basis for col(A) consists of r vectors, say u1 , u2 , . . . , ur ∈ Rm . So if ai is the
ith column of A, then for each i we have
ai = ci1 u1 + ci2 u2 + · · · + cir ur , i = 1, 2, . . . , n.
Now let viT = c1i c2i · · · cni , for i = 1, 2, . . . , r. Then so that
A = a1 a2 · · · an = c11 u1 + c12 u2 + · · · + c1r ur · · · cn1 u1 + cn2 u2 + · · · + cnr ur
= c11 u1 · · · cn1 u1 + c12 u2 · · · cn2 u2 + c1r ur · · · cnr ur
= u1 · v1T + u2 · v2T + · · · + ur · vrT .
By exercise 62, each ui · viT is a matrix of rank 1, so A is a sum of r matrices of rank 1.
64. Let A = {u1 , . . . , ur } be a basis for col(A) and B = {v1 , . . . , vs } be a basis for col(B). Then rank(A) =
r and rank(B) = s. Let ai and bi be the columns of A and B respectively, for i = 1, 2, . . . , n. Then if
x ∈ col(A + B),


!
!
n
n
r
s
r
n
s
n
X
X
X
X
X
X
X
X

x=
c1 (ai + bi ) =
αij uj +
βik vk  =
ci αij uj +
ci βik vk ,
i=1
i=1
j=1
k=1
j=1
i=1
k=1
i=1
so that x is a linear combination of A and B. Thus A ∪ B spans col(A + B). But this implies that
col(A + B) ⊆ span(A ∪ B) = span(A) + span(B) = col(A) + col(B).
Since rank(A + B) is the dimension of col(A + B) and rank(A) + rank(B) is the dimension of col(A)
plus the dimension of col(B), we are done.
192
CHAPTER 3. MATRICES
P
65. We show that col(A) ⊆ null(A) and then use the Rank Theorem. Suppose x =
ci ai ∈ col(A). Let
ai be the ith column of A; then ai = Aei where the ei are the standard basis vectors for Rn . Then
X
X
X
X
X
Ax = A
ci ai =
ci (Aai ) =
ci (A (Aei )) =
ci A2 ei =
ci (Oei ) = 0.
Thus any vector in the column space of A is taken to zero by A, so it is in the null space of A. Thus
col(A) ⊆ null(A). But then
rank(A) = dim(col(A)) ≤ dim(null(A)) = nullity(A),
so by the Rank Theorem,
rank(A) + rank(A) ≤ rank(A) + nullity(A) = n
⇒
2 rank(A) ≤ n
⇒
rank(A) ≤
n
.
2
T
66. (a) Note first that since x is n × 1, then xT is 1 × n and thus xT Ax is 1 × 1. Thus xT Ax = xT Ax.
Now since A is skew-symmetric,
xT Ax = xT Ax
T
= x T AT x T
T
= xT (−A) x = −xT Ax.
So xT Ax is equal to its own negation, so it must be zero.
(b) By Theorem 3.27, to show I + A is invertible it suffices to show that null(I + A) = 0. Choose x
with (I + A)x = 0. Then xT (I + A)x = xT 0 = x · 0 = 0, so that
0 = xT (I + A)x = xT Ix + xT Ax = xT Ix + xT 0 = xT Ix = xT x = x · x
since Ax = 0. But x · x = 0 implies that x = 0. So any vector in null(I + A) is the zero vector
and thus I + A has a zero null space, so it is invertible.
3.6
Introduction to Linear Transformations
1. We must compute T (u) and T (v), so we take the product of the matrix A of T and each of these two
vectors:
2 −1 1
0
T (u) = Au =
=
3
4 2
11
2 −1 1
8
T (v) = Av =
=
.
3
4 2
1
2. We must compute T (u) and T (v), so we take the product of the matrix A of T and each of these two
vectors:



  
3 −1 3·1−1·2
1
1
2
T (u) = Au = 1
= 1 · 1 + 2 · 2 = 5
2
1
4
1·1+4·2
9



  
3 −1
3 · 3 − 1 · (−2)
11
3
2
T (v) = Av = 1
= 1 · 3 + 2 · (−2) = −1
−2
1
4
1 · 3 + 4 · (−2)
−5
3. Use the remark following Example 3.55. T is a linear transformation if and only if
x
x
T (c1 v1 + c2 v2 ) = c1 T (v1 ) + c2 T (v2 ), where v1 = 1 , v2 = 2 .
y1
y2
3.6. INTRODUCTION TO LINEAR TRANSFORMATIONS
193
Then
x1
x
+ c2 2
y1
y2
c1 x1 + c2 x2
=T
c1 y1 + c2 y2
(c1 x1 + c2 x2 ) + (c1 y1 + c2 y2 )
=
(c1 x1 + c2 x2 ) − (c1 y1 + c2 y2 )
(c1 x1 + c1 y1 ) + (c2 x2 + c2 y2 )
=
(c1 x1 − c1 y1 ) + (c2 x2 − c2 y2 )
c1 x1 + c1 y1
c2 x2 + c2 y2
=
+
c1 x1 − c1 y1
c2 x2 − c2 y2
x1 + y1
x2 + y2
= c1
+ c2
x1 − y1
x2 − y2
T (c1 v1 + c2 v2 ) = T
c1
= c1 T (v1 ) + c2 T (v2 )
and thus T is linear.
4. Use the remark following Example 3.55. T is a linear transformation if and only if
T (c1 v1 + c2 v2 ) = c1 T (v1 ) + c2 T (v2 ), where v1 =
x1
x
, v2 = 2 .
y1
y2
Then
x1
x
T (c1 v1 + c2 v2 ) = T c1
+ c2 2
y1
y2
c x + c2 x2
=T 1 1
c1 y1 + c2 y2


−(c1 y1 + c2 y2 )
=  (c1 x1 + c2 x2 ) + 2(c1 y1 + c2 y2 ) 
3(c1 x1 + c2 x2 ) − 4(c1 y1 + c2 y2 )

 

−c1 y1
−c2 y2
=  c1 x1 + 2c1 y1  +  c2 x2 + 2c2 y2 
3c1 x1 − 4c1 y1
3c2 x2 − 4c2 y2




−y1
−y2
= c1  x1 + 2y1  + c2  x2 + 2y2 
3x1 − 4y1
3x2 − 4y2
= c1 T (v1 ) + c2 T (v2 )
and thus T is linear.
5. Use the remark following Example 3.55. T is a linear transformation if and only if
 
 
x1
x2
T (c1 v1 + c2 v2 ) = c1 T (v1 ) + c2 T (v2 ), where v1 =  y1  , v2 =  y2  .
z1
z2
194
CHAPTER 3. MATRICES
Then

 
 
x1
x2
T (c1 v1 + c2 v2 ) = T c1  y1  + c2  y2 
z1
z2


c1 x1 + c2 x2
= T  c1 y1 + c2 y2 
c1 z1 + c2 z2
(c1 x1 + c2 x2 ) − (c1 y1 + c2 y2 ) + (c1 z1 + c2 z2 )
=
2(c1 x1 + c2 x2 ) + (c1 y1 + c2 y2 ) − 3(c1 z1 + c2 z2 )
(c1 x1 − c1 y1 + c1 z1 ) + (c2 x2 − c2 y2 + c2 z2 )
=
(2c1 x1 + c1 y1 − 3c1 z1 ) + (2c2 x2 + c2 y2 − 3c2 z2 )
c1 x1 − c1 y1 + c1 z1
c2 x2 − c2 y2 + c2 z2
=
+
2c1 x1 + c1 y1 − 3c1 z1
2c2 x2 + c2 y2 − 3c2 z2
x1 − y1 + z1
x2 − y2 + z2
= c1
+ c2
2x1 + y1 − 3z1
2x2 + y2 − 3z2
= c1 T (v1 ) + c2 T (v2 )
and thus T is linear.
6. Use the remark following Example 3.55. T is a linear transformation if and only if
 
 
x1
x2
T (c1 v1 + c2 v2 ) = c1 T (v1 ) + c2 T (v2 ), where v1 =  y1  , v2 =  y2  .
z1
z2
Then

 
 
x1
x2
T (c1 v1 + c2 v2 ) = T c1  y1  + c2  y2 
z1
z2


c1 x1 + c2 x2
= T  c1 y1 + c2 y2 
c1 z1 + c2 z2


(c1 x1 + c2 x2 ) + (c1 z1 + c2 z2 )
=  (c1 y1 + c2 y2 ) + (c1 z1 + c2 z2 ) 
(c1 x1 + c2 x2 ) + (c1 y1 + c2 y2 )

 

c1 x1 + c1 z1
c2 x2 + c2 z2
=  c1 y1 + c1 z1  +  c2 y2 + c2 z2 
c1 x1 + c1 y1
c2 x2 + c2 y2




x1 + z1
x2 + z2
= c1  y1 + z1  + c2  y2 + z2 
x1 + y1
x2 + y2
= c1 T (v1 ) + c2 T (v2 )
and thus T is linear.
7. For example,
1
0
0
2
0
0
1+2
3
0
0
0
0
T
= 2 =
, T
= 2 =
, but T
=T
= 2 =
6=
+
.
0
1
1
0
2
4
0
0
3
9
1
4
8. For example,
−1
|−1|
1
1
|1|
1
−1 + 1
0
|0|
0
1
1
T
=
=
, T
=
=
, but T
=
=
=
6=
+
.
0
|0|
0
0
|0|
0
0
0
|0|
0
0
0
3.6. INTRODUCTION TO LINEAR TRANSFORMATIONS
9. For example
T
2
2·2
4
2+2
4
4·4
16
4
4
, but T
=T
=
=
6=
+
.
=
=
2
2+2
4
2+2
4
4+4
8
4
4
10. For example,
1
1+1
2
1+1
2
2+1
3
2
2
T
=
=
, but T
=T
=
=
6=
+
.
0
0−1
−1
0+0
0
0−1
−1
−1
−1
11. By Theorem 3.31, the standard matrix of T is the matrix A whose ith column is T (ei ). Since
1
1+0
1
0
0+1
1
T (e1 ) = T
=
=
, T (e2 ) = T
=
=
,
0
1−0
1
1
0−1
−1
the matrix of T is
1
A=
1
1
.
−1
12. By Theorem 3.31, the standard matrix of T is the matrix A whose ith column is T (ei ). Since

  

  
−0
0
−1
−1
1
0
T (e1 ) = T
=  1 + 2 · 0  = 1 , T (e2 ) = T
=  0 + 2 · 1  =  2 ,
0
1
3·1−4·0
3
3·0−4·1
−4
the matrix of T is

0
A = 1
3

−1
2 .
−4
13. By Theorem 3.31, the standard matrix of T is the matrix A whose ith column is T (ei ). Since
 
1
1−0+0
1
T (e1 ) = T 0 =
=
,
2·1+0−3·0
2
0
 
0
0−1+0
−1
T (e2 ) = T 1 =
=
,
2·0+1−3·0
1
0
 
0
0−0+1
1
T (e3 ) = T 0 =
=
,
2·0+0−3·1
−3
1
the matrix of T is
1
A=
2
−1
1
.
1 −3
14. By Theorem 3.31, the standard matrix of T is the matrix A whose ith column is T (ei ). Since
  
  
1
1+0
1
T (e1 ) = T 0 = 0 + 0 = 0 ,
0
1+0
1
  
  
0
0+0
0
T (e2 ) = T 1 = 1 + 0 = 1 ,
0
0+1
1
  
  
0
0+1
1
T (e3 ) = T 0 = 0 + 1 = 1 ,
1
0+0
0
195
196
CHAPTER 3. MATRICES
the matrix of T is

1
A = 0
1

1
1 .
0
0
1
1
15. Reflecting a vector in the x-axis means negating the x-coordinate. So using the method of Example
3.56,
x
−x
−1
0
,
F
=
=x
+y
y
y
0
1
and thus F is a matrix transformation with matrix
−1
F =
0
0
.
1
It follows that F is a linear transformation.
16. Example 3.58 shows that a rotation of 45◦ about the origin sends
" #
"
# " √ #
" #
"
# "√ #
2
◦
0
cos
135
− 2
1
cos 45◦
to
= √22 ,
to
= √22 and
◦
◦
1
sin 135
0
sin 45
2
2
so that R is a matrix transformation with matrix
"√
R=
√
2
√2
2
2
−√ 22
#
2
2
.
It follows that R is a linear transformation.
17. Since D stretches by a factor of 2 in the x-direction and a factor of 3 in the y-direction, we see that
x
2x
2 0 x
D
=
=
.
y
3y
0 3 y
Thus D is a matrix transformation with matrix
2
0
0
.
3
It follows that D is a linear transformation.
18. Since P = (x, y) projects onto the line whose direction vector is h1, 1i, it follows that the projection of
P onto that line is
x+y x+y
hx, yi · h1, 1i
h1, 1i =
,
.
projh1,1i hx, yi =
h1, 1i · h1, 1i
2
2
Thus
" # "
#
" #
" # "
x+y
1
1
1
x
2
2
T
= x+y = x 1 + y 21 = 21
y
2
2
2
2
1
2
1
2
#" #
x
.
y
Thus T is a matrix transformation with the matrix above, and it follows that T is a linear transformation.
19. Let
A1 =
k
0
0
,
1
A2 =
1
0
0
,
k
B=
0
1
1
,
0
C1 =
1
0
k
,
1
C2 =
1
k
0
1
3.6. INTRODUCTION TO LINEAR TRANSFORMATIONS
197
Then
x
k
A1
=
y
0
x
1
A2
=
y
0
x
0
B
=
y
1
x
1
C1
=
y
0
x
1
C2
=
y
k
0 x
kx
: A1 stretches vectors horizontally by a factor of k
=
1 y
y
0 x
x
=
: A2 stretches vectors vertically by a factor of k
k y
ky
1 x
y
=
: B reflects vectors in the line y = x
0 y
x
k x
x + ky
=
: C1 extends vectors horizontally by ky
1 y
y
0 x
x
=
: C1 extends vectors vertically by kx
1 y
y + kx
(C1 and C2 are called shears.) The effect of each of these transformations, with k = 2, on the unit
square is
Original
A1
3
3
2
2
2
1
1
1
0
0
1
2
0
3
0
1
B
2
0
3
C1
3
3
2
2
2
1
1
1
0
0
1
2
0
3
0
1
A2
3
0
1
0
3
3
2
3
C2
3
2
2
0
1
Note that reflection across the line y = x leaves the square unchanged, since it is symmetric about that
line.
20. Using the formula from Example 3.58, we compute the matrix for a counterclockwise rotation of 120◦ :
"
R120◦ (e1 ) =
cos 120◦
sin 120◦
#
"
=
− 21
√
3
2
#
"
and R120◦ (e2 ) =
− sin 120◦
cos 120◦
So by Theorem 3.31, we have
"
h
i
−1
A = R120◦ (e1 ) R120◦ (e2 ) = √32
2
√
− 23
1
2
#
.
#
"
=
√
− 23
− 12
#
.
198
CHAPTER 3. MATRICES
21. A clockwise rotation of 30◦ is a rotation of −30◦ . Then using the formula from Example 3.58, we
compute the matrix for that rotation:
R−30◦ (e1 ) =
"
#
cos(−30◦ )
sin(−30◦ )
"√ #
=
3
2
− 12
"
and R120◦ (e2 ) =
#
− sin(−30◦ )
cos(−30◦ )
"
=
1
√2
3
2
#
.
So by Theorem 3.31, we have
h
"√
i
3
2
− 21
A = R−30◦ (e1 ) R−30◦ (e2 ) =
22. A direction vector of the line y = 2x is d =
P` (e1 ) =
d · e1
d·d
1
√2
3
2
#
.
1
, so
2
" # " #
" # " #
1
2
1
1
1
2
d · e2
5
d=
= 2 and P` (e2 ) =
d=
= 54 .
1+4 2
d·d
1+4 2
5
5
Thus by Theorem 3.31, we have
h
i
A = P` (e1 ) P` (e2 ) =
"
1
5
2
5
2
5
4
5
#
.
23. A direction vector of the line y = −x is d =
P` (e1 ) =
d · e1
d·d
1
, so
−1
" # " #
" # " #
1
1
1
− 21
1
d · e2
−1
2
d=
=
and
P
(e
)
=
d
=
=
.
`
2
1
1 + 1 −1
d·d
1 + 2 −1
− 12
2
Thus by Theorem 3.31, we have
h
i
A = P` (e1 ) P` (e2 ) =
"
1
2
− 21
24. Reflection in the line y = x is given by
x
y
0
1
0
R
=
=x
+y
=
y
x
1
0
1
− 12
1
2
#
.
1 x
,
0 y
so that the matrix of R is
0
1
1
.
0
25. Reflection in the line y = −x is given by
x
−y
0
−1
0
R
=
=x
+y
=
y
−x
−1
0
−1
so that the matrix of R is
0
−1
−1
.
0
−1 x
,
0 y
3.6. INTRODUCTION TO LINEAR TRANSFORMATIONS
199
26. (a) Consider the following diagrams:
a+b
{
c F{ HbL
a
{
F{ HbL
F{ HbL
cb
b
F{ HaL
F{ Ha + bL
b
F{ HaL
From the left-hand graph, we see that F` (cb) = cF` (b), and the right-hand graph shows that
F` (a + b) = F` (a) + F` (b).
(b) Let P` (x) be the projection of x on the line `; this is the image of x under the linear transformation
P` . Then from the figure in the text, since the diagonals of the parallelogram shown bisect each
other, and the point furthest from the origin on ` is x + F` (x), we have
x + F` (x) = 2P` (x), so that F` (x) = 2P` (x) − x.
Now, Example 3.59 gives us the matrix of P` , where ` has direction vector d =
F` (x) = 2P` (x) − x
"
#
"
#
d21
d1 d2
1 0
2
= 2
x−
x
d1 + d22 d1 d2
d22
0 1
"
"
#
d21
d1 d2
d21 + d22
2
1
=
−
d21 + d22 d1 d2
d21 + d22
d22
0
"
#
d21 − d22
2d1 d2
1
x,
= 2
2
d1 + d2 2d1 d2 −d21 + d22
0
2
d1 + d22
d1
, so we have
d2
#!
x
which shows that the matrix of F` is as claimed in the exercise statement.
(c) If θ is the angle that ` forms with the positive x-axis, we may take
d
cos θ
d= 1 =
d2
sin θ
as the direction vector. Substituting those values into the formula for F` from part (b) gives
2
1
d1 − d22
2d1 d2
F` = 2
d1 + d22 2d1 d2 −d21 + d22
2
1
cos θ − sin2 θ
2 cos θ sin θ
=
− cos2 θ + sin2 θ
cos2 θ + sin2 θ 2 cos θ sin θ
2
cos θ − sin2 θ
2 cos θ sin θ
=
,
2 cos θ sin θ
− cos2 θ + sin2 θ
200
CHAPTER 3. MATRICES
since cos2 θ + sin2 θ = 1. Now, cos2 θ − sin2 θ = cos 2θ, and 2 cos θ sin θ = sin 2θ, so the matrix
above simplifies to
cos 2θ
sin 2θ
F` =
.
sin 2θ − cos 2θ
1
27. By part (b) of the previous exercise, since the line y = 2x has direction vector d =
, we have for
2
the standard matrix of this reflection
#
"
# "
1−4
2·2
− 53 54
1
.
F` (x) =
=
4
3
1 + 4 2 · 2 −1 + 4
5
5
√
√
1
3
sin θ
28. The direction vector of this line is √ , so that cos
=
3, and thus θ = π3 = 60◦ . So by part
θ
1 =
3
(c) of the previous exercise, the standard matrix of this reflection is
"
# "
# " √
#
1
cos 2θ
sin 2θ
cos 120◦
sin 120◦
− 23
√2
F` (x) =
=
=
.
3
1
sin 2θ − cos 2θ
sin 120◦ − cos 120◦
2
2
29. Since


x1
x
T 1 =  2x1 − x2  ,
x2
3x1 + 4x2


 
2y1 + y3
y1
 3y2 − y3 

S y2  = 
 y1 − y2  ,
y3
y1 + y2 + y3
we substitute the result of T into the equation for S:


x1
x
(S ◦ T ) 1 = S  2x1 − x2 
x2
3x1 + 4x2


2x1 + (3x1 + 4x2 )
 3(2x1 − x2 ) − (3x1 + 4x2 ) 

=


x1 − (2x1 − x2 )
x1 + (2x1 − x2 ) + (3x1 + 4x2 )


5x1 + 4x2
3x1 − 7x2 

=
 −x1 + x2 
6x1 + 3x2


5
4  3 −7 x1

=
.
−1
1 x2
6
3
Thus the standard matrix of S ◦ T is


4
−7
.
1
3
5
 3
A=
−1
6
30. (a) By direct substitution, we have
x
x
x − x2
2(x1 − x2 )
2x1 − 2x2
(S ◦ T ) 1 = S T 1
=S 1
=
=
.
x2
x2
x1 + x2
−(x1 + x2 )
−x1 − x2
Thus
[S ◦ T ] =
2
−1
−2
.
−1
3.6. INTRODUCTION TO LINEAR TRANSFORMATIONS
201
(b) Using matrix multiplication, we first find the standard matrices for S and T , which are
2
0
1 −1
[S] =
, [T ] =
.
0 −1
1
1
Then
2
[S][T ] =
0
0 1
−1 1
−1
2
=
1
−1
−2
.
−1
Then [S ◦ T ] = [S][T ].
31. (a) By direct substitution, we have
x1
x1
x1 + 2x2
(x1 + 2x2 ) + 3(−3x1 + x2 )
−8x1 + 5x2
(S ◦ T )
=S T
=S
=
=
.
x2
x2
−3x1 + x2
(x1 + 2x2 ) − (−3x1 + x2 )
4x1 + x1
Thus
[S ◦ T ] =
−8
4
5
.
1
(b) Using matrix multiplication, we first find the standard matrices for S and T , which are
1
3
1 2
[S] =
, [T ] =
.
1 −1
−3 1
Then
[S][T ] =
1
1
3
1
−1 −3
2
−8
=
1
4
5
.
1
Then [S ◦ T ] = [S][T ].
32. (a) By direct substitution, we have

 

x2 + 3(−x1 )
−3x1 + x2
x1
x1
x2
(S ◦ T )
=S T
=S
=  2x2 − x1  = −x1 + 2x2  .
x2
x2
−x1
x2 − (−x1 )
x1 + x2
Thus

−3
[S ◦ T ] = −1
1

1
2 .
1
(b) Using matrix multiplication, we first find the standard matrices for S and T , which are


1
3
0 1


1 , [T ] =
[S] = 2
.
−1 0
1 −1
Then

1
[S][T ] = 2
1

3 0
1
−1
−1

−3
1
= −1
0
1

1
2 .
1
Then [S ◦ T ] = [S][T ].
33. (a) By direct substitution, we have
 
  
x1
x1
x1 + x2 − x3






(S ◦ T ) x2 = S T x2
=S
2x1 − x2 + x3
x3
x3
4(x1 + x2 − x3 ) − 2(2x1 − x2 + x3 )
6x2 − 6x3
=
=
−(x1 + x2 − x3 ) + (2x1 − x2 + x3 )
x1 − 2x2 + 2x3
202
CHAPTER 3. MATRICES
Thus
0
[S ◦ T ] =
1
6 −6
.
−2
2
(b) Using matrix multiplication, we first find the standard matrices for S and T , which are
4 −2
1
1 −1
[S] =
, [T ] =
.
−1
1
2 −1
1
Then
[S][T ] =
−2 1
1 2
4
−1
1 −1
0
=
−1
1
1
6 −6
.
−2
2
Then [S ◦ T ] = [S][T ].
34. (a) By direct substitution, we have
 
  

 

x1
x1
(x1 + 2x2 ) − (2x2 − x3 )
x1 + x3
x + 2x2
(S ◦ T ) x2  = S T x2  = S 1
=  (x1 + 2x2 ) + (2x2 − x3 )  = x1 + 4x2 − x3 
2x2 − x3
x3
x3
−(x1 + 2x2 ) + (2x2 − x3 )
−x1 − x3
Thus

1
[S ◦ T ] =  1
−1
0
4
0

1
−1 .
−1
(b) Using matrix multiplication, we first find the standard matrices for S and T , which are


1 −1
1 2
0


1
1 , [T ] =
.
[S] =
0 2 −1
−1
1
Then

1
[S][T ] =  1
−1

−1 1
1
0
1
2
2

1
0
= 1
−1
−1
0
4
0

1
−1 .
−1
Then [S ◦ T ] = [S][T ].
35. (a) By direct substitution, we have
 
  

 
 

x1
x1
x1 + x2
(x1 + x2 ) − (x2 + x3 )
x1 − x3
(S ◦ T ) x2  = S T x2  = S x2 + x3  =  (x2 + x3 ) − (x1 + x3 )  = −x1 + x2  .
x3
x3
x1 + x3
−(x1 + x2 ) + (x1 + x3 )
−x2 + x3
Thus

1
[S ◦ T ] = −1
0

0 −1
1
0 .
−1
1
(b) Using matrix multiplication, we first find the standard matrices for S and T , which are




1 −1
0
1 1 0
1 −1 , [T ] = 0 1 1 .
[S] =  0
−1
0
1
1 0 1
Then

1
[S][T ] =  0
−1
Then [S ◦ T ] = [S][T ].

−1
0 1
1 −1 0
0
1 1
1
1
0
 
0
1
1 = −1
1
0

0 −1
1
0 .
−1
1
3.6. INTRODUCTION TO LINEAR TRANSFORMATIONS
203
36. A counterclockwise rotation T through 60◦ is given by (see Example 3.58)
"
# "
√ #
1
cos 60◦ − sin 60◦
− 23
2
[T ] =
= √3
,
1
sin 60◦
cos 60◦
2
2
1
and reflection S in y = x is reflection in a line with direction vector d =
, which has the matrix
1
(see Exercise 26(b))
1 0 2
0 1
[S] =
=
.
1 0
2 2 0
Thus by Theorem 3.32, the composite transformation is given by
"√
"
#"
√ #
3
3
1
0 1
−
2
2
√2
=
[S ◦ T ] = [S][T ] =
3
1
1
1 0
2
2
2
1
√2
− 23
#
.
37. Reflection T in the y-axis is taking the negative of the x-coordinate, so it has matrix
−1 0
[T ] =
,
0 1
while clockwise rotation S through 30◦ has matrix (see Example 3.58)
# "√
"
#
3
1
cos(−30◦ ) − sin(−30◦ )
2
2
√
=
[S] =
.
3
sin(−30◦ ) cos(−30◦ )
− 12
2
Thus by Theorem 3.32, the composite transformation is given by
"√
# " √
#"
3
1
− 23
−1
0
[S ◦ T ] = [S][T ] = 21 √23
=
1
−2
0 1
2
2
38. Clockwise rotation T through 45◦ has matrix (see Example 3.58)
# " √
"
2
cos 45◦ sin 45◦
√2
=
[T ] =
− sin 45◦ cos 45◦
− 22
√
2
√2
2
2
1
√2
3
2
#
.
#
,
while projection S onto the y-axis has matrix (see Example 3.59, or note that this transformation
simply sets the x-coordinate to zero)
0 0
[S] =
.
0 1
Thus by Theorem 3.32, the composite transformation is given by
" √
#" √
√ #"
2
2
2
0 0
2
2
√
√
√2
[T ◦ S ◦ T ] = [T ][S][T ] =
2
− 22
0 1 − 22
2
√
2
√2
2
2
#
− 12
1
2
1
2
#
.
1
, is given by (see Exercise 26(b))
1
2
0 1
=
.
0
1 0
39. Reflection T in the line y = x, with direction vector d =
1 0
[T ] =
2 2
=
"
− 12
Counterclockwise rotation S through 30◦ has matrix (see Example 3.58)
"
# "√
#
3
1
cos 30◦ sin 30◦
[S] =
= 21 √23 .
− sin 30◦ cos 30◦
−2
2
204
CHAPTER 3. MATRICES
Finally, reflection R in the line y = −x, with direction vector d =
1
, is given by (see Exercise
−1
26(b))
1
0 −2
0 −1
=
.
0
−1
0
2 −2
Then by Theorem 3.32, the composite transformation is given by
#"
# " √
"
#" √
3
1
0 1
− 23
0 −1
2
2
√
=
[R ◦ S ◦ T ] = [R][S][T ] =
3
1
1 0
−1
0 − 12
2
2
[R] =
− 12
√
− 23
#
.
40. Using the formula for rotation through an angle θ, from Example 3.58, we have
cos α − sin α cos β − sin β
[Rα ◦ Rβ ] = [Rα ][Rβ ] =
sin α
cos α sin β
cos β
cos α cos β − sin α sin β − cos α sin β − sin α cos β
=
sin α cos β + cos α sin β − sin α sin β + cos α cos β
cos(α + β) − sin(α + β)
=
sin(α + β)
cos(α + β)
= Rα+β .
cos α
cos β
41. Assume line ` has direction vector l =
and line m has direction vector
. Let θ = β − α
sin α
sin β
be the angle from ` to m (note that if α > β, then θ is negative). Then by Exercise 26(c),
cos 2β
sin 2β cos 2α
sin 2α
Fm ◦ F` =
sin 2β − cos 2β sin 2α − cos 2α
cos 2β cos 2α + sin 2β sin 2α cos 2β sin 2α − sin 2β cos 2α
=
sin 2β cos 2α − cos 2β sin 2α cos 2β cos 2α + sin 2β sin 2α
cos(2(α − β)) sin(2(α − β))
=
− sin(2(α − β)) cos(2(α − β))
cos(2(β − α)) − sin(2(β − α))
=
sin(2(β − α)) cos(2(β − α))
cos 2θ − sin 2θ
=
sin 2θ cos 2θ
= R2θ .
42. (a) Let P be the projection onto a line with direction vector d =
P is
P =
2
1
d1
d21 + d22 d1 d2
d1
. Then the standard matrix of
d2
d1 d2
,
d22
and thus
2
2
1
1
d1
d1 d2
d1
d1 d2
d22 d21 + d22 d1 d2
d22
d21 + d22 d1 d2
4
1
d31 d2 + d1 d32
d1 + d21 d22
= 2
3
3
2
2
d21 d22 + d42
(d1 + d2 ) d1 d2 + d1 d2
2 2
1
d1 (d1 + d22 ) d1 d2 (d21 + d22 )
= 2
(d1 + d22 )2 d1 d2 (d21 + d22 ) d22 (d21 + d22 )
2
1
d1
d1 d2
= 2
d22
d1 + d22 d1 d2
P ◦ P = P2 =
= P.
3.6. INTRODUCTION TO LINEAR TRANSFORMATIONS
205
Note that we have proven this fact before, in a different form. Since P (v) = projd v, then
(P ◦ P )(v) = projd (projd v) = projd v = P (v)
by Exercise 70(a) in Section 1.2.
(b) Let P be any projection matrix. Using the form of P from the previous exercise, we first prove
that P 6= I. Suppose it were. Then d1 d2 = 0, so that either d1 = 0 or d2 = 0. But then either d21
or d22 would be zero, so that P could not be the identity matrix. So P is a non-identity matrix
such that P 2 = P . Then P 2 − P = P (P − I) = O. Since P 6= I, we see that P − I 6= O, so
there is some k such that (P − I)ek 6= 0 (otherwise, every row of P − I would be orthogonal
to every standard basis vector, so would be zero, so that P − I would be the zero matrix). Let
x = (P − I)ek . But then
P x = P (P − I)ek = Oek = 0,
so that x ∈ null(P ). Since the null space of P contains a nonzero vector, P is not invertible by
Theorem 3.27.
43. Let `, m, and n be lines through the origin with angles from the x-axis of θ, α, and β respectively.
Then
cos 2θ
sin 2θ cos 2α
sin 2α cos 2β
sin 2β
Fn ◦ Fm ◦ F` =
sin 2θ − cos 2θ sin 2α − cos 2α sin 2β − cos 2β
cos 2θ cos 2α + sin 2θ sin 2α cos 2θ sin 2α − sin 2θ cos 2α cos 2β
sin 2β
=
sin 2θ cos 2α − cos 2θ sin 2α sin 2θ sin 2α + cos 2θ cos 2α sin 2β − cos 2β
cos 2(θ − α) − sin 2(θ − α) cos 2β
sin 2β
=
sin 2(θ − α) cos 2(θ − α)
sin 2β − cos 2β
cos 2(θ − α) cos 2β − sin 2(θ − α) sin 2β cos 2(θ − α) sin 2β + sin 2(θ − α) cos 2β
=
sin 2(θ − α) cos 2β + cos 2(θ − α) sin 2β sin 2(θ − α) sin 2β − cos 2(θ − α) cos 2β
cos 2(θ − α + β)
sin 2(θ − α + β)
=
.
sin 2(θ − α + β) − cos 2(θ − α + β)
This is a reflection through a line making an angle of θ − α + β with the x-axis.
44. If ` is a line with direction vector d through the point P , then it has equation x = p + td. Since T is
linear, we have
T (x) = T (p + td) = T (p) + tT (d),
so that the equation of the resulting graph has equation T (x) = T (p) + tT (d). If T (d) = 0, then the
image of ` has equation T (x) = T (p) so that it is the single point T (p). Otherwise, the image of ` is
the line `0 with direction vector T (d) passing through T (p).
45. Suppose that we start with two parallel lines
` : x = p + td,
m : x = q + sd.
Then the images of these lines have equations
`0 : T (p) + tT (d),
m0 : T (q) + sT (d).
Then
• If T (d) 6= 0 and T (p) 6= T (q), then `0 and m0 are distinct parallel lines.
• If T (d) 6= 0 and T (p) = T (q), then `0 and m0 are the same line.
• If T (d) = 0 and T (p) 6= T (q), then `0 and m0 are distinct points T (p) and T (q).
• If T (d) = 0 and T (p) 6= T (q), then `0 and m0 are the single point T (p).
206
CHAPTER 3. MATRICES
46. Using the linear transformation T from Exercise 3, we have
−1
0
T
=
,
1
−2
1
2
T
=
,
1
0
1
0
T
=
,
−1
2
−1
−2
T
=
.
−1
0
The image is the square below:
2
C
1
D
B
-2
1
-1
2
-1
A
-2
47. Using the linear transformation D from Exercise 17, we have
D
−1
−2
=
,
1
3
D
1
2
=
,
1
3
1
2
=
,
−1
−3
D
D
−1
−2
=
.
−1
−3
The image is the rectangle below:
3
A
B
2
1
-2
1
-1
2
-1
-2
D
C
48. Using the linear transformation P from Exercise 18, we have
−1
0
P
=
,
1
0
The image is the line below:
1
1
P
=
,
1
1
1
0
P
=
,
−1
0
−1
−1
P
=
.
−1
−1
3.6. INTRODUCTION TO LINEAR TRANSFORMATIONS
207
1.
B
0.5
A
-1.
1.
0.5
C
-0.5
-0.5
D
-1.
49. Using the projection P from Exercise 22, we have
" # " #
" # " #
" # " #
1
3
−1
1
1
− 51
5
5
P
= 2 , P
= 6 , P
=
,
1
1
−1
− 25
5
5
" # " #
−1
− 35
P
=
.
−1
− 65
The image is the line below:
B
1.
0.5
A
-1.
1.
0.5
-0.5
C
-0.5
-1.
D
1 2
50. Using the transformation T from Exercise 31, with matrix
, we have
−3 1
−1
1
1
3
1
−1
−1
−3
T
=
, T
=
, T
=
, T
=
.
1
4
1
−2
−1
−4
−1
2
The image is the parallelogram below:
4
A
3
2
D
1
-4
-3
-2
1
-1
2
3
4
-1
B
-2
-3
C
-4
51. Using the transformation resulting from Exercise 37, which we call Q, we have
" # "√
#
" # " √
#
" # " √
#
" # " √
#
3+1
− 3+1
− 3−1
3−1
−1
1
1
−1
2
2
√2
Q
= √3+1
, Q
=
, Q
= −√23−1 , Q
= −√3+1
.
3−1
1
1
−1
−1
2
2
2
2
The image is the parallelogram below:
208
CHAPTER 3. MATRICES
A
1
D
1
-1
B
-1
C
52. If ` has direction vector d, then using standard properties of the dot product and of scalar multiplication, we get
d · cv
c(d · v)
d·v
P` (cv) = projd (cv) =
d=
d=c
d = cP` (v).
d·d
d·d
d·d
53. First suppose that T is linear. Then applying property 1 first, followed by property 2, we have
T (c1 v1 + c2 v2 ) = T (c1 v1 ) + T (c2 v2 ) = c1 T (v1 ) + c2 T (v2 ).
For the reverse direction, assume that T (c1 v1 + c2 v2 ) = c1 T (v1 ) + c2 T (v2 ). To prove property 1, set
c1 = c2 = 1; this gives
T (v1 + v2 ) = T (v1 ) + T (v2 ).
For property 2, set c1 = c, v1 = v, and c2 = 0; this gives
T (cv) = T (cv + 0v2 ) = cT (v) + 0T (v2 ) = cT (v).
Thus T is linear.
54. Let range(T ) be the range of T and col([T ]) be the column space of [T ]. We show that range(T ) =
col([T ]) by showing that they are both equal to span({T (ei )}).
To see
P that range(T ) = span({T (ei )}), note that x ∈ range(T ) if and only if x = T (v) for some
v = ci ei , so that
X
X
x = T (v) = T (
ci e i ) =
ci T (ei ),
which is the case if and only if x ∈ span({T (ei )}). Thus range(T ) = span({T (ei )}).
Next, we show
that col([T ]) = span({T (ei )}). By Theorem 2, [T ] = T (e1 ) T (e2 ) · · ·
Thus if vT = v1 v2 · · · vn , then
T (en ) .
T (v) = v1 T (e1 ) + v2 T (e2 ) + · · · + vn T (en ).
Then
x ∈ col([T ]) ⇐⇒ x = T (v) = T (
X
vi T (ei )) =
X
vi T (ei ),
so that col([T ]) = span({T (ei )}).
55. The Fundamental Theorem of Invertible Matrices implies that any invertible 2 × 2 matrix is a product
of elementary matrices. The matrices in Exercise 19 represent the three types of elementary 2 × 2
matrices, so TA must be a composition of the three types of transformations represented in Exercise
19.
3.7. APPLICATIONS
3.7
209
Applications
1. x1 = P x0 =
0.5 0.3 0.5
0.4
=
,
0.5 0.7 0.5
0.6
x2 = P x1 =
0.5
0.5
0.3 0.4
0.38
=
.
0.7 0.6
0.62
2. Since there are two steps involved, the proportion we seek will be [P 2 ]21 — the probability of a state
1 population member being in state 2 after two steps:
0.5
P =
0.5
2
0.3
0.7
2
0.4
=
0.6
0.36
.
0.64
The proportion of the state 1 population in state 2 after two steps is 0.6.
3. Since there are two steps involved, the proportion we seek will be [P 2 ]22 — the probability of a state
2 population member remaining in state 2 after two steps:
0.5
P =
0.5
2
0.3
0.7
2
0.4
=
0.6
0.36
.
0.64
The proportion of the state 2 population in state 2 after two steps is 0.64.
x
4. The steady-state vector x = 1 has the property that P x = x, or (I − P )x = 0. So we form the
x2
augmented matrix of that system and row-reduce it:
1 − 0.5 −0.3
0
0.5 −0.3 0
1 −0.6 0
I −P 0 :
=
→
−0.5 1 − 0.7 0
0.5
0.3 0
0
0 0
Thus x2 = t is free and x1 = 35 t. Since we also want x to be a probability vector, we also have
x1 + x2 = 53 t + t = 85 t = 1, so that t = 85 . Thus the steady-state vector is
" #
x=

1
2

5. x1 = P x0 =  0
1
2
1
3
1
3
1
3




90
.

1
120
150
3
  
2 
=
180
120 ,




3
0
3
8
5
8
1
3
1
3
1
3
1
2

x2 = P x1 =  0
1
2
120




1
150
155
3
 

2 
=
120
120 .




3
0
120
115
6. Since there are two steps involved, the proportion we seek will be [P 2 ]11 — the probability of a state
1 population member being in state 1 after two steps:

1
2

P 2 = 0
1
2
1
3
1
3
1
3
2 
5
1
3
12


2
=  31
3
1
0
4
7
18
1
3
5
18

7
18
2 
.
9 
7
18
5
The proportion of the state 1 population in state 1 after two steps is 12
.
7. Since there are two steps involved, the proportion we seek will be [P 2 ]32 — the probability of a state
2 population member being in state 3 after two steps:

1
2

P 2 = 0
1
2
1
3
1
3
1
3
2 
1
5
3
12

2
=  31
3
1
0
4
7
18
1
3
5
18

7
18
2 
.
9 
7
18
5
The proportion of the state 2 population in state 3 after two steps is 18
.
210
CHAPTER 3. MATRICES
 
x1
8. The steady-state vector x = x2  has the property that P x = x, or (I − P )x = 0. So we form the
x3
augmented matrix of that system and row-reduce it:
h
I −P

1 − 12

0 :  0
− 12
i
− 13
1 − 13
− 13
− 13
− 23
1
 
1
0
2
 
0 =  0
− 21
0
− 13
2
3
− 13
− 31
− 32
1


1 0
0


0 → 0 1
0 0
0
− 43
−1
0

0

0
0
Thus x3 = t is free and x1 = 43 t, x2 = t. Since we want x to be a probability vector, we also have
x1 + x2 + x3 =
10
3
4
t+t+t=
t = 1, so that t =
.
3
3
10
Thus the steady-state vector is


2
5
3
x =  10  .
3
10
9. Let the first column and row represent dry days and the second column and row represent wet days.
(a) Given the data in the exercise, the transition matrix is
0.750
P =
0.250
0.338
,
0.662
with
x
x= 1 ,
x2
where x1 and x2 denote the probabilities of dry and wet days respectively.
1
(b) Since Monday is dry, we have x0 =
. Since Wednesday is two days later, the probability vector
0
for Wednesday is given by
0.750 0.338 0.750 0.338 1
0.647
x2 = P (P x0 ) =
=
.
0.250 0.662 0.250 0.662 0
0.353
Thus the probability that Wednesday is wet is 0.353.
(c) We wish to find the steady state probability vector; that is, over the long run, the probability
that a given day will be dry or wet. Thus we need to solve P x = x, or (I − P )x = 0. So form
the augmented matrix and row-reduce:
1 − 0.750 −0.338
0
0.250 −0.338 0
1 −1.352 0
I −P 0 :
=
→
.
−0.750 1 − 0.662 0
−0.750 0.338 0
0
0
0
Let x2 = t, so that x1 = 1.352t; then we also want x1 + x2 = 1, so that
x1 + x2 = 2.352t = 1, so that t =
1
≈ 0.425.
2.352
Thus the steady state probability vector is
1.352 · 0.425
0.575
x=
≈
.
0.425
0.425
So in the long run, about 57.5% of the days will be dry and 42.5% will be wet.
3.7. APPLICATIONS
211
10. Let the three rows (and columns) correspond to, in order, tall, medium, or short.
(a) From the given data, the transition matrix is

0.6
P = 0.2
0.2
0.1
0.7
0.2

0.2
0.4 .
0.4
 
0
(b) Starting with a short person, we have x0 = 0. That person’s grandchild is two generations
1
later, so that


  

0.6 0.1 0.2 0.6 0.1 0.2
0
0.24
x2 = P (P x0 ) = 0.2 0.7 0.4 0.2 0.7 0.4 0 = 0.48 .
0.2 0.2 0.4 0.2 0.2 0.4
1
0.28
So the probability that a grandchild will be tall is 0.24.
 
0.2
(c) The current distribution of heights is x0 = 0.5, and we want to find x3 :
0.3


3   
0.2457
0.2
0.6 0.1 0.2
x2 = P 3 x0 = 0.2 0.7 0.4 0.5 = 0.5039 .
0.2504
0.3
0.2 0.2 0.4
So roughly 24.6% of the population is tall, 50.4% medium, and 25.0% short.
(d) We want to solve (I − P )x = 0, so form the augmented matrix and row-reduce:
 



0.4 −0.1 −0.2 0
1 0
1 − 0.6 −0.1
−0.2
0
0 = −0.2
0.3 −0.4 0 → 0 1
I − P 0 :  −0.2 1 − 0.7 −0.4
−0.2
−0.2 1 − 0.4 0
−0.2 −0.2
0.6 0
0 0
−1
−2
0

0
0 .
0
So x3 = t is free, and x1 = t, x2 = 2t. Since
x1 + x2 + x3 = t + 2t + t = 4t = 1, we get t =
1
,
4
so that the steady state vector is
 
1
4
 
x =  12  .
1
4
So in the long run, half the population is medium and one quarter each are tall and short.
11. Let the rows (and columns) represent good, fair, and poor crops, in order.
(a) From the data given, the transition matrix is

0.08
P = 0.07
0.85
0.09
0.11
0.80

0.11
0.05 .
0.84
(b) We want to find the probability that starting from a good crop in 1940 we end up with a good
crop in each of the five succeeding years, so we want (P i )11 for i = 1, 2, 3, 4, 5. Computing P i and
212
CHAPTER 3. MATRICES
looking at the upper left entries, we get
1941 : P11
= 0.0800
2
1942 :
(P )11 = 0.1062
1943 :
(P 3 )11 ≈ 0.1057
1944 :
(P 4 )11 ≈ 0.1057
1945 :
(P 5 )11 ≈ 0.1057.
(c) To find the long-run proportion, we want to solve P x = x. So row-reduce I − P 0 :




1 − 0.08 −0.09
−0.11
0
1 0 −0.126 0
0 → 0 1 −0.066 0 .
I − P 0 =  −0.07 1 − 0.11 −0.05
−0.85
−0.80 1 − 0.84 0
0 0
0
0
So x3 = t is free, and x1 = 0.126t, x2 = 0.066t. Since x1 + x2 + x3 = 1, we have
x1 + x2 + x3 = 0.126t + 0.066t + t = 1.192t = 1, so that t =
1
≈ 0.839.
1.192
So the steady state vector is
  
 

x1
0.126 · 0.839
0.106
x = x2  ≈ 0.066 · 0.839 ≈ 0.055 .
x3
0.839
0.839
In the long run, about 10.6% of the crops will be good, 5.5% will be fair, and 83.9% will be poor.
12. (a) Looking at the connectivity of the graph, we get for the transition matrix
0
1
3
1

P =  12
2
0

0
1
4
1
4
1
3
1
3
0

0
1
3 .
2

3
1
2
0
Thus for example from state 2, there is a 13 probability of getting to state 3, since there are three
paths out of state 2 one of which leads to state 3. Thus P32 = 13 .
(b) First find the steady-state probability distribution by solving P x = x:

h
I −P
1
i − 1

0 =  21
− 2
0
− 31
1
1
−3
− 31
− 41
− 41
1
1
−2
0
− 13
− 23
1


0
1
0
0


→
0
0
0
0
0
1
0
0
0
0
1
0
− 32
−1
− 34
0

0
0

.
0
0
So x4 is free; setting x4 = 3t to avoid fractions gives x1 = 2t, x2 = 3t, x3 = 4t, and x4 = 3t. Since
this is a probability vector, we also have
x1 + x2 + x3 + x3 = 2t + 3t + 4t + 3t = 12t = 1, so that t =
So the long run probability vector is
1
6
1
 
x =  41  .
3
1
4
1
.
12
3.7. APPLICATIONS
213
Since we initially started with 60 total robots, the steady-state distribution will be
 
10
15

60x = 
20 .
15
13. First suppose that P = p1 · · · pn is a stochastic matrix (so that the elements of each pi sum to
1). Let j be a row vector consisting entirely of 1’s. Then
P
P
(p1 )i · · ·
(pn )i = 1 · · · 1 = j.
jP = j · p1 · · · j · pn =
P
Conversely, suppose that jP = j. Then j · pi = 1 for each i, so that j (pi )j = 1 for all i, showing that
the column sums of P are 1 so that P is stochastic.
x1 y1
w1 z1
14. Let P =
and Q =
be stochastic matrices. Then
x2 y2
w2 z2
x y1 w1 z1
x w + y1 w2 x1 z1 + y1 z2
PQ = 1
= 1 1
.
x2 y2 w2 z2
x2 w1 + y2 w2 x2 z1 + y2 z2
Now, both P and Q are stochastic, so that x1 + x2 = y1 + y2 = z1 + z2 = w1 + w2 = 1. Then summing
each column of P Q gives
x1 w1 + y1 w2 + x2 w1 + y2 w2 = (x1 + x2 )w1 + (y1 + y2 )w2 = w1 + w2 = 1
x1 z1 + y1 z2 + x2 z1 + y2 z2 = (x1 + x2 )z1 + (y1 + y2 )z2 = z1 + z2 = 1.
Thus each column of P Q sums to 1 so that P Q is also stochastic.
15. The transition matrix from Exercise 9 is
P =
0.750 0.338
.
0.250 0.662
To find the expected number of days until
a wet day, we delete the second row and column of P
(corresponding to wet days), giving Q = 0.750 . Then the expected number of days until a wet day is
(I − Q)−1 = (1 − 0.750)−1 = 4.
We expect four days until the next wet day.
16. The transition matrix from Exercise 10 is

0.6
P = 0.2
0.2
0.1
0.7
0.2

0.2
0.4 .
0.4
We delete the row and column of this matrix corresponding to “tall”, which is the first row and column,
giving
0.7 0.4
Q=
.
0.2 0.4
Then
(I − Q)
−1
0.3
=
−0.2
−0.4
0.6
−1
6.0 4.0
=
.
2.0 3.0
The column of this matrix corresponding to “short” is the second column, since this was the third
column in the original matrix. Summing the values in the second column shows that the expected
number of generations before a short person has a tall descendant is 7.
214
CHAPTER 3. MATRICES
17. The transition matrix from Exercise 11 is

0.08
P = 0.07
0.85
0.09
0.11
0.80

0.11
0.05 .
0.84
Since we want to know how long we expect it to be before a good crop, we remove the first row and
column, leaving
0.11 0.05
Q=
.
0.80 0.84
Then
−1
(I − Q)
−0.05
0.16
0.89
=
−0.80
−1
1.562 0.488
=
.
7.812 8.691
Since we are starting from a fair crop, we sum the entries in the first column (which was the “fair”
column in the original matrix), giving 1.562 + 7.812, or 9 or 10 years expected until the next good crop.
18. The transition matrix from Exercise 12 is
0
1
3
1

P =  21
2
0

0
1
3
1
3
1
4
1
4
0

0
1
3 .
2

3
1
2
0
The expected number of moves from any junction other than 4 to reach junction 4 is found by removing
the fourth row and column of P , giving


0 31 14


Q =  12 0 14  .
1
1
0
2
3
Then

1

(I − Q)−1 = − 21
− 21
− 31
1
− 13
−1 
22
− 41
13


− 14  =  15
13
16
1
13
10
13
21
13
12
13

8
13
9 
,
13 
20
13
15 16
53
so that a robot is expected to reach junction 4 starting from junction 1 in 22
13 + 13 + 13 = 13 ≈ 4.08 moves,
10
21
12
43
8
9
37
from junction 2 in 13 + 13 + 13 = 13 ≈ 3.31 moves, and from junction 3 in 13 + 13 + 20
13 = 13 ≈ 2.85
moves.
19. Since the sums of the columns of the matrix E are both 1, and all numbers are nonnegative, this matrix
is an exchange matrix. To find a solution to Ex = x, or equivalently (E − I)x = 0, we row-reduce the
augmented matrix of that system:
"
#
"
#
"
#
−2R1
i
h
1
1
1
1
− 12
0
−
0
1
−
0
−
−
−
→
4
2
4
2
.
E−I 0 :
R2 +R1
1
− 41 0 −−−
0 0 0
0
0 0
−−→
2
1
Thus
the solutions are of the form x1 = 2 t, x2 = t. Set t = 2 to avoid fractions, giving a price vector
1
. Of course, other values of t lead to equally valid price vectors.
2
20. Since neither column sum is 1, this is not an exchange matrix.
21. While the first column sums to 1, the second does not, so this is not an exchange matrix.
3.7. APPLICATIONS
215
22. Since the sums of the columns of the matrix E are both 1, and all numbers are nonnegative, this matrix
is an exchange matrix. To find a solution to Ex = x, or equivalently (E − I)x = 0, we row-reduce the
augmented matrix of that system:
0 :
E−I
−0.9
0.9
0.6
−0.6
− 10 R1 9
0 −
−−
−→ 1
0
0
0
−0.9 0.6
R2 +R1
0 −
0
0
−−−−→
− 23
0
0
.
0
2
Thus
the solutions are of the form x1 = 3 t, x2 = t. Set t = 3 to avoid fractions, giving a price vector
2
. Of course, other values of t lead to equally valid price vectors.
3
23. Since the matrix contains a negative entry, it is not an exchange matrix.
24. Since the sums of the columns of the matrix E are both 1, and all numbers are nonnegative, this matrix
is an exchange matrix. To find a solution to Ex = x, or equivalently (E − I)x = 0, we row-reduce the
augmented matrix of that system:
h
E−I

− 12
i

0 :  0
1
2
1
−1
0
0
1
3
1
−3


0
− 21
 −R2 
0 −−−→  0
R3 +R1
0 −
0
−−−−→
1
1
1
0
− 13
− 13

− 12

 0
0
 R1 −R2
−−→
0 −−−

0
R3 −R2
0 −
−−−−→
0
1
0
1
3
− 13
0
 −2R 
1
1 0
0 −−−→


0
0 1
0
0 0
− 23
− 13
0

0

0 .
0
Thus the
solutions are of the form x1 = 32 t, x2 = 13 t, x3 = t. Set t = 3 to avoid fractions, giving a price
 
2
vector 1. Of course, other values of t lead to equally valid price vectors.
3
25. Since the sums of the columns of the matrix E are both 1, and all numbers are nonnegative, this matrix
is an exchange matrix. To find a solution to Ex = x, or equivalently (E − I)x = 0, we row-reduce the
augmented matrix of that system:
h
E−I
 − 10 R1 

7
0
0.2 0 −−−
1
0
− 27 0
−→



−0.5
0.3 0
0.3 0
0.3 −0.5
0.5 −0.5 0
0.4
0.5 −0.5 0





1
0 − 72 0
1
0 − 27 0
1 0


 −2R2 
R2 −0.3R1 
1
27
1
27
−−−−−−→ 0 − 2
0
0 −−−→ 0 1
0 − 2
70
70
R3 +R2
R3 −0.4R1
1
27
−
0
0
0
0
0
0
0 0
−
−
−
−
−
→
−−−−−−→
2
70

−0.7
i

0 :  0.3
0.4
− 27
− 27
35
0

0

0 .
0
2
27
Thus the solutions
  are of the form x1 = 7 t, x2 = 35 t, x3 = t. Set t = 35 to avoid fractions, giving a
10
price vector 27. Of course, other values of t lead to equally valid price vectors.
35
26. Since the sums of the columns of the matrix E are both 1, and all numbers are nonnegative, this matrix
is an exchange matrix. To find a solution to Ex = x, or equivalently (E − I)x = 0, we row-reduce the
216
CHAPTER 3. MATRICES
augmented matrix of that system:

 −2R 

1
1 −1.40 −0.70 0
−0.50
0.70 0.35 0 −−−→
i



 R −0.25R1
0 :  0.25 −0.70 0.25 0
0.25 0 −−2−−−−−→
0.25 −0.70
R3 −0.25R1
0.25
0 −0.6 0
0.25
0 −0.6 0 −
−−−−−−→

 R1 −4R2 



0 −2.40 0
1 −1.40 −0.70 0 −−−−−→ 1
1 0 −2.40 0
1
− 3.5 R2 
0 −0.35 0.425 0 −
0 −0.35
0.425 0
−−−−→ 0 1 −1.214 0
R3 +R2
0
0
0.35 −0.425 0 −
0
0
0
0 0
0 0
−−−−→


2.4
So one possible solution is 1.214. Multiplying through by 70 will roughly clear fractions, giving
1


168
≈  85  if an integral solution is desired.
70
h
E−I
27. This matrix is a consumption matrix since all of its entries are nonnegative. By Corollary 3.36, it is
productive since the sum of the entries in each column is strictly less than 1. (Note that Corollary 3.35
does not apply since the sum of the entries in the second row exceeds 1).
28. This matrix is a consumption matrix since all of its entries are nonnegative. By Corollary 3.35, it is
productive since the sum of the entries in each row is strictly less than 1. (Note that Corollary 3.36
does not apply since the sum of the entries in the third column exceeds 1).
29. This matrix is a consumption matrix since all of its entries are nonnegative. Since the entries in the
second row as well as those in the second column sum to a number greater than 1, neither Corollary
3.35 nor 3.36 applies. So to determine if C is productive, we must use the definition, which involves
seeing if I − C is invertible and if its inverse has all positive entries.

 
 

1 0 0
0.35 0.25
0
0.65 −0.25
0
0.45 −0.35 .
I − C = 0 1 0 − 0.15 0.55 0.35 = −0.15
0 0 1
0.45 0.30 0.60
−0.45 −0.30
0.40
Using technology,


−13.33 −17.78 −15.56
(I − C)−1 ≈ −38.67 −46.22 −40.44 .
−44.00 −54.67 −45.33
Since this is not a positive matrix, it follows that C is not productive.
30. This matrix is a consumption matrix since all of its entries are nonnegative. Since the entries in the
first row as well as those in the each column sum to a number at least equal to 1, neither Corollary
3.35 nor 3.36 applies. So to determine if C is productive, we must use the definition, which involves
seeing if I − C is invertible and if its inverse has all positive entries.

 
 

1 0 0 0
0.2 0.4 0.1 0.4
0.8 −0.4 −0.1 −0.4
0 1 0 0 0.3 0.2 0.2 0.1 −0.3
0.8 −0.2 −0.1
 
 
.
I −C =
0 0 1 0 −  0 0.4 0.5 0.3 = 
0 −0.4
0.5 −0.3
0 0 0 1
0.5 0 0.2 0.2
−0.5
0 −0.2
0.8
Using technology reveals that I − C is not invertible, so that C is not productive.
31. To find a solution x to (I − C)x = d, we invert I − C and compute x = (I − C)−1 d. Here
# "
#
"
# "
1
1
1
1
−
1 0
2
4 .
I −C =
− 21 41 =
1
− 21
0 1
2
2
2
3.7. APPLICATIONS
217
Then det(I − C) = 12 · 21 − − 14
− 12 = 18 , so that
"
1 12
−1
(I − C) =
1/8 12
1
4
1
2
#
"
4
=
4
#
2
.
4
Thus a feasible production vector is
−1
x = (I − C)
4
d=
4
2 1
10
=
.
4 3
16
32. To find a solution x to (I − C)x = d, we invert I − C and compute x = (I − C)−1 d. Here
1 0
0.1 0.4
0.9 −0.4
I −C =
−
=
.
0 1
0.3 0.2
−0.3
0.8
Then det(I − C) = 0.9 · 0.8 − (−0.4)(−0.3) = 0.60, so that
"
# "
4
0.8 0.4
1
−1
= 31
(I − C) =
0.60 0.3 0.9
2
2
3
3
2
#
.
Thus a feasible production vector is
"
x = (I − C)−1 d =
4
3
1
2
2
3
3
2
#" # " #
10
2
= 35 .
1
2
33. To find a solution x to (I − C)x = d, we invert I − C and compute x = (I − C)−1 d. Here

 
 

1 0 0
0.5 0.2 0.1
0.5 −0.2 −0.1
0.6 −0.8 .
I − C = 0 1 0 −  0 0.4 0.2 =  0
0 0 1
0
0 0.5
0
0
0.5
Then row-reduce I − C I to compute (I − C)−1 :




1 0 0 2 23 23
0.5 −0.2 −0.1 1 0 0
h
i




I −C I = 0
0.6 −0.8 0 1 0 → 0 1 0 0 53 23  .
0
0
0.5 0 0 1
0 0 1 0 0 2
Thus a feasible production vector is

2

−1
x = (I − C) d = 0
0
 


2
3
5
3
2
3
10
3
 
2  
=  6 .
3  2
0
2
4
8
34. To find a solution x to (I − C)x = d, we invert I − C and compute x = (I − C)−1 d. Here

 
 

1 0 0
0.1 0.4 0.1
0.9 −0.4 −0.1
0
0.8 −0.2 .
I − C = 0 1 0 −  0 0.2 0.2 = 
0 0 1
0.3 0.2 0.3
−0.3 −0.2
0.7
Compute (I − C)−1 using technology to get


1.238 0.714 0.381
(I − C)−1 ≈ 0.143 1.429 0.429 .
0.571 0.714 1.714
Thus a feasible production vector is

  

1.238 0.714 0.381
1.1
4.624

  

x = (I − C)−1 d = 0.143 1.429 0.429 3.5 = 6.014 .
0.571 0.714 1.714
2.0
6.557
218
CHAPTER 3. MATRICES
35. Since x ≥ 0 and A ≥ O, then Ax has all nonnegative entries as well, so that Ax ≥ Ox = 0. But then
x > Ax ≥ Ox = 0,
so that x > 0.
36. (a) Since all entries of all matrices are positive, it is clear that AC ≥ O and BD ≥ O. To see that
AC ≥ BD, let A = (aij ), B = (bij ), C = (cij ), and D = (dij ). Then each aij ≥ bij and each
cij ≥ dij . So
n
n
n
X
X
X
(AC)ij =
aik ckj ≥
bik ckj ≥
bik dkj = (BD)ij .
k=1
k=1
k=1
So every entry of AC is greater than or equal to the corresponding entry of BD, and thus
AC ≥ BD.
(b) Let xT = x1 x2 · · · xn . Then for any i and j, we know that aij xj ≥ bij xj since aij > bij
and xj ≥ 0. Since x 6= 0, we know that some component of x, say xk , is strictly positive. Then
aik xk > bik xk since xk > 0. But then
(Ax)i = ai1 x1 + · · · + aik xk + · · · + ain xn > bi1 x1 + · · · + bik xk + · · · + bin xn = (Bx)i .
37. Multiplying, we get
2 5 10
45
x1 = Lx0 =
=
0.6 0 5
6
2 5 45
120
x2 = Lx1 =
=
0.6 0 6
22.5
2 5 120
375
x3 = Lx2 =
=
0.6 0 27
72
38. Multiplying, we get

0
x1 = Lx0 = 0.2
0

0
x2 = Lx1 = 0.2
0

0
x3 = Lx2 = 0.2
0
   
10
2 10
0  4  =  2 
2
3
0
   
6
1 2 10
0 0  2  = 2
1
2
0.5 0
   
1 2 6
4
0 0 2 = 1.2
0.5 0 1
1
1
0
0.5
39. Multiplying, we get

1
x1 = Lx0 = 0.7
0

1
x2 = Lx1 = 0.7
0

1
x3 = Lx2 = 0.7
0
   
3 100
500
0 100 =  70 
0 100
50
   
1 3 500
720
0 0  70  = 350
0.5 0
50
35
  

1 3 720
1175
0 0 350 =  504  .
0.5 0
35
175
1
0
0.5
3.7. APPLICATIONS
219
40. Multiplying, we get

0
0.5
x1 = Lx0 = 
0
0

0
0.5
x2 = Lx1 = 
0
0

0
0.5
x3 = Lx2 = 
0
0
1
0
0.7
0
1
0
0.7
0
1
0
0.7
0
   
80
5 10
10  5 
0
  =  
0 10  7 
3
10
0
   
34
2 5 80
 5   40 
0 0
  =  
0 0  7  3.5
3
2.1
0.3 0

  
57.5
34
2 5

  
0 0
  40  =  17  .
0 0 3.5  28 
1.05
2.1
0.3 0
2
0
0
0.3
41. (a) The first ten generations of the species using each of the two models are as follows:
L1
x0
10
10
x1
50
8
x2
40
40
x3 200
32
x4 160
160
x5 800
128
10
10
50
8
208
40
L2
3654.4
697.6
15315.2
2923.52
L1
x6 640
640
L2
64184.3
12252.2
872
166.4
x7 3200
512
x8 2560
2560
x9 12800
2048
x10 10240
10240
268989
51347.5
1.127 × 106
215192
4.724 × 106
901844
1.980 × 107
.
3.780 × 106
(b) Graphs of the first ten generations using each model are:
Model L1
Percentage
1.0
Class 1
0.8
0.8
0.6
0.6
0.4
0.4
0.2
0.2
Class 2
0
2
4
6
Model L2
Percentage
1.0
8
10
TimeHyrsL
Class 2
Class 1
0
2
4
6
8
10
TimeHyrsL
The graphs suggest that the first model does not have a steady state, while the second Leslie
matrix quickly approaches a steady state ratio of class 1 to class 2 population.
220
CHAPTER 3. MATRICES
 
20
42. Start with x0 = 20. Then
20

0
x1 = 0.1
0

0
x2 = 0.1
0

0
x3 = 0.1
0
0
0
0.5
0
0
0.5
0
0
0.5
   
20 20
400
0  20 =  2 
0
20
10

 

20 400
200
0   2  =  40 
0
10
1

  
20 200
20
0   40  = 20 .
0
1
20
Thus after three generations the population is back to where it started, so that the population is cyclic.
 
20
43. Start with x0 = 20. Then
20

  

0 0 20 20
400
0  20 = 20s
x1 =  s 0
0 0.5 0
20
10
 



200
0 0 20 400
0  20s = 400s
x2 =  s 0
10
0 0.5 0
10s

 


200s
200
0 0 20
0  400s = 200s .
x3 =  s 0
200s
10s
0 0.5 0
Thus after three generations the population is still evenly distributed among the three classes, but the
total population has been multiplied by 10s. So
• If s < 0.1, then the overall population declines after three years;
• If s = 0.1, then the overall population is cyclic every three years;
• If s > 0.1, then the overall population increases after three years.
44. The table gives us seven separate age classes, each of duration two years; from the table, the Leslie
matrix of the system is


0 0.4 1.8 1.8 1.8 1.6 0.6
0.3 0
0
0
0
0
0


 0 0.7 0
0
0
0
0


0 0.9 0
0
0
0
L=
0
.
0

0
0
0.9
0
0
0


0
0
0
0 0.9 0
0
0
0
0
0
0 0.6 0
Since in 1990,


 


46.4
42.06
10
13.92
 3 
2




 
 2.1 
8
 1.4 




 





x0 = 
 5  , so that x1 = Lx0 =  7.2  , and x2 = Lx1 =  1.26  .
 6.48 
 4.5 
12




 
10.8
 4.05 
0
0
6.48
1
3.7. APPLICATIONS
221
Then x1 and x2 represent the population distribution in 1991 and 1992. To project the population for
the years 2000 and 2010, note that xn = Ln x0 . Thus




167.80
69.12
 46.08 
18.61




 29.54 
12.24




20



x10 = L10 x0 ≈ 
10.59 , x20 = L x0 ≈  24.35  .
 20.06 
 8.29 




 16.48 
 6.51 
9.05
3.57
45. The adjacency matrix contains a 1 in row i, column j if there is an edge between vi and vj :


0 1 0 1
1 0 1 0


0 1 0 1 .
1 0 1 0
46. The adjacency matrix contains a 1 in row i, column j if there is an edge between vi and vj :


1 0 0 1
0 1 1 0


0 1 0 0 .
1 0 0 0
47. The adjacency matrix contains a 1 in row i, column j if there is an edge between vi and vj :


0 1 1 1 1
1 0 1 0 0


1 1 0 1 0 .


1 0 1 0 1
1 0 0 1 0
48. The adjacency matrix contains a 1 in row i, column j if there is an edge between vi and vj :


0 0 0 1 1
0 0 0 1 0


0 0 0 1 0 .


1 1 1 0 0
1 0 0 0 0
49.
v2
v4
v1
v3
222
CHAPTER 3. MATRICES
50.
v4
v3
v1
v2
51.
v4
v1
v2
v3
v5
52.
v2
v4
v1
v5
v3
53. The adjacency matrix contains a 1 in row i, column j if there is an edge from vi to vj :

0
0

0
1
1
0
1
0
1
0
0
0

0
0
.
1
0
54. The adjacency matrix contains a 1 in row i, column j if there is an edge from vi to vj :

0
0

1
0
0
0
1
1
1
0
0
1

1
1
.
0
0
3.7. APPLICATIONS
223
55. The adjacency matrix contains a 1 in row i, column j if there is an edge from vi to vj :


0 1 0 1 0
1 0 0 1 0


1 1 0 0 0 .


1 0 0 0 1
1 0 0 0 0
56. The adjacency matrix contains a 1 in row i, column j if there is an edge from vi to vj :


0 1 0 1 1
0 0 0 0 1


0 1 0 1 0 .


0 0 0 0 1
0 0 1 0 0
57.
v4
v3
v1
v2
58.
v3
v4
v1
v2
59.
v1
v5
v3
v4
v2
224
CHAPTER 3. MATRICES
60.
v5
v2
v1
v3
v4
61. In Exercise 50,

0
1
A=
0
1
1
1
1
1
0
1
0
1


2
1
2
1
2
 , so A = 
2
1
1
0
2
4
2
3
2
2
2
1

1
3
.
1
3
The number of paths of length 2 between v1 and v2 is given by A2 12 = 2. So there are two paths of
length 2 between v1 and v2 .
62. In Exercise 52,

0
0

A=
0
1
1
0
0
0
1
1
0
0
0
1
1
1
1
1
0
0


2
1
2
1


2

1
 , so A = 2
0

0
0
0
2
2
2
0
0
2
2
2
0
0
0
0
0
3
3

0
0

0
.
3
3
The number of paths of length 2 between v1 and v2 is given by A2 12 = 2. So there are two paths of
length 2 between v1 and v2 .
63. In Exercise 50,

0
1
A=
0
1

3 6
7 8
.
3 6
6 5
The number of paths of length 3 between v1 and v3 is given by A3 13 = 3. So there are three paths
of length 3 between v1 and v3 .
1
1
1
1


1
3

1
 , so A3 = 7
3
1
0
6
7
11
7
8


1
12
12
1


4

1
 , so A = 12
0
0
0
0
12
12
12
0
0
0
1
0
1
64. In Exercise 52,

0
0

A=
0
1
1

12 0 0
12 0 0 

12 0 0 
.
0 18 18
0 18 18
The number of paths of length 4 between v2 and v2 is given by A4 22 = 12. So there are twelve paths
of length 4 between v2 and itself.
0
0
0
1
1
0
0
0
1
1
1
1
1
0
0
3.7. APPLICATIONS
225
65. In Exercise 57,

0
1
A=
0
1
1
0
1
0



0
1 0 0 1


1
 , so A2 = 1 1 1 1 .
1 0 0 1
0
1
1 2 1 1
The number of paths of length 2 from v1 to v3 is A2 13 = 0. There are no paths of length 2 from v1
to v3 .

0
1
A=
0
1
1
0
1
0
0
0
0
1
66. In Exercise 57,



0
1 1 1 1


1
 , so A3 = 2 2 1 2 .
1 1 1 1
0
1
3 2 1 3
The number of paths of length 3 from v4 to v1 is A3 41 = 3. There are three paths of length 3 from
v4 to v1 .
0
0
0
1
67. In Exercise 60,

0
0

A=
1
1
1



1 1 1 1 1
1
1 1 0 1 2
0



3


1 , so A = 
2 3 0 3 3 .


3 3 1 1 1
0
2 1 1 1 0
0
The number of paths of length 3 from v4 to v1 is A3 41 = 3. There are three paths of length 3 from
v4 to v1 .
1
0
0
0
1
0
0
0
1
0
0
1
1
0
0
68. In Exercise 60,



3 2 1 2 2
1
3 3 1 1 1
0



4
6 5 3 3 2 .
1
,
so
A
=



3 4 1 4 4
0
2 2 1 2 3
0
The number of paths of length 4 from v1 to v4 is A3 14 = 2. There are two paths of length 4 from v1
to v4 .

0
0

A=
1
1
1
1
0
0
0
1
0
0
0
1
0
0
1
1
0
0
69. (a) If there are no ones in row i, then no edges join vertex i to any of the other vertices. Thus G is a
disconnected graph, and vertex i is an isolated vertex.
(b) If there are no ones in column j, then no edges join vertex j to any of the other vertices. Thus G
is a disconnected graph, and vertex j is an isolated vertex.
70. (a) If row i of A2 is all zeros, then there are no paths of length 2 from vertex i to any other vertex.
(b) If column j of A2 is all zeros, then there are no paths of length 2 from any vertex to vertex j.
71. The adjacency matrix for the digraph is

0
1

1
A=
1

0
1
0
0
0
1
0
0
0
1
0
0
0
1
0
0
1
0
0
1
1
1
1
1
0
0

0
1

0
.
0

1
0
226
CHAPTER 3. MATRICES
The number of wins that each player had is given by
   
1
1
1 4
   
1 3
  
A
1 = 3 ,
   
1 1
3
1
so the ranking is: first, P2 ; second, P3 , P4 , P6 (tie); third, P1 , P5 (tie). Ranking instead by combined
wins and indirect wins, we compute
   
  
2
1
0 0 0 0 1 1
1
1 3 0 2 2 3 2 1 12
   
  
1 2 1 0 1 3 1 1  8 
   
=
(A + A2 ) 
1 2 1 1 0 3 2 1 =  9  ,
   
  
1 1 0 1 1 0 1 1  4 
10
1
3 1 1 2 3 0
1
so the ranking is: first, P2 ; second, P6 ; third, P4 ; fourth, P3 ; fifth, P5 ; sixth, P1 .
72. Let vertices 1 through 7 correspond to, in order, rodent, plant, insect, bird, fish, fox, and bear. Then
the adjacency matrix is


0 1 0 0 0 0 0
0 0 0 0 0 0 0


0 1 0 0 0 0 0



A=
0 1 1 0 1 0 0 .
0 1 1 0 0 0 0


1 0 0 1 0 0 0
1 0 0 0 1 1 0
(a) The number of direct sources of food is the number of ones in each row, so it is
   
1
1
1 0
   
1 1
   
  
A
1 = 3 .
1 2
   
1 2
3
1
Thus birds and bears have the most direct sources of food, 3.
(b) The number of species for which a given species is a direct source of food is the number of ones in
its column. Clearly plants, with a value of 4, are a direct source of food for the greatest number
of species. Note the analogy of parts (a) and (b) with the previous exercise, where eating a given
species is equivalent to winning that head-to-head match, and being eaten is equivalent to losing
it.
(c) Since A2 ij gives the number of paths of length 2 from vertex i to vertex j, it follows that A2 ij
is the number of ways in which species j is an indirect food source for species i. So summing the
entries of the ith row of A2 gives the total number of indirect food sources:
   
1
0
1 0
   
1 0
   
2 

A 1 = 
3 .
1 1
   
1 4
1
5
3.7. APPLICATIONS
227
Thus bears have the most indirect food sources. Combined direct and indirect food sources are
given by
   
1
1
1 0
   
1 1
   
2  

(A + A ) 1 = 
6 .
1 3
   
1 6
8
1
(d) The matrix A∗ is the matrix A after removing row 2 and column 2:


0 0 0 0 0 0
0 0 0 0 0 0


0 1 0 1 0 0

A∗ = 
0 1 0 0 0 0 .


1 0 1 0 0 0
1 0 0 1 1 0
Then we have
   
0
1
1 0
   
  2
∗ 1

A  =
 ,
1 1
1 2
3
1
so the species with the most direct food sources is the bear, with three. The species that is a
direct food source for the most species is the species for which the column sum is the largest;
three species, rodents, insects, and birds, each are prey for two species. Note that the column
sums can be computed as
 
1
1
 
1

1 1 1 1 1 1 A∗ or (A∗ )T 
1 .
 
1
1
The number of indirect food sources each species has is
   
1
0
1 0
   
 
2 1
 1
(A∗ ) 
1 = 0 ,
   
1 2
1
3
and the number of combined direct and indirect food sources is
   
1
0
1 0

  
 
2 1
 = 3 .
A∗ + (A∗ ) 
1 1
   
1 4
1
6
Thus birds have lost the most combined direct and indirect sources of food (5). The bear, fox,
and fish species have each lost two combined sources of food, and rodents and insects have each
lost one combined food source.
228
CHAPTER 3. MATRICES
5
(e) Since (A∗ ) is the zero matrix, after a while, no species will have a direct food source, so that the
ecosystem will be wiped out.
73. (a) Using the relationships given, we get the digraph
Bert
Ehaz
Carla
Dana
Ann
Letting vertices 1 through 5 correspond to the people in alphabetical order, the adjacency matrix
is


0 0 1 0 1
0 0 1 1 0



A=
0 0 0 0 1 .
1 0 1 0 0
0 1 0 0 0
k
(b) The element A ij is 1 if there is a path of length k from i to j, so there is a path of length at
most k from i to j if A + A2 + · · · + Ak ij ≥ 1. If Bert hears a rumor and we want to know how
many steps it takes before everyone else has heard the rumor, then we must calculate A, A + A2 ,
A + A2 + A3 , and so forth, until every entry in row 2 (except for the one in column 2, which
corresponds to Bert himself) is nonzero. This is not true for A, since for example A21 = 0. Next,


0 1 1 0 2
1 0 2 1 1



A + A2 = 
0 1 0 0 1 ,
1 0 2 0 2
0 1 1 1 0
so everyone else will have heard the rumor, either directly or indirectly, from Bert after two steps
(Carla will have heard it twice).
(c) Using the same logic as in part (b), we must compute successive sums of powers of A until every
entry in the first row, except for the entry in the first column corresponding
to Ann herself, is
nonzero. From part (b), we see that this is not the case for A + A2 since A + A2 14 = 0. Next,


0 2 2 1 2
1 1 3 1 3


2
3

A+A +A =
0 1 1 1 1 ,
1 2 2 0 3
1 1 2 1 1
so everyone else will have heard the rumor from Ann after three steps. In fact, everyone but Dana
will have heard it twice.
(d) j is reachable from i by a path of length k if Ak ij is nonzero. Now, if j is reachable from i at
all, it is reachable by a path of length at most n, where n is the number of vertices in the digraph,
so we must compute
A + A2 + A3 + · · · + An ij
and see if it is nonzero. If it is, the two vertices are connected.
3.7. APPLICATIONS
229
74. Let G be a graph with m vertices, and A = (aij ) its adjacency matrix.
(a) The basis for the induction is n = 1, where the statement says that aij is the number of direct
paths between vertices i and j. But this is just the definition of the adjacency matrix, since
aij = 1 if i and j have an edge between them and is zero otherwise.
Next, assume [Ak ]ij is the
Pm
k+1
k
number of paths of length k between i and j. Then [A
]ij = l=1 [A ]il [A]lj . For a given vertex
l, [Ak ]il gives the number of paths of length k from i to l, and [A]lj gives the number of paths
of length 1 from l to j (this is either zero or one). So the product gives the number of paths of
length k + 1 from i to j passing through l. Summing over all vertices l in the graph gives all paths
of length k + 1 from i to j, proving the inductive step.
(b) If G is a digraph, the statement of the result must be modified to say that the (i, j) entry of An
is equal to the number of n-paths from i to j; the proof is valid since the adjacency matrix of a
digraph takes arrow directions into account.
Pm
75. [AAT ]ij = k=1 aik ajk . So this matrix entry is the number of vertices k that have edges from both
vertex i and vertex j.
76. From the diagram in Exercise 49, this graph is bipartite, with U = {v1 } and V = {v2 , v3 , v4 }.
77. From the diagram in Exercise 52, this graph is bipartite, with U = {v1 , v2 , v3 } and V = {v4 , v5 }.
78. The graph in Exercise 51 is not bipartite. For suppose v1 ∈ U . Then since there are edges from v1 to
both v3 and v4 , we must have v3 , v4 ∈ V . Next, there are edges from v3 to v5 , and from v4 to v2 , so
that v2 , v5 ∈ U . But there is an edge from v2 to v5 . So this graph cannot be bipartite.
79. Examining the matrix, we see that vertices 1, 2, and 4 are connected to vertices 3, 5, and 6 only, so
that if we define U = {v1 , v2 , v4 } and V = {v3 , v5 , v6 }, we have shown that this graph is bipartite.
80. Suppose that G has n vertices.
(a) First suppose that the adjacency matrix of G is of the form shown. Suppose that the zero blocks
have p and q = n − p rows respectively. Then any edges at vertices v1 , v2 , . . . , vp have their other
vertex in vertices vp+1 , vp+2 , . . . , vn , since the upper left p × p matrix is O. Similarly, any edges
at vertices vp+1 , vp+2 , . . . , vn have their other vertex in vertices v1 , v2 , . . . , vp , since the lower right
q × q matrix is O. So G is bipartite, with U = {v1 , v2 , . . . , vp } and V = {vp+1 , vp+2 , . . . , vn }.
Next suppose that G is bipartite. Label the vertices so that U = {v1 , v2 , . . . , vp } and V =
{vp+1 , vp+2 , . . . , vn }. Now consider the adjacency matrix A of G. aij = 0 if i, j ≤ p, since then
both vi and vj are in U ; similarly, aij = 0 if i, j > p, since then both vi and vj are in V . So the
upper left p × p submatrix as well as the bottom right q × q submatrix are the zero matrix. The
remaining two submatrices are transposes of each other since this is an undirected graph, so that
aij = 1 if and only if aji = 1. Thus A is of the form shown.
(b) Suppose G is bipartite and A its adjacency matrix, of the form given in part (a). It suffices to
show that odd powers of A have zero blocks at upper left of size p × p and at lower right of size
q × q, since then there are no paths of odd length between any vertex and itself. We do this by
induction. Clearly A1 = A is of that form. Now,
OO + BB T
OB + BO
BB T
O
D1 O
A2 =
=
=
O D2
B T O + OB T B T B + OO
O
BT B
where D1 = BB T and D2 = B T B. Suppose that A2k−1 has the required zero blocks. Then
O C1 D1 O
A2k+1 = A2k−1 A2 =
C2 O
O D2
OD1 + C1 O OO + C1 D2
=
C2 D1 + OO D1 O + OD2
O
C 1 D2
=
.
C2 D1
O
230
CHAPTER 3. MATRICES
Thus A2k+1 is of this form as well. So no odd power of A has any entries in the upper left or
lower right block, so there are no odd paths beginning and ending at the same vertex. Thus all
circuits must be of even length. (Note that in the matrix A2k+1 above, we must in fact have
(C1 D2 )T = C2 D1 since powers of a symmetric matrix are symmetric, but we do not need this
fact.)
Chapter Review
1. (a) True. See Exercise 34 in Section 3.2. If A is an m × n matrix, then AT is an n × m matrix. Then
AAT is defined and is an m × m matrix, and AT A is also defined and is an n × n matrix.
(b) False. This is true only when A is invertible. For example, if
1 0
0 0
0
A=
, B=
, then AB =
0 1
0 1
0
0
,
0
but A 6= O.
(c) False. What is true is that XAA−1 = BA−1 so that X = BA−1 . Matrix multiplication is not
commutative, so that BA−1 6= A−1 B in general.
(d) True. This is Theorem 3.11 in Section 3.3.
(e) True. If E performs kRi , then E is a matrix with all zero entries off the diagonal, so it is its
own transpose and thus E T is also the elementary matrix that performs kRi . If E performs
Ri + kRj , then the only off-diagonal entry in E is [E]ij = k, so the only off-diagonal entry of E T
is [E T ]ji = k and thus E T is the elementary matrix performing Rj + kRi . Finally, if E performs
Ri ↔ Rj , then [E]ii = [E]jj = 0 and [E]ij = [E]ji = 1, so that E T = E and thus E T is also an
elementary matrix exchanging Ri and Rj .
(f ) False. Elementary matrices perform a single row operation, while their products can perform multiple row operations. There are many examples. For instance, consider the elementary matrices
corresponding to R1 ↔ R2 and 2R1 :
0 1 2 0
0 1
=
,
1 0 0 1
2 0
which is not an elementary matrix.
(g) True. See Theorem 3.21 in Section 3.5.
(h) False. For this to be true, the plane must pass through the origin.
(i) True, since
T (c1 u + c2 v) = −(c1 u + c2 v) = −c1 u − c2 v = c1 (−u) + c2 (−v) = c1 T (u) + c2 T (v).
(j) False; the matrix is 5 × 4, not 4 × 5. See Theorems 3.30 and 3.31 in Section 3.6.
2. Since A is 2 × 2, so is A2 . Since B is 2 × 3, the product makes sense. Then
1 2 1 2
7 12
7 12 2
0 −1
50 −36
2
2
A =
=
, so that A B =
=
3 5 3 5
18 31
18 31 3 −3
4
129 −93
41
.
106
3. B 2 is not defined, since the number of columns of B is not equal to the number of rows of B. So this
matrix product does not make sense.
4. Since B T is 3 × 2, A−1 is 2 × 2, and B is 2 × 3, this matrix product makes sense as long as A is
invertible. Since det A = 1 · 5 − 2 · 3 = −1, we have
1
5 −2
−5
2
=
,
A−1 =
1
3 −1
−1 −3
Chapter Review
231
so that

2
B T A−1 B =  0
−1

3 −5
−3
3
4

−1
0 −1
= −9
−3
4
17
2 2
−1 3

1 2
3
3
−6


1 −3
5
0 −1
21 .
= −9 −9
−3
4
16 18 −41
5. Since BB T is always defined (see exercise 34 in section 3.2) and is square, (BB T )−1 makes sense if
BB T is invertible. We have


2
3
2
0 −1 
5 2
0 −3 =
BB T =
.
3 −3
4
2 34
−1
4
Then det(BB T ) = 5 · 34 − 22̇ = 166, so that
(BB T )−1 =
1
34 −2
.
5
166 −2
6. Since B T B is always defined (see exercise 34 in section 3.2) and is square, (B T B)−1 makes sense if
B T B is invertible. We have




2
3 13 −9
10
2
0 −1
9 −12 .
B T B =  0 −3
= −9
3 −3
4
−1
4
10 −12
17
To compute the inverse, we row-reduce B T B I :

h
BT B
13
−9
10

I = −9
9 −12
10 −12
17
i
1
0
0
0
1
0


13
0
9
 R2 + 13 R1 
0 −−−−−−→  0
R3 − 10
R1
1 −
−−−13
−−→ 0
−9
10
1
36
13
66
− 13
− 66
13
9
13
10
− 13
121
13

0 0

1 0
0 1

13 −9
10

36
66
 0 13 − 13
R3 + 66
R
2
0
0
−−−−36
−−→ 0
1
9
13
1
2

0 0

1 0 .
11
1
6
Since the matrix on the left has a zero row, B T B is not invertible, so (B T B)−1 does not exist.
7. The outer product expansion of AAT is
1 T
T
T
1
AA = a1 A1 + a2 A2 =
3
2 3 +
2
5
1
5 =
3
3
4 10
5
+
=
9
10 25
13
13
.
34
8. By Theorem 3.9 in Section 3.3, we know that (A−1 )−1 = A, so we must find the inverse of the given
matrix. Now, det A−1 = 12 · 4 − (−1) − 32 = 12 , so that
"
# "
#
8 2
1 4 1
−1 −1
A = (A ) =
=
.
1/2 23 12
3 1

−1
9. Since AX = B =  5
3

−3
0, we have X = A−1 B. To find A−1 , we compute
−2




3
1 0 −1 1 0 0
1 0 0
2 − 12
2




1
− 21  .
2 3 −1 0 1 0 → 0 1 0 −1
2
3
0 1
1 0 0 1
0 0 1
1 − 12
2
232
CHAPTER 3. MATRICES
Then


− 21
2

X = A−1 B = −1
1
3
−1
2

− 21   5
3
3
2
1
2
− 12
 
0
−3
 
0 = 2
1
−2

−9

4 .
−6
10. Since det A = 1 · 6 − 2 · 4 = −2 6= 0, A is invertible, so by Theorem 3.12 in Section 3.3, it can be written
as a product of elementary matrices. As in Example 3.29 in Section 3.3, we start by row-reducing A:
A=
1
4
2
1
R2 −4R1
6 −−−−−→ 0
1
2
− 21 R2
−2 −−−−→ 0
R1 −2R2 2 −
−−−−→ 1
0
1
0
.
1
So by applying
"
1
E1 =
4
#
0
,
1
"
1
E2 =
0
0
#
− 12
,
h
E3 = 1
−2
0
i
1
in that order to A we get the identity matrix, so that E3 E2 E1 A = I. Since elementary matrices are
invertible, this means that A = (E3 E2 E1 )−1 = E1−1 E2−1 E3−1 . Since E1 is R2 − 4R1 , it follows that
E1−1 is R2 + 4R1 . Since E2 is − 12 R2 , it follows that E2−1 is −2R2 . Since E3 is R1 − 2R2 , it follows that
E3−1 is R1 + 2R2 . Thus
1 0 1
0 1 2
−1 −1 −1
A = E1 E2 E3 =
.
−1 1 0 −2 0 1
11. To show that (I − A)−1 = I + A + A2 , it suffices to show that (I + A + A2 )(I − A) = I. But
(I + A + A2 )(I − A) = II − IA + AI − A2 + A2 I − A3 = I − A + A − A2 + A2 − A3 = I + A3 = I + O = I.
12. Use the multiplier method. We want to reduce A to an upper triangular matrix:

1
A = 3
2


1
1 1
R2 −3R1
1 1 −−−−−→ 0
R3 −2R1
−1 1 −
−−−−→ 0


1
1
1
0
−2 −2
R3 − 23 R2
−3 −1 −
−−−−−→ 0

1
1
−2 −2 = U.
0
2
Then `21 = 3, `31 = 2, and `32 = 23 , so that

1

L = 3
2
0
1
3
2

0

0 .
1
13. We follow the comment after Example 3.48 in Section 3.5. Start by finding the reduced row echelon
form of A:






2 −4 5 8 5
1 −2 2 3 1
1 −2
2
3 1
R
−2R
2
1
R1 ↔R2 
1 −2 2 3 1−
0
1
2 3
−−−−→ 2 −4 5 8 5 −−−−−→ 0
R
−4R
3
1
4 −8 3 2 6
4 −8 3 2 6 −−−−−→ 0
0 −5 −10 2




1 −2 2 3 1
1 −2 2 3 1
0
0
0 1 2 3
0 1 2 3 1
R3 +5R2
8 R3
0
0
0
0
8
0
0 0 0 1
−−−−−→
−−−→
 R1 −2R2 

R1 −R3 
−−−
−−→ 1 −2 2 3 0 −
−−−−→ 1 −2 0 −1 0
R −3R3 0
0
0 1 2 0
0 1
2 0 .
−−2−−−→
0
0 0 0 1
0
0 0
0 1
Chapter Review
233
Thus a basis for row(A) is given by the nonzero rows of the matrix, or
{ 1 −2 0 −1 0 , 0 0 1 2 0 , 0 0 0
0
1 }.
Note that since the matrix is rank 3, so all three rows are nonzero, the rows of the original matrix also
provide a basis for row(A).
A basis for col(A) is given by the column vectors of the original matrix A corresponding to the leading
1s in the row-reduced matrix, so a basis for col(A) is
     
5
5 
 2
1 , 2 , 1 .


4
3
6
Finally, from the row-reduced form, the solutions to Ax = 0 where
 
x1
x2 
 

x=
x3 
x4 
x5
can be found by noting that x2 = s and x4 = t are free variables and that x1 = 2s + t, x3 = −2t, and
x5 = 0. Thus the nullspace of A is
 

 

1
2
2s + t
 0
1
 s 
 

 

 −2t  = s 0 + t −2 ,
 

 

 1
0
 t 
0
0
0
so that a basis for the nullspace is given by
   
1
2
1  0
   
  
{
0 , −2}.
0  1
0
0
14. The row spaces will be the same since by Theorem 3.20 in Section 3.5, if A and B are row-equivalent,
they have the same row spaces. However, the column spaces need not be the same. For example, let




1 0 0
1 0 0
0 1 0 = B.
A = 0 1 0
R3 −R1
1 0 0 −
−−−−→ 0 0 0
Then A and B are row-equivalent, but
   
   
1
0
1
0
col(A) = span 0 , 1 but col(B) = span 0 , 1 .
1
0
0
0
15. If A is invertible, then so is AT . But by Theorem 3.27 in Section 3.5, invertible matrices have trivial
null spaces, so that null(A) = null(AT ) = 0. If A is not invertible, then the null spaces need not be
the same. For example, let




1 1 1
1 0 0
A = 0 0 0 , so that AT = 1 0 0 .
0 0 0
1 0 0
234
CHAPTER 3. MATRICES
Then AT row reduces to

1
0
0
But
0
0
0

   
0
0
0
0 , so that null(AT ) = span 1 , 0 .
0
0
1
   
−1
−1
null(A) = span  1 ,  0 .
0
1
Since every vector in null(AT ) has a zero first component, neither of the basis vectors for null(A) is in
null(AT ), so the spaces cannot be equal.
16. If the rows of A add up to zero, then the rows are linearly dependent, so by Theorem 3.27 in Section
3.5, A is not invertible.
17. Since A has n linearly independent columns, rank(A) = n. Theorem 3.28 of Section 3.5 tells us that
rank(AT A) = rank(A) = n, so AT A is invertible since it is an n × n matrix. For the case of AAT ,
since m may be greater than n, there is no reason to believe that this matrix must be invertible. One
counterexample is








1 0 1 0 1
1 0
1 0 1
1 0 1
1 0 1
, AAT = 0 1
A = 0 1 , AT =
= 0 1 0 → 0 1 0 .
0 1 0
0 1 0
1 0
0 0 0
1 0
1 0 1
Thus AAT is not invertible, since it row-reduces to a matrix with a zero row.
18. The matrix of T is [T ] = T (e1 ) T (e2 ) . From the given data, we compute
1
1 2
1 0
1 1
1
1
1
+
=
+
=
T
=T
4
0
2 1
2 −1
2 3
2 5
1 1
1
1 2
1 0
0
1
1
T
=T
−
=
−
=
.
1
−1
2 1
2 −1
2 3
2 5
Thus
[T ] =
1
4
1
.
−1
19. Let R be the rotation and P the projection. The given line has direction vector
Examples 3.58 and 3.59 in Section 3.6,
"
cos 45◦
[R] =
sin 45◦
− sin 45◦
cos 45◦
"
12
1
[P ] = 2
1 + 22 1 · (−2)
"√
#
=
√
2
√2
2
2
− 22
#
"
1 · (−2)
1
. Then from
−2
#
√
2
2
=
(−2)2
1
5
− 25
− 25
4
5
#
.
So R followed by P is
"
[P ◦ R] = [P ][R] =
1
5
− 25
− 52
4
5
# "√
2
√2
2
2
√
− 22
√
2
2
#
=
" √
− 102
√
2
5
√
− 3102
√
3 2
5
#
.
20. Suppose that cv + dT (v) = 0. We want to show that c = d = 0. Since T is linear and T 2 (v) = 0,
applying T to both sides of this equation gives
T (cv + T (dT (v))) = cT (v) + dT (T (v)) = cT (v) + dT 2 (v) = cT (v) = 0.
Since T (v) 6= 0, it follows that c = 0. But then cv + dT (v) = dT (v) = 0. Again since T (v) 6= 0, it
follows that d = 0. Thus c = d = 0 and v and T (v) are linearly independent.
Chapter 4
Eigenvalues and Eigenvectors
4.1
Introduction to Eigenvalues and Eigenvectors
1. Following example 4.1, to see that v is an eigenvector of A, we show that Av is a multiple of v:
0 3 1
0·1+3·1
3
1
Av =
=
=
=3
= 3v.
3 0 1
3·1+0·1
3
1
Thus v is an eigenvector of A with eigenvalue 3.
2. Following example 4.1, to see that v is an eigenvector of A, we show that Av is a multiple of v:
1 2
3
1 · 3 + 2 · (−3)
−3
Av =
=
=
= −v
2 1 −3
2 · 3 + 1 · (−3)
3
Thus v is an eigenvector of A with eigenvalue −1.
3. Following example 4.1, to see that v is an eigenvector of A, we show that Av is a multiple of v:
−1 1
1
−1 · 1 + 1 · (−2)
−3
1
Av =
=
=
= −3
= −3v.
6 0 −2
6 · 1 + 0 · (−2)
6
−2
Thus v is an eigenvector of A with eigenvalue −3.
4. Following example 4.1, to see that v is an eigenvector of A, we show that Av is a multiple of v:
4 −2 4
4·4−2·2
12
Av =
=
=
= 3v
5 −7 2
5·4−7·2
6
Thus v is an eigenvector of A with eigenvalue 3.
5. Following example 4.1, to see that v is an eigenvector of A, we show that Av is a multiple of v:

  
  
 
6
2
3 0
0
2
3 · 2 + 0 · (−1) + 0 · 1
Av = 0 1 −2 −1 = 0 · 2 + 1 · (−1) − 2 · 1 = −3 = 3 −1 = 3v.
1 0
1
1
1 · 2 + 0 · (−1) + 1 · 1
3
1
Thus v is an eigenvector of A with eigenvalue 3.
6. Following example 4.1, to see that v is an eigenvector of A, we show that Av is a multiple of v:

  
  
0 1 −1
−2
0 · (−2) + 1 · 1 − 1 · 1
0
1  1 = 1 · (−2) + 1 · 1 + 1 · 1 = 0 = 0v
Av = 1 1
1 2
0
1
1 · (−2) + 2 · 1 + 0 · 1
0
Thus v is an eigenvector of A with eigenvalue 0.
235
236
CHAPTER 4. EIGENVALUES AND EIGENVECTORS
7. We want to show that λ = 3 is an eigenvalue of
2
A=
2
2
−1
Following Example 4.2, that means we wish to find a vector v with Av = 3v, i.e. with (A − 3I)v = 0.
So we show that (A − 3I)v = 0 has nontrivial solutions.
2
2
3 0
−1
2
A − 3I =
−
=
.
2 −1
0 3
2 −4
Row-reducing, we get
−R1 −1
2 −
1 −2
−−→ 1 −2
.
R2 −2R1
2 −4
2 −4 −
0
−−−−→ 0
x
It follows that any vector
with x = 2y is an eigenvector, so for example one eigenvector correy
2
sponding to λ = 3 is
.
1
8. We want to show that λ = −1 is an eigenvalue of
A=
2
3
3
2
Following Example 4.2, that means we wish to find a vector v with Av = −v, i.e. with (A − (−I))v =
(A + I)v = 0. So we show that (A + I)v = 0 has nontrivial solutions.
2 3
1 0
3 3
A+I =
+
=
.
3 2
0 1
3 3
Row-reducing, we get
3
3
3
3
R2 −R1
3 −
−−−−→ 0
1 R1 3
3 −−
−→ 1
0
0
1
0
x
It follows that any vector
with x = −y is an eigenvector, so for example one eigenvector corre y
1
sponding to λ = −1 is
.
−1
9. We want to show that λ = 1 is an eigenvalue of
0
A=
−1
4
5
Following Example 4.2, that means we wish to find a vector v with Av = 1v, i.e. with (A − I)v = 0.
So we show that (A − I)v = 0 has nontrivial solutions.
0 4
1 0
−1 4
A−I =
−
=
.
−1 5
0 1
−1 4
Row-reducing, we get
−R1 −1 4 −
1 −4
−−→ 1 −4
.
R2 +R1
−1 4
−1
4 −
0
−−−−→ 0
x
It follows that any vector
with x = 4y is an eigenvector, so for example one eigenvector correy
4
sponding to λ = 1 is
.
1
4.1. INTRODUCTION TO EIGENVALUES AND EIGENVECTORS
10. We want to show that λ = −6 is an eigenvalue of
4
A=
5
237
−2
−7
Following Example 4.2, that means we wish to find a vector v with Av = −6v, i.e. with (A−(−6I))v =
(A + 6I)v = 0. So we show that (A + 6I)v = 0 has nontrivial solutions.
4 −2
6 0
10 −2
+
=
.
A + 6I =
5 −7
0 6
5 −1
Row-reducing, we get
"
10
5
#
−2
"
# 1R "
1
10 −2 −10
−−→ 1
− 15
0
0
R −1R
2 1
−1 −−2−−−
−→
0
0
#
x
It follows that any vector
with x = 51 y is an eigenvector, so for example one eigenvector correy
1
sponding to λ = −6 is
.
5
11. We want to show that λ = −1 is an eigenvalue of

1
A = −1
2
0
1
0

2
1
1
Following Example 4.2, that means we wish to find a vector v with Av = −1v, i.e. with (A + I)v = 0.
So we show that (A + I)v = 0 has nontrivial solutions.

 
 

2 0 2
1 0 0
1 0 2
A + I = −1 1 1 + 0 1 0 = −1 2 1 .
2 0 2
0 0 1
2 0 1
Row-reducing, we get

2
−1
2
0
2
0


2
2
−1
1
R3 −R1
2 −
0
−−−−→
0
2
0


 −R1 

2
−1 2 1 −
−−→ 1 −2 −1
R1 ↔R2 
R2 −2R1
2
1 −
2 0 2
0
2 −
−−−−→
−−−−→
0
0 0 0
0
0
0

 R1 +2R2 


1 −2 −1 1
1 −2 −1 −
−−−−→ 1
0

0
4 R2 0
4
4 −−
1
1
−→
0
0
0
0
0
0
0
0
1
0

1
1
0
 
x
It follows that any vector y  with x = y = −z is an eigenvector, so for example one eigenvector
z 
−1
corresponding to λ = −1 is −1.
1
12. We want to show that λ = 2 is an eigenvalue of

3
A = 1
4
1
1
2

−1
1
0
238
CHAPTER 4. EIGENVALUES AND EIGENVECTORS
Following example 4.2, that means we wish to find a vector v with Av = 2v, i.e. with (A − 2I)v = 0.
So we show that (A − 2I)v = 0 has nontrivial solutions.

 
 

3 1 −1
2 0 0
1
1 −1
1 − 0 2 0 = 1 −1
1
A − 2I = 1 1
4 2
0
0 0 2
4
2 −2
Row-reducing, we have

1
1
4


1 −1
1
R2 −R1
−1
1 −−−
−−→ 0
R3 −4R1
2 −2 −
−−−−→ 0
It follows that any vector


1 −1
1
0
−2
2
R3 −R2
−2
2 −
−−−−→ 0
hxi
y
z


1
1 −1
− 21 R2 
−2
2 −
−−−→ 0
0
0
0
1
1
0
 R1 −R2 
−1 −
−−−−→ 1
0
−1
0
0
0
1
0

0
−1
0
 
0
with x = 0 and y = z is an eigenvector. Thus for example 1 is an
1
eigenvector for λ = 2.
13. Since A is the reflection F in the y-axis,
the only vectors that F maps parallel to themselves are vectors
1
parallel to the x-axis (multiples of
), which are reversed (eigenvalue λ = −1) and vectors parallel
0
0
to the y-axis (multiples of
), which are sent to themselves (eigenvalue λ = 1). So λ = ±1 are the
1
eigenvalues of A, with eigenspaces
1
0
E−1 = span
E1 = span
.
0
1
F He1 L = e1
e-1
F He-1 L =- e-1
x
F HxL
14. The only vectors that the transformation F corresponding to A maps to themselves are vectors perpendicular to y = x, which are reversed (so correspond to the eigenvalue λ = −1) and vectors parallel
to y = x, which are sent to
(so correspond to the eigenvalue λ=1. Vectors perpendicular
themselves
1
1
to y = x are multiples of
; vectors parallel to y = x are multiples of
, so the eigenspaces are
−1
1
E−1 = span
1
−1
E1 = span
1
.
1
4.1. INTRODUCTION TO EIGENVALUES AND EIGENVECTORS
239
F He1 L = e1
F He-1 L =- e-1
F HxL
e-1
x
15. If x is an eigenvalue of A, then Ax is a multiple of x:
1 0 x
x
Ax =
=
.
0 0 y
0
x
Then clearly any vector parallel to the x-axis, i.e., of the form
, is an eigenvector for A with
0
eigenvalue 1. But this projection also transforms vectors parallel to the y-axis
to the zero vector, since
0
the projection of such a vector on the x-axis is 0. So vectors of the form
are eigenvectors for A
y
x
with eigenvalue 0. Finally, a vector of the form
where neither x nor y is zero is not an eigenvector:
y
x
it is transformed into
, and comparing the x-coordinates shows that λ would have to be 1, while
0
comparing y-coordinates shows that it would have to be zero. Thus there is no such λ. So in summary,
1
λ = 1 is an eigenvalue with eigenspace E1 = span
0
0
λ = 0 is an eigenvalue with eigenspace E0 = span
.
1
16. This projection sends a vector to its projection on the given line. Then a vector parallel to the direction
vector of that line will be sent to itself, since its projection on the line is the vector itself. A vector
orthogonal to the direction vector of the line will be "sent
zero vector, since its projection on the
# to the
4
4
line is the zero vector. Since the direction vector is 53 , or
, there are two eigenspaces
3
5
4
span
, λ = 1;
3
span
−3
, λ = −1.
4
If a vector v is neither parallel to nor orthogonal to the direction vector of the line, then the projection
transforms v to a vector parallel to the direction vector, thus not parallel to v. Thus the image of v
cannot be a multiple of v. Hence the eigenvalues and eigenspaces above are the only ones.
x
17. Consider the action of A on a vector x =
. If y = 0, then A multiplies x by 2, so this is an
y
eigenspace for λ = 2. If x = 0, then A multiplies x by 3, so this is an eigenspace for λ = 3. If neither x
240
CHAPTER 4. EIGENVALUES AND EIGENVECTORS
nor y is zero, then
x
2x
x
is mapped to
, which is not a multiple of
. So the only eigenspaces are
y
3y
y
1
0
span
, λ = 2;
span
, λ = 3.
0
1
18. Since A rotates the plane 90◦ counterclockwise around the origin, any nonzero vector is taken to
a vector orthogonal to itself. So there are no nonzero vectors that are taken to vectors parallel to
themselves (that is, to multiples of themselves). The only vector taken to a multiple of itself is the
zero vector; this is not an eigenvector since eigenvectors are by definition nonzero. So this matrix has
no eigenvectors and no eigenvalues. Analytically, the matrix of A is
cos θ − sin θ
0 −1
A=
=
,
sin θ
cos θ
1
0
so that
0
Ax =
1
−1 x
−y
=
.
0 y
x
If this is a multiple of the original vector, then there is a constant a such that x = −ay and y = ax.
Substituting ax for y in the first equation gives x = −a2 x, which is impossible unless x = 0. But then
the second equation forces y = 0. So again we see that there are no eigenvectors.
19. Since eigenvectors x are those taken to a multiple of themselves, i.e., to a vector parallel to themselves,
by A, we look for unit vectors whose image under A is in the same direction.
It appears that these
vec
1
0
tors are vectors along the x and y-axes, so the eigenspaces are E1 = span
and E2 = span
.
0
1
The length of the resultant vectors in the first eigenspace appears are same length as the original vector, so that λ1 = 1. The length of the resultant vectors in the second eigenspace are twice the original
length, so that λ2 = 2.
20. Since eigenvectors x are those taken to a multiple of themselves, i.e., to a vector parallel to themselves,
by A, we look for unit vectors whose image under A is in the
same
direction. The only vectors satisfying
1
this criterion are those parallel to the line y = x, i.e., span
, so this is the only eigenspace. The
1
1
terminal point of the line in the direction
appears to be the point (2, 5), so that it has a length of
1
√
√
29. Since the original vector is a unit vector, the eigenvalue is 29.
21. Since eigenvectors x are those taken to a multiple of themselves, i.e., to a vector parallel to themselves,
by A, we look for unit vectors
whose image under A is in the same direction. The vectors parallel to
1
the line y = x, i.e., span
satisfy this criterion, so this is an eigenspace E1 . The vectors parallel
1
1
to the line y = −x, i.e. span
, also satisfy this criterion, so this is another eigenspace E2 . For
−1
√
E1 , the terminal point of the line along y = x appears to be (2, 2),√so its length is 2 2. Since the
original vector is a unit vector, the corresponding eigenvalue is λ1 = 2 2. For E2 , the resultant vectors
are zero length, so λ2 = 0.
22. Since eigenvectors x are those taken to a multiple of themselves, i.e., to a vector parallel to themselves,
by A, we look for unit vectors whose image under A is in the same direction. There are none — every
vector is “bent” to the left by the transformation. So this transformation has no eigenvectors.
23. Using the method of Example 4.5, we wish to find solutions to the equation
4 −1
λ 0
4−λ
−1
0 = det(A − λI) = det
−
= det
2
1
0 λ
2
1−λ
= (4 − λ)(1 − λ) − 2 · (−1) = λ2 − 5λ + 6 = (λ − 2)(λ − 3).
4.1. INTRODUCTION TO EIGENVALUES AND EIGENVECTORS
241
Thus the eigenvalues are 2 and 3.
To find the eigenspace corresponding to λ = 2, we must find the null space of A − 2I.
"
#
"
#
h
i
2 −1 0
1 − 21 0
→
A − 2I 0 =
2 −1 0
0
0 0
so that eigenvectors corresponding to this eigenvalue have the form
1 t
2t ,
or
.
2t
t
1
A basis for this eigenspace is given by choosing for example t = 1 to get
. As a check,
2
1
4
A
=
2
2
−1 1
4·1−1·2
2
1
=
=
=2
.
1 2
2·1+1·2
4
2
To find the eigenspace corresponding to λ = 3, we must find the null space of A − 3I.
1 −1 0
1 −1 0
A − 3I 0 =
→
2 −2 0
0
0 0
so that eigenvectors corresponding to this eigenvalue have the form
t
.
t
A basis for this eigenspace is given by choosing for example t = 1 to get
A
1
4
=
1
2
−1 1
4·1−1·1
3
1
=
=
=3
1 1
2·1+1·1
3
1
The plot below shows the effect of A on the eigenvectors
1
1
v=
w=
.
2
1
4
3
2v
2
3w
1
1
. As a check,
1
v
w
1
2
3
242
CHAPTER 4. EIGENVALUES AND EIGENVECTORS
24. Using the method of Example 4.5, we wish to find solutions to the equation
0 = det(A − λI) = det
4
λ
−
0
0
2
6
0
2−λ
= det
λ
6
4
−λ
= (2 − λ)(−λ) − 4 · 6 = λ2 − 2λ − 24 = (λ − 6)(λ + 4).
Thus the eigenvalues are −4 and 6.
To find the eigenspace corresponding to λ = −4, we must find the null space of A + 4I.
h
"
i
6
0 =
6
A + 4I
4
0
4
0
#
→
"
1
2
3
#
0
0
0
0
so that eigenvectors corresponding to this eigenvalue have the form
2 −3t
,
t
or
−2t
3t
−2
A basis for this eigenspace is given by choosing for example t = 1 to get
. As a check,
3
A
−2
2
=
3
6
4
0
−2
2 · (−2) + 4 · 3
8
−2
=
=
= −4
.
3
6 · (−2) + 0 · 3
−12
3
To find the eigenspace corresponding to λ = 6, we must find the null space of A − 6I.
A − 6I
−4
0 =
6
4
−6
0
1
→
0
0
−1
0
0
0
so that eigenvectors corresponding to this eigenvalue have the form
t
t
1
A basis for this eigenspace is given by choosing for example t = 1 to get
. As a check,
1
1
2
A
=
1
6
4
0
−1
2·1+4·1
6
1
=
=
=6
1
6·1+0·1
6
1
The plot below shows the effect of A on the eigenvectors
v=
−2
3
w=
1
.
1
4.1. INTRODUCTION TO EIGENVALUES AND EIGENVECTORS
243
6
6w
4
v
2
w
2
-2
4
6
8
-2
-4
-6
-8
-10
-4 v
-12
25. Using the method of Example 4.5, we wish to find solutions to the equation
2 5
λ 0
2−λ
5
0 = det(A − λI) = det
−
= det
0 2
0 λ
0
2−λ
= (2 − λ)(2 − λ) − 5 · 0 = (λ − 2)2
Thus the only eigenvalue is λ = 2.
To find the eigenspace corresponding to λ = 2, we must find the null space of A − 2I.
#
"
#
"
i
h
0 1 0
0 5 0
→
A − 2I 0 =
0 0 0
0 0 0
so that eigenvectors corresponding to this eigenvalue have the form
t
.
0
1
A basis for this eigenspace is given by choosing for example t = 1, to get
. As a check,
0
1
2 5 1
2·1+5·0
2
1
A
=
=
=
=2
.
0
0 2 0
0·1+2·0
0
0
1
The plot below shows the effect of A on the eigenvector v =
:
0
v
2v
1
2
26. Using the method of Example 4.5, we wish to find solutions to the equation
1 2
λ 0
1−λ
2
0 = det(A − λI) = det
−
= det
−2 3
0 λ
−2
3−λ
= (1 − λ)(3 − λ) + 4 = λ2 − 4λ + 7
Since this quadratic has no (real) roots, A has no (real) eigenvalues.
244
CHAPTER 4. EIGENVALUES AND EIGENVECTORS
27. Using the method of Example 4.5, we wish to find solutions to the equation
0 = det(A − λI) = det
1
−1
1
λ
−
1
0
0
1−λ
= det
λ
−1
1
1−λ
= (1 − λ)(1 − λ) − 1 · (−1) = λ2 − 2λ + 2.
Using the quadratic formula, this quadratic has roots
√
2 ± 2i
2 ± 22 − 4 · 2
=
= 1 ± i.
λ=
2
2
Thus the eigenvalues are 1 + i and 1 − i.
To find the eigenspace corresponding to λ = 1 + i, we must find the null space of A − (1 + i)I.
iR1 −i
1 0 −
i 0
1 i 0
−→ 1
A − (1 + i)I 0 =
R2 +R1
−1 −i 0
−1 −i 0 −
−−−−→ 0 0 0
so that eigenvectors corresponding to this eigenvalue have the form
−it
.
t
A basis for this eigenspace is given by choosing for example t = 1, to get
−i
.
1
To find the eigenspace corresponding to λ = 1 − i, we must find the null space of A − (1 − i)I.
−iR1 i 1 0 −
1 −i 0
−−→ 1 −i 0
A − (1 − i)I 0 =
R2 +R1
−1 i 0
−1
i 0 −−−−−→ 0
0 0
so that eigenvectors corresponding to this eigenvalue have the form
it
.
t
A basis for this eigenspace is given by choosing for example t = 1, to get
i
.
1
28. We wish to find solutions to the equation
0 = det(A − λI) = det
2
1
−3
λ
−
0
0
0
2−λ
= det
λ
1
−3
−λ
= (2 − λ)(−λ) − (−3) · 1 = λ2 − 2λ + 3.
√
√
Using the quadratic formula, this has roots λ = 2± 24−4·3 = 1 ± i 2, so these are the eigenvalues.
√
√
To find the eigenspace corresponding to λ = 1 + i 2, we must find the null space of A − (1 + i 2)I:
√
A − (1 + i 2)I
√
1−i 2
0 =
1
−3 √
0 R1 ↔R2
−−−−−→
−1 − i 2 0
√
1√
−1 − i 2 0
1
√
R2 −(1−i 2)R1
1−i 2
−3
0 −
−−−−−−−−−→ 0
so that eigenvectors corresponding to this eigenvalue have the form
√ (1 + i 2)t
.
t
√
−1 − i 2
0
0
,
0
4.1. INTRODUCTION TO EIGENVALUES AND EIGENVECTORS
245
√ 1+i 2
. As a check,
1
√ √
√ √ √ √
−1 + 2i
2 −3 1 + i 2
2(1 + i√ 2) − 3 · 1
1+i 2
√ 2 = (1 + i 2) 1 + i 2
=
=
=
A
1
0
1
1
1
(1 + i 2) + 0 · 1
1+i 2
√
√
To find the eigenspace corresponding to λ = 1 − i 2, we must find the null space of A − (1 − i 2)I.
√
√
−3 √
0 R1 ↔R2
1+i 2
−−−−−→
A − (1 − i 2)I 0 =
1
−1 + i 2 0
√
√
1√
−1 + i 2 0
1 −1 + i 2 0
√
R2 −(1+i 2)R1
0
0
−3
0 −
1+i 2
−−−−−−−−−→ 0
A basis for this eigenspace is given by choosing for example t = 1, to get
so that eigenvectors corresponding to this eigenvalue have the form
√ (1 − i 2)t
.
t
√ 1−i 2
A basis for this eigenspace is given by choosing for example t = 1, to get
. As a check,
1
√
√ √ √ √ √
2
2 −3 1 − i 2
2(1 − i√ 2) − 3 · 1
−1 − 2i
1−i 2
1
−
i
2
√
A
=
=
=
= (1 − i 2)
1
0
1
1
1
(1 − i 2) + 0 · 1
1−i 2
29. Using the method of Example 4.5, we wish to find solutions to the equation
1 i
λ 0
1−λ
i
0 = det(A − λI) = det
−
= det
i 1
0 λ
i
1−λ
= (1 − λ)(1 − λ) − i · i = λ2 − 2λ + 2.
Using the quadratic formula, this quadratic has roots
√
2 ± 22 − 4 · 2
2 ± 2i
λ=
=
= 1 ± i.
2
2
Thus the eigenvalues are 1 + i and 1 − i.
To find the eigenspace corresponding to λ = 1 + i, we must find the null space of A − (1 + i)I.
iR1 −i
i 0 −
1 −1 0
1 −1 0
−
→
A − (1 + i)I 0 =
R2 −iR1
i −i 0
i −i 0 −
0 0
−−−−→ 0
so that eigenvectors corresponding to this eigenvalue have the form
t
.
t
A basis for this eigenspace is given by choosing for example t = 1, to get
1
.
1
To find the eigenspace corresponding to λ = 1 − i, we must find the null space of A − (1 − i)I.
−iR1 i i 0 −
1 1 0
−−→ 1 1 0
A − (1 − i)I 0 =
R2 −iR1
i i 0
i i 0 −
−−−−→ 0 0 0
so that eigenvectors corresponding to this eigenvalue have the form
−t
.
t
A basis for this eigenspace is given by choosing for example t = 1, to get
−1
.
1
246
CHAPTER 4. EIGENVALUES AND EIGENVECTORS
30. Using the method of Example 4.5, we wish to find solutions to the equation
0 = det(A − λI) = det
0
1+i
λ
−
1−i
1
0
0
λ
= det
−λ
1−i
1+i
1−λ
= (−λ)(1 − λ) − (1 + i)(1 − i) = λ2 − λ − 2 = (λ − 2)(λ + 1)
Thus the eigenvalues are 2 and −1.
To find the eigenspace corresponding to λ = 2, we must find the null space of A − 2I.
"
# 1
"
#
"
− 2 R1
i
h
−2 1 + i 0 −−−
1
− 12 (1 + i) 0
1 − 12 (1 + i)
−
→
A − 2I 0 =
R2 −(1−i)R1
1 − i −1 0
1−i
−1
0 −−−−−−−−→ 0
0
#
0
0
so that eigenvectors corresponding to this eigenvalue have the form
"
#
"
#
1
(1 + i)t
2 (1 + i)t , or, clearing fractions,
.
t
2t
A basis for this eigenspace is given by choosing for example t = 1, to get
1+i
.
2
To find the eigenspace corresponding to λ = −1, we must find the null space of A − (−I) = A + I.
A+I
0 =
1
1+i
1−i
2
1 1+i
0
R2 −(1−i)R1
0 −
0
−−−−−−−→ 0
0
0
so that eigenvectors corresponding to this eigenvalue have the form
−(1 + i)t
.
t
A basis for this eigenspace is given by choosing for example t = 1, to get
−1 − i
.
1
31. Using the method of Example 4.5, we wish to find solutions in Z3 to the equation
0 = det(A − λI) = det
1
1
0
λ
−
2
0
0
1−λ
= det
λ
1
0
2−λ
= (1 − λ)(2 − λ) − 0 · 1 = (λ − 1)(λ − 2).
Thus the eigenvalues are 1 and 2.
32. Using the method of Example 4.5, we wish to find solutions in Z3 to the equation
0 = det(A − λI) = det
2
1
1
λ
−
2
0
0
2−λ
= det
λ
1
1
2−λ
= (2 − λ)(2 − λ) − 1 · 1 = λ2 − 4λ + 4 − 1 = λ2 + 2λ = λ(λ + 2) = λ(λ − 1).
Thus the eigenvalues are 0 and 1.
33. Using the method of Example 4.5, we wish to find solutions in Z5 to the equation
0 = det(A − λI) = det
3
4
1
λ
−
0
0
0
3−λ
= det
λ
4
1
−λ
= (3 − λ)(−λ) − 1 · 4 = λ2 − 3λ − 4 = λ2 + 2λ + 1 = (λ + 1)2 = (λ − 4)2 .
Thus the only eigenvalue is 4.
4.1. INTRODUCTION TO EIGENVALUES AND EIGENVECTORS
247
34. Using the method of Example 4.5, we wish to find solutions in Z5 to the equation
0 = det(A − λI) = det
1
4
4
λ
−
0
0
0
1−λ
= det
λ
4
4
−λ
= (1 − λ)(−λ) − 4 · 4 = λ2 − λ − 16 = λ2 + 4λ + 4 = (λ + 2)2 = (λ − 3)2 .
Thus the only eigenvalue is 3.
35. (a) To find the eigenvalues of A, find solutions to the equation
0 = det(A − λI) = det
a
c
b
λ
−
d
0
0
a−λ
= det
λ
c
b
d−λ
= (a − λ)(d − λ) − bc = λ2 − (a + d)λ + (ad − bc) = λ2 − tr(A)λ + det(A).
(b) Using the quadratic formula to find the roots of this quadratic gives
p
tr(A) ± (tr(A))2 − 4 det(A)
λ=
2
√
a + d ± a2 + 2ad + d2 − 4ad + 4bc
=
2
p
1
2
=
a + d ± a − 2ad + d2 + 4bc
2
p
1
=
a + d ± (a − d)2 + 4bc .
2
(c) Let
p
1
a + d + (a − d)2 + 4bc ,
2
be the two eigenvalues. Then
λ1 =
λ2 =
p
1
a + d − (a − d)2 + 4bc
2
1
p
p
1
a + d + (a − d)2 + 4bc +
a + d − (a − d)2 + 4bc
2
2
1
= (a + d + a + d)
2
= a + d = tr(A),
1 p
p
1
2
2
λ1 λ2 =
a + d − (a − d) + 4bc
a + d − (a − d) + 4bc
2
2
1
(a + d)2 − (a − d)2 + 4bc
=
4
1
= (a2 + 2ad + d2 − a2 + 2ad − d2 − 4bc)
4
1
= (4ad − 4bc) = det(A).
4
λ1 + λ2 =
36. A quadratic equation will have two distinct real roots when its discriminant (the quantity under the
square root sign in the quadratic formula) is positive, one real root when it is zero, and no real roots
when it is negative. In Exercise 35, the discriminant is (a − d)2 + 4bc.
(a) For A to have to distinct real eigenvalues, we must have (a − d)2 + 4bc > 0.
(b) For A to have a single real eigenvalue, it must be the case that (a − d)2 + 4bc = 0, so that
(a − d)2 = −4bc. Note that this implies that either a = d and at least one of b or c is zero, or that
a 6= d and either b or c, but not both, is negative.
(c) For A to have no real eigenvalues, we must have (a − c)2 + 4bc < 0.
248
CHAPTER 4. EIGENVALUES AND EIGENVECTORS
37. From Exercise 35, since c = 0, the eigenvalues are
1
p
1
a + d ± (a − d)2 + 4b · 0 = (a + d ± |a − d|)
2
2
1
1
(a + d + a − d), (a + d + d − a) = {a, d}.
=
2
2
To find the eigenspace corresponding to λ = a, we must find the null space of A − aI.
0
b
0
A − aI 0 =
.
0 d−a 0
If b = 0 and a = d, then this is the zero matrix, so that any vector is an eigenvector; a basis for the
eigenspace is
1
0
,
.
0
1
The reason for this is that under these conditions, the matrix A is a multiple of the identity matrix, so
it simply stretches any vector by a factor of a = d. Otherwise, either b 6= 0 or d − a 6= 0, so the above
matrix row-reduces to
0 1 0
0 0 0
so that eigenvectors corresponding to this eigenvalue have the form
t
.
0
1
A basis for this eigenspace is given by choosing for example t = 1, to get
. As a check,
0
1
a·1+b·0
a
1
A
=
=
=a
.
0
0·1+d·0
0
0
To find the eigenspace corresponding to λ = d, we must find the null space of A − dI.
a−d b 0
A − dI 0 =
.
0
0 0
If b = 0 and a = d, then this is again the zero matrix, which we dealt with above. Otherwise, either
b 6= 0 or d − a 6= 0, so that eigenvectors corresponding to this eigenvalue have the form
−bt
bt
, or
.
(a − d)t
(d − a)t
b
A basis for this eigenspace is given by choosing for example t = 1, to get
. As a check,
d−a
b
a · b + b · (d − a)
bd
b
=
=
=d
.
A
d−a
0 · b + d · (d − a)
d(d − a)
d−a
38. Using the method of Example 4.5, we wish to find solutions to the equation
0 = det(A − λI) = det
a b
λ
−
−b a
0
0
a−λ
= det
λ
−b
b
a−λ
= (a − λ)(a − λ) − 4 · (−b) = λ2 − 2aλ + (a2 + 4b).
4.1. INTRODUCTION TO EIGENVALUES AND EIGENVECTORS
249
Using Exercise 35, the eigenvalues are
p
1
1p
a + a ± (a − a)2 + 4b · (−b) = a ±
−4b2 = a ± bi.
2
2
To find the eigenspace corresponding to λ = a + bi, we must find the null space of A − (a + bi)I.
A − (a + bi)I
−bi
b
0 =
−b −bi
0
.
0
If b = 0 then this is the zero matrix, so that any vector is an eigenvector; a basis for the eigenspace is
1
0
,
.
0
1
The reason for this is that under these conditions, the matrix A is a times the identity matrix, so it
simply stretches any vector by a factor of a. Otherwise, b 6= 0, so we can row-reduce the matrix:
−bi
b
−b −bi
1 1 0 −b−iR
i
−→ 1
−b −bi
0
0
1
R2 +bR1
0 −
−−−−→ 0
i
0
0
.
0
Thus eigenvectors corresponding to this eigenvalue have the form
−it
t
=
.
t
it
1
A basis for this eigenspace is given by choosing for example t = 1, to get
. As a check,
i
1
a + bi
1
A
=
= (a + bi)
.
i
−b + ai
i
To find the eigenspace corresponding to λ = a − bi, we must find the null space of A − (a − bi)I.
A − (a − bi)I
bi
0 =
−b
b
bi
0
.
0
If b = 0 then this is the zero matrix, which we dealt with above. Otherwise, b 6= 0, so we can row-reduce
the matrix:
− 1 iR1 bi b 0 −
−b−−→ 1 −i 0 R +bR 1 −i 0 .
1
−b bi 0
−b bi 0 −−2−−−→
0 0 0
Thus eigenvectors corresponding to this eigenvalue have the form
it
.
t
i
A basis for this eigenspace is given by choosing for example t = 1, to get
. As a check,
1
A
i
ai + b
b + ai
i
=
=
= (a − bi)
.
1
−bi + a
a − bi
1
250
4.2
CHAPTER 4. EIGENVALUES AND EIGENVECTORS
Determinants
1. Expanding along the first row gives
1
5
0
0
1
1
3
1
1 =1
1
2
5
1
−0
0
2
5
1
+3
0
2
1
= 1(1 · 2 − 1 · 1) − 0 + 3(5 · 1 − 1 · 0) = 16.
1
Expanding along the first column gives
1
5
0
0
1
1
3
1
1 =1
1
2
0
1
−5
1
2
0
3
+0
1
2
3
= 1(1 · 2 − 1 · 1) − 5(0 · 2 − 3 · 1) + 0 = 16.
1
2. Expanding along the first row gives
0
2
−1
1
3
3
−1
3
−2 = 0
3
0
2
−2
+ (−1)
−1
0
2
−2
−1
−1
0
3
3
= 0 − 1(2 · 0 − (−2) · (−1)) − 1(2 · 3 − 3 · (−1)) = −7.
Expanding along the first column gives
0
2
−1
1
3
3
−1
3
−2 = 0
3
0
1
−2
−2
3
0
1
−1
+ (−1)
3
0
−1
−2
= 0 − 2(1 · 0 − (−1) · 3) − 1(1 · (−2) − (−1) · 3) = −7.
3. Expanding along the first row gives
1
−1
0
−1
0
0
0
1 =1
1
1 −1
−1
1
−(−1)
0
−1
−1
1
+0
0
−1
0
= 1(0·(−1)−1·1)+1(−1·(−1)−1·0)+0 = 0.
1
−1
0
+0
0
−1
0
= 1(0·(−1)−1·1)+1(−1·(−1)−0·1)+0 = 0.
1
Expanding along the first column gives
1
−1
0
−1
0
0
0
1 =1
1
1 −1
−1
1
−(−1)
1
−1
4. Expanding along the first row gives
1
1
0
1
0
1
0
0
1 =1
1
1
1
1
−1
1
0
1
1
+0
1
0
0
= 1(0 · 1 − 1 · 1) − 1(1 · 1 − 1 · 0) + 0 = −2.
1
Expanding along the first column gives
1
1
0
1
0
1
0
0
1 =1
1
1
1
1
−1
1
1
0
1
+0
1
0
0
= 1(0 · 1 − 1 · 1) − 1(1 · 1 − 1 · 0) + 0 = −2.
1
5. Expanding along the first row gives
1
2
3
2
3
1
3
3
1 =1
1
2
1
2
−2
2
3
1
2
+3
2
3
3
= 1(3 · 2 − 1 · 1) − 2(2 · 2 − 1 · 3) + 3(2 · 1 − 3 · 3) = −18.
1
4.2. DETERMINANTS
251
Expanding along the first column gives
1
2
3
2
3
1
3
3
1 =1
1
2
2
1
+3
3
2
2
1
−2
3
2
3
= 1(3 · 2 − 1 · 1) − 2(2 · 2 − 1 · 3) + 3(2 · 1 − 3 · 3) = −18.
1
6. Expanding along the first row gives
1
4
7
2
5
8
3
5
6 =1
8
9
4
6
+3
7
9
4
6
−2
7
9
5
= 1(5 · 9 − 6 · 8) − 2(4 · 9 − 6 · 7) + 3(4 · 8 − 5 · 7) = 0.
8
Expanding along the first column gives
1
4
7
2
5
8
3
5
6 =1
8
9
2
3
+7
5
9
2
6
−4
8
9
3
= 1(5 · 9 − 6 · 8) − 4(2 · 9 − 3 · 8) + 7(2 · 6 − 3 · 5) = 0.
6
7. Noting the pair of zeros in the third row, use cofactor expansion along the third row to get
5
−1
3
2
1
0
2
2
2 =3
1
0
5
2
−0
−1
2
5
2
+0
−1
2
2
= 3(2 · 2 − 2 · 1) − 0 + 0 = 6.
1
8. Noting the zero in the second row, use cofactor expansion along the second row to get
1
2
3
1 −1
1
0
1 = −2
−2
−2
1
1
−1
−1
3
1
1
= −2(1 · 1 − (−1) · (−2)) − 1(1 · (−2) − 1 · 3) = 7.
−2
9. Noting the zero in the bottom right, use the third row, since the multipliers there are smaller:
−4
2
1
1 3
1
−2 4 = 1
−2
−1 0
−4
3
− (−1)
2
4
3
= 1(1 · 4 − 3 · (−2)) + 1(−4 · 4 − 3 · 2) = −12.
4
10. Noting that the first column has only one nonzero entry, we expand along the first column:
cos θ
0
0
sin θ
cos θ
sin θ
tan θ
− sin θ = cos θ(cos θ · cos θ − (− sin θ) · sin θ) = cos θ(cos2 θ + sin2 θ) = cos θ.
cos θ
11. Use the first column:
a
0
a
b
a
0
0
a
b =a
0
b
b
b
+a
b
a
0
= a(a · b − b · 0) + a(b · b − 0 · a) = a2 b + ab2 .
b
12. Use the first row (other choices are equally easy):
0
b
0
a 0
b d
c d = −a
= −a(b · 0 − d · 0) = 0.
0 0
e 0
252
CHAPTER 4. EIGENVALUES AND EIGENVECTORS
13. Use the third row, since it contains only one nonzero entry:
−1 0 3
1
5 2 6
= (−1)3+2 · 1 2
1 0 0
1
4 2 1
1
2
0
1
1
3
6 =− 2
1
1
0
2
2
0
2
2
3
6 .
1
To compute this determinant, use cofactor expansion along the first row:
1
− 2
1
0
2
2
3
2
6 = −1
2
1
2
6
−3
1
1
2
= −1(2 · 1 − 6 · 2) − 3(2 · 2 − 2 · 1) = 4.
2
14. Noting that the second column has only one nonzero entry, we expand along that column. The nonzero
entry is in row 3, column 2, so it gets a minus sign, since (−1)3+2 = −1:
0 3 −1
2
0 2
2
= −(−1) 1
−1 1
4
2
0 1 −3
2
1
0
2
2
−1
2 = 1
2
−3
3
2
1
3
2
1
−1
2 .
−3
To evaluate the remaining determinant, we use expansion along the first row:
2
1
2
3 −1
2
2
2 =2
1
1 −3
1
2
−3
2
−3
1
2
−1
2
−3
2
1
= 2(2 · (−3) − 2 · 1) − 3(1 · (−3) − 2 · 2) − 1(1 · 1 − 2 · 2) = 8.
15. Expand along the first row:
0
0
0
g
0
0
d
h
0 a
0
b c
= −a 0
e f
g
i j
0
d
h
b
e .
i
Now expand along the first row again:
0
−a 0
g
0 b
0
d e = −ab
g
h i
d
= −ab(0 · h − dg) = abdg.
h
16. Using the method of Example 4.9, we get
1
4
7
2 3
5 6
8 9
105
48
72
1
4
7
2
5
8
45
84
96
so that the determinant is −(105 + 48 + 72) + (45 + 84 + 96) = 0.
17. Using the method of Example 4.9,
1
2
3
1 −1
0
1
−2
1
0
−2
1
2
3
1
0
−2
0
3
so that the determinant is −(0 + (−2) + 2) + (0 + 3 + 4) = 7.
2
4
4.2. DETERMINANTS
253
18. Using the method of Example 4.9, we get
0
a
0
a
b
a
0
0
b
b
0
a
0
a
b
a
0
a2 b
2
0
ab2
0
2
so that the determinant is −(0 + 0 + 0) + (a b + ab + 0) = a2 b + ab2 .
19. Consider (2) in the text. The three upward-pointing diagonals have product a31 a22 a13 , a32 a23 a11 , and
a33 a21 a12 , while the downward-pointing diagonals have product a11 a22 a33 , a12 a23 a31 , and a13 a21 a32 ,
so the determinant by that method is
det A = −(a31 a22 a13 + a32 a23 a11 + a33 a21 a12 ) + (a11 a22 a33 + a12 a23 a31 + a13 a21 a32 ).
Expanding along the first row gives
det A = a11
a22
a32
a
a23
− a12 21
a31
a33
a
a23
+ a13 21
a31
a33
a22
a32
= a11 a22 a33 − a11 a32 a23 − a12 a21 a33 + a12 a23 a31 + a13 a21 a32 − a13 a22 a31 ,
which, after rearranging, is the same as the first answer.
a11 a12
20. Consider the 2 × 2 matrix A =
. Then
a21 a22
C11 = (−1)1+1 det[a22 ] = a22 ,
C12 = (−1)1+2 det[a21 ] = −a21 ,
so that from definition (4)
det A =
2
X
a1j C1j = a11 C11 + a12 C12 = a11 a12 + a12 · (−a21 ) = a11 a22 − a12 a21 ,
j=1
which agrees with the definition of a 2 × 2 determinant.
21. If A is lower triangular, then AT is upper triangular. Since by Theorem 4.10 det A = det AT , if we
prove the result for upper triangular matrices, we will have proven it for lower triangular matrices as
well. So assume A is upper triangular. The basis for the induction is the case n = 1, where A = [a11 ]
and det A = a11 , which is clearly the “product” of the entries on the diagonal. Now assume that the
statement holds for n × n upper triangular matrices, and let A be an (n + 1) × (n + 1) upper triangular
matrix. Using cofactor expansion along the last row, we get
det A = (−1)(n+1)+(n+1) an+1,n+1 An+1,n+1 = an+1,n+1 An+1,n+1 .
But An+1,n+1 is an n × n upper triangular matrix with diagonal entries a1 , a2 , . . . , an , so applying the
inductive hypothesis we get
det A = an+1,n+1 An+1,n+1 = an+1,n+1 (a1 a2 · · · an ) = a1 a2 · · · an an+1 ,
proving the result for (n + 1) × (n + 1) matrices. So by induction, the statement is true for all n ≥ 1.
22. Row-reducing the matrix in Exercise 1, and modifying the determinant as required by Theorem 4.3,
gives
1 0 3
1 0
3
1 0
3
R2 −5R1
5 1 1 −
0
1
−14
0
1
−14
−−−−→
R3 −R2
0 1 2
0 1
2 −
−−−−→ 0 0 16
Note that we did not introduce any factors in front of the determinant since all of the row operations
were of the form Ri + kRj , which does not change the determinant. The final determinant is easy to
evaluate since the matrix is upper triangular; the determinant of the original matrix is 1 · 1 · 16 = 16.
254
CHAPTER 4. EIGENVALUES AND EIGENVECTORS
23. Row-reducing the matrix in Exercise 9, and modifying the determinant as required by Theorem 4.3,
gives
1 −1 0
1 −1 0
1 −1 0
−4
1 3
R2 −2R1
R2 ↔R3
R1 ↔R3
0 4 −
2 −2 4 −
−−−−→ 0 −3 3 .
−−−−→ − 2 −2 4 −−−−−→ − 0
R3 +4R1
0
0 4
0 −3 3
−4
1 3 −
1 −1 0
−−−−→
There were two row interchanges, each of which introduced a factor of −1 in the determinant. The
other row operations were of the form Ri + kRj , which does not change the determinant. The final
determinant is easy to evaluate since the matrix is upper triangular; the determinant of the original
matrix is 1 · (−3) · 4 = −12.
24. Row-reducing the matrix in Exercise 13, and modifying the determinant as required by Theorem 4.3,
gives
1
2
0
1
−1 0
3
1 0
0
7 2
0
5 2 −2
1
−1 0
3
0
7 2
0 R2 ↔R3
−
0
1 0
0 −−−−−→
0
5 2 −2
1
−1 0 3
R −2R1
0
5 2 6 −−2−−−→
0
1 0 0
R4 −R1
4 2 1 −
−−−−→ 0
1
0
R −7R2 −
−−3−−−→
0
R4 −5R2
0
−−−−−→
1
−1 0
3
0
1 0
0
−
0
0 2
0
R4 −R3
0
0 2 −2 −
−−−−→
−1 0
3
1 0
0
.
0 2
0
0 0 −2
There was one row interchange, which introduced a factor of −1 in the determinant. The other row
operations were of the form Ri + kRj , which does not change the determinant. The final determinant
is easy to evaluate since the matrix is upper triangular; the determinant of the original matrix is
−(1 · 1 · 2 · (−2)) = 4.
25. Row-reducing the matrix in Exercise 14, and modifying the determinant as required by Theorem 4.3,
gives
2
1
0
2
1
0 3 −1
2
0 2
2 R1 ↔R2
−
0
−1 1
4 −−−−−→
2
0 1 −3
1
R ↔R3 0
−−2−−−→
0
0
1
0 2
2
R −2R1
0
0 3 −1 −−2−−−→
−
0
−1 1
4
R2 −2R1
0
0 1 −3 −
−−−−→
0
2
2
1
−1
1
4
0
0 −1 −5
0
R4 −3R3
0 −3 −7 −
−−−−→ 0
0
2
2
0 −1 −5
−1
1
4
0 −3 −7
0
2
2
−1
1
4
.
0 −1 −5
0
0
8
There were two row interchanges, each of which introduced a factor of −1 in the determinant. The
other row operations were of the form Ri + kRj , which does not change the determinant. The final
determinant is easy to evaluate since the matrix is upper triangular; the determinant of the original
matrix is 1 · (−1) · (−1) · 8 = 8.
26. Since the third row is a multiple of the first row, the determinant is zero since we can use the row
operation R3 − 2R1 to zero the third row, and doing so does not change the determinant.
27. This matrix is upper triangular, so its determinant is the product of the diagonal entries, which is
3 · (−2) · 4 = −24.
28. If we interchange the first and third rows, the result is an upper triangular matrix with diagonal
entries 3, 5, and 1. Since interchanging the rows negates the determinant, the determinant of the
original matrix is −(3 · 5 · 1) = −15.
4.2. DETERMINANTS
255
29. The third column is −2 times the first column, so the columns are not linearly independent and thus
the matrix does not have rank 3, so is not invertible. By Theorem 4.6, its determinant is zero.
30. Since the third row is the sum of the first and second rows, the rows are not linearly independent and
thus the matrix does not have rank 3, so is not invertible. By Theorem 4.6, its determinant is zero.
31. Since the first column is the sum of the second and third columns, the columns are not linearly
independent and thus the matrix does not have rank 3, so is not invertible. By Theorem 4.6, its
determinant is zero.
32. Interchanging the second and third rows gives the identity matrix, which has determinant 1. Since
interchanging the rows introduces a minus sign, the determinant of the original matrix is −1.
33. Interchange the first and second rows, and also the third and fourth rows, to give a diagonal matrix with
diagonal entries −3, 2, 1, and 4. Performing two row interchanges does not change the determinant,
so the determinant of the original matrix is −3 · 2 · 1 · 4 = −24.
34. The sum of the first and second rows is equal to the sum of the third and fourth rows, so the rows are
not linearly independent and thus the matrix does not have rank 4, so is not invertible. By Theorem
4.6, its determinant is zero.
35. This matrix results from the original matrix by multiplying the first row by 2, so its determinant is
2 · 4 = 8.
36. This matrix results from the original matrix by multiplying the first column by 2, the second column
by 31 , and the third column by −1. Thus the determinant of this matrix is 2 · 13 · (−1) = − 32 times the
original determinant, so its value is − 32 · 4 = − 83 .
37. This matrix results from the original matrix by interchanging the first and second rows. This multiplies
the determinant by −1, so the determinant of this matrix is −4.
38. This matrix results from the original matrix by subtracting the third column from the first. This
operation does not change the determinant, so the determinant of this matrix is also 4.
39. This matrix results from the original matrix by exchanging the first and third columns (which multiplies
the determinant by −1) and then multiplying the first column by 2 (which multiplies the determinant
by 2). So the determinant of this matrix is 2 · (−1) · 4 = −8.
40. To get from the original matrix to this one, row 2 was multiplied by 3, and then twice row 3 was
added to each of rows 1 and 2. The first of these operations multiplies the determinant by 3, while the
additions do not change the determinant. Thus the determinant of this matrix is 3 · 4 = 12.
41. Suppose first that A has a zero row, say the ith row is zero. Using cofactor expansion along that row
gives
n
n
X
X
det A =
aij Cij =
0Cij = 0,
j=1
j=1
so the determinant is zero. Similarly, if A has a zero column, say the j th column, then using cofactor
expansion along that column gives
det A =
n
X
i=1
aij Cij =
n
X
0Cij = 0,
i=1
and again the determinant is zero. Alternatively, the column case can be proved from the row case by
noting that if A has a zero column, then AT has a zero row, and det A = det AT by Theorem 4.10.
256
CHAPTER 4. EIGENVALUES AND EIGENVECTORS
Ri +kRj
42. Dealing first with rows, we want to show that if A −−−−−→ B where i 6= j then det B = det A. Let Ai
be the ith row of A. Then




A1
A1
 A2 

 A2




 .. 


..




.
 . 

B=
Ai + kAj  ; let C = kAj  .




 . 


..
.
 . 


.
An
An
Then A, B, and C are identical except that the ith row of B is the sum of the ith rows of A and C,
so that by part (a), det B = det A + det C. However, since i 6= j, the ith row of C, which is kAj , is a
multiple of the j th row, which is Aj . By Theorem 4.3(d), multiplying the j th row by k multiplies the
determinant of C by k. However, the result is a matrix with two identical rows; both the ith and j th
rows are kAj . So det C = 0 by Theorem 4.3(c). Hence det B = det A as desired.
Ci +kCj
Ri +kRj
If instead A −−−−−→ B where i 6= j, then AT −−−−−→ B T , so that by the first part and Theorem 4.10
det A = det AT = det B T = det B.
43. If E interchanges two rows, then det E = −1 by Theorem 4.4. Also, EB is the same as B but with two
rows interchanged, so by Theorem 4.3(b), det(EB) = − det B = det(E) det(B). Next, if E results from
multiplying a row by a constant k, then det E = k by Theorem 4.4. But EB is the same as B except that
one row has been multiplied by k, so that by Theorem 4.3(d), det(EB) = k det(B) = det(E) det(B).
Finally, if E results from adding a multiple of one row to another row, then det E = 1 by Theorem 4.4.
But in this case EB is obtained from B by adding a multiple of one row to another, so by Theorem
4.3(f), det(EB) = det(B) = det(E) det(B).
44. Let the row vectors of A be A1 , A2 , . . . , An . Then the row vectors of kA are kA1 , kA2 , . . . , kAn .
Using Theorem 4.3(d) n times,
kA1
A1
A1
A1
kA2
kA2
A2
A2
det(kA) = . = k . = k 2 . = · · · = k n . = k n det A.
..
..
..
..
kAn
kAn
kAn
An
45. By Theorem 4.6, A is invertible if and only if det A 6= 0. To find det A, expand along the first column:
det A = k
k+1
−8
1
−k
+k
k−1
k+1
3
1
= k((k + 1)(k − 1) − 1 · (−8)) + k(−k · 1 − 3(k + 1)) = k 3 − 4k 2 + 4k = k(k − 2)2 .
Thus det A = 0 when k = 0 or k = 2, so A is invertible for all k except k = 0 and k = 2.
46. By Theorem 4.6, A is invertible if and only if det A 6= 0. To find det A, expand along the first row:
k
k2
0
k
2
k
0
2
k =k
k
k
k
k2
−k
k
0
k
k
= k(2k − k 2 ) − k(k 3 − 0) = k(2k − k 2 − k 3 ) = k 2 (2 − k − k 2 ) = −k 2 (k + 2)(k − 1).
Since the matrix is invertible precisely when its determinant is nonzero, and since the determinant is
zero when k = 0, k = 1, or k = −2, we see that the matrix is invertible for all values of k except for 0,
1, and −2.
47. By Theorem 4.8, det(AB) = det(A) det(B) = 3 · (−2) = −6.
4.2. DETERMINANTS
257
48. By Theorem 4.8, det(A2 ) = det(A) det(A) = 3 · 3 = 9.
1
49. By Theorem 4.9, det(B −1 ) = det(B)
, so that by Theorem 4.8
det(A)
3
=− .
det(B)
2
det(B −1 A) = det(B −1 ) det(A) =
50. By Theorem 4.7, det(2A) = 2n det(A) = 3 · 2n .
51. By Theorem 4.10, det(B T ) = det(B), so that by Theorem 4.7, det(3B T ) = 3n det(B T ) = −2 · 3n .
52. By Theorem 4.10, det(AT ) = det(A), so that by Theorem 4.8,
det(AAT ) = det(A) det(AT ) = det(A)2 = 9.
53. By Theorem 4.8, det(AB) = det(A) det(B) = det(B) det(A) = det(BA).
1
, so that using Theorem 4.8
54. If B is invertible, then det(B) 6= 0 and det(B −1 ) = det(B)
det(B −1 AB) = det(B −1 (AB)) = det(B −1 ) det(AB)
= det(B −1 ) det(A) det(B) =
1
det(A) det(B) = det(A).
det(B)
55. If A2 = A, then det(A2 ) = det(A). But det(A2 ) = det(A) det(A) = det(A)2 , so that det(A) = det(A)2 .
Thus det(A) is either 0 or 1.
56. We first show that Theorem 4.8 extends to prove that if m ≥ 1, then det(Am ) = det(A)m . The case
m = 1 is obvious, and m = 2 is Theorem 4.8. Now assume the statement is true for m = k. Then
using Theorem 4.8 and the truth of the statement for m = k,
det(Ak+1 ) = det(AAk ) = det(A) det(Ak ) = det(A) det(A)k = det(A)k+1 .
So the statement holds for all m ≥ 1. Now, if Am = O, then det(Am ) = det(A)m = det(O) = 0, so
that det(A) = 0.
57. The coefficient matrix and constant vector are
1
1
A=
,
1 −1
b=
1
.
2
To apply Cramer’s Rule, we compute
det A =
1
1
1
= −2
−1
det A1 (b) =
1
2
1
= −3
−1
det A2 (b) =
1
1
1
= 1.
2
Then by Cramer’s Rule,
x=
3
det A1 (b)
= ,
det A
2
58. The coefficient matrix and constant vector are
2 −1
A=
,
1
3
y=
det A2 (b)
1
=− .
det A
2
b=
5
.
−1
258
CHAPTER 4. EIGENVALUES AND EIGENVECTORS
To apply Cramer’s Rule, we compute
det A =
−1
=7
3
2
1
det A1 (b) =
5
−1
−1
= 14
3
det A2 (b) =
2
1
5
= −7.
−1
Then by Cramer’s Rule,
det A1 (b)
det A2 (b)
= 2, y =
= −1.
det A
det A
59. The coefficient matrix and constant vector are


 
2 1 3
1
A = 0 1 1 ,
b = 1 .
0 0 1
1
x=
To apply Cramer’s Rule, we compute
2
det A = 0
0
1
1
0
3
1 =2·1·1=2
1
1
det A1 (b) = 1
1
1
1
0
3
1
1 =1
1
1
1
3
+1
1
1
2
det A2 (b) = 0
0
1
1
1
3
1
1 =2
1
1
1
=0
1
2
det A3 (b) = 0
0
1
1
0
1
1
1 =2
0
1
1
= 2.
1
y=
det A2 (b)
= 0,
det A
1
= −2
1
Then by Cramer’s Rule,
x=
det A1 (b)
= −1,
det A
60. The coefficient matrix and constant vector are


1
1 −1
1
1 ,
A = 1
1 −1
0
z=
det A3 (b)
= 1.
det A
 
1
b = 2 .
3
To apply Cramer’s Rule, we compute
1
det A = 1
1
1 −1
1
1
1 =1
1
−1
0
−1
1
− (−1)
1
1
−1
=4
1
1
det A1 (b) = 2
3
1 −1
1
1
1 =3
1
−1
0
−1
1
− (−1)
1
2
−1
=9
1
−1
1
1 =1
2
0
−1
1
−3
1
1
−1
= −3
1
1 1
1
1 2 =1
−1
−1 3
2
1
−1
3
1
2
1
+1
3
1
1 1
det A2 (b) = 1 2
1 3
1
det A3 (b) = 1
1
1
= 2.
−1
4.2. DETERMINANTS
259
Then by Cramer’s Rule,
61. We have det A =
x=
det A1 (b)
9
= ,
det A
4
1
1
1
= −2. The four cofactors are
−1
y=
det A2 (b)
3
=− ,
det A
4
C11 = (−1)1+1 · (−1) = −1,
C21 = (−1)2+1 · 1 = −1,
z=
det A3 (b)
1
= .
det A
2
C12 = (−1)1+2 · 1 = −1,
C22 = (−1)2+2 · 1 = 1.
Then
"
−1
adj A =
−1
62. We have det A =
2
1
#T "
−1
−1
=
1
−1
#
"
−1
1
1 −1
−1
, so A =
adj A = −
det A
2 −1
1
# "
1
−1
= 21
1
2
1
2
− 12
−1
= 7. The four cofactors are
3
C11 = (−1)1+1 · 3 = 3,
C21 = (−1)2+1 · (−1) = 1,
C12 = (−1)1+2 · 1 = −1,
C22 = (−1)2+2 · 2 = 2.
Then
"
3
adj A =
1
#T "
−1
3
=
2
−1
#
"
1
3
1
1
−1
, so A =
adj A =
det A
7 −1
2
# "
3
1
7
=
1
2
−7
1
7
2
7
#
63. We have det A = 2 from Exercise 59. The nine cofactors are
C11 = (−1)1+1 · 1 = 1,
C12 = (−1)1+2 · 0 = 0, C13 = (−1)1+3 · 0 = 0,
C21 = (−1)2+1 · 1 = −1,
C22 = (−1)2+2 · 2 = 2, C23 = (−1)2+3 · 0 = 0,
C31 = (−1)3+1 · (−2) = −2,
Then
C32 = (−1)3+2 · 2 = −2, C33 = (−1)3+3 · 2 = 2.
T 

1 −1 −2
0 0
2 −2 ,
2 0 = 0
0
0
2
−2 2

 

1
1 −1 −2
− 12 −1
2
1
1
 

A−1 =
adj A = 0
2 −2 =  0
1 −1 .
det A
2
0
0
2
0
0
1

1
adj A = −1
−2
so
64. We have det A = 4 from Exercise 60. The nine cofactors are
C11 = (−1)1+1 · 1 = 1,
C12 = (−1)1+2 · (−1) = 1, C13 = (−1)1+3 · (−2) = −2,
C21 = (−1)2+1 · (−1) = 1,
C31 = (−1)3+1 · 2 = 2,
Then
C22 = (−1)2+2 · 1 = 1, C23 = (−1)2+3 · (−2) = 2,
C32 = (−1)3+2 · 2 = −2, C33 = (−1)3+3 · 0 = 0.
T 

1 −2
1 1
2
1
2 =  1 1 −2 ,
−2
0
−2 2
0

 

1
1
1
1 1
2
4
4
2
1
1
 

adj A =  1 1 −2 =  14 14 − 12  .
A−1 =
det A
4
−2 2
0
− 12 12
0

1
adj A = 1
2
so
.
#
.
260
CHAPTER 4. EIGENVALUES AND EIGENVECTORS
65. First, since A−1 = det1 A adj A, we have adj A = (det A)A−1 . Then
(adj A) ·
1
1
1
A = (det A)A−1 ·
A = det A ·
A−1 A = I.
det A
det A
det A
Thus adj A times det1 A A is the identity, so that adj A is invertible and
(adj A)−1 =
1
A.
det A
This proves the first equality. Finally, substitute A−1 for A in the equation adj A = (det A)A−1 ; this
gives adj(A−1 ) = (det A−1 )A = det1 A A, proving the second equality.
66. By Exercise 65, det(adj A) = det((det A)A−1 ); by Theorem 4.7, this is equal to (det A)n det(A−1 ). But
det(A−1 ) = det1 A , so that det(adj A) = (det A)n det1 A = (det A)n−1 .
67. Note that exchanging Ri−1 and Ri results in a matrix in which row i has been moved “up” one row
and all other rows remain the same. Refer to the diagram at the end of the solution when reading
what follows. Starting with Rs ↔ Rs−1 , perform s − r adjacent interchanges, each one moving the row
that started as row s up one row. At the end of these interchanges, row s will have been moved “up”
to row r, and all other rows will remain in the same order. The original row r is therefore at row r + 1.
We wish to move that row down to row s. Note that exchanging Ri and Ri+1 results in a matrix in
which row i has been moved “down” one row and all other rows remain the same. So starting with
Rr+1 ↔ Rr+2 , perform s − r − 1 adjacent interchanges, each one moving the row that started as row
r + 1 “down” one row. At the end of these interchanges, row r + 1 will have been moved down s − r − 1
rows, so it will be at row s − r − 1 + r + 1 = s, and all other rows will be as they were. Now, row r + 1
originally started life as row r, so the end result is that row s has been moved to row r, and row r to
row s. We used s − r adjacent interchanges to move row s, and another s − r − 1 adjacent interchanges
to move row r, for a total of 2(s − r) − 1 adjacent interchanges.


R1
 R2 


 .. 
 . 


Rr−1 


 Rr 


Rr+1 


Original:  . 
 .. 


Rs−1 


 Rs 


Rs+1 


 . 
 .. 
Rn




R1
R1
 R2 
 R2 




 .. 
 .. 
 . 
 . 




Rr−1 
Rr−1 




 Rs 
 Rs 




 Rr 
Rr+1 




After initial s − r exchanges: Rr+1  After next s − r − 1 exchanges:  .  .

 .. 

 .. 


 . 
Rs−1 




Rs−1 
 Rr 




Rs+1 
Rs+1 




 . 
 . 
 .. 
 .. 
Rn
Rn
4.2. DETERMINANTS
261
68. The proof is very similar to the proof given in the text for the case of expansion along a row. Let B
be the matrix obtained by moving column j of matrix A to column 1:


a11 a12 · · · a1,j−1 a1j a1,j+1 · · · a1n
 a21 a22 · · · a2,j−1 a2j a2,j+1 · · · a2n 


A= .
..
..
..
..
.. 
..
..
 ..
.
.
.
.
.
.
. 
an1 an2 · · · an,j−1 anj an,j+1 · · · ann

a1j
 a2j

B= .
 ..
a11
a21
..
.
a12
a22
..
.
···
···
..
.
a1,j−1
a2,j−1
..
.
a1,j+1
a2,j+1
..
.
···
···
..
.

a1n
a2n 

.. 
. 
anj
an1
an2
···
an,j−1
an,j+1
···
ann
By the discussion in Exercise 67, it requires j − 1 adjacent column interchanges to move column j to
column 1, so by Lemma 4.14, det B = (−1)j−1 det A, so that det A = (−1)j−1 det B. Then by Lemma
4.13, evaluating det B using cofactor expansion along the first column gives
det A = (−1)j−1 det B = (−1)j−1
n
X
bi1 Bi1 = (−1)j−1
i=1
n
X
aij Bi1 ,
i=1
where Aij and Bij are the cofactors of A and B. It remains only to understand how the cofactors of A
and B are related. Clearly the matrix formed by removing row i and column j of A is the same matrix
as is formed by removing row i and column 1 of B, so the only difference is the sign that we put on
the determinant of that matrix. In A, the sign is (−1)i+j ; in B, it is (−1)i+1 . Thus
Bi1 = (−1)j−1 Aij .
Substituting that in the equation for det A above gives
det A = (−1)j−1
n
X
aij Bi1 = (−1)j−1
i=1
n
X
(−1)j−1 aij Aij = (−1)2j−2
i=1
n
X
i=1
aij Aij =
n
X
aij Aij ,
i=1
proving Laplace expansion along columns.
69. Since P and S are both square, suppose P is n × n and S is m × m. Then O is m × n and Q is n × m.
Proceed by induction on n. If n = 1, then
p
Q
A = 11
,
O S
and Q is 1 × n while O is n × 1. Compute the determinant by cofactor expansion along the first column.
The only nonzero entry in the first column is p11 , so that det A = p11 A11 . Since O has one column
and Q one row, removing the first row and column of A leaves only S, so that A11 = det S and thus
det A = p11 det S = (det P ) det S). This proves the case n = 1 and provides the basis for the induction.
Now suppose that the statement is true when P is k × k; we want to prove it when P is (k + 1) × (k + 1).
So let P be a (k + 1) × (k + 1) matrix. Again compute det A using column 1:
det A =
k+1+m
X
ai1 (−1)i+1 det Ai1 .
i=1
Now, ai1 = pi1 for i ≤ k + 1, and ai1 = 0 for i > k + 1, so that
det A =
k+1
X
i=1
pi1 (−1)i+1 det Ai1 .
262
CHAPTER 4. EIGENVALUES AND EIGENVECTORS
Ai1 is the submatrix formed by removing row i and column 1 of A; since i ranges from 1 to k + 1, we
see that this process removes row i and column 1 of P , but does not change S. So we are left with a
matrix
Pi1 Qi
Ai1 =
.
O
S
Now Pi1 is k × k, so we can apply the inductive hypothesis to get
det Ai1 = (det Pi1 )(det S).
Substitute that into the equation for det A above to get
det A =
k+1
X
pi1 (−1)i+1 det Ai1
i=1
=
k+1
X
pi1 (−1)i+1 (det Pi1 )(det S)
i=1
= (det S)
k+1
X
!
pi1 (−1)
i+1
det Pi1
.
i=1
But the sum is just cofactor expansion of P along the first column, so its value is det P , and finally we
get det A = (det S)(det P ) = (det P )(det S) as desired.
70. (a) There are many examples. For instance, let
1 0
1
P =S=
, Q=
0 1
1
Then
A=
P
R

1
0
Q
=
0
S
0
0
,
0
R=
0
1
1
1

0
0
.
0
1
1
1
1
0
0
0
1
.
1
Since the second and third rows of A are identical, det A = 0, but
(det P )(det S) − (det Q)(det R) = 1 · 1 − 0 · 0 = 1,
and the two are unequal.
(b) We have
P −1
O P Q
BA =
−RP −1 I R S
−1
P P + OR
P −1 Q + OS
=
−RP −1 P + IR −RP −1 Q + IS
I
P −1 Q
=
−R + R −RP −1 Q + S
I
P −1 Q
.
=
O S − RP −1 Q
It is clear that the proof in Exercise 69 can easily be modified to prove the same result for a block
lower triangular matrix. That is, if A, P , and S are square and
P O
A=
,
Q S
Exploration: Geometric Applications of Determinants
263
then det A = (det P )(det S). Applying this to the matrix B gives
det B = (det P −1 )(det I) = det P −1 =
1
.
det P
Further, applying Exercise 69 to BA gives
det(BA) = (det I)(det(S − RP −1 Q)) = det(−SRP −1 Q).
So finally
det A =
1
det(BA) = (det P )(det(S − RP −1 Q)).
det B
(c) If P R = RP , then
det A = (det P )(det(S − RP −1 Q)) = det(P (S − RP −1 Q))
= det(P S − (P R)P −1 Q)) = det(P S − RP P −1 Q) = det(P S − RQ).
Exploration: Geometric Applications of Determinants
1. (a) We have
i
u×v = 0
3
j k
1
1 1 =i
−1
−1 2
0
1
−j
3
2
0
1
+k
3
2
1
= 3i + 3j − 3k.
−1
i
u×v = 3
0
j k
−1
−1 2 = i
1
1 1
3
2
−j
0
1
3
2
+k
0
1
−1
= −3i − 3j + 3k.
1
(b) We have
(c) We have
i
u × v = −1
2
j
k
2
2
3 =i
−4
−4 −6
−1
3
−j
2
−6
−1
3
+k
2
−6
2
= 0.
−4
(d) We have
i
u×v = 1
1
j
1
2
k
1
1 =i
2
3
1
1
−j
1
3
1
1
+k
1
3
1
= i − 2j + k.
2
2. First,
i
u · (v × w) = u · v1
w1
j
v2
w2
k
v3
w3
= (u1 i + u2 j + u3 k) · (i(v2 w3 − v3 w2 ) − j(v1 w3 − v3 w1 ) + k(v1 w2 − v2 w1 ))
= (u1 v2 w3 − u1 v3 w2 ) − (u2 v1 w3 − u2 v3 w1 ) + (u3 v1 w2 − u3 v2 w1 ).
Computing the determinant, we get
u1
v1
w1
u2
v2
w2
u3
v
v3 = u 1 2
w2
w3
v3
v
− u2 1
w3
w1
v3
v
+ u3 1
w3
w1
v2
w2
= u1 (v2 w3 − v3 w2 ) − u2 (v1 w3 − v3 w1 ) + u3 (v1 w2 − v2 w1 )
= (u1 v2 w3 − u1 v3 w2 ) − (u2 v1 w3 − u2 v3 w1 ) + (u3 v1 w2 − u3 v2 w1 ),
and the two expressions are equal.
264
CHAPTER 4. EIGENVALUES AND EIGENVECTORS
3. With u, v, and w as in the previous exercise,
(a) We have
i
v × u = v1
u1
j
v2
u2
i
k
R2 ↔R3
v3 −
−−−−→ = − u1
v1
u3
j
u2
v2
k
u3 = −(u × v).
v3
(b) We have
i
u × 0 = u1
0
j
u2
0
k
u3 .
0
This matrix has a zero row, so its determinant is zero by Theorem 4.3. Thus u × 0 = 0.
(c) We have
i
j
k
u × 0 = u1 u2 u3 .
u1 u2 u3
This matrix has two equal rows, so its determinant is zero by Theorem 4.3. Thus u × u = 0.
(d) We have by Theorem 4.3
i
u × kv = u1
kv1
j
u2
kv2
k
i
u3 = k u1
kv3
v1
j
u2
v2
k
u3 = k(u × v).
v3
(e) We have by Theorem 4.3
u × (v + w) =
i
u1
v1 + w 1
j
u2
v2 + w2
k
i
u3
= u1
v3 + w3
v1
j
u2
v2
k
i
u3 + u1
v3
w1
u1
u · (u × v) = u1
v1
u2
u2
v2
u3
u3
v3
v1
v · (u × v) = u1
v1
v2
u2
v2
v3
u3 .
v3
j
u2
w2
k
u3 = u × v + u × w.
w3
(f ) From Exercise 2 above,
But each of these matrices has two equal rows, so each has zero determinant. Thus
u · (u × v) = v · (u × v) = 0.
(g) From Exercise 2 above,
u1
u · (v × w) = v1
w1
u2
v2
w2
u3
v3 ,
w3
w1
(u × v) · w = w · (u × v) = u1
v1
w2
u2
v2
w3
u3 .
v3
The second matrix is obtained from the first by exchanging rows 1 and 2 and then exchanging
rows 1 and 3. Since each row exchange multiplies the determinant by −1, it remains unchanged
so that the two products are equal.
4. The vectors u and v, considered as vectors in R3 , have their third coordinates equal to zero, so the
area of the parallelogram with these edges is


i
j k
u1 u2
u1 u2
u1 v1


A = ku × vk = det u1 u2 0
= k det
= det
= det
.
v1 v2
v1 v2
u2 v2
v1 v2 0
Exploration: Geometric Applications of Determinants
265
5. The area of the rectangle is (a + c)(b + d). The small rectangles at top left and bottom right each
have area bc. The triangles at top and bottom each have base a and height b, so the area of each is
1
2 ab; altogether they have area ab. Finally, the smaller triangles at left and right each have base d
and height c, so the area of each is 12 cd; altogether they have area cd. So in total, the area in the big
rectangle that is not in the parallelogram is bc + ab + cd, so the area of the parallelogram is
(a + c)(b + d) − 2bc − ab − cd = ab + ad + bc + cd − 2bc − ab − cd = ad − bc.
a
c
For this parallelogram, u =
and v =
, and indeed
b
d
det u v = ad − bc.
As for the absolute value sign, note that the formula A = ku × vk does not depend on which of the
two sides we choose for u and which for v, but the sign of the determinant does. In our particular
case, when we choose u and v as above, we don’t need the absolute value sign, as the area works out
properly. It is present in the norm formula to represent the fact that we need not make particular
choices for u and v.
6. (a) By Exercise 5, the area is
A = det
2
3
−1
= |2 · 4 − (−1) · 3| = 11.
4
(b) By Exercise 5, the area is
3
A = det
4
5
= |3 · 5 − 4 · 5| = 5.
5
7. A parallelepiped of height h with base area a has volume ha. This can be seen using calculus; the
Rh
h
volume is 0 a dz = [az]0 = ah. Alternatively, it can be seen intuitively, since the parallelepiped
consists of “slices” each of area a, and the height of those slices is h. Now, for the parallelepiped in
Figure 4.11, the height is clearly kuk cos θ. The base is a parallelogram with sides v and w, so its area
is kv × wk. Thus the volume of the parallelepiped is kuk kv × wk cos θ. But
u · (v × w) = kuk kv × wk cos θ,
so that the volume
 is u · (v × w) which, by Exercise 2, is equal to the absolute value of the determinant
u
of the matrix  v , which is equal to the absolute value of the determinant of the transpose of that
w
matrix, or u v w .
8. Compare Figure 4.12 to Figure 4.11. The base of the tetrahedron is a triangle that is one half of the
area of the parallelogram in Figure 4.11. So from the hint, we have
1
(area of the base)(height)
3
1
= (area of the parallelogram)(height)
6
1
= (volume of the parallelepiped)
6
1
= |u · (v × w)| .
6
V=
9. By problem 4, the area of the parallelogram determined by Au and Av is
TA (P ) = det Au Av = det(A u v ) = (det A)(det u v ) = |det A| (area of P ).
266
CHAPTER 4. EIGENVALUES AND EIGENVECTORS
10. By Problem 7, the volume of the parallelepiped determined by Au, Av, and Aw is
TA (P ) = det Au
Av
Aw
= det(A u
v
w )
= (det A)(det u
v
w ) = |det A| (volume of P ).
11. (a) The line through (2, 3) and (−1, 0) is given by
x y
2 3
−1 0
1
1 .
1
Evaluate by cofactor expansion along the third row:
−1
y
3
x
1
+1
2
1
y
= −1(y − 3) + (3x − 2y) = 3x − 3y + 3.
3
The equation of the line is 3x − 3y + 3 = 0, or x − y + 1 = 0.
(b) The line through (1, 2) and (4, 3) is given by
x
1
4
y
2
3
1
1 .
1
Evaluate by cofactor expansion along the first row:
x
2
3
1
1
−y
4
1
1
1
+1
4
1
2
= −x + 3y − 5.
3
The equation of the line is −x + 3y − 5 = 0, or x − 3y + 5 = 0.
12. Suppose the three points satisfy
ax1 + by1 + c = 0
ax2 + by2 + c = 0
ax3 + by3 + c = 0,
which can be written as the matrix equation

x1
XA = x2
x3
y1
y2
y3
 
1 a
1  b  = O.
c
1
Then det X 6= 0 if and only if X is invertible, which implies that X −1 XA = X −1 O = O, so that
A = O. Thus A 6= O if and only if det X = O; this says precisely that the three points lie on the line
ax + by + c = 0 with a and b not both zero if and only if det X = 0.
13. Following the discussion at the beginning of the section, since the three points are noncollinear, they
determine a unique plane, say ax + by + cz + d = 0. Since all three points lie on the plane, their
coordinates satisfy this equation, so that
ax1 + by1 + cz1 + d = 0
ax2 + by2 + cz2 + d = 0
ax3 + by3 + cz3 + d = 0.
These three equations together with the equation of the plane then form a system of linear equations
in the variables a, b, c, and d. Since we know that the plane exists, it follows that there are values
Exploration: Geometric Applications of Determinants
267
of a, b, c, and d, not all zero, satisfying these equations. So the system has a nontrivial solution, and
therefore its coefficient matrix, which is


x y z 1
x1 y1 z1 1


x2 y2 z2 1
x3 y3 z3 1
cannot be invertible, by the Fundamental Theorem of Invertible Matrices. Thus its determinant must
be zero by Theorem 4.6.
To see what happens if the three points are collinear, try row reducing the above coefficient matrix:

x
x1

x2
x3
y
y1
y2
y3
z
z1
z2
z3


x
1
 x1
1
 R3 −R2 
−−→ x2 − x1
1 −−−
R4 −R2
1 −
−−−−→ x3 − x1
y
y1
y2 − y1
y3 − y1
z
z1
z2 − z1
z3 − z1

1
1
.
0
0
If the points are collinear, then




x3 − x1
x2 − x1
 y3 − y1  = k  y2 − y1 
z3 − z1
z2 − z1
for some k 6= 0. Then performing R4 − kR3 gives a matrix with a zero row. So its determinant is
trivially zero and the system has an infinite number of solutions. This corresponds to the fact that
there are an infinite number of planes passing through the line determined by the three collinear points.
14. Suppose the four points satisfy
ax1 + by1 + cz1 + d = 0
ax2 + by2 + cz2 + d = 0
ax3 + by3 + cz3 + d = 0
ax4 + by4 + cz4 + d = 0,
which can be written as the matrix equation

x1
x2
XA = 
x3
x4
y1
y2
y3
y4
z1
z2
z3
z4
 
1 a
 
1
  b  = O.

1 c
d
1
Then det X 6= 0 if and only if X is invertible, which implies that X −1 XA = X −1 O = O, so that
A = O. Thus A 6= O if and only if det X = O; this says precisely that the four points lie on the plane
ax + by + cz + d = 0 with a, b, and c not all zero if and only if det X = 0.
15. Substituting the values (x, y) = (−1, 10), (0, 5) and (3, 2) in turn into the equation gives the linear
system in a, b, and c
a − b + c = 10
a
= 5
a + 3b + 9c = 2.
The determinant of the coefficient matrix of this system is
1
X= 1
1
−1 1
−1
0 0 = −1
3
3 9
1
= 12 6= 0,
9
268
CHAPTER 4. EIGENVALUES AND EIGENVECTORS
Since X is has nonzero determinant, it is invertible by Theorem 4.6, so we have
   
 
 
a
10
a
10
−1  






b
5
b
5 ,
X
=
⇒
=X
c
2
c
2
so that the system has a unique solution. To find that solution, we can either invert X or we can
row-reduce the augmented matrix. Using the second method, and writing the second equation first in
the augmented matrix, we have

1
0 0
1 −1 1
1
3 9


1
5
R2 −R1
10 −−−
−−→ 0
R3 −R1
2 −
−−−−→ 0



1 0
0
5
5
−R2 
5 −
−−→ 0 1 −1 −5
R3 −3R2
−3
0 3
9 −3 −
−−−−→




1
0
5
1 0
0
5
R
+R
2
3 
0 1 −1 −5 −−−
−1 −5 1
−−→ 0
R3
12 12 −12
0
0
1
1
0
−−→
0 0
−1 1
3 9

1 0
0 1
0 0
0
1
0
0
0
1

5
−4 .
1
So the system has the unique solution a = 5, b = −4, c = 1 and the parabola through the three points
is y = 5 − 4x + x2 .
16. (a) The linear system in a, b, and c is
a + b + c = −1
a + 2b + 4c =
4
a + 3b + 9c =
3,
so row-reducing the augmented matrix gives

1 1 1
1 2 4
1 3 9





1 1 1 −1
−1
1 1 1 −1
R2 −R1
0 1 3
−−→ 0 1 3
4 −−−
5
5 1
R3 −2R2
R3 −R1
2 R3
3 −
0
2
8
4
0
0
2
−6
−−−−−→
−−
−→
−−−−→

 R1 −R3 
 R1 −R2 
1 1 1 −1 −−−−−→ 1 1 0
2 −−−−−→ 1 0 0
R2 −3R3 0 1 0
0 1 3

0 1 0
5 −
14
−−−−→
0 0 1 −3
0 0 1 −3
0 0 1

−12
14 .
−3
So the system has the unique solution a = −12, b = 14, c = −3 and the parabola through the
three points is y = −12 + 14x − 3x2 .
(b) The linear system in a, b, and c is
a − b + c = −3
a + b + c = −1
a + 3b + 9c =
1,
so row-reducing the augmented matrix gives

1
1
1
−1 1
1 1
3 9


−3
1
R2 −R1
−1 −−−
−−→ 0
R3 −R1
1 −
−−−−→ 0

1
0
0



−3 1 R
1 −1 1 −3
2 2
2 −−
−→ 0
1 0
1
R3 −R2
4 14 R3 0
1 2
1 −
−−−−→
−−−→


 R1 +R2 −R3 
−3
1 −1 1 −3 −−−−−−−→ 1 0 0
0
0 1 0
1 1
1 0
1
2 R3
0 −−
0
0
0
1
0 0 1
−→
−1 1
2 0
4 8
−1 1
1 0
0 2

−2
1 .
0
So the system has the unique solution a = −2, b = 1, c = 0. Thus the equation of the curve
passing through these points is in fact a line, since c = 0, and its equation is y = −2 + x.
Exploration: Geometric Applications of Determinants
269
17. Computing the determinant, we get
1
1
1
a1
a2
a3
a21
a
a22 = 1 2
a3
a23
a
a22
−1 1
a3
a23
a
a21
+1 1
a2
a23
a21
a22
= a2 a23 − a22 a3 − a1 a23 + a21 a3 + a1 a22 − a21 a2 .
Now compute the product on the right:
(a2 − a1 )(a3 − a1 )(a3 − a2 ) = (a2 a3 − a1 a3 − a1 a2 + a21 )(a3 − a2 )
= a2 a23 − a22 a3 − a1 a23 + a1 a2 a3 − a1 a2 a3 + a1 a22 + a21 a3 − a21 a2
= a2 a23 − a22 a3 − a1 a23 + a1 a22 + a21 a3 − a21 a2 .
The two expressions are equal. Furthermore, the determinant is nonzero since it is the product of the
three differences a2 − a1 , a3 − a1 , and a3 − a2 , which are all nonzero because we are assuming that the
ai are distinct real numbers.
18. Recall that adding a multiple of one column to another does not change the determinant (see Theorem
4.3). So given the matrix
1
1
1
1
a1
a2
a3
a4
a21
a22
a23
a24
a31
a32
,
a33
a34
apply in order the column operations C4 − a1 C3 , C3 − a1 C2 , C2 − a1 C1 to get
1
1
1
1
a1 − a1
a2 − a1
a3 − a1
a4 − a1
a21 − a21
a22 − a2 a1
a23 − a3 a1
a24 − a4 a1
a31 − a31
1 0
a32 − a22 a1
1 1
= (a4 − a1 )(a3 − a1 )(a2 − a1 )
3
2
a3 − a3 a1
1 1
3
2
a4 − a4 a1
1 1
0
a2
a3
a4
0
a22
.
a23
a24
Then expand along the first row, giving
1
(a4 − a1 )(a3 − a1 )(a2 − a1 ) 1
1
a2
a3
a4
a22
a23 .
a24
But we know how to evaluate this determinant from Exercise 17; doing so, we get for the original
determinant
(a4 − a1 )(a3 − a1 )(a2 − a1 )(a3 − a2 )(a4 − a2 )(a4 − a3 )
which is the same as the expression given in the problem statement.
For the second part, we can interpret the matrix above as the coefficient matrix formed by starting
with the cubic y = a + bx + cx2 + dx3 , substituting the four given points for x, and regarding the
resulting equations as a linear system in a, b, c, and d, just as in Exercise 15. Since the determinant
is nonzero if the ai are all distinct, it follows that the system has a unique solution, so that there is a
unique polynomial of degree at most 3 (it may be less than a cubic; for example, see Exercise 16(b))
passing through the four points.
19. For the n × n determinant given, use induction on n (note that Exercise 18 was essentially an inductive
computation, since it invoked Exercise 17 at the end). We have already proven the cases n = 1, n = 2,
270
CHAPTER 4. EIGENVALUES AND EIGENVECTORS
and n = 3. So assume that the statement holds for n = k, and consider the determinant
1
1
..
.
1
1
···
···
..
.
a1
a2
..
.
ak
ak+1
···
···
ak−1
1
a2k−1
..
.
k−1
ak
ak−1
k+1
ak1
ak2
..
. .
akk
akk+1
apply in order the column operations Ck+1 − a1 Ck , Ck − a1 Ck−1 , . . . , C2 − a1 C1 to get
1
1
..
.
1
1
a1 − a1
a2 − a1
..
.
ak − a1
ak+1 − a1
···
···
..
.
···
···
ak−1
− a1k−1
1
ak−1
− a1 ak−2
2
2
..
.
ak−1
−
a1 akk−2
k
k−2
ak−1
k+1 − a1 ak+1
1 0
ak1 − ak1
k−1
k
1 1
a2 − a1 a2
k+1
Y
. .
..
=
(aj − a1 ) .. ..
.
j=2
1 1
akk − a1 ak−1
k
k−1
k
1 1
ak+1 − a1 ak+1
···
···
..
.
0
0
ak−2
2
ak−1
2
..
.
..
.
···
···
ak−2
k
ak−2
k+1
ak−1
k
ak−1
k+1
.
Now expand along the first row, giving
k+1
Y
(aj − a1 )
j=2
1
..
.
1
1
···
..
.
···
···
ak−2
2
..
.
ak−1
2
..
.
ak−2
k
ak−2
k+1
ak−1
k
ak−1
k+1
.
But the remaining determinant is again a Vandermonde determinant in the k variables a2 , a3 , . . . , ak+1 ,
so we know its value from the inductive hypothesis. Substituting that value, we get for the determinant
of the original matrix
k+1
Y
Y
j=2
2≤i<j≤k+1
(aj − a1 )
(aj − ai ) =
Y
(aj − ai )
1≤i<j≤k+1
as desired.
For the second part, we can interpret the matrix above as the coefficient matrix formed by starting
with the (n − 1)st degree polynomial y = c0 + c1 x + · · · + cn−1 xn−1 , substituting the n given points for
x, and regarding the resulting equations as a linear system in the ci , just as in Exercises 15 and 18.
Since the determinant is nonzero if the x-coordinates of the points are all distinct (being a product of
their differences), it follows that the system has a unique solution, so that there is a unique polynomial
of degree at most n − 1 (it may be less; for example, see Exercise 16(b)) passing through the n points.
4.3
Eigenvalues and Eigenvectors of n × n Matrices
1. (a) The characteristic polynomial of A is
det(A − λI) =
1−λ
−2
3
= (1 − λ)(6 − λ) − (−2) · 3 = λ2 − 7λ + 12 = (λ − 3)(λ − 4)
6−λ
(b) The eigenvalues of A are the roots of the characteristic polynomial, which are λ = 3 and λ = 4.
(c) To find the eigenspace corresponding to the eigenvalue λ = 3 we must find the null space of
−2 3
A − 3I =
−2 3
4.3. EIGENVALUES AND EIGENVECTORS OF N × N MATRICES
271
Row reduce this matrix:
−2
0 =
−2
3
3
−3
0 =
−2
3
2
0
2 −3 0
−→
0
0 0
0
3t
3
Thus eigenvectors corresponding to λ = 3 are of the form
, so that E3 = span
.
2t
2
To find the eigenspace corresponding to the eigenvalue λ = 4 we must find the null space of
−3 3
A − 4I =
−2 2
A − 3I
Row reduce this matrix:
0
1 −1 0
−→
0
0
0 0
t
1
Thus eigenvectors corresponding to λ = 4 are of the form
, so that E4 = span
.
t
1
A − 4I
(d) Since λ = 3 and λ = 4 are single roots of the characteristic polynomials, the algebraic multiplicity
of each is 1. Since each eigenspace is one-dimensional, the geometric multiplicity of each eigenvalue
is also 1.
2. (a) The characteristic polynomial of A is
det(A − λI) =
2−λ
−1
1
= (2 − λ)(−λ) − 1 · (−1) = λ2 − 2λ + 1 = (λ − 1)2 .
−λ
(b) The eigenvalues of A are the roots of the characteristic polynomial; this polynomial has as its
only root λ = 1.
(c) To find the eigenspace corresponding to the eigenvalue we must find the null space of
1
1
A−I =
−1 −1
Row reduce this matrix:
0
1 1 0
−→
0
0 0 0
−t
−1
Thus eigenvectors corresponding to λ = 1 are of the form
, which is span
.
t
1
A−I
0 =
1
−1
1
−1
(d) Since λ = 1 is a double root of the characteristic polynomial, its algebraic multiplicity is 2.
However, the eigenspace is one-dimensional, so its geometric multiplicity is 1.
3. (a) The characteristic polynomial of A is
det(A − λI) =
1−λ
0
0
1
0
−2 − λ
1
= (1 − λ)(−2 − λ)(3 − λ) = −(λ − 1)(λ + 2)(λ − 3).
0
3−λ
(b) The eigenvalues of A are the roots of the characteristic polynomial, which are λ = 1, λ = −2, and
λ = 3.
(c) To find the eigenspace corresponding to the eigenvalue λ = 1 we must find the null space of


0
1 0
A − I = 0 −3 1
0
0 2
272
CHAPTER 4. EIGENVALUES AND EIGENVECTORS
Row reduce this matrix:

0
0 = 0
0



0
0 1 0 0
0 −→ 0 0 1 0
0
0 0 0 0
 
 
t
1
Thus eigenvectors corresponding to λ = 1 are of the form 0, so that E1 = span 0.
0
0
To find the eigenspace corresponding to the eigenvalue λ = −2 we must find the null space of


3 1 0
A + 2I = 0 0 3
0 0 5
A−I
1 0
−3 1
0 2
Row reduce this matrix:

3 1 0

0 = 0 0 3
0 0 5
i
h
A + 2I


1
0


0 −→ 0
0
0
1
3
0
0
0
1
0

0

0
0
 1 
 
−1
−3t
Thus eigenvectors corresponding to λ = −2 are of the form  t , so that E−2 = span  3.
0
0
To find the eigenspace corresponding to the eigenvalue λ = 3 we must find the null space of


−2
1 0
A − 3I =  0 −5 1
0
0 0
Row reduce this matrix:
h
A − 3I

−2

0 = 0
0
i
1 0
−5 1
0 0


0
1


0 −→ 0
0
0
Thus eigenvectors corresponding to λ = 3 are of the form
 
 
1
1
10 t
 
1 
 5 t  , or, clearing fractions,  2  , so that
10
t
0
1
0
1
− 10
− 15
0

0

0
0
 
1
 
E−2 = span  2  .
10
(d) Since all three eigenvalues are single roots of the characteristic polynomials, the algebraic multiplicity of each is 1. Since each eigenspace is one-dimensional, the geometric multiplicity of each
eigenvalue is also 1.
4. (a) The characteristic polynomial of A is
1−λ
0
1
0
1−λ
1
1
1
−λ
= (1 − λ)
1−λ
1
1
0 1−λ
+1
−λ
1
1
det(A − λI) =
= (1 − λ)(λ2 − λ − 1) + 1(−1 + λ) = −λ3 + 2λ2 + λ − 2
= −(λ + 1)(λ − 1)(λ − 2).
4.3. EIGENVALUES AND EIGENVECTORS OF N × N MATRICES
273
(b) The eigenvalues of A are the roots of the characteristic polynomial, which are λ1 = −1, λ2 = 1,
and λ3 = 2.
(c) To find the eigenspace corresponding to λ1 = −1 we must find the null space of


2 0 1
A + I = 0 2 1 .
1 1 1
Row reduce this matrix:
h

i 2 0 1
0 =
0 2 1
1 1 1
A+I


1 0
0




0 −→ 0 1
0 0
0
1
2
1
2
0

0

0

0
 1 
 
 
−2t
−t
−1
Thus eigenvectors corresponding to λ1 are of the form − 12 t, or −t, which is span −1.
2t
2
t
To find the eigenspace corresponding to λ2 = 1 we must find the null space of


0 0
1
1 .
A − I = 0 0
1 1 −1
Row reduce this matrix:



1 1 0 0
0
0 −→ 0 0 1 0
0
0 0 0 0
 
 
−t
−1
Thus eigenvectors corresponding to λ2 are of the form  t, which is span  1.
0
0
To find the eigenspace corresponding to λ3 = 2 we must find the null space of


−1
0
1
1 .
A − 2I =  0 −1
1
1 −2
A−I

0
0 = 0
1
0
0
1
1
1
−1
Row reduce this matrix:



0
1 0 −1 0
0 −→ 0 1 −1 0
0
0 0
0 0

 
t
1




t
1.
Thus eigenvectors corresponding to λ3 are of the form
, which is span
t
1
A − 2I

−1
0 = 0
1
0
1
−1
1
1 −2
(d) Since each eigenvalue is a single root of the characteristic polynomial, each has algebraic multiplicity 1. Since each eigenspace is one-dimensional, each eigenvalue also has geometric multiplicity
1.
5. (a) The characteristic polynomial of A is
det(A − λI) =
1−λ
−1
0
2
0
−1 − λ
1
= −λ3 + λ2 = −λ2 (λ − 1).
1
1−λ
274
CHAPTER 4. EIGENVALUES AND EIGENVECTORS
(b) The eigenvalues of A are the roots of the characteristic polynomial, which are λ = 0 and λ = 1.
(c) To find the eigenspace corresponding to the eigenvalue λ = 0 we must find the null space of A.
Row reduce this matrix:




1
2 0 0
1 0 −2 0
A 0 = −1 −1 1 0 −→ 0 1 1 0
0
1 1 0
0 0 0 0
Thus eigenvectors corresponding to λ = 0 are of the form
 
 
2t
2
−t = t −1 ,
t
1
 
2
so that the eigenspace is E0 = span −1.
1
To find the eigenspace corresponding to the eigenvalue λ = 1 we must find the null space of


0
2 0
A − I = −1 −2 1
0
1 0
Row reduce this matrix:
h
A−I

0
i

0 = −1
0
2 0
−2 1
1 0


0
1 0


0 −→ 0 1
0
0 0
−1
0
0

0

0
0
Thus eigenvectors corresponding to λ = 1 are of the form
 
 
t
1
 
 
0 , so that E1 = span 0 .
t
1
(d) Since λ = 0 is a double root of the characteristic polynomial, it has algebraic multiplicity 2; since
its eigenspace is one-dimensional, it has geometric multiplicity 1. Since λ = 1 is a single root of the
characteristic polynomial, it has algebraic multiplicity 1; since its eigenspace is one-dimensional,
it also has geometric multiplicity 1.
6. (a) The characteristic polynomial of A is
det(A − λI) =
1−λ
3
2
= (1 − λ)
0
2
−1 − λ
3
0
1−λ
−1 − λ
0
3
3
+2
1−λ
2
−1 − λ
0
= (1 − λ)(−(1 + λ)(1 − λ) − 3 · 0) + 2(3 · 0 + 2(1 + λ))
= (1 − λ)(λ2 − 1) + 4 + 4λ = −λ3 + λ2 + 5λ + 3 = −(λ + 1)2 (λ − 3).
(b) The eigenvalues of A are the roots of the characteristic polynomial, which are λ1 = −1 and λ2 = 3.
(c) To find the eigenspace corresponding to λ1 = −1 we must find the null space of


2 0 2
A + I = 3 0 3 .
2 0 2
4.3. EIGENVALUES AND EIGENVECTORS OF N × N MATRICES
275
Row reduce this matrix:
A+I

2
0 = 3
2
0
0
0


0
1 0 1
0 −→ 0 0 0
0
0 0 0
2
3
2

0
0 .
0
Thus eigenvectors corresponding to λ1 are of the form
     
 
 
   
−s
−s
0
−1
0
−1
0
 t =  0 +  t = s  0 + t 1 = span  0 , 1 .
s
s
0
1
0
1
0
To find the eigenspace corresponding to λ2 = 3 we must find the null space of


−2
0
2
3 .
A − 3I =  3 −4
2
0 −2
Row reduce this matrix:
h
A − 3I

i −2
0 =
 3
2
0
2
−4
3
0 −2


1
0




0 −→ 0
0
0

0
−1
0
1
− 23
0
0

0
.
0
Thus eigenvectors corresponding to λ2 are of the form
 
 
 
 
t
1
2
2
 3 t = t  3  = s 3 = span 3 .
2
2
2
2
1
t
(d) λ1 is a double root of the characteristic polynomial, and its eigenspace has dimension 2, so its
algebraic and geometric multiplicities are both equal to 2. λ2 is a single root of the characteristic
polynomial, and its eigenspace has dimension 1, so its algebraic and geometric multiplicities are
both equal to 1.
7. (a) The characteristic polynomial of A is
det(A − λI) =
4−λ
2
−1
0
1
3−λ
2
= −λ3 + 9λ2 − 27λ + 27 = −(λ − 3)3 .
0
2−λ
(b) The eigenvalues of A are the roots of the characteristic polynomial, so the only eigenvalue is λ = 3.
(c) To find the eigenspace corresponding to the eigenvalue λ = 3 we must find the null space of


1 0
1
2
A − 3I =  2 0
−1 0 −1
Row reduce this matrix:
h
A − 3I

1
i

=
0
 2
−1
0
0
0
1
2
−1


0
1 0 1


−→
0
0 0 0
0
0 0 0

0

0
0
Thus eigenvectors corresponding to λ = 3 are of the form
 
 
 
   
−t
0
−1
0
−1
 
 
 
   
 s  = s 1 + t  0 , so that E3 = span 1 ,  0 .
t
0
1
0
1
276
CHAPTER 4. EIGENVALUES AND EIGENVECTORS
(d) Since λ = 3 is a triple root of the characteristic polynomial, it has algebraic multiplicity 3; since
its eigenspace is two-dimensional, it has geometric multiplicity 2.
8. (a) The characteristic polynomial of A is
1−λ
0
−1
−1
2−λ
−1
−1
0
1−λ
= (2 − λ)
1−λ
−1
−1
1−λ
det(A − λI) =
= (2 − λ)((1 − λ)2 − (−1) · (−1)) = (2 − λ)(λ2 − 2λ)
= −λ(λ − 2)2
(b) The eigenvalues of A are the roots of the characteristic polynomial, which are λ1 = 0 and λ2 = 2.
(c) To find the eigenspace corresponding to λ1 = 0 we must find the null space of


1 −1 −1
2
0 .
A + 0I =  0
−1 −1
1
Row reduce this matrix:

A + 0I
1
0 = 0
−1
−1 −1
2
0
−1
1


1 0
0
0 −→ 0 1
0
0 0
−1
0
0

0
0 .
0
Thus eigenvectors corresponding to λ1 are of the form
 
 
 
s
1
1
0 = s 0 = span 0 .
s
1
1
To find the eigenspace corresponding to λ2 = 2 we must find the null space of


−1 −1 −1
0
0 .
A − 2I =  0
−1 −1 −1
Row reduce this matrix:
A − 2I

−1
0 = 0
−1
−1 −1
0
0
−1 −1


0
1
0 −→ 0
0
0
1
0
0
1
0
0

0
0 .
0
Thus eigenvectors corresponding to λ1 are of the form

 

 
   
−s − t
−1
−1
−1
−1
 s  = s  1 + t  0 = span  1 ,  0 .
t
0
1
0
1
(d) Since λ1 = 0 is a single root of the characteristic polynomial and its eigenspace has dimension
1, its algebraic and geometric multiplicities are both equal to 1. Since λ2 = 2 is a double root
of the characteristic polynomial and its eigenspace has dimension 2, its algebraic and geometric
multiplicities are both equal to 2.
9. (a) The characteristic polynomial of A is
3−λ
−1
det(A − λI) =
0
0
1
0
1−λ
0
0
1−λ
0
1
0
0
= λ4 − 6λ3 + 9λ2 + 4λ − 12 = (λ + 1)(λ − 2)2 (λ − 3)
4
1−λ
4.3. EIGENVALUES AND EIGENVECTORS OF N × N MATRICES
277
(b) The eigenvalues of A are the roots of the characteristic polynomial, so the eigenvalues are λ = −1,
λ = 2, and λ = 3.
(c) To find the eigenspace corresponding to the eigenvalue λ = −1 we must find the null space of


4 1 0 0
−1 2 0 0

A+I =
 0 0 2 4
0 0 1 2
Row reduce this matrix:

h
A+I
4 1 0 0
i −1 2 0 0

0 =
 0 0 2 4
0 0 1 2


1 0 0 0
0


0
0 1 0 0
 −→ 
0 0 1 2
0
0
0 0 0 0

0
0


0
0
Thus eigenvectors corresponding to λ = −1 are of the form


 
0
0
 0 
 0


 
−2t , so that E−1 = span −2 .
t
1
To find the eigenspace corresponding to the eigenvalue λ = 2 we must find the null space of


1
1
0
0
−1 −1
0
0

A − 2I = 
 0
0 −1
4
0
0
1 −1
Row reduce this matrix:

h
A − 2I
1
i −1

0 =
 0
0
1
0
0
−1
0
0
0 −1
4
0
1 −1


1 1 0 0
0


0
0 0 1 0
 −→ 
0 0 0 1

0
0
0 0 0 0

0
0


0
0
Thus eigenvectors corresponding to λ = 2 are of the form
 
 
−s
−1
s
 1
 
 
  , so that E2 = span   .
0
 0
0
0
To find the eigenspace corresponding to the eigenvalue λ = 3 we must find the null space of


0
1
0
0
−1 −2
0
0

A − 3I = 
 0
0 −2
4
0
0
1 −2
Row reduce this matrix:

h
A − 3I
0
i −1

0 =
 0
0
1
0
0
−2
0
0
0 −2
4
0
1 −2


0
1
0
0


 −→ 
0
0
0
0
0
1
0
0
0
0
1
0
0
0
−2
0

0
0


0
0
278
CHAPTER 4. EIGENVALUES AND EIGENVECTORS
Thus eigenvectors corresponding to λ = 3 are of the form
 
 
0
0
0
0
 
 
  , so that E3 = span   .
2
2t
1
t
(d) Since λ = 3 and λ = −1 are single roots of the characteristic polynomial, they each have algebraic
multiplicity 1; their eigenspaces are each one-dimensional, so those two eigenvalues have geometric
multiplicity 1 as well. Since λ = 2 is a double root of the characteristic polynomial, it has algebraic
multiplicity 2; since its eigenspace is one-dimensional, it only has geometric multiplicity 1.
10. (a) The characteristic polynomial of A is
2−λ
0
det(A − λI) =
0
0
1
1
1−λ
4
0
3−λ
0
0
0
5
1
2−λ
Since this is a triangular matrix, its determinant is the product of the diagonal entries by Theorem
4.2, so the characteristic polynomial is (2 − λ)2 (1 − λ)(3 − λ).
(b) The eigenvalues of A are the roots of the characteristic polynomial, which are λ1 = 1, λ2 = 2,
and λ3 = 3.
(c) To find the eigenspace corresponding to λ1 = 1 we must find the null space of


1 1 1 0
0 0 4 5

A−I =
0 0 2 1 .
0 0 0 1
Row reduce this matrix:
A−I

1 1 1 0
0 0 4 5
0 =
0 0 2 1
0 0 0 1


0
1
0
0
 −→ 
0
0
0
0
1
0
0
0
0
1
0
0
0
0
1
0

0
0
.
0
0
Thus eigenvectors corresponding to λ1 are of the form
 
 
 
−t
−1
−1
 t
 1
 1
  = t   = span   .
 0
 0
 0
0
0
0
To find the eigenspace corresponding to λ2 = 2 we must find the null space of


0
1 1 0
0 −1 4 5
.
A − 2I = 
0
0 1 1
0
0 0 0
Row reduce this matrix:
A − 2I

0
0
0 =
0
0
1 1 0
−1 4 5
0 1 1
0 0 0


0
0 1 0
0 0 1
0
 −→ 
0 0 0
0
0
0 0 0
−1
1
0
0

0
0
.
0
0
4.3. EIGENVALUES AND EIGENVECTORS OF N × N MATRICES
279
Thus eigenvectors corresponding to λ2 are of the form
   
 
 
 
0
1
0
1
s
0  1
 1
0
 t
  = s   + t   = span   ,   .
0 −1
−1
0
−t
1
0
1
0
t
To find the eigenspace corresponding to λ3 = 3 we must find the null space of


−1
1 1
0
 0 −2 4
5
.
A − 3I = 
 0
0 0
1
0
0 0 −1
Row reduce this matrix:
A − 3I

−1
 0
0 =
 0
0
1 1
0
−2 4
5
0 0
1
0 0 −1


1 0
0
0 1
0
 −→ 
0 0
0
0 0
0
−3 0
−2 0
0 1
0 0

0
0
.
0
0
Thus eigenvectors corresponding to λ1 are of the form
 
 
 
3
3
3t
2
2
2t
  = t   = span   .
1
1
 t
0
0
0
(d) Since λ1 = 1 is a single root of the characteristic polynomial and its eigenspace has dimension
1, its algebraic and geometric multiplicities are both equal to 1. Since λ2 = 2 is a double root
of the characteristic polynomial and its eigenspace has dimension 2, its algebraic and geometric
multiplicities are both equal to 2. Since λ3 = 3 is a single root of the characteristic polynomial
and its eigenspace has dimension 1, its algebraic and geometric multiplicities are both equal to 1.
11. (a) The characteristic polynomial of A is
1−λ
0
det(A − λI) =
1
−2
0
0
1−λ
0
1
3−λ
1
2
0
0
= (λ − 1)2 (λ − 3)(λ + 1)
0
−1 − λ
(b) The eigenvalues of A are the roots of the characteristic polynomial, so the eigenvalues are λ = −1,
λ = 1, and λ = 3.
(c) To find the eigenspace corresponding to the eigenvalue λ = −1 we must find the null space of


2 0 0 0
 0 2 0 0

A+I =
 1 1 4 0
−2 1 2 0
Row reduce this matrix:

A+I
2 0 0 0
 0 2 0 0
0 =
 1 1 4 0
−2 1 2 0


0
1
0
0
 −→ 
0
0
0
0
0
1
0
0
0
0
1
0
0
0
0
0

0
0

0
0
280
CHAPTER 4. EIGENVALUES AND EIGENVECTORS
Thus eigenvectors corresponding to λ = −1 are of the form
 
 
0
0
0
0
  , so that E−1 = span   .
0
0
1
t
To find the eigenspace corresponding to the eigenvalue λ = 1 we must find the null space of


0 0 0
0
 0 0 0
0

A−I =
 1 1 2
0
−2 1 2 −2
Row reduce this matrix:

h
A−I
0 0 0
i  0 0 0

0 =
 1 1 2
−2 1 2
0
0
0
−2


0
1


0
0
 −→ 
0
0
0
0
1
0
0
0
0
2
0
0
2
3
− 23
0
0

0
0


0
0
Thus eigenvectors corresponding to λ = 1 are of the form
 2
 

−3
0
− 32 t
 2
−2
−2s + 2 t
 


3  = s
  + t  3 ,


 0




1
s
0
t
1
  
−2
0
−2  2
   
E1 = span   ,   .
 1  0
3
0


so that
To find the eigenspace corresponding to the eigenvalue λ = 3 we must find the null space of


−2
0 0
0
 0 −2 0
0

A − 3I = 
 1
1 0
0
−2
1 2 −4
Row reduce this matrix:
h
A − 3I

−2
i  0

0 =
 1
−2
0 0
0
−2 0
0
1 0
0
1 2 −4


1 0 0
0


0
0 1 0
 −→ 
0 0 1
0
0
0 0 0
0
0
−2
0

0
0


0
0
Thus eigenvectors corresponding to λ = 3 are of the form
 
0
0
 
 ,
2t
t
so that
 
0
0
 
E3 = span   .
2
1
(d) Since λ = 3 and λ = −1 are single roots of the characteristic polynomial, they each have algebraic
multiplicity 1; their eigenspaces are each one-dimensional, so those two eigenvalues have geometric
multiplicity 1 as well. Since λ = 1 is a double root of the characteristic polynomial, it has algebraic
multiplicity 2; since its eigenspace is two-dimensional, it also has geometric multiplicity 2.
4.3. EIGENVALUES AND EIGENVECTORS OF N × N MATRICES
281
12. (a) The characteristic polynomial of A is
4−λ
0
det(A − λI) =
0
0
= −3
0
1
4−λ
1
0
1−λ
0
3
4−λ
0
0
0
1
2
−λ
4−λ
0
0
4 − λ 1 + (−λ) 0
0
0
2
0
1
4−λ
1
0
1−λ
= −6(4 − λ)2 − λ(1 − λ)(4 − λ)2
= (4 − λ)2 (λ2 − λ − 6) = (4 − λ)2 (λ − 3)(λ + 2)
since each of the 3 × 3 matrices is upper triangular, so that by Theorem 4.2 their determinants
are the product of their diagonal elements.
(b) The eigenvalues of A are the roots of the characteristic polynomial, which are λ1 = 4, λ2 = 3,
and λ3 = −2.
(c) To find the eigenspace corresponding to λ1 = 4 we must find the null space of


0 0
1
0
0 0
1
1
.
A − 4I = 
0 0 −3
2
0 0
0 −4
Row reduce this matrix:
A − 4I

0
0
0 =
0
0
0
0
0
0
1
0
1
1
−3
2
0 −4


0
0
0
0
 −→ 
0
0
0
0
0
0
0
0
1
0
0
0
0
1
0
0

0
0
.
0
0
Thus eigenvectors corresponding to λ1 are of the form
   
 
 
 
0
1
0
1
s
0 1
1
0
 t
  = s   + t   = span   ,   .
0 0
0
0
0
0
0
0
0
0
To find the eigenspace corresponding to λ2 = 3 we must find the null space of


1 0
1
0
0 1
1
1
.
A − 3I = 
0 0 −2
2
0 0
0 −3
Row reduce this matrix:
A − 3I

1 0
0 1
0 =
0 0
0 0
1
0
1
1
−2
2
0 −3


1
0
0
0
 −→ 
0
0
0
0
0
1
0
0
Thus eigenvectors corresponding to λ1 are of the form


 
 
−t
−1
−1
−2t
−2
−2


 
 
 t = t  1 = span  1 .
t
1
1
0
0
1
0
1
2
−1
0

0
0
.
0
0
282
CHAPTER 4. EIGENVALUES AND EIGENVECTORS
To find the eigenspace corresponding to λ3 = −2 we must find the null space of


6 0 1 0
0 6 1 1

A + 2I = 
0 0 3 2 .
0 0 3 2
Row reduce this matrix:
h
A + 2I

6 0 1 0
i 
0 6 1 1
0 =
0 0 3 2

0 0 3 2


1 0 0
0



0
 −→ 0 1 0
0 0 1

0

0
0 0 0
− 19
1
18
2
3
0
0


0
.
0

0
Thus eigenvectors corresponding to λ1 are of the form






 
1
2
2
2s
9t



 1  





− 18 t  −s


 = s  −1 = span  −1 .
=
−12
 − 2 t −12s
−12



 3  


18
18
18s
t
(d) Since λ1 = 4 is a double root of the characteristic polynomial and its eigenspace has dimension
2, its algebraic and geometric multiplicities are both equal to 2. Since λ2 = 3 is a single root
of the characteristic polynomial and its eigenspace has dimension 1, its algebraic and geometric
multiplicities are both equal to 1. Since λ3 = −2 is a single root of the characteristic polynomial
and its eigenspace has dimension 1, its algebraic and geometric multiplicities are both equal to 1.
13. We must show that if A is invertible and Ax = λx, then A−1 x = λ1 x = λ−1 x. First note that since A
is invertible, 0 is not an eigenvalue, so that λ 6= 0 and this statement makes sense. Now,
x = (A−1 A)x = A−1 (Ax) = A−1 (λx) = λ(A−1 x).
Dividing through by λ gives A−1 x = λ−1 x as required.
14. The remark referred to in Section 3.3 tells us that if A is invertible and n > 0 is an integer, then
A−n = (A−1 )n = (An )−1 . Now, part (a) gives us the result in part (c) when n > 0 is an integer. When
n = 0, the statement says that λ0 = 1 is an eigenvalue of A0 = I, which is true. So it remains to prove
the statement for n < 0. We use induction on n. For n = −1, this is the statement that if Ax = λx,
then A−1 x = λ−1 x, which is Theorem 4.18(b). Now assume the statement holds for n = −k, and
prove it for n = −(k + 1). Note that by the remark in Section 3.3,
A−(k+1) = (Ak+1 )−1 = (Ak A)−1 = A−1 (Ak )−1 = A−1 A−k .
Now
A−(k+1) x = A−1 (A−k x)
inductive
hypothesis
=
n=−1
A−1 (λ−k x) = λ−k A−1 x = λ−k λ−1 x = λ−(k+1) x
and we are done.
15. If we can express x as a linear combination of the eigenvectors, it will be easy to computeA10
x since
c1
5
10
we know what A does to the eigenvectors. To so express x, we solve the system v1 v2
=
,
c2
1
which is
c1 + c2 = 5
−c1 + c2 = 1.
4.3. EIGENVALUES AND EIGENVECTORS OF N × N MATRICES
283
This has the solution c1 = 2 and c2 = 3, so that x = 2v1 + 3v2 . But by Theorem 4.4(a), v1 and v2
10
are eigenvectors of A10 with eigenvalues λ10
1 and λ2 . Thus
10
A10 x = A10 (2v1 + 3v2 ) = 2A10 v1 + 3A10 v2 = 2λ10
1 v1 + 3λ2 v2 =
1
v1 + 3072v2 .
512
Substituting for v1 and v2 gives
#
1
3072 + 512
.
A x=
1
3072 − 512
"
10
16. As in Exercise 15,
k
1
3 · 2k + 21−k
k
v1 + 3 · 2 v2 =
A x=2·
.
3 · 2k − 21−k
2
k
As k → ∞, the 21−k terms approach zero, so that Ak x ≈ 3 · 2k v2 .
17. We want to express x as a linear combination of the eigenvectors, say x = c1 v1 + c2 v2 + c3 v3 , so we
want a solution of the linear system
c1 + c2 + c3 = 2
c2 + c3 = 1
c3 = 2.
Working from the bottom and substituting gives c3 = 2, c2 = −1, and c1 = 1, so that x = v1 −v2 +2v3 .
Then
A20 x = A20 (v1 − v2 + 2v3 )
= A20 v1 − A20 v2 + 2A20 v3
20
20
= λ20
1 v1 − λ2 v2 + 2λ3 v3
 
 
 
20 1
20 1
1
1
1
0 −
1 + 2 · 120 1
= −
3
3
0
0
1


2
= 2 − 3120  .
2
18. As in Exercise 17,
Ak x = Ak (v1 − v2 + 2v3 )
= Ak v1 − Ak v2 + 2Ak v3
= λk1 v1 − λk2 v2 + 2λk3 v3
 
 
 
k 1
k 1
1
1
1
0 −
1 + 2 · 1k 1
= −
3
3
0
0
1
 1

1
± 3k − 3k + 2
=  − 31k + 2  .
2
 
2
As k → ∞, the 31k terms approach zero, and Ak x → 2.
2
284
CHAPTER 4. EIGENVALUES AND EIGENVECTORS
19. (a) Since (λI)T = λI T = λI, we have
AT − λI = AT − (λI)T = (A − λI)T ,
so that using Theorem 4.10,
det(AT − λI) = det (A − λI)T = det(A − λI).
But the left-hand side is the characteristic polynomial of AT , and the right-hand side is the
characteristic polynomial of A. Thus the two matrices have the same characteristic polynomial
and thus the same eigenvalues.
(b) There are many examples. For instance
A=
1
1
0
2
has eigenvalues λ1 = 1 and λ2 = 2, with eigenspaces
−1
0
E1 = span
, E2 = span
.
1
1
AT has the same eigenvalues by part (a), but the eigenspaces are
1
1
E1 = span
, E2 = span
.
0
1
20. By Theorem 4.18(b), if λ is an eigenvalue of A, then λm is an eigenvalue of Am = O. But the only
eigenvalue of O is zero, since Am takes any vector to zero. Thus λm = 0, so that λ = 0.
21. By Theorem 4.18(a), if λ is an eigenvalue of A with eigenvector x, then λ2 is an eigenvalue of A2 = A
with eigenvector x. Since different eigenvalues have linearly independent eigenvectors by Theorem 4.20,
we must have λ2 = λ, so that λ = 0 or λ = 1.
22. If Av = λv, then Av − cIv = λv − cIv, so that (A − cI)v = (λ − c)v. But this equation means exactly
that v is an eigenvector of A − cI with eigenvalues λ − c.
23. (a) The characteristic polynomial of A is
A − λI =
3−λ
5
2
= λ2 − 3λ − 10 = (λ − 5)(λ + 2).
−λ
So the eigenvalues of A are λ1 = 5 and λ2 = −2. For λ1 , we have
−2
2
1 −1
A − 5I =
→
,
5 −5
0
0
and for λ2 , we have
5
A + 2I =
5
2
1
→
2
0
2
5
0
.
It follows that
1
E5 = span
,
1
E−2 = span
−2
− 25
= span
.
5
1
(b) By Theorem 4.4(b), A−1 has eigenvalues 15 and − 21 , with
1
−2
E1/5 = span
, E−1/2 = span
.
1
5
4.3. EIGENVALUES AND EIGENVECTORS OF N × N MATRICES
285
By Exercise 22, A − 2I has eigenvalues 5 − 2 = 3 and −2 − 2 = −4 with
1
−2
E3 = span
, E−4 = span
.
1
5
Also by Exercise 22, A + 2I = A − (−2I) has eigenvalues 5 + 2 = 7 and −2 + 2 = 0 with
1
E7 = span
,
1
−2
E0 = span
.
5
24. (a) There are many examples. For instance, let
0
A=
1
1
,
0
1
B=
0
1
.
1
Then the characteristic equation of A is λ2 − 1, so that A has eigenvalues λ = ±1. The characteristic equation of B is (µ − 1)2 , so that B has eigenvalue µ = 1. But
A+B =
1
1
2
1
has characteristic polynomial λ2 − 2λ − 1, so it has eigenvalues 1 ±
λ + µ are 0 and 2.
√
2. The possible values of
(b) Use the same A and B as above. Then
0
AB =
1
1
,
1
√
with characteristic polynomial λ2 − λ − 1 and eigenvalues 21 (1 ± 5), which again does not match
λ + µ.
(c) If λ and µ both have eigenvector x, then
(A + B)x = Ax + Bx = λx + µx = (λ + µ)x,
which shows that λ + µ is an eigenvalue of A + B with eigenvector x. For the product, we have
(AB)x = A(Bx) = A(µx) = µAx = µλx = λµx,
showing that λµ is an eigenvalue of AB with eigenvector x.
25. No, they do not. For example, take B = I2and let A be any 2 × 2 rank 2 matrix that has an eigenvalue
0 2
other than 1. For example let A =
, which has eigenvalues 2 and −2. Since A has rank 2, it
2 0
row-reduces to I, so A and I are row-equivalent but do not have the same eigenvalues.
26. By definition, the companion matrix of p(x) = x2 − 7x + 12 is
−(−7) −12
7
C(p) =
=
1
0
1
−12
,
0
which has characteristic polynomial
7−λ
1
−12
= (7 − λ)(−λ) + 12 = λ2 − 7λ + 12.
−λ
286
CHAPTER 4. EIGENVALUES AND EIGENVECTORS
27. By definition, the companion matrix of p(x) = x3 + 3x2 − 4x + 12 is

 

−3 −(−4) −12
−3 4 −12
0
0 = 1 0
0 ,
C(p) =  1
0
1
0
0 1
0
which has characteristic polynomial
−3 − λ
1
0
4
−λ
1
−12
−3 − λ
0 = −1
1
−λ
−3 − λ
−12
−λ
1
0
4
−λ
= −12 − λ((−3 − λ)(−λ) − 4)
= −λ3 − 3λ2 + 4λ − 12.
28. (a) If p(x) = x2 + ax + b, then
C(p) =
−a
1
−b
,
0
so the characteristic polynomial of C(p) is
−a − λ
1
−b
= (−a − λ)(−λ) + b = λ2 + aλ + b.
−λ
x
(b) Suppose that λ is an eigenvalue of C(p), and that a corresponding eigenvector is x = 1 . Then
x2
x
x
λ 1 = C(p) 1 means that
x2
x2
λx1
−a −b x1
−ax1 − bx2
=
=
,
λx2
1
0 x2
x1
λ
so that x1 = λx2 . Setting x2 = 1 gives an eigenvector
.
1
29. (a) If p(x) = x3 + ax2 + bx + c, then

−a
C(p) =  1
0

−b −c
0
0 ,
1
0
so the characteristic polynomial of C(p) is
−a − λ
1
0
−b
−λ
1
−c
−a − λ
0 = −1
1
−λ
−c
−a − λ
−λ
0
1
−b
−λ
= −c − λ((−a − λ)(−λ) + b)
= −λ3 − aλ2 − bλ − c.
 
x1
(b) Suppose that λ is an eigenvalue of C(p), and that a corresponding eigenvector is x = x2 . Then
x3
 
 
x1
x1
λ x2  = C(p) x2  means that
x3
x3

 
  

λx1
−a −b −c
x1
−ax1 − bx2 − cx3
λx2  =  1
,
0
0 x2  = 
x1
λx3
0
1
0
x3
x2
4.3. EIGENVALUES AND EIGENVECTORS OF N × N MATRICES
287
 2
λ
so that x1 = λx2 and x2 = λx3 . Setting x3 = 1 gives an eigenvector  λ .
1
30. Using Exercise 28, the characteristic polynomial of the matrix
−a −b
C(p) =
1
0
is λ2 + aλ + b. If we want this matrix to have eigenvalues 2 and 5, then λ2 + aλ + b = (λ − 2)(λ − 5) =
λ2 − 7λ + 10, so we can take
7 −10
A=
.
1
0
31. Using Exercise 28, the characteristic polynomial of the matrix


−a −b −c
0
0
C(p) =  1
0
1
0
is −(λ3 + aλ2 + bλ + c). If we want this matrix to have eigenvalues −2, 1, and 3, then
−(λ3 + aλ2 + bλ + c) = −(λ + 2)(λ − 1)(λ − 3) = λ3 − 2λ2 − 5λ + 6,
so we can take

2
A = 1
0

−6
0 .
0
5
0
1
32. (a) Exercises 28 and 29 provide a basis for the induction, proving the cases n = 2 and n = 3. So
assume that the statement holds for n = k; that is, that
−ak−1 − λ
1
−ak−2
−λ
0
0
0
1
0
0
· · · −a1
···
0
..
.
0
· · · −λ
···
1
−a0
0
k
0 = (−1) p(λ),
0
−λ
where p(x) = xk + ak−1 xk−1 + · · · + a1 x + a0 . Now suppose that q(x) = xk+1 + bk xk + · · · + b1 x + b0
has degree k + 1; then
C(q) =
−bk − λ
1
−bk−1
−λ
0
0
0
1
0
0
···
···
..
.
···
···
−b1
0
−b0
0
0
−λ
1
0 .
0
−λ
Evaluate this determinant by expanding along the last column, giving
1
−λ
C(q) = (−1)k (−b0 ) 0
0
0
1
0
0
···
..
.
···
···
0
0 + (−λ)
−λ
1
−bk − λ
1
−bk−1
−λ
0
0
0
1
0
0
···
···
..
.
···
···
−b2
0
−b1
0
0
−λ
1
0 .
0
−λ
The first determinant is of an upper triangular matrix, so it is the product of its diagonal entries,
which is 1. The second determinant is the characteristic polynomial of xk +bk xk−1 +· · ·+b2 x+b1 ,
288
CHAPTER 4. EIGENVALUES AND EIGENVECTORS
which is a k th degree polynomial. So by the inductive hypothesis, this determinant is (−1)k (λk +
bk λk−1 + · · · + b2 λ + b1 ). So the entire expression becomes
C(q) = (−1)k (−b0 ) − λ (−1)k (λk + bk λk−1 + · · · + b2 λ + b1 )
= (−1)k+1 b0 + (−1)k+1 (λk+1 + bk λk + · · · + b2 λ2 + b1 λ)
= (−1)k+1 (λk+1 + bk λk + · · · + b2 λ2 + b1 λ + b0 )
= (−1)k+1 q(λ).
(b) The method is quite similar to Exercises 28(b) and 29(b). Suppose that λ is an eigenvalue of C(p)
with eigenvector


x1
 x2 




x =  ...  .


xn−1 
xn
Then λx = C(p)x, so that


 

λx1
x1
−an−1 −an−2 · · · −a1 −a0
 λx2   1


0
···
0
0 

  x2 
 
 ..   0
  .. 
1
·
·
·
0
0
 . =



  .
..
..
..   . 
..
λxn−1   ..


.
xn−1 
.
.
.
λxn
0
0
···
1
0
xn


−an−1 x1 − an−2 x2 − · · · − a1 xn−1 − a0 xn


x1




x2
=
.


..


.
xn−1
But this means that xi = λxi+1 for i = 1, 2, . . . , n − 1. Setting xn = 1 gives an eigenvector
 n−1 
λ
λn−2 




x =  ...  .


 λ 
1
33. The characteristic polynomial of A is
cA (λ) = det(A − λI) =
1−λ
2
−1
= (1 − λ)(3 − λ) + 2 = λ2 − 4λ + 5.
3−λ
Now,
A2 =
1
2
2 −1
−1
=
3
8
−4
,
7
so that
cA (A) = A2 − 4A + 5I2
−1 −4
1
=
−4
8
7
2
= O.
−1
1
+5
3
0
0
1
4.3. EIGENVALUES AND EIGENVECTORS OF N × N MATRICES
289
34. The characteristic polynomial of A is
1−λ
1
0
cA (λ) = det(A − λI) =
1
−λ
1
0
1
= −λ3 + 2λ2 + λ − 2.
1−λ
Now,

1
A2 = 1
0
1
0
1
2 
0
2
1 = 1
1
1
1
2
1

1
1 ,
2

1
A3 = 1
0
1
0
1
3 
0
3
1 = 3
1
2
3
2
3

2
3 ,
3
so that
cA (A) = −A3 + 2A2 + A − 2I3



 


3 3 2
2 1 1
1 1 0
1 0
= − 3 2 3 + 2 1 2 1 + 1 0 1 − 2 0 1
2 3 3
1 1 2
0 1 1
0 0


−3 + 4 + 1 − 2 −3 + 2 + 1 − 0 −2 + 2 + 0 − 0
= −3 + 2 + 1 − 0 −2 + 4 + 0 − 2 −3 + 2 + 1 − 0
−2 + 2 + 0 − 0 −3 + 2 + 1 − 0 −3 + 4 + 1 − 2

0
0
1
= O.
35. Since A2 − 4A + 5I = 0, we have A2 = 4A − 5I, and
A3 = A(4A − 5I) = 4A2 − 5A = 4(4A − 5I) − 5A = 11A − 20I
A4 = (A2 )(A2 ) = (4A − 5I)(4A − 5I) = 16A2 − 40A + 25I = 16(4A − 5I) − 40A + 25I = 24A − 55I.
36. Since −A3 + 2A2 + A − 2I = 0, we have
A3 = 2A2 + A − 2I,
A4 = A(A3 ) = A(2A2 + A − 2I) = 2A3 + A2 − 2A = 2(2A2 + A − 2I) + A2 − 2A = 5A2 − 4I.
37. Since A2 − 4A + 5I = 0, we get A(A − 4I) = −5I. Then solving for A−1 yields
−4
1
4
1
I =− A+ I
A−1 = − A −
5
5
5
5
2
1
4
1 2
8
16
1
8
16
4
11
2
A−2 = A−1 = − A + I
=
A − A + I2 =
(4A − 5I) − A + I = − A + I.
5
5
25
25
25
25
25
25
25
25
38. Since −A3 + 2A2 + A − 2I = 0, rearranging terms gives A(−A2 + 2A + I) = 2I. Then solving for A−1
yields (using Exercise 36)
1
1
1
I(−A2 + 2A + I) = − A2 + A + I
2
2
2
2
1
1
2
A−2 = A−1 = − A2 + A + I
2
2
1 4
1
1
= A − A3 + A2 + A + I 2
4
2
4
1
1
1
2
2
= (5A − 4I) − (2A + A − 2I) + A2 + A + I
4
2
4
1 2 5
= − A + I.
4
4
A−1 =
290
CHAPTER 4. EIGENVALUES AND EIGENVECTORS
39. Apply Exercise 69 in Section 4.2 to the matrix A − λI:
P − λI
Q
cA (λ) = det(A − λI) = det
= (det(P − λI))(det(S − λI)) = cP (λ)cS (λ).
O
S − λI
(Note that neither O or Q is affected by subtracting an identity matrix, since all the 1’s are along the
diagonal, in either P or S.)
40. Start with the equation given in the problem statement:
det(A − λI) = (−1)n (λ − λ1 )(λ − λ2 ) · · · (λ − λn ).
This is an equation, so in particular it holds for λ = 0, and it becomes
det A = det(A − 0I) = (−1)n (−λ1 )(−λ2 ) · · · (−λn ) = (−1)n (−1)n λ1 λ2 · · · λn = λ1 λ2 · · · λn ,
proving the first result. For the second result, we will find the coefficients of λn−1 on the left and right
sides. On the left side, the easiest method given the tools we have is induction on n. For n = 2,
a11 − λ
a12
det(A − λI) = det
= (a11 − λ)(a22 − λ) − a12 a21 = λ2 − (a11 + a22 )λ − a12 a21 ,
a21
a22 − λ
so that the coefficient of λn−1 = λ is − tr(A). Claim that in general, for n ≥ 1, the coefficient of λn−1
in det(A − λI) is (−1)n−1 tr(A). Suppose this statement holds for n = k. Then if A is (k + 1) × (k + 1),
we have (expanding along the first column of A − λI)
det A = (a11 − λ) det(A11 − λI) +
k+1
X
(−1)i+1 ai1 det(Ai1 − λI).
i=2
Now, the terms in the sum on the right each consist of a constant times the determinant of a k × k
matrix derived by removing row i and column 1 from A. Since i > 1, this means we have removed
both the entries a11 − λI and ai1 − λI from A, so that there are only k − 1 entries in A that contain
λ. So the sum on the right cannot produce any terms with λk . For the other term, note that A11 − λI
is a k × k matrix; using the inductive hypothesis we see that the coefficient of λk−1 in det(A11 − λI)
is (−1)k−1 tr(A11 ). Now
det A = (a11 − λ)((−1)k λk + (−1)k−1 tr(A11 )λk−1 + . . . ) +
X
...,
so that the coefficient of λk in det A is
(−1)k a11 − (−1)k−1 tr(A11 ) = (−1)k (a11 + tr(A11 )) = (−1)k (a11 + a22 + · · · + ak+1,k+1 ) = (−1)k tr(A).
Now look at the product on the right in the original expression
det(A − λI) = (−1)n (λ − λ1 )(λ − λ2 ) · · · (λ − λn ).
We can get a constant times λn−1 by expanding the right-hand side when exactly n − 1 of the factors
use λ and one uses λi for some i; that contribution to the λn−1 term in the characteristic polynomial
is −λi λn−1 . Summing over all possible choices for i shows that the coefficient of the λn−1 term is
(−1)n (−λ1 − λ2 − · · · − λn ) = (−1)n+1 (λ1 + λ2 + · · · + λn ) = (−1)n−1 (λ1 + λ2 + · · · + λn ).
Since the coefficient of λn−1 on the left must be the same as that on the right, we have
(−1)n−1 tr(A) = (−1)n−1 (λ1 + λ2 + · · · + λn ),
or
tr(A) = λ1 + λ2 + · · · + λn .
4.4. SIMILARITY AND DIAGONALIZATION
291
41. Suppose that the eigenvalues of A are α1 , α2 , . . . , αn (repetitions included) and for B are β1 , β2 , . . . , βn .
Let the eigenvalues of A + B be λ1 , λ2 , . . . , λn . Then by Exercise 40,
tr(A + B) = λ1 + λ2 + · · · + λn
tr(A) + tr(B) = α1 + · · · + αn + β1 + · · · + βn .
Since tr(A + B) = tr(A) + tr(B) by Exercise 44(a) in Section 3.2, we get
λ1 + λ2 + · · · + λ2 = α1 + · · · + αn + β1 + · · · + βn
as desired.
Let µ1 , µ2 , . . . , µn be the eigenvalues of AB. Then again by Exercise 40 we have
µ1 µ2 · · · µn = det(AB) = det(A) det(B) = α1 α2 · · · αn β1 β2 · · · βn
as desired.
Pn
42. Suppose x = i=1 ci vi , where the vi are eigenvectors of A with corresponding eigenvalues λi . Then
Theorem 4.18 tells us that vi are eigenvectors of Ak with corresponding eigenvalues λki , so that
k
A x=A
k
n
X
i=1
4.4
!
ci vi
=
n
X
i=1
ci Ak vi =
n
X
ci λki vi .
i=1
Similarity and Diagonalization
1. The two characteristic polynomials are
cA (λ) =
4−λ
3
1
= (4 − λ)(1 − λ) − 3 = λ2 − 5λ + 1
1−λ
cB (λ) =
1−λ
0
0
= (1 − λ)2 = λ2 − 2λ + 1.
1−λ
Since the two characteristic polynomials are unequal, the two matrices are not similar by Theorem
4.22.
2. The two characteristic polynomials are
cA (λ) =
2−λ
−4
1
= (2 − λ)(6 − λ) + 4 = λ2 − 8λ + 16
6−λ
cB (λ) =
3−λ
−5
−1
= (3 − λ)(7 − λ) − 5 = λ2 − 10λ + 16.
7−λ
Since the two characteristic polynomials are unequal, the two matrices are not similar by Theorem
4.22.
3. Both A and B are triangular matrices, so by Theorem 4.15, A has eigenvalues 2 and 4, while B has
eigenvalues 1 and 4. By Theorem 4.22, the two matrices are not similar.
292
CHAPTER 4. EIGENVALUES AND EIGENVECTORS
4. The two characteristic polynomials are
1−λ
0
0
2
1−λ
−1
0
−1
1−λ
= (1 − λ)
1−λ
−1
0
−1
−2
0
1−λ
cA (λ) =
−1
1−λ
= (1 − λ)((1 − λ)2 − 1) = (1 − λ)(λ2 − 2λ) = −λ3 + 3λ2 − 2λ
cB (λ) =
2−λ
0
2
= (1 − λ)
1
1
1−λ
0
0
1−λ
2−λ
2
1
1−λ
= (1 − λ)((2 − λ)(1 − λ) − 2) = (1 − λ)(λ2 − 3λ) = −λ3 + 4λ2 − 3λ.
Since the two characteristic polynomials are unequal, the two matrices are not similar by Theorem
4.22.
5. By Theorem 4.23, the columns of P are the eigenvectors of A corresponding to the eigenvalues, which
are the diagonal entries of D. Thus
1
λ1 = 4 with eigenspace span
1
1
λ2 = 3 with eigenspace span
.
2
6. By Theorem 4.23, the columns of P are the eigenvectors of A corresponding to the eigenvalues, which
are the diagonal entries of D. Thus
 
3
λ1 = 2 with eigenspace span 1
2
 
1
λ2 = 0 with eigenspace span −1
0
 
0
λ3 = −1 with eigenspace span  1 .
−1
7. By Theorem 4.23, the columns of P are the eigenvectors of A corresponding to the eigenvalues, which
are the diagonal entries of D. Thus
 
3
λ1 = 6 with eigenspace span 2
3
   
0
1
λ2 = −2 with eigenspace span  1 ,  0 .
−1
−1
8. To determine whether A is diagonalizable, we first compute its eigenvalues:
det(A − λI) =
5−λ
2
2
= (5 − λ)2 − 4 = λ2 − 10λ + 21 = (λ − 3)(λ − 7).
5−λ
4.4. SIMILARITY AND DIAGONALIZATION
293
Since A is a 2 × 2 matrix with 2 distinct eigenvalues, it is diagonalizable by Theorem 4.25. To find the
diagonalizing matrix P , we find the eigenvectors of A. To find the eigenspace corresponding to λ1 = 3
we must find the null space of
2 2
A − 3I =
.
2 2
Row reduce this matrix:
2
0 =
2
A − 3I
2
2
0
1 1
−→
0
0 0
0
.
0
Thus eigenvectors corresponding to λ1 are of the form
−t
−1
−1
=t
= span
.
t
1
1
To find the eigenspace corresponding to λ2 = 7 we must find the null space of
−2
2
A − 7I =
.
2 −2
Row reduce this matrix:
−2
0 =
2
A − 7I
0
1
−→
0
0
2
−2
0
.
0
−1
0
Thus eigenvectors corresponding to λ1 are of the form
t
1
1
=t
= span
.
t
1
1
Thus
"
P =
#
−1
1
1
1
"
− 21
,
so that
P
−1
=
"
− 12
1
2
1
2
1
2
#
and so by Theorem 4.23,
P
−1
AP =
1
2
1
2
1
2
#"
5
2
#"
−1
#
1
2
5
1
1
"
=
3
#
0
0
7
.
9. To determine whether A is diagonalizable, we first compute its eigenvalues:
det(A − λI) =
−3 − λ
−1
4
= (−3 − λ)(1 − λ) − 4 · (−1) = λ2 + 2λ + 1 = (λ + 1)2 .
1−λ
So the only eigenvalue is λ = −1, with algebraic multiplicity 2. To see if A is diagonalizable, we must see
if its geometric multiplicity is also 2 by finding its eigenvectors. To find the eigenspace corresponding
to λ = −1, we must find the null space of
−2 4
A+I =
−1 2
by row-reducing it:
A+I
−2 4
0 =
−1 2
0
1
−→
0
0
−2
0
0
.
0
Thus eigenvectors corresponding to λ1 are of the form
2t
2
, so a basis for the eigenspace is
.
t
1
Thus λ = −1 has geometric multiplicity 1, which is less than the algebraic multiplicity, so that A is
not diagonalizable by Theorem 4.27.
294
CHAPTER 4. EIGENVALUES AND EIGENVECTORS
10. To determine whether A is diagonalizable, we first compute its eigenvalues:
det(A − λI) =
3−λ
0
0
1
0
3−λ
1
= (3 − λ)3
0
3−λ
since the matrix is a triangular matrix. Thus the only eigenvalue is λ = 3, with algebraic multiplicity
3. To see if A is diagonalizable, we must compute the eigenspace
corresponding
to this eigenvalue. To
find the eigenspace corresponding to λ = 3 we row-reduce A − 3I 0 :
A − 3I

0
0 = 0
0
1
0
0
0
1
0

0
0 .
0
But this matrix is already row-reduced, and thus the eigenspace is
 
 
 
s
1
1
0 = s 0 = span 0 .
0
0
0
Thus the geometric multiplicity is only 1 while the algebraic multiplicity is 3. So by Theorem 4.27,
the matrix is not diagonalizable.
11. To determine whether A is diagonalizable, we first compute its eigenvalues:
det(A − λI) =
1−λ
0
1
0
1−λ
1
1
1 = −λ3 + 2λ2 + λ − 2 = −(λ + 1)(λ − 1)(λ − 2).
−λ
Since A is a 3 × 3 matrix with 3 distinct eigenvalues, it is diagonalizable by Theorem 4.25. To find
the diagonalizing matrix P , we find the eigenvectors of A. To find the eigenspace corresponding to
λ = −1, we must find the null space of


2 0 1
A + I = 0 2 1
1 1 1
by row-reducing it:
h
A+I

2 0 1
i

0 = 0 2 1
1 1 1


0
1 0


0 −→ 0 1
0
0 0
1
2
1
2
0

0

0
0
Thus eigenvectors corresponding to λ1 are of the form

 

− 12 t
−t
 1 
 
− 2 t or, clearing fractions −t .
t
2t
 
−1
Thus a basis for this eigenspace is −1.
2
To find the eigenspace corresponding to λ = 1, we must find the null space of


0 0
1
1
A − I = 0 0
1 1 −1
4.4. SIMILARITY AND DIAGONALIZATION
295
by row-reducing it:
A−I

0 0
0 = 0 0
1 1
1
1
−1


0
1 1 0
0 −→ 0 0 1
0 0 0
0

0
0
0
Thus eigenvectors corresponding to λ1 are of the form
 
 
−t
−1
 t , so a basis for the eigenspace is  1 .
0
0
To find the eigenspace corresponding to λ = 2, we must find the null space of


−1
0
1
1
A − 2I =  0 −1
1
1 −2
by row-reducing it:

−1
0 = 0
1
A − 2I
0
1
−1
1
1 −2


0
1
0 −→ 0
0
0
0
1
0
−1
−1
0

0
0
0
Thus eigenvectors corresponding to λ1 are of the form
 

1
t
t , so a basis for the eigenspace is 1 .
1
t
Then a diagonalizing matrix P is the matrix whose columns are the eigenvectors of A:




− 61 − 16 13
−1 −1 1




1
P = −1
1 1 , so that P −1 = − 12
0
2
1
1
1
2
0 1
3
3
3
and so by Theorem 4.23,

− 16

P −1 AP = − 21
− 16
1
2
1
3
1
3

1

0 0
1
1
3
1
3
0
1
1

1
−1

1 −1
0
2
 
−1 1
−1
 
1 1 =  0
0 1
0
0
1
0

0

0 .
2
12. To determine whether A is diagonalizable, we first compute its eigenvalues:
det(A − λI) =
1−λ
2
3
0
0
2−λ
2−λ
1
= (1 − λ)
0
0
1−λ
1
= (1 − λ)2 (2 − λ).
1−λ
So the eigenvalues are λ1 = 1 with algebraic multiplicity 2, and
algebraic multiplicity 1.
λ2 = 2 with
To find the eigenspace corresponding to λ1 = 1 we row-reduce A − I 0 :




1 0 0 0
1 0 0 0
A − I 0 = 2 2 1 0 → 0 1 1 0 .
3 0 1 0
0 0 0 0
Thus the eigenspace is


 
 
0
0
0
−t = t −1 = span −1 .
t
1
1
Thus the geometric multiplicity of this eigenvalue is only 1 while the algebraic multiplicity is 2. So by
Theorem 4.27, the matrix is not diagonalizable.
296
CHAPTER 4. EIGENVALUES AND EIGENVECTORS
13. To determine whether A is diagonalizable, we first compute its eigenvalues:
det(A − λI) =
1−λ
−1
1
2
−λ
1
1
1 = λ2 − λ3 = −λ2 (λ − 1).
−λ
So A has eigenvalues 0 and 1. Considering λ = 0, we row-reduce A − 0I = A:




1 2 1 0
1 0 −1 0
−1 0 1 0 → 0 1
1 0 ,
1 1 0 0
0 0
0 0
so that the eigenspace is


1
span −1 .
1
Since λ = 0 has algebraic multiplicity 2 but geometric multiplicity 1, the matrix is not diagonalizable.
14. To determine whether A is diagonalizable, we first compute its eigenvalues:
2−λ
0
det(A − λI) =
0
0
0
0
3−λ
2
0
3−λ
0
0
2
1
= (1 − λ)(2 − λ)(3 − λ)2
0
1−λ
since the matrix is upper triangular. So the eigenvalues are λ1 = 1 with algebraic multiplicity 1,
λ2 = 2 with algebraic multiplicity 1, and λ3 = 3 with
algebraic multiplicity 2. To find the eigenspace
corresponding to λ3 = 3 we row-reduce A − 3I 0 :




−1 0 0
2 0
1 0 0 0 0


 0 0 2
1 0
 → 0 0 1 0 0 .
A − 3I 0 = 
 0 0 0


0 0
0 0 0 1 0
0 0 0 −2 0
0 0 0 0 0
Thus the eigenspace is
 
 
 
0
0
0
1
1
t
  = t   = span   .
0
0
0
0
0
0
Thus the geometric multiplicity of this eigenvalue is only 1 while the algebraic multiplicity is 2. So by
Theorem 4.27, the matrix is not diagonalizable.
15. Since A is triangular, we can read off its eigenvalues, which are λ1 = 2 and λ2 = −2, each of which
has algebraic multiplicity 2. Computing eigenspaces, we have
   




0 0
0
4 0
0 0 1 0 0
1
0 


   


0 0 0 1 0
0 0
0
0
0
0
→
 , so E2 has basis   , 1
A − 2I 0 = 
0 0 −4







0 0
0 0 0 0 0
0
0 





0 0
0 −4 0
0 0 0 0 0
0
0
   




4 0 0 4 0
1 0 0 1 0
−1
0 


   

0 1 0 0 0
0 4 0 0 0
0



  0
A + 2I 0 
0 0 0 0 0 → 0 0 0 0 0 , so E−2 has basis  0 , 1 .




0 0 0 0 0
0 0 0 0 0
1
0
Since each eigenvalue also has geometric multiplicity 2, the matrix is diagonalizable, and




1 0 −1 0
2 0
0
0
0 1

0 0
0
0
 , D = 0 2

P =
0 0


0 1
0 0 −2
0
0 0
1 0
0 0
0 −2
4.4. SIMILARITY AND DIAGONALIZATION
provides a solution to

1 0

0 1
P −1 AP = 
0 0
0 0
0
0
0
1

2
1
0
0

1 0
0
0
0
2
0
0
297

0
4 1

0
0
 0
−2
0 0
0 −2 0
0
1
0
0
16. As in Example 4.29, we must find a matrix P such that
−1
−1 −4
P AP = P
−3
 
2
−1 0
0
0 0
=
0 1 0
0
1 0
0
2
0
0

0
0
0
0
 = D.
−2
0
0 −2
6
P
5
is diagonal. To do this, we find the eigenvalues and a basis for the eigenspaces of A. The eigenvalues
of A are the roots of its characteristic polynomial
det(A − λI) =
−4 − λ
−3
6
= (−4 − λ)(5 − λ) + 18 = λ2 − λ − 2.
5−λ
This polynomial factors as (λ − 2)(λ + 1), so the eigenvalues are λ1 = −1 and λ2 = 2. The eigenspace
associated with λ1 can be found by finding the null space of A + I:
−3 6 0
1 −2 0
A+I 0 =
→
−3 6 0
0
0 0
2
Thus a basis for this eigenspace is given by
.
1
The eigenspace associated with λ2 can be found by finding the null space of A − 2I:
−6 6 0
1 −1 0
A − 2I 0 =
→
.
−3 3 0
0
0 0
Thus a basis for this eigenspace is given by
Set
1
P =
1
2
,
1
1
.
1
so that P
−1
−1
=
1
2
−1
Then
D = P −1 AP =
2
0
0
1
, so that A5 = P D5 P −1 =
−1
1
2
1
32
0
0 −1
−1
1
2
−34
=
−1
−33
66
65
17. As in Example 4.29, we must first diagonalize A. To do this, we find the eigenvalues and a basis for
the eigenspaces of A. The eigenvalues of A are the roots of its characteristic polynomial
det(A − λI) =
−1 − λ
1
6
= (−1 − λ)(−λ) − 6 = λ2 + λ − 6 = (λ + 3)(λ − 2),
−λ
so the
are λ1 = −3 and λ2 = 2. To compute the eigenspaces, we row-reduce A + 3I
eigenvalues
and A − 2I 0 :
A + 3I
A − 2I
2 6 0
1 3 0
−3
0 =
→
, so E−3 has basis
1 3 0
0 0 0
1
−3
6 0
1 −2 0
2
0 =
→
so E2 has basis
.
1 −2 0
0
0 0
1
0
298
CHAPTER 4. EIGENVALUES AND EIGENVECTORS
Thus
"
−3
P =
1
#
2
1
"
P −1 =
1
5
1
5
− 52
3
5
#
"
−3
D=
0
0
2
#
satisfy P −1 AP = D, so that
"
#10
6
= (P DP −1 )10 = P D10 P =1
0
"
#"
#10 "
#
1
−3 2 −3 0
− 52
5
=
1
3
1 1
0 2
5
5
"
#"
#10 "
#
1
−3 2 (−3)10
0
− 25
5
=
1
3
1 1
0 210
5
5
"
#
35,839 −69,630
=
.
−11,605
24,234
−1
1
18. As in Example 4.29, we must first diagonalize A. To do this, we find the eigenvalues and a basis for
the eigenspaces of A. The eigenvalues of A are the roots of its characteristic polynomial
4−λ
−1
det(A − λI) =
−3
= (4 − λ)(2 − λ) − 3 = λ2 − 6λ + 5 = (λ − 5)(λ − 1),
2−λ
A−I
so
the
eigenvalues
are
λ
=
1
and
λ
=
5.
To
compute
the
eigenspaces,
we
row-reduce
1
2
A − 5I 0 :
A−I
A − 5I
0
1
→
0
0
0
1
→
0
0
−3
1
3
0 =
−1
−1
0 =
−1
−3
−3
−1
0
3
0
0 and
0
1
, so that E1 has basis
,
0
1
0
−3
, so that E5 has basis
.
0
1
Set
"
P =
1
#
−3
1
1
−6
−1
"
,
#
so that P −1 =
1
4
− 14
"
#"
3
4
1
4,
"
,
D=
1
#
0
0
5
.
Then
A
−6
= PD
P
=
1
#"
−3 1
0
1
1
5−6
0
1
4
− 14
3
4
1
4
#
"
=
3907
56
3906
56
11718
56
11719
56
#
.
19. As in Example 4.29, we must first diagonalize A. To do this, we find the eigenvalues and a basis for
the eigenspaces of A. The eigenvalues of A are the roots of its characteristic polynomial
det(A − λI) =
−λ
1
3
= (−λ)(2 − λ) − 3 = λ2 − 2λ − 3 = (λ − 3)(λ + 1),
2−λ
so the
are λ1 = 3 and λ2 = −1. To compute the eigenspaces, we row-reduce A − 3I
eigenvalues
and A + I 0 :
−3
3 0
1 −1 0
1
0 =
→
, so that E3 has basis
,
1 −1 0
0 0 0
1
1 3 0
1 3 0
−3
A+I 0 =
→
, so that E−1 has basis
.
1 3 0
0 0 0
1
A − 3I
0
4.4. SIMILARITY AND DIAGONALIZATION
Set
"
P =
299
"
#
1
−3
1
1
3
4
1
4,
#"
−3 3k
0
so that P −1 =
,
#
1
4
− 41
,
D=
"
3
#
0
0
−1
.
Then
k
k
A = PD P
−1
=
1
1
3 +3(−1)k
4
3k −(−1)k
4
#"
1
4
− 41
(−1)k
0
3k+1 −3(−1)k
4
3k+1 +(−1)k
4
" k
=
"
1
3
4
1
4
#
#
.
20. As in Example 4.29, we must first diagonalize A. To do this, we find the eigenvalues and a basis for
the eigenspaces of A. The eigenvalues of A are the roots of its characteristic polynomial
det(A − λI) =
2−λ
2
2
1
2
1−λ
2
= 5λ2 − λ3 = −λ2 (λ − 5),
1
2−λ
are λ1 = 0 and λ2 = 5. To compute the eigenspaces we row-reduce A − 0I
so the eigenvalues
A − 5I 0 :
   






1 12 1 0


−1 −1
h
i 2 1 2 0


 → 0 0 0 0 , so E0 has basis  2 ,  0
A 0 =
2
1
2
0
   








0
1 
2 1 2 0
0 0 0 0
 




1 0 −1 0
−3
1
2 0
 1 
A − 5I 0 =  2 −4
2 0 → 0 1 −1 0 , so E5 has basis 1 .


1
2
1 −3 0
0 0
0 0
Set

−1


P = 2
0
Then

−1
1
0

1
,
1
1

− 51

2
so that P −1 = 
− 5
2
5

0
D = P −1 AP = 0
0
0
0
0

0
0 ,
5
2
5
− 51
1
5
− 51
2
5
− 15
1
5
− 15
0 and


3 .
5
2
5
so that

−1

8
8 −1

A = PD P =  2
0

1 0


0 1
 0
1 1 0
−1
0
0
0

0 − 51

 2
0
 − 5
2
58
5


156250
 
3 = 
156250
5
2
156250
5

78125
156250
78125

156250
.
78125
156250
21. As in Example 4.29, we must first diagonalize A. To do this, we find the eigenvalues and a basis for
the eigenspaces of A. Since A is upper triangular, its eigenvalues are the diagonal
are
entries, which
λ1 = −1 and λ2 = 1. To compute the eigenspaces, we row-reduce A + I 0 and A − I 0 :
   





2 1 1 0
1 21 12 0
−1 
 −1

h
i




   
A + I 0 = 0 0 0 0 → 0 0 0 0 , so E−1 has basis  2 ,  0 ,




0 0 0 0
0 0 0 0
0
2
 




0
1
1 0
0 1 0 0
 1 
A − I 0 = 0 −2
0 0 → 0 0 1 0 , so E1 has basis 0 .


0
0 −2 0
0 0 0 0
0
300
CHAPTER 4. EIGENVALUES AND EIGENVECTORS
Set

0

−1
so that P = 
0
1

1

2 0
,
0 0

−1

P =
 0
2
−1
0
1
2
1
2

1
2
0
.
1
2
Then

−1
D = P −1 AP =  0
0

0 0
−1 0 ,
0 1
so that

−1

2015
2015 −1

A
= PD
P = 0
2

−1


= 0
−1
2
0
−1
2
2

1


= 0
0
0
1
−1


0
0
0
1 (−1)2015


2015




0
(−1)
0  0
0 
2015
0
0
1
0
1



0 0 0 0 21
1 −1



1



0
  0 −1 0 0 2 0
0
0 1 1 12 12
0

1

0
 = A.
0
1
2
1
2

1
2
0

1
2
0 −1
22. As in Example 4.29, we must first diagonalize A. To do this, we find the eigenvalues and a basis for
the eigenspaces of A. The eigenvalues of A are the roots of its characteristic polynomial
det(A − λI) =
2−λ
1
1
= (1 − λ)
0
1
1−λ
1
0
2−λ
2−λ
1
1
2−λ
= (1 − λ)((2 − λ)2 − 1) = (1 − λ)(λ2 − 4λ + 3) = −(λ − 1)2 (λ − 3).
Thus the eigenvalues
are λ1 = 1 and λ2 = 3. To compute the eigenspaces, we row-reduce A − I
and A − 3I 0 .
A−I
A − 3I

1 0 1
0 = 1 0 1
1 0 1

−1
0
0 =  1 −2
1
0
   



0
1 0 1 0
0 
 −1
0 → 0 0 0 0 , so E1 has basis  0 , 1


0
0 0 0 0
1
0
 



1 0
1 0 −1 0
 1 
1 0 → 0 1 −1 0 , so E3 has basis 1 .


−1 0
0 0
0 0
1
Set

1

P =
1
1

0

0 1
,
1 0
−1

1
 2
−1
1
so that P = 
− 2
− 12
0
0
1

1
2
1 ,
2
− 21

3

D=
0
0
0
1
0

0

0
,
1
0
4.4. SIMILARITY AND DIAGONALIZATION
301
so that

1

k
k −1
A = PD P = 
1
1

1

=
1
1

−1
0
1
−1
0
1

0 3k


1
 0
0
0

0 3k


1
 0
0
0
1 k
(3 + 1) 0
2
1

=  2 (3k − 1) 1
1 k
2 (3 − 1) 0
0
1k
0

1
 2
 1
0
 − 2
1k
− 12
0

1
0 0
 2
 1
1 0
 − 2
0 1
− 12

1 k
2 (3 − 1)
1 k

2 (3 − 1)
1 k
2 (3 + 1)

1
2
1
2
− 21
0
0
1
0
0
1

1
2
1
2
− 21
23. As in Example 4.29, we must first diagonalize A. To do this, we find the eigenvalues and a basis for
the eigenspaces of A. The eigenvalues of A are the roots of its characteristic polynomial
det(A − λI) =
1−λ
2
0
1
1
−2 − λ
2
= −(λ − 1)(λ − 2)(λ + 3).
1
1−λ
Thus
the eigenvalues
are λ1 =1, λ2 = 2, and λ2 = −3. To compute the eigenspaces, we row-reduce
A − I 0 , A − 2I 0 , and A + 3I 0 .
 




1 0 1 0
0
1 0 0
 −1 
A − I 0 = 2 −3 2 0 → 0 1 0 0 , so E1 has basis  0


1
0
1 0 0
0 0 0 0




 

1
0
−1
1 
−1
1
0
0
0

A − 2I 0 =  2 −4
2 0 → 0 1 −1 0 , so E2 has basis 1


1
0
1 −1 0
0 0
0 0





 

1 0 −1 0
4 1 0 0
 1 
A + 3I 0 = 2 1 2 0 → 0 1
4 0 , so E3 has basis −4 .


1
0 1 4 0
0 0
0 0
Set

−1

P = 0
1
1
1
1

1

−4 ,
1

− 12

so that P −1 =  52
1
10
0
1
5
− 15

1
2
2
,
5
1
10

1

D = 0
0
0
2
0

0

0 ,
−3
so that



−1 1
1 1 0
0 − 21
0


 2
k
k −1
1
k
A = P D P =  0 1 −4 0 2
0  5
5
1
1 1
1 0 0 (−3)k
− 15
10

1
1 k
k
k+2
k
)
10 (5 + (−3) + 2
5 (2 − (−3) )

1
k
k
=
− 25 ((−3)k − 2k )
5 (4(−3) + 2 )
1
1 k
k
k+2
k
) 5 (2 − (−3) )
10 (−5 + (−3) + 2

1
2
2
5
1
10
24. The characteristic polynomial of A is
det(A − λI) =
1−λ
0

1
k
k+2
)
10 (−5 + (−3) + 2

2
k
k
− 5 ((−3) − 2 )
.
1
k
k+2
)
10 (5 + (−3) + 2
1
= (λ − 1)(λ − k),
k−λ
302
CHAPTER 4. EIGENVALUES AND EIGENVECTORS
so that if k 6= 1, then A is a 2 × 2 matrix with two distinct eigenvalues, so it is diagonalizable. If k = 1,
then
0 1
A−I =
,
0 0
1
which gives a one-dimensional eigenspace with basis
. Since λ = 1 has algebraic multiplicity 2 but
0
geometric multiplicity 1, A is not diagonalizable. So A is diagonalizable exactly when k 6= 1.
25. The characteristic polynomial of A is
det(A − λI) =
1−λ
0
k
= (λ − 1)2 .
1−λ
The only eigenvalue is 1, and then
A−I =
0
0
k
.
0
If k = 0, then
A is diagonal to start with, while if k 6= 0, then we get a one-dimensional eigenspace
1
with basis
. Since λ = 1 then has algebraic multiplicity 2 but geometric multiplicity 1, A is not
0
diagonalizable. So A is diagonalizable exactly when k = 0.
26. The characteristic polynomial of A is
det(A − λI) =
k−λ
1
√
1
= λ(λ − k) − 1 = λ2 − kλ − 1,
−λ
2
so that A has the eigenvalues λ = k± 2k +4 . Since k 2 + 4 is never zero, and always positive, the
eigenvalues of A are real and distinct, so A is diagonalizable for all k.
27. Since A is triangular, the only eigenvalue is 1, with algebraic multiplicity 3. Then


0 0 k
A − I = 0 0 0 .
0 0 0
If k = 0, then
to start with, while if k 6= 0, then we get a one-dimensional eigenspace
 A
 isdiagonal

0 
 1
with basis 0 , 1 . Since λ = 1 then has algebraic multiplicity 3 but geometric multiplicity 2,


0
0
A is not diagonalizable. So A is diagonalizable exactly when k = 0.
28. Since A is triangular, the eigenvalues are 1, with algebraic multiplicity 2, and 2, with algebraic multiplicity 1. Then




0 k 0
0 1 0
A − I = 0 1 0 → 0 0 0
0 0 0
0 0 0
   
0 
 1
regardless of the value of k, so λ = 1 has a two-dimensional eigenspace with basis 0 , 0 , i.e.,


0
1
with geometric multiplicity 2. Since this is equal to the algebraic multiplicity of λ, we see that A is
diagonalizable for all k.
29. The characteristic polynomial of A is
det(A − λI) =
1−λ
1
1
1
1−λ
1
k
k
= −λ3 + 2λ2 + kλ2 = −λ2 (λ − (k + 2)).
k−λ
4.4. SIMILARITY AND DIAGONALIZATION
303
If k = −2, then A has only the eigenvalue 0, with algebraic multiplicity 3, and then




1 1 −2
1 1 −2
0 ,
A = 1 1 −2 → 0 0
1 1 −2
0 0
0
so the eigenspace is two-dimensional and therefore the geometric multiplicity is not equal to the algebraic multiplicity, so that A is not diagonalizable. If k 6= −2, then 0 has algebraic multiplicity 2, and
then




1 1 −k
1 1 −k
0 ,
A = 1 1 −k  → 0 0
1 1 −k
0 0
0
and the eigenspace is again two-dimensional, so the geometric multiplicity equals the algebraic multiplicity and A is diagonalizable. Thus A is diagonalizable exactly when k 6= −2.
30. We are given that A ∼ B, so that B = P −1 AP for some invertible matrix P , and that B ∼ C, so that
C = Q−1 BQ for some invertible matrix Q. But then
C = Q−1 BQ = Q−1 P −1 AP Q = (P Q)−1 AP Q,
so that C is R−1 AR for the invertible matrix R = P Q.
31. A is invertible if and only if det A 6= 0. But since A and B are similar, det A = det B by part (a) of
this theorem. Thus A is invertible if and only if det B 6= 0, which happens if and only if B is invertible.
32. By Exercise 61 in Section 3.5, if U is invertible then rank A = rank(U A) = rank(AU ). Now, since A
and B are similar, we have B = P −1 AP for some invertible matrix P , so that
rank B = rank(P −1 (AP )) = rank(AP ) = rank A,
using the result above twice.
33. Suppose that B = P −1 AP where P is invertible. This can be rewritten as BP −1 = P −1 A. Let λ be
an eigenvalue of A with eigenvector x. Then
B(P −1 x) = (BP −1 )x = (P −1 A)x = P −1 (Ax) = P −1 (λx) = λ(P −1 x).
Thus P −1 x is an eigenvector for B, corresponding to λ. So every eigenvalue of A is also an eigenvalue
of B. By writing the similarity equation as A = Q−1 BQ, we see in exactly the same way that every
eigenvalue of B is an eigenvalue of A. So A and B have the same eigenvalues. (Note that we have also
proven a relationship between the eigenvectors corresponding to a given eigenvalue).
34. Suppose that A ∼ B, say A = P −1 BP . Then if m = 0, we have Am = B m = I, so that A0 ∼ B 0 . If
m > 0, then
m
Am = P −1 BP
= |P −1 BP · P −1 BP ·{z
· · P −1 BP · P −1 BP} = P −1 B m P.
m times
But A
m
=P
−1
m
B P means that A
m
m
∼B .
35. From part (f), Am ∼ B m for m ≥ 0. If m < 0, then
−1
−1
Am = A−m
= P −1 B −m P
−1
−1
−1
since A−m and B −m are similar because −m > 0. But P −1 B −m P
= P −1 (B −m )
P −1
=
P −1 B m P . Thus Am ∼ B m for m < 0, so they are similar for all integers m.
36. Let P = B −1 . Then
BA = BABB −1 = B(AB)B −1 = P −1 (AB)P,
showing that AB ∼ BA.
304
CHAPTER 4. EIGENVALUES AND EIGENVECTORS
37. Exercise 45 in Section 3.2 shows that if A and B are n × n, then tr(AB) = tr(BA). But then if
B = P −1 AP , we have
tr B = tr(P −1 AP ) = tr((P −1 A)P ) = tr(P (P −1 A)) = tr((P P −1 )A) = tr A.
38. A is triangular, so it has eigenvalues −1 and 3. For B, we have
det(B − λI) =
1−λ
2
2
= (1 − λ)2 − 4 = λ2 − 2λ − 3 = (λ + 1)(λ − 3).
1−λ
So the eigenvalues of B are also −1 and 3, and thus
−1
A∼B∼
0
0
= D.
3
1
1
Reducing A + I and A − 3I shows that E−1 = span
and E3 = span
. Reducing B + I
−4
0
1
1
and B − 3I shows that E−1 = span
and E3 = span
. Thus
−1
1
"
−1
D=Q
#"
#"
#
3
1
1 1
0 −1 −4 0
#"
#"
#
1 1
− 12 1 2
,
1
2 1 −1 1
2
1
4
− 41
0
AQ =
1
"
D = R−1 BR =
1
2
1
2
so that finally
B = RDR−1 = RQ−1 AQR−1 = (QR−1 )−1 A(QR−1 ).
Thus
P = QR
−1
1
=
−2
0
2
is an invertible matrix such that B = P −1 AP .
39. We have
det(A − λI) =
5−λ
4
−3
= (5 − λ)(−2 − λ) + 12 = λ2 − 3λ + 2 = (λ − 1)(λ − 2)
−2 − λ
det(B − λI) =
−1 − λ
−6
1
= (−1 − λ)(4 − λ) + 6 = λ2 − 3λ + 2 = (λ − 1)(λ − 2).
4−λ
So A and B both have eigenvalues 1 and 2, so that
1
A∼B∼
0
0
= D.
2
3
1
Reducing A − I and A − 2I shows that E1 = span
and E2 = span
. Reducing B − I and
4
1
1
1
B − 2I shows that E1 = span
and E3 = span
. Thus
2
3
−1
4
3
D = R−1 BR =
−2
D = Q−1 AQ =
1 5 −3 3 1
−3 4 −2 4 1
−1 −1 1 1 1
,
1 −6 4 2 3
4.4. SIMILARITY AND DIAGONALIZATION
305
so that finally
B = RDR−1 = RQ−1 AQR−1 = (QR−1 )−1 A(QR−1 ).
Thus
P = QR−1 =
7 −2
10 −3
is an invertible matrix such that B = P −1 AP .
40. A is upper triangular, so it has eigenvalues −2, 1, and 2. For B, we have
det(B − λI) =
3−λ
1
2
2
2−λ
2
−5
−1
= −(λ + 2)(λ − 1)(λ − 2),
−4 − λ
So B also has eigenvalues −2, 1, and 2, so that

−2
A∼B∼ 0
0
0
1
0

0
0 = D.
2
Reducing A + 2I, A − I, and A − 2I gives
 
 
1
1
E−2 = span −4 , E1 = span −1 ,
−3
0
 
1
E2 = span 0 .
0
Reducing B + 2I, B − I, and B − 2I gives
 
 
1
1
E−2 = span 0 , E1 = span −1 ,
0
1
 
1
E2 = span 2 .
1
Therefore



1
0 − 41
1
1
2
1 0
12



D = Q−1 AQ = 0
0 − 13  0 −2 1 −4 −1
1
1
0 −3
0
0 1
1
4
4



1
1
3
−2 −2
1
3 2 −5 1
2



−1
D = R BR =  1
0 −1 1 2 −1 0 −1
1
1
0
− 21
2 2 −4 1
2
2

1

0
0

1

2 ,
1
so that finally
B = RDR−1 = RQ−1 AQR−1 = (QR−1 )−1 A(QR−1 ).
Thus

1
P = QR−1 =  1
−3
0
2
0

0
−5
3
is an invertible matrix such that B = P −1 AP .
41. We have
det(A − λI) =
1−λ
1
2
det(B − λI) =
−3 − λ
6
4
0
2
−1 − λ
1
= −(λ + 1)2 (λ − 3)
0
1−λ
−2
5−λ
4
0
0
= −(λ + 1)2 (λ − 3),
−1 − λ
306
CHAPTER 4. EIGENVALUES AND EIGENVECTORS
so that A and B both have eigenvalues −1 (of algebraic multiplicity 2) and 3. Therefore


−1
0 0
A ∼ B ∼  0 −1 0 = D.
0
0 3
Reducing A + I and A − 3I gives

  
0
1
E−1 = span  0 , 1 ,
−1
0
 
2
E3 = span 1 .
2
Reducing B + I and B − 3I gives

  
1
0
E−1 = span −1 , 0 ,
0
1


1
E3 = span −3 .
−2
Therefore

1
2

D = Q−1 AQ = − 41
1
4

3
2

D = R−1 BR =  −1
− 21


− 12
1
0 2
1 0

1 
− 4  1 −1 1  0 1
1
2
0 1 −1 0
4


1
1
0
0 −3 −2
2


5
0 −1
−1 1  6
0
4
4 −1
− 12 0
0
1
0

2

1
2
0
0
1

1

−3 ,
−2
so that finally
B = RDR−1 = RQ−1 AQR−1 = (QR−1 )−1 A(QR−1 ).
Thus

1
2

P = QR−1 = − 23
− 25
− 12
− 32
− 32

0

1
0
is an invertible matrix such that B = P −1 AP .
42. Suppose that B = P −1 AP . This gives P B = AP , so that (P B)T = (AP )T . Expanding yields
B T P T = P T AT , so that B T ∼ AT and thus AT ∼ B T .
43. Suppose that P −1 AP = D, where D is diagonal. Then
D = DT = (P −1 AP )T = P T AT (P T )−1 .
Now let Q = (P T )−1 ; the equation above becomes D = Q−1 AT Q, so that AT is diagonalizable.
44. If A is invertible, it has no zero eigenvalues. So if it is diagonalizable, the diagonal matrix D is also
invertible since none of the diagonal entries are zero. Then D−1 is also diagonal, consisting of the
inverses of each of the diagonal entries of D. Suppose that D = P −1 AP . Then
−1
−1
D−1 = P −1 AP
= P −1 A−1 P −1
= P −1 A−1 P,
showing that A−1 is also diagonalizable (and, incidentally, showing again that the eigenvalues of A−1
are the inverses of the eigenvalues of A, by looking at the entries of D−1 .)
45. Suppose that A is diagonalizable, and that λ is its only eigenvalue. Then if D = P −1 AP is diagonal,
all of its diagonal entries must equal λ, so that D = λI. Then λI = P −1 AP implies that P (λI)P −1 =
P P −1 AP P −1 = A. However, P (λI)P −1 = λP IP −1 = λI. Thus A = λI.
4.4. SIMILARITY AND DIAGONALIZATION
307
46. Since A and B each have n distinct eigenvalues, they are both diagonalizable by Theorem 4.25. Recall
that diagonal matrices commute: if D and E are both diagonal, then DE = ED. Now, if A and B have
the same eigenvectors, then since the diagonalizing matrix P has columns equal to the eigenvectors,
we see that for the same invertible matrix P , both DA = P −1 AP and DB = P −1 BP are diagonal.
But then
AB = (P DA P −1 )(P DB P −1 ) = P DA DB P −1 = P DB DA P −1 = (P DB P −1 )(P DA P −1 ) = BA.
Conversely, if AB = BA, let α1 , α2 , . . . , αn be the distinct eigenvalues of A, and pi the corresponding
eigenvectors, which are also distinct since they are linearly independent. Then
A(Bpi ) = (AB)pi = (BA)pi = B(Api ) = B(αi Pi ) = αi Bpi .
Thus Bpi is also an eigenvector of A corresponding to αi , so it must be a multiple of pi . That is,
Bpi = βi pi for all i, so that the eigenvectors of B are also pi . So every eigenvector of A is an
eigenvector of B. Since B also has n distinct eigenvectors, the set of eigenvectors must be the same.
47. If A ∼ B, then A and B have the same characteristic polynomial, so the algebraic multiplicities of
their eigenvalues are the same.
48. From the solution to Exercise 33, we know that if B = P −1 AP and x is an eigenvector of A with
eigenvalue λ, then P −1 x is an eigenvalue of B with eigenvalue λ. Therefore, if λ is an eigenvalue of A
with multiplicity m, its eigenspace has a basis consisting of m distinct vectors v1 , v2 , . . . , vm , so that
the vectors P −1 v1 , P −1 v2 , . . . , P −1 vm are in the eigenspace of B corresponding to λ. These vectors
are linearly independent since P −1 is an invertible linear transformation. So each of the n eigenvectors
of A corresponds to a distinct eigenvector of B; since each matrix has n eigenvectors, the reverse is true
as well. Therefore the m eigenvalues of B found above are all of the eigenvalues of B corresponding to
λ, so that the dimension of the eigenspace of λ for B is m as well. So the geometric multiplicities of
the eigenvalues of A and B are the same.
49. If D = P −1 AP and D is diagonal, then every entry of D is either 0 or 1, since every eigenvalue of A
is either 0 or 1 by assumption. Since D2 is computed by squaring each diagonal entry, we see that
D2 = D. But then P DP −1 = A, and
A2 = (P DP −1 )2 = (P DP −1 )(P DP −1 ) = P −1 D2 P = P −1 DP = A,
so that A is idempotent.
50. Suppose that A is nilpotent and diagonalizable, so that D = P −1 AP (so that A = P DP −1 ) and
Am = O for some m > 1. Then
Dm = (P −1 AP )m = P −1 Am P = P −1 OP = O.
Since D is diagonal, Dm is found by taking the mth power of each diagonal entry. If this is the zero
matrix, then all the diagonal entries must be zero, so that D = O. But then A = P OP −1 = O, so that
A is the zero matrix.
51. (a) Suppose we have three vectors v1 , v2 , and v3 with
Av1 = v1 ,
Av2 = v2 ,
Av3 = v3 .
Then all three vectors must lie in the eigenspace E1 corresponding to λ = 1, since all three vectors
are multiplied by 1 when A is applied. However, since the algebraic multiplicity of λ = 1 is 2,
the geometric multiplicity is no greater than 2, by Lemma 4.26. But any three vectors in a twodimensional space must be linearly independent, so that v1 , v2 , and v3 are linearly dependent.
(b) By Theorem 4.27, A is diagonalizable if and only if the algebraic multiplicity of each eigenvalue
equals its geometric multiplicity. So the dimensions of the eigenspaces must be dim E−1 = 1,
dim E1 = 2, dim E2 = 3, since the algebraic multiplicity of −1 is 1, that of 1 is 2, and that of 2 is
3.
308
CHAPTER 4. EIGENVALUES AND EIGENVECTORS
52. (a) The characteristic polynomial of A is
det(A − λI) =
a−λ
c
b
= λ2 − (a + d)λ + (ad − bc) = λ2 − tr(A) + det(A).
d−λ
Therefore the eigenvalues are
p
p
a + d ± (a + d)2 − 4(ad − bc)
a + d ± (a − d)2 + 4bc
λ=
=
.
2
2
So if (a − d)2 + 4bc > 0, then the two eigenvalues of A are real and distinct and therefore A is
diagonalizable. If (a − d)2 + 4bc < 0, then A has no real eigenvalues and so is not diagonalizable.
(b) If (a − d)2 + 4bc = 0, then A has one real eigenvalue, and A may or may not be diagonalizable
depending on the geometric multiplicity of that eigenvalue. For example, the zero matrix O has
(a − d)2 + 4bc = 0, and it is already a diagonal matrix, so is diagonalizable. However,
1 −1
A=
1 −1
satisfies (a − d)2 + 4bc = 0; from the formula above,
its only real eigenvalue is λ = 0, and it is
1
easy to see by row-reducing A that E0 = span
. So the geometric multiplicity of λ = 0 is
1
only 1 and thus A is not diagonalizable, by Theorem 4.27.
4.5
Iterative Methods for Computing Eigenvalues
1. (a) For the given value of x5 , we estimate a dominant eigenvector of A to be
# "
#
"
1
1
.
11,109 ≈
2.500
4443
The corresponding dominant eigenvalue is computed by finding
"
#"
# "
#
1 2
4443
26,661
x6 = Ax5 =
=
,
5 4 11,109
66,651
66,651
and 26,661
4443 ≈ 6.001, 11,109 ≈ 6.000.
(b) det(A − λI) =
1−λ
5
2
= (1 − λ)(4 − λ) − 10 = λ2 − 5λ − 6 = (λ + 1)(λ − 6), so the dominant
4−λ
eigenvalue is 6.
2. (a) For the given value of x5 , we estimate a dominant eigenvector of A to be
# "
#
"
1
1
≈
.
3904
−0.500
− 7811
The corresponding dominant eigenvalue is computed by finding
"
#"
# "
#
7
4
7811
39,061
x6 = Ax5 =
=
,
−3 −1 −3904
−19,529
−19,529
and 39,061
7811 ≈ 5.001, −3904 ≈ 5.002.
4.5. ITERATIVE METHODS FOR COMPUTING EIGENVALUES
309
7−λ
4
= (7 − λ)(−1 − λ) + 12 = λ2 − 6λ + 5 = (λ − 1)(λ − 5), so the
−3
−1 − λ
dominant eigenvalue is 5.
(b) det(A − λI) =
3. (a) For the given value of x5 , we estimate a dominant eigenvector of A to be
" # "
#
1
1
≈
.
89
0.618
144
The corresponding dominant eigenvalue is computed by finding
"
#" # "
#
2 1 144
377
x6 = Ax5 =
=
,
1 1
89
233
233
and 377
144 ≈ 2.618, 89 ≈ 2.618.
√
2−λ
1
= (2 − λ)(1 − λ) − 1 = λ2 − 3λ + 1, which has roots λ = 12 (3 ± 5).
1
1−λ
√
So the dominant eigenvalue is 12 (3 + 5) ≈ 2.618.
(b) det(A − λI) =
4. (a) For the given value of x5 , we estimate a dominant eigenvector of A to be
# "
#
"
1
1
≈
.
239.500
3.951
60.625
The corresponding dominant eigenvalue is computed by finding
#
# "
#"
"
210.688
60.625
1.5 0.5
,
=
x6 = Ax5 =
839.75
2.0 3.0 239.500
839.75
and 210.688
60.625 ≈ 3.475, 239.500 ≈ 3.506.
1.5 − λ
0.5
= (1.5 − λ)(3.0 − λ) − 1.0 = λ2 − 4.5λ + 3.5, which has roots
2.0
3.0 − λ
λ = 1 and λ = 3.5. So the dominant eigenvalue is 3.5.
(b) det(A − λI) =
5. (a) The entry of largest magnitude in x5 is m5 = 11.001, so that
"
# "
#
3.667
− 11.001
−0.333
1
y5 =
x5 =
=
.
11.001
1
1
So the dominant eigenvalue is about 11 and y5 approximates the dominant eigenvector.
(b) We have
"
#"
# "
#
2 −3 −0.333
−3.666
Ay5 =
=
−3 10
1
10.999
"
# "
#
−0.333
−3.663
λ1 y5 = 11.001
=
,
1
11.001
so we have indeed approximated an eigenvalue and an eigenvector of A.
6. (a) The entry of largest magnitude in x10 is m10 = 5.530, so that
"
# "
#
1
1
1
y10 =
x10 = 1.470 ≈
.
5.530
0.266
5.530
So the dominant eigenvalue is about 5.53 and y10 approximates the dominant eigenvector.
310
CHAPTER 4. EIGENVALUES AND EIGENVECTORS
(b) We have
"
5
Ay10 =
2
#"
# "
#
2
1
5.532
=
−2 0.266
1.468
"
# "
#
1
5.530
λ1 y10 = 5.530
=
,
0.266
1.471
so we have indeed approximated an eigenvalue and an eigenvector of A.
7. (a) The entry of largest magnitude in x8 is m8 = 10, so that

 

1
1
1

 

y8 =
≈ 0.0001 .
x8 =  0.001
10.0 
10
1
1
So the dominant eigenvalue is about 10 and y8 approximates the dominant eigenvector.
(b) We have


 

4 0 6
1
10


 

Ay8 = −1 3 1 0.0001 = 0.0003
6 0 4
1
10

 

1
10

 

λ1 y8 = 10 0.0001 = 0.001 ,
1
10
so we have indeed approximated an eigenvalue and an eigenvector of A, though the approximation
in the middle component is not great.
8. (a) The entry of largest magnitude in x10 is m10 = 3.415, so that


 
1
1
1


 
y10 =
x10 =  2.914
≈  0.853  .
3.415 
3.415
−0.353
− 1.207
3.415
So the dominant eigenvalue is about 3.415 and y10 approximates the dominant eigenvector.
(b) We have

 


3.412
1
1
2 −2
 



Ay10 = 1
1 −3  0.853  =  2.912 
−1.206
0 −1
1 −0.353

 

1
3.415

 

λ1 y10 = 3.415  0.853  =  2.913  ,
−0.353
−1.205
so we have indeed approximated an eigenvalue and an eigenvector of A.
9. With the given A and x0 , we get the values below:
k
yk
0
1
1
1
1
mk
1
xk
1
26
8
1.000
0.308
2
17.692
5.923
1.000
0.335
3
18.017
6.004
1.000
0.333
4
17.999
6.000
1.000
0.333
5
18.000
6.000
1.000
0.333
26
17.692
18.017
17.999
18
4.5. ITERATIVE METHODS FOR COMPUTING EIGENVALUES
311
3
.
1
We conclude that the dominant eigenvalue of A is 18, with eigenvector
10. With the given A and x0 , we get the values below:
k
yk
0
1
0
1
0
mk
1
xk
1
−6
8
−0.75
1.00
2
8.500
8.000
1.000
−0.941
3
−9.765
9.882
−0.988
1.000
8
8.5
9.882
4
9.929
−9.905
1.000
−0.998
5
−9.990
9.995
−1.000
1.000
9.929
9.995
6
9.997
−9.996
1.000
−1.000
9.997
We conclude that the dominant eigenvalue of A is −10 (not 10, since at each step the sign of each
1
component of yk is reversed) , with eigenvector
.
−1
11. With the given A and x0 , we get the values below:
k
yk
0
1
0
1
0
mk
1
xk
1
7
2
1.000
0.286
2
7.571
2.857
1.000
0.377
3
7.755
3.132
1.000
0.404
4
7.808
3.212
1.000
0.411
5
7.823
3.234
−1.000
0.412
6
7.827
3.240
1.000
0.414
7
7.571
7.755
7.808
7.823
7.827
1
We conclude that the dominant eigenvalue of A is about 7.83, with eigenvector about
. The
0.414
√
1
.
true values are λ = 5 + 2 2 with eigenvector √
2−1
12. With the given A and x0 , we get the values below:
k
yk
0
1
0
1
0
mk
1
xk
1
2
3.5
4.143
1.5
1.286
1.000
1.000
0.429
0.310
3.5
3
3.966
1.345
1.000
0.339
4
4.009
1.330
1.000
0.332
5
3.998
1.334
−1.000
0.334
6
4.001
1.333
1.000
0.333
3.966
4.009
3.998
4.001
4.143
3
We conclude that the dominant eigenvalue of A is 4, with eigenvector
.
1
13. With the given A and x0 , we get the values below:
k
xk
yk
mk
0
1
2
 
 


1
21
16.810
1
15
12.238
1
13
10.714
  
 

1
1.000
1.000
1 0.714 0.728
1
0.619
0.637
3
4
5

 
 

17.011
16.999
17.000
12.371 12.363 12.264
10.824
10.818
10.818






1.000
1.000
1.000
0.727
0.727
0.727
0.636
0.636
0.636
1
17.011
21
16.81
16.999
17
312
CHAPTER 4. EIGENVALUES AND EIGENVECTORS


1
We conclude that the dominant eigenvalue of A is 17, with corresponding eigenvector about 0.727.
0.636
   
1
0
In fact the eigenspace corresponding to this eigenvalue is span 2 , −2, and we have found one
0
1
eigenvector in this space.
14. With the given A and x0 , we get the values below:
k
xk
yk
mk
0
1
2
   
 
1
4
3.4
1 5
4.6
1
4
3.4
    

1
0.8
0.739
1 1.0 1.000
1
0.8
0.739
3


3.217
4.478
3.217


0.718
1.000
0.718
4


3.155
4.437
3.155


0.711
1.000
0.711
5


3.133
4.422
3.133


0.709
1.000
0.709
6


3.126
4.417
3.126


0.708
1.000
0.708
1
4.478
4.437
4.422
4.417
5
4.6


0.708
We conclude that the dominant eigenvalue of A is ≈ 4.4, with corresponding eigenvector ≈ 1.000.
0.708
√ 
2
√
 2 
In fact the dominant eigenvalue is 3 + 2 with eigenspace spanned by  √1 .
2
2
 
1
15. With the given matrix, and using x0 = 1, we get
1
k
xk
yk
mk
0
1
 
 
8
1
2
1
4
1

  
1
1
1 0.25
0.50
1
2


5.75
0.50
2.25


1
0.09
0.39
3


5.26
0.17
1.87


1
0.03
0.36
4


5.10
0.07
1.74


1
0.01
0.34
5


5.04
0.03
1.70


1
0.01
0.34
6


5.02
0.01
1.68


1
 0 
0.33
7


5.01
 0 
1.67


1
 0 
0.33
1
5.75
5.26
5.10
5.04
5.02
5.01
8
8

9
 

5
5
 0   0 
1.67
1.67

 

1
1
 0   0 
0.33
0.33
5
 
3
We conclude that the dominant eigenvalue of A is 5, with corresponding eigenvector 0.
1
 
1
16. With the given matrix, and using x0 = 1, we get
1
k
xk
yk
mk
0
 
2
1
1
 
2
1
1
2
1
2
 


24
11.0
2
 1.5
6
−2.5

 

1
1
0.08  0.14 
0.25
−0.23
24
11
3
4
 

14.18
16.38
 2.45  3.12
−7.91
−11.65

 

1
1
 0.17   0.19 
−0.56
−0.71

14.18
16.38
5

16.38
 3.12
−11.65


1
 0.20 
−0.77

17.41
5
4.5. ITERATIVE METHODS FOR COMPUTING EIGENVALUES
k
6
7
8
 
 

17.80
17.93
17.98
 3.54  3.58  3.59
−14.05
−14.28
−14.36






1
1
1
 0.20 
 0.2 
 0.2 
−0.79
−0.8
−0.8
9

17.99
 3.60
−14.39


1
 0.2 
−0.8

xk
yk
mk
17.8
17.93
313

17.98
10
11
 

18.00
18.00
 3.60  3.60
−14.40
−14.40




1
1
 0.2 
 0.2 
−0.8
−0.8

17.99
18
18


1
We conclude that the dominant eigenvalue of A is 18, with corresponding eigenvector  0.2 , or
−0.8
 
5
 1.
−4
17. With the given A and x0 , we get the values below:
xk
0
1
0
1
7
2
R(xk )
7
7.755
k
2
7.571
2.857
7.823
3
7.755
3.132
7.828
The Rayleigh quotient converges to the dominant eigenvalue, to three decimal places, after three
iterations.
18. With the given A and x0 , we get the values below:
xk
0
1
0
1
3.5
1.5
2
4.143
1.286
3
3.966
1.345
R(xk )
3.5
3.966
3.998
4
k
The Rayleigh quotient converges to the dominant eigenvalue after three iterations.
19. With the given A and x0 , we get the values below:
k
xk
R(xk )
0
 
1
1
1
1
 
21
15
13
2


16.810
12.238
10.714
16.333
16.998
17
The Rayleigh quotient converges to the dominant eigenvalue after two iterations.
20. With the given A and x0 , we get the values below:
k
xk
R(xk )
0
 
1
1
1
1
 
4
5
4
2
 
3.4
4.6
3.4
3


3.217
4.478
3.217
4.333
4.404
4.413
4.414
The Rayleigh quotient converges to the dominant eigenvalue, to three decimal places, after three
iterations.
314
CHAPTER 4. EIGENVALUES AND EIGENVECTORS
21. With the given matrix, and using x0 =
1
, we get
1
yk
0
1
1
1
1
1
5
4
1
0.8
2
3
4
4.8
4.67
4.57
3.2
2.67
2.29
1
1
1
0.67
0.57
0.5
5
6
7
8
4.50
4.44
4.40
4.36
2.00
1.78
1.60
1.45
1
1
1
1
0.44
0.40
0.36
0.33
mk
4.5
4.49
4.46
4.37
k
xk
4.43
4.4
4.34
4.32
4.3
The characteristic polynomial is
4−λ
det(A − λI) =
0
1
= (λ − 4)2 ,
4−λ
so that A has the double eigenvalue 4. To find the corresponding eigenspace,
0 1
0 =
0 0
A − 4I
0
,
0
1
. The values mk appear to be converging to 4, but very slowly;
0
the vectors yk may also be converging to the eigenvector, but very slowly if at all.
so that the eigenspace has basis
1
22. With the given matrix, and using x0 =
, we get
1
yk
0
1
1
1
1
1
4
0
1
0
mk
2
3
k
xk
2
3
3
−1
1
−0.33
4
2.67
−1.33
1
−0.5
2.8
5
2.50
−1.50
1
−0.6
2.6
2.47
6
2.40
−1.60
1
−0.67
7
8
2.33
−1.67
1
−0.71
2.29
−1.71
1
−0.75
2.25
−1.70
1
−0.78
2.32
2.28
2.25
2.38
The characteristic polynomial is
3−λ
det(A − λI) =
−1
1
= λ2 − 4λ + 4 = (λ − 2)2 ,
1−λ
so that A has the double eigenvalue 2. To find the corresponding eigenspace,
A − 2I
0 =
1
−1
1
−1
0
1 1
→
0 0
0
0
0
1
. The values mk appear to be converging to 2, but very slowly;
−1
the vectors yk may also be converging to the eigenvector, but very slowly.
so that the eigenspace has basis
4.5. ITERATIVE METHODS FOR COMPUTING EIGENVALUES
315
 
1
23. With the given matrix, and using x0 = 1, we get
1
k
xk
yk
mk
0
 
1
1
1
 
1
1
1
1
2
 
 
5
4.2
4
3.2
1
0.2
  

1
1
0.8 0.76
0.2
0.05
3


4.05
3.05
0.05


1
0.75
0.01
4


4.01
3.01
0.01


1
0.75
0
5
 
4
3
0


1
0.75
0
3.33
4.05
4.01
4
4
4.03
6
7
 
 
4
4
3
3
0
0

 

1
1
0.75 0.75
0
0
4
8
 
4
3
0


1
0.75
0
4
4
The matrix is upper triangular, so that A has an eigenvalue 1 and a double eigenvalue 4. To find the
eigenspace for λ = 4,




0 0 1 0
0 0
1 0
0 0 → 0 0 0 0
A − 4I 0 = 0 0
0 0 −3 0
0 0 0 0
   
0 
 1
so that the eigenspace has basis 0 , 1 . The values mk converge to 4, while yk converge


0
0


1
apparently to 0.75, which is in the eigenspace.
0
 
1
24. With the given matrix, and using x0 = 1, we get
1
k
xk
yk
mk
0
1
 
 
0
1
6
1
5
1
  

1
0
1  1 
1
0.83
3.67
5.49
2

3
 
4

5
 
7
0
0
0
0
0
5.83 5.71 5.62 5.56 5.50
4.17
3.57
3.12
2.78
2.50

 
 
 
 

0
0
0
0
0
 1   1   1   1   1 
0.71
0.62
0.56
0.50
0.45
5.47
5.36
5.42
 
6

5.45

5.4

5.38
8
 

0
0
5.45 5.42
2.27
2.08

 

0
0
 1   1 
0.42
0.38
5.34
The matrix is upper triangular, so that A has an eigenvalue 0 and a double eigenvalue 5. To find the
eigenspace for λ = 5,




1 0 0 0
−5 0 0 0
A − 5I 0 =  0 0 1 0 → 0 0 1 0
0 0 0 0
0 0 0 0
 
0
so that the eigenspace has basis 1. The values mk appear to converge slowly to 5, while yk also
0
 
0
appear to converge slowly to 1.
0
316
CHAPTER 4. EIGENVALUES AND EIGENVECTORS
1
, we get
1
25. With the given matrix, and using x0 =
yk
0
1
1
1
1
1
1
0
1
0
2
−1
−1
1
1
3
4
1
−1
0
−1
1
1
0
1
mk
1
1
−1
1
k
xk
−1
The characteristic polynomial is
det(A − λI) =
−1 − λ
−1
2
= λ2 + 1,
1−λ
so that A has no real eigenvalues, so the power method cannot succeed.
1
26. With the given matrix, and using x0 =
, we get
1
yk
0
1
1
1
1
1
3
3
1
1
2
3
3
1
1
3
3
3
1
1
mk
1
3
3
3
k
xk
The characteristic polynomial is
2−λ
det(A − λI) =
−2
1
= λ2 − 7λ + 12 = (λ − 3)(λ − 4)
5−λ
so
that the power method converges to a non-dominant eigenvalue. This happened because we chose
1
as the initial value, which is in fact an eigenvector for λ = 3:
1
2 1 1
3
1
=
=3
.
−2 5 1
3
1
27. With the given matrix, and using x0 =
k
xk
yk
mk
0
1
 
 
1
3
1
4
1
3
  

1
0.75
1  1 
1
0.75
1
1
, we get
1
2
 
2.5
4.0
2.5


0.62
 1 
0.62
3


2.25
 4 
2.25


0.56
 1 
0.56
4


2.12
 4 
2.12


0.53
 1 
0.53
5


2.06
 4 
2.06


0.52
 1 
0.52
6


2.03
 4 
2.03


0.51
 1 
0.51
4
4
4
4
4
1
4−λ
1

7
0  = (λ + 12)(λ − 4)(λ − 2)
−5 − λ
4
7
8

 

2.02
2.01
 4   4 
2.02
2.01
 
 
0.5
0.5
1
1
0.5
0.5
4
The characteristic polynomial is

−5 − λ
det(A − λI) =  0
7
4
4.5. ITERATIVE METHODS FOR COMPUTING EIGENVALUES
317
so
that the power method converges to a non-dominant eigenvalue. This happened because we chose
 
1
1 as the initial value. This vector is orthogonal to the eigenvector corresponding to the dominant
1
 
1
eigenvalue λ = −12, which is  0:
−1

  

 
−5 1
7
1
−12
1
 0 4
0  0 =  0 = −12  0 .
7 1 −5 −1
12
−1
1
28. With the given matrix, and using x0 =
, we get
1
k
xk
yk
mk
0
 
1
1
1
 
1
1
1
1
 
0
2
1
 
0
1
0.5
1
0.6
2
3
4
 
−1
−2
0.8
 1   0  0.8
−0.5
−2.5
1.8

   

−1
0.8
0.44
 1   0  0.44
−0.5
1
1

 
1.44

1.49
5
6
 

0
−0.89
0.89  0.89
1
0.11

 

0
−1
0.89  1 
1
0.12

1
0.5
7

8
−2
 0 
−1.88


1
 0 
0.94

1
 1 
1.94


0.52
0.52
1
1.5
1
0.88


Neither mk nor yk shows clear signs of converging to anything. The characteristic polynomial is


1−λ
−1
0
1−λ
0  = (λ − 1)(λ2 − 2x + 2)
det(A − λI) =  1
1
−1
1−λ
so that A has eigenvalues 1 and 1 ± i. The absolute value of each of the complex eigenvectors is 2, so
1 is not the eigenvalue of largest magnitude, and the power method cannot succeed.
29. In Exercise 9, we found that λ1 = 18. To find λ2 , we apply the power method to
−4
12
1
A − 18I =
with x0 =
.
5 −15
1
k
yk
0
1
1
1
1
8
−10
−0.8
1
mk
1
−10
xk
1
2
15.2
−19.0
−0.8
1
3
15.2
−19.0
−0.8
1
−19
−19
Thus λ2 − λ1 = −19, so that λ2 = −1 is the second eigenvalue of A.
30. In Exercise 10, we found that λ1 = −10, so we apply the power method to
k
4
A + 10I =
8
4
1
with x0 =
:
8
1
yk
0
1
1
1
1
1
8
16
0.5
1
2
6
12
0.5
1
3
6
12
0.5
1
mk
1
16
12
12
xk
Thus λ2 − λ1 = 12, so that λ2 = 2 is the second eigenvalue of A.
318
CHAPTER 4. EIGENVALUES AND EIGENVECTORS
31. In Exercise 13, we found that λ1 = 17, so we apply the power method to


 
−8
4
8
1
A − 17I =  4 −2 −4 with x0 = 1 .
8 −4 −8
1
k
xk
yk
0
 
1
1
1
 
1
1
1
1

1
mk
2
3
 

4
18
−2  −9
−4
−18
   
−1
−1
0.5 0.5
1
1

−4
−18

18
 −9
−18
 
−1
0.5
1
−18
Thus λ2 − λ1 = −18, so that λ2 = −1 is the second eigenvalue of A.
32. In Exercise 14, we found that λ1 ≈ 4.4, so we apply the power method to


 
−1.4
1
0
1
−1.4
1  with x0 = 1 .
A − 4.4I =  1
0
1
−1.4
1
k
xk
yk
0
1

  
−0.4
1
1  0.6
−0.4
1

  
−0.667
1
1  1 
−0.667
1
mk
1
2
3
4
5

1.990
−2.814
1.990


−0.707
 1 
−0.707
1.990
1.990
1.933
−2.733 −2.815 −2.814
1.990
1.990
1.933

 
 

−0.707
−0.707
−0.707
 1   1   1 
−0.707
−0.707
−0.707

−2.733
−2.814
0.6

 
 

−2.815
−2.814
Thus λ2 − λ1 ≈ −2.81, so that λ2 ≈ 1.59 is the second eigenvalue of A (in fact, λ2 = 3 −
√
2).
33. With A and x0 as in Exercise 9, proceed to calculate xk by solving Axk = yk−1 at each step to apply
the inverse power method. We get
k
yk
0
1
1
1
1
mk
1
xk
1
2
3
0.798
−0.997
−0.801
1
4
0.8
−1
−0.8
1
5
0.8
−1
−0.8
1
−0.997
−1
−1
0.5
−0.5
−1
1
0.833
−1.056
−0.789
1
−0.5
−1.056
1
We conclude that the eigenvalue of A of smallest magnitude is −1
= −1.
34. With A and x0 as in Exercise 10, proceed to calculate xk by solving Axk = yk−1 at each step to apply
the inverse power method. We get
k
yk
0
1
1
1
1
mk
1
xk
1
2
3
4
5
0.1
0.225
0.256
0.249
0.250
0.4
0.400
0.525
0.495
0.501
0.25
0.562
0.488
0.502
0.5
1
1
1
1
1
0.4
0.4
0.525
0.495
0.501
1
We conclude that the eigenvalue of A of smallest magnitude is 0.5
= 2.
6
0.25
0.50
0.5
1
0.5
4.5. ITERATIVE METHODS FOR COMPUTING EIGENVALUES
319
35. With A as in Exercise 7, and x0 as given, proceed to calculate xk by solving Axk = yk−1 at each step
to apply the inverse power method. We get
k
0
1
 

1
−0.5
 1  0 
−1
0.5
   
1
−1
 1  0
−1
1

xk
yk
−1
mk
2

0.500
 0.333
−0.500


−1
−0.667
1

3

0.500
 0.111
−0.500


−1
−0.222
1

−0.5
0.5
4
5
6
 

0.500
0.50
 0.259  0.16
−0.500
−0.50

 

−1
−1
−0.519 −0.321
1
1

−0.5
−0.5
−0.5
1
We conclude that the eigenvalue of A of smallest magnitude is −0.5
= −2.
36. With A and x0 as in Exercise 14, proceed to calculate xk by solving Axk = yk−1 at each step to apply
the inverse power method. We get
k
xk
yk
0
 
1
1
1
 
1
1
1
1


0.286
0.143
0.286
 
1
0.5
1
1
0.286
mk
2

3
 
4
 

0.357
0.457
0.545
−0.071 −0.371 −0.634
0.357
0.457
0.545

 
 

1
1
−0.859
−0.2 −0.812  1 
1
1
−0.859
0.357
0.457
−0.634
5


−0.511
 0.674
−0.511


−0.758
 1 
−0.758
6


−0.468
 0.645
−0.468


−0.725
 1 
−0.725
0.674
0.645
1
We conclude that the eigenvalue of A of smallest magnitude is ≈ 0.645
≈ 1.55. In fact the eigenvalue
√
is 3 − 2.
37. With α = 0, we are just using the inverse power method, so the solution to Exercise 33 provides the
eigenvalue closest to zero.
1
38. Since α = 0, there is no shift required, so this is the inverse power method. Take x0 = y0 =
, and
1
proceed to calculate xk by solving Axk = yk−1 at each step to apply the inverse power method. We
get
k
0
1
2
3
4
5
6
1
0.125
0.417
0.306
0.340
0.332
0.334
xk
1
0.375
−0.750
−1.083
−0.981
−1.005
−0.999
1
0.333
−0.556
−0.282
−0.346
−0.33
−0.334
yk
1
1
1
1
1
1
1
mk
1
0.375
−0.75
−1.083
−0.981
−1.005
−0.999
1
We conclude that the eigenvalue of A closest to 0 (i.e., the eigenvalue smallest in magnitude) is −1
= −1.
39. We first shift A:

−1
A − 5I = −1
6

0
6
−2
1 .
0 −1
 
1
Now take x0 = y0 = 1, and proceed to calculate xk by solving Axk = yk−1 at each step to apply
1
320
CHAPTER 4. EIGENVALUES AND EIGENVECTORS
the inverse power method. We get
k
0
 
1
1
1
 
1
1
1
xk
yk
1
2
3
 
 

0.2
−0.08
0.32
−0.5 −0.50 −0.50
0.2
−0.08
0.32

 
 

−0.4
0.16
−0.064
 1   1   1 
−0.4
0.16
−0.064

−0.5
1
mk
−0.5
−0.5
1
We conclude that the eigenvalue of A closest to 5 is 5 + −0.5
= 3.
40. We first shift A:


11
4
7
A + 2I =  4 17 −4 .
8 −4 11
 
1
Now take x0 = y0 = 1, and proceed to calculate xk by solving Axk = yk−1 at each step to apply
1
the inverse power method. We get
k
xk
yk
mk
0
1

  
−0.158
1
1  0.158
0.263
1

  
−0.6
1
1  0.6 
1
1
1
0.263
2


−0.832
 0.432
0.853


−0.975
 0.506 
1
3
4

 

−0.999
−0.990
 0.496  0.500
1.000
0.991


 
−0.999
−1
 0.5 
0.5
1
1
0.853
0.991
5
 
−1
0.5
1
 
−1
0.5
1
1
1
We conclude that the eigenvalue of A closest to −2 is −2 + 11 = −1.
41. The companion matrix of p(x) = x2 + 2x − 2 is
2
,
0
1
so we apply the inverse power method with x0 = y0 =
, calculating xk by solving Axk = yk−1 at
1
each step. We get
C(p) =
yk
0
1
1
1
1
0.667
1
mk
1
1.5
k
xk
1
1
1.5
−2
1
2
3
4
5
6
1
1
1
1
1
1.333
1.375
1.364
1.367
1.366
0.75
0.727
0.733
0.732
0.732
1
1
1
1
1
1.333
1.375
1.364
1.367
1
So the root of p(x) closest to 0 is ≈ 1.366
≈ 0.732. The exact value is −1 +
42. The companion matrix of p(x) = x2 − x − 3 is
C(p) =
1
1
3
,
0
1.366
√
3.
4.5. ITERATIVE METHODS FOR COMPUTING EIGENVALUES
321
so we apply the inverse power method to
−1
1
C(p) − 2I =
with x0 = y0 =
3
−2
1
, calculating xk by solving Axk = yk−1 at each step. We get
1
k
xk
yk
0
1
1
1
1
1
5
2
1
0.4
2
3.2
1.4
1
0.438
1
5
3.2
3
4
5
3.312
3.302
3.303
1.438
1.434
1.434
1
1
1
0.434
0.434
0.434
3.303
√
1
≈ 2.303. The exact value is 12 (1 + 13).
So the root of p(x) closest to 2 is ≈ 2 + 3.303
mk
3.312
43. The companion matrix of p(x) = x3 − 2x2 + 1 is

2
C(p) = 1
0
3.302
6
3.303
1.434
1
0.434
3.303

−1
0 ,
0
0
0
1
 
1
so we apply the inverse power method to C(p). If we first start with y0 = 1, we find that this is an
1
eigenvector of C(p) with
eigenvalue
1,
but
this
may
not
be
the
closest
root
to 0, so we start instead
 
1
with with x0 = y0 = 0, calculating xk by solving Axk = yk−1 at each step. We get
1
k
xk
yk
mk
0
 
1
0
1
 
1
0
1
1
k
xk
yk
mk
1

0
 1
−1
 
0
−1
1
2
 
−1
 1
−2


0.5
−0.5
1
−1
−2

3
4
5

 



−0.5
−0.667
−0.6
 1   1 
 1 
0.5
−1.667
−1.6






0.333
0.4
0.375
−0.667
−0.6
−0.625
1
1
1
−1.6
−1.667
−1.6
6


−0.625
 1 
−1.625


0.385
−0.615
1
7


−0.615
 1 
−1.615


0.381
−0.619
1
8


−0.619
 1 
1.619


0.382
−0.618
1
9


−0.618
 1 
−1.618


0.382
−0.618
1
−1.625
−1.615
−1.619
−1.618
1
We conclude that the root of p(x) closest to zero is −1.618
≈= −0.618. The actual root is 12 (1 −
44. The companion matrix of p(x) = x3 − 2x2 + 1 is

5
C(p) = 1
0

−1 −1
0
0 ,
1
0
√
5).
322
CHAPTER 4. EIGENVALUES AND EIGENVECTORS
 
1
so we apply the inverse power method to C(p) − 5I starting with x0 = y0 = 1, calculating xk by
1
solving Axk = yk−1 at each step. We get
k
xk
yk
mk
0
1
  

1
−2.333
1 −0.667
1
−0.333
  

1
1
1 0.286
1
0.143
1
−2.333
2
3
4

 
 

−3.762
−3.909
−3.918
−0.810 −0.825 −0.826
−0.190
−0.175
−0.174






1
1
1
0.215
0.211
0.211
0.051
0.045
0.044
5


−3.919
−0.826
−0.174


1
0.211
0.044
6


−3.919
−0.826
−0.174


1
0.211
0.044
−3.762
−3.919
−3.919
−3.909
−3.918
1
We conclude that the root of p(x) closest to zero is 5 + −3.919
≈= 4.745.
45. First, A−αI is invertible, since α is an eigenvalue of A precisely when A−αI is not invertible. Let x be
an eigenvector for λ. Then Ax = λx, so that Ax − αx = λx − αx, and therefore (A − αI)x = (λ − α)x.
Now left multiply both sides of this equation by (A − αI)−1 , to get
x = (A − αI)−1 (λ − α)x, so that
1
x = (A − αI)−1 x.
λ−α
But the last equation says exactly that λ − α is an eigenvalue of (A − αI)−1 with eigenvector x.
46. If λ is dominant, then |λ| > |γ| for all other eigenvalues γ. But this means that the algebraic multiplicity
of λ is 1, since it appears only once in the list of eigenvalues listed with multiplicity, so its geometric
multiplicity is 1 and thus its eigenspace is one-dimensional.
47. Using rows, the Gerschgorin disks are:
Center 1, radius 1 + 0 = 1,
1 1
Center 4, radius + = 1,
2 2
Center 5, radius 1 + 0 = 1.
Using columns, they are
1
3
+1= ,
2
2
Center 4, radius 1 + 0 = 1,
1
1
Center 5, radius 0 + = :
2
2
Center 1, radius
From this diagram, we see that A has a real eigenvalue between 0 and 2.
4.5. ITERATIVE METHODS FOR COMPUTING EIGENVALUES
323
48. Using rows, the Gerschgorin disks are:
Center 2, radius |−i| + 0 = 1,
Center 2i, radius |1| + |1 + i| = 1 +
√
2,
Center − 2i, radius 0 + 1 = 1.
Using columns, they are
Center 2, radius 1 + 0 = 1,
Center 2i, radius |−i| + 1 = 2,
√
Center − 2i, radius 0 + |1 + i| = 2 :
From this diagram, we conclude that A has an eigenvalue within one unit of −2i.
49. Using rows, the Gerschgorin disks are:
Center 4 − 3i, radius |i| + 2 + 2 = 5,
Center − 1 + i, radius |i| + 0 + 0 = 1,
Center 5 + 6i, radius |1 + i| + |−i| + |2i| = 3 +
Center − 5 − 5i, radius 1 + |−2i| + |2i| = 5.
√
Using columns, they are
Center 4 − 3i,
radius |i| + |1 + i| + 1 = 2 +
Center − 1 + i,
Center 5 + 6i,
radius |i| + |−i| + |−2i| = 4,
radius 2 + 0 + |2i| = 4
Center − 5 − 5i,
radius |−2| + 0 + |2i| = 4.
√
2,
2
324
CHAPTER 4. EIGENVALUES AND EIGENVECTORS
From this diagram, we conclude that A has an eigenvalue within 1 unit of −1 + i.
50. Using rows, the Gerschgorin disks are:
1
,
2
1 1
1
Center 4, radius + = ,
4 4
2
1 1
2
Center 6, radius + =
6 6
3
1
Center 8, radius .
8
Center 2, radius
Using columns, they are
1
,
4
1 1
1
Center 4, radius + = ,
2 6
3
1 1
3
Center 6, radius + =
4 8
8
1
Center 8, radius .
6
Center 2, radius
Since the circles are all disjoint, we conclude that A has four real eigenvalues, near 2, 4, 6, and 8.
51. Let A be a strictly diagonally dominant square matrix. Then inP
particular, none of the diagonal entries
of A can be zero, since each row has the property that |aii | > j6=i |aij |. Therefore each Gerschgorin
disk is centered at a nonzero point, and its radius is less than the absolute value of its center, so the disk
does not contain the origin. Since every eigenvalue is contained in some Gerschgorin disk, it follows
that 0 is not an eigenvalue, so that the matrix is invertible by Theorem 4.17.
4.5. ITERATIVE METHODS FOR COMPUTING EIGENVALUES
325
52. Let λ be any eigenvalue of A. Then by Theorem 4.29, λ is contained in the Gerschgorin disk for some
row i, so that
X
|λ − aii | ≤
|aij | .
j6=i
By the Triangle Inequality, |λ| ≤ |λ − aii | + |aii |, so that
|λ| ≤ |λ − aii | + |aii | ≤ |aii | +
X
|aij | =
j6=i
n
X
|aij | ≤ kAk ,
j=1
since kAk is the maximum of the possible row sums.
53. A stochastic matrix is a square matrix whose entries are probability vectors, which means that the
sum of the entries in any column of A is 1, so the sum of the entries of any row in AT is 1, so that
AT = 1. By Exercise 52, any eigenvalue of AT must satisfy |λ| ≤ AT = 1. But by Exercise 19
in Section 4.3, the eigenvalues of A and of AT are the same, so that if λ is any eigenvalue of A, then
|λ| ≤ 1.
54. Using rows, the Gerschgorin disks are:
Center 0, radius 1,
Center 5, radius 2,
1 1
Center 3, radius + = 1
2 2
3
Center 7, radius .
4
Using columns, they are
Center 0, radius 2 +
1
5
= ,
2
2
Center 5, radius 1,
3
Center 3, radius
4
1
Center 7, radius .
2
From the row diagram, we see that there is exactly one eigenvalue in the circle of radius 1 around 0, so
it must be real and therefore there is exactly one eigenvalue in the interval [−1, 1]. From the column
diagram,
conclude that there is exactly one eigenvalue in the interval [4, 6] and exactly
we15similarly
one in 13
2 , 2 .
Since the matrix has real entries, the fourth eigenvalue must be real as well. From the row diagram,
since the circle around zero contains exactly one eigenvalue, the fourth eigenvalue must be in the
interval 2, 7 34 . But it cannot be larger than 3 43 since then, from the column diagram, it would either
326
CHAPTER 4. EIGENVALUES AND EIGENVECTORS
not bein a disk
or would be in an isolated disk already containing an eigenvalue. So it must lie in the
range 2, 3 34 .
In summary, the four eigenvalues are each in a different one of the four intervals
[−1, 1],
[2, 3.75],
[4, 6],
[6.5, 7.5].
The actual values are ≈ −0.372, ≈ 2.908, ≈ 5.372, and ≈ 7.092.
4.6
Applications and the Perron-Frobenius Theorem
0
1
regular.
1. Since
2
1
1
=
0
0
0
0
, it follows that any power of
1
1
1
contains zeroes. So this matrix is not
0
2. Since the given matrix is upper triangular, any of its powers is upper triangular as well, so will contain
a zero. Thus this matrix is not regular.
3. Since
"
#2 "
7
1
= 92
0
9
1
3
2
3
1
3
2
3
#
,
the square of the given matrix has all positive entries, so this matrix is regular.
4. We have

1
2
1
2
0

1
2
1
2
0

1
2
1
2
0
2 
1
0 1
4

1
0 0 =  4
1
1 0
2
3 
1
0 1
2

1
0 0 =  2
1 0
0
4 
1
0 1
2

1
0 0 =  2
1 0
0
1
0
0
0
0
1
0
0
1

1
2
1
2
0

1
1
4
 1
0  4
1
0
2

5
1
8
 1
0  8
1
0
4
1
0
0
1
2
1
2
0




1
5
2
8


1
=  18
2
1
0
4

1
2
1
2
1
4
1
4
1
2
0
1
9
4
16


5
1
=  16
4
1
1
2
8
1
4
1
4
1
2

5
8
1
8
1
4
so that the fourth power of this matrix contains all positive entries, so this matrix is regular.
5. If A and B are any 3 × 3 matrices with a12 = a32 = b12 = b32 = 0, then
(AB)12 = a11 b12 + a12 b22 + a13 b32 = 0 + 0 + 0 = 0,
(AB)32 = a31 b12 + a32 b22 + a33 b32 = 0 + 0 + 0 = 0,
so that AB has the same property. Thus any power of the given matrix A has this property, so no
power of A consists of positive entries, so A is not regular.
6. Since the third row of the matrix is all zero, the third row of any power will also be all zero, so this
matrix is not regular.
7. The transition matrix P has characteristic equation
det(P − λI) =
1
3 −λ
2
3
1
6
5
6 −λ
=
1
(λ − 1)(6λ − 1),
6
so its eigenvalues are λ1 = 1 and λ2 = 16 . Thus P has two distinct eigenvalues, so is diagonalizable.
Now, this matrix is regular, so the remark at the end of the proof of Theorem 4.33 shows that the
4.6. APPLICATIONS AND THE PERRON-FROBENIUS THEOREM
327
vector x which forms the columns of L is a probability vector parallel to an eigenvector for λ = 1. To
compute this, we row-reduce P − I:
"
#
"
#
h
i
1
1
− 23
0
1
−
0
6
4
→
P −I 0 =
2
− 16 0
0
0 0
3
so that the corresponding eigenspace is
1
E1 = span
.
4
t
Thus the columns of L are of the form
where t + 4t = 5t = 1, so that t = 15 . Therefore
4t
#
"
L=
1
5
4
5
1
5
4
5
.
8. The transition matrix P has characteristic equation
det(P − λI) =
1
2 −λ
1
2
0
1
3
1
2 −λ
1
6
1
6
1
3
1
2 −λ
=−
1
(λ − 1)(36λ2 − 18λ + 1).
36
So one eigenvalue is 1; the roots of the quadratic are distinct, so that P has three distinct eigenvalues,
so is diagonalizable. Now, this matrix is regular (its square has all positive entries), so the remark
at the end of the proof of Theorem 4.33 shows that the vector x which forms the columns of L is a
probability vector parallel to an eigenvector for λ = 1. To compute this, we row-reduce P − I:




1
1
0
− 12
1 0 − 73 0
h
i
3
6




1
P − I 0 =  12 − 21
0 → 0 1 −3 0
3
1
− 12 0
0
0 0
0 0
6
so that the corresponding eigenspace is
 
7
E1 = span 9 .
3
 
7t
1
Thus the columns of L are of the form 9t where 7t + 9t + 3t = 19t = 1, so that t = 19
. Therefore
3t


7 7 7
1 
9 9 9 .
L=
19
3 3 3
9. The transition matrix P has characteristic equation
det(P − λI) =
0.2 − λ
0.6
0.2
0.3
0.1 − λ
0.6
0.4
= −(λ − 1)(λ2 + 0.5λ + 0.08)
0.4
0.2 − λ
So one eigenvalue is 1; the roots of the quadratic are distinct (and complex), so that P has three
distinct eigenvalues, so is diagonalizable. Now, this matrix is regular, so the remark at the end of the
328
CHAPTER 4. EIGENVALUES AND EIGENVECTORS
proof of Theorem 4.33 shows that the vector x which forms the columns of L is a probability vector
parallel to an eigenvector for λ = 1. To compute this, we row-reduce P − I:




1 0 −0.889 0
−0.8
0.3
0.4 0
h
i




0.4 0 → 0 1 −1.037 0
P − I 0 =  0.6 −0.9
0.2
0.6 −0.8 0
0 0
0 0
so that the corresponding eigenspace is


0.889
E1 = span 1.037 .
1


0.889t
Thus the columns of L are of the form 1.037t where 0.889t + 1.037t + t = 2.926t = 1, so that
1t
1
t = 2.926 . Therefore


0.304 0.304 0.304
L = 0.354 0.354 0.354 .
0.341 0.341 0.341
(If the matrix
  is converted to fractions and the computation done that way, the resulting eigenvector
24
will be 28, which must then be converted to a unit vector.)
27
10. Suppose that the sequence of iterates xk approach both u and v, and choose > 0. Choose k such
that |xk − u| < 2 and |xk − v| < 2 . Then
|u − v| = |(u − xk ) + (xk − v)| ≤ |u − xk | + |xk − v| < ,
so that |u − v| is less than any positive . So it must be zero; that is, we must have u = v.
11. The characteristic polynomial is
det(L − λI) =
−λ
0.5
2
= λ2 − 1 = (λ + 1)(λ − 1),
−λ
so the positive eigenvalue is 1. To find a corresponding eigenvector, row-reduce
−1
2 0
1 −2 0
L−I 0 =
→
.
0.5 −1 0
0
0 0
2
Thus E1 = span
.
1
12. The characteristic polynomial is
det(L − λI) =
1−λ
0.5
1.5
= λ2 − λ − 0.75 = (λ − 1.5)(λ + 0.5),
−λ
so the positive eigenvalue is 1.5. To find a corresponding eigenvector, row-reduce
−0.5
1.5 0
1 −3 0
L − 1.5I 0 =
→
.
0.5 −1.5 0
0
0 0
3
Thus E1.5 = span
.
1
4.6. APPLICATIONS AND THE PERRON-FROBENIUS THEOREM
329
13. The characteristic polynomial is
−λ
det(L − λI) = 0.5
0
7
−λ
0.5
4
1
0 = −λ3 + 3.5λ + 1 = − (λ − 2)(2λ2 + 4λ + 1),
2
−λ
so 2 is a positive eigenvalue. The other two roots are
−4 ±
4
√
8
,
both of which are negative. To find an eigenvector for 2, row-reduce
L − 2I

−2
0 = 0.5
0
7
4
−2
0
0.5 −2


0
1
0 → 0
0
0
0
1
0
−16
−4
0

0
0 .
0
 
16
Thus E2 = span  4 .
1
14. The characteristic polynomial is
1−λ
det(L − λI) =
1
3
5
−λ
0
2
3
3
2
1
5
0 = −x3 + x2 + x + = − (x − 2)(3x2 + 3x + 1)
3
3
3
−λ
so 2 is a positive eigenvalue. The other two eigenvalues are complex. To find an eigenvector for 2,
row-reduce




1 0 −18 0
−1
5
3 0
h
i




L − 2I 0 =  13 −2
0 0 → 0 1 −3 0 .
2
0
−2 0
0 0
0 0
3
 
18
Thus E2 = span  3 .
1
15. Since the eigenvector corresponding to the positive eigenvalue λ1 for the Leslie matrix represents a
stable ratio of population classes, one that proportion is reached, if λ1 > 1, the population will grow
without bound; if λ1 = 1, it will be stable, and if λ1 < 1 it will decrease and eventually die out.
16. From the Leslie matrix in Equation (3), we have
b1 − λ
s1
0
det(L − λI) =
0
..
.
0
b2
−λ
s2
0
..
.
b3
0
−λ
s3
..
.
···
···
···
···
..
.
bn−1
0
0
0
..
.
bn
0
0
0 .
..
.
0
0
···
sn−1
−λ
For n = 1, this is just b1 − λ, which is equal to
cL (λ) = (−1)1 (λ1 − b1 λ0 ) = b1 − λ.
330
CHAPTER 4. EIGENVALUES AND EIGENVECTORS
This forms the basis for the induction. Now assume the statement holds for n − 1, and we prove it for
n. Expand the above determinant along the last column, giving
s1
0
n+1
det(L − λI) = (−1)
bn 0
..
.
−λ
s2
0
..
.
0
−λ
s3
..
.
···
···
···
..
.
0
0
0
..
.
0
0
0
···
sn−1
b1 − λ
s1
0
+ (−λ)
0
..
.
0
b2
−λ
s2
0
..
.
b3
0
−λ
s3
..
.
···
···
···
···
..
.
bn−2
0
0
0
..
.
bn−1
0
0
0 .
..
.
0
0
···
sn−2
0
The first matrix is upper triangular, so its determinant is the product of its diagonal entries; for the
second, we use the inductive hypothesis. So
det(L − λI) = (−1)n+1 bn (s1 s2 · · · sn−1 )
− λ (−1)n−1 (λn−1 − b1 λn−2 − b2 s1 λn−3 − · · ·
−bn−2 s1 s2 · · · sn−3 λ − bn−1 s1 s2 · · · sn−2 )
n+1
= (−1)
bn (s1 s2 · · · sn−1 )
+ (−1)n (λn − b1 λn−1 − b2 s1 λn−2 − · · ·
− bn−2 s1 s2 · · · sn−3 λ2 − bn−1 s1 s2 · · · sn−2 λ)
= (−1)n (λn − b1 λn−1 − b2 s1 λn−2 − · · ·
− bn−2 s1 s2 · · · sn−3 λ2 − bn−1 s1 s2 · · · sn−2 λ − bn s1 s2 · · · sn−1 ),
proving the result.
17. P is a diagonal matrix, so inverting it consists of inverting each of the diagonal entries. Now, multiplying
any matrix on the left by a diagonal matrix multiplies each row by the corresponding diagonal entry,
and multiplying on the right by a diagonal matrix multiplies each column by the corresponding diagonal
entry, so


b1 b2
b3 · · ·
bn−1
bn


0
···
0
0
1 0


0 1
0
···
0
0
s1


−1
P L = 0 0
1
···
0
0


s1 s2
.
..
..
..
.. 
..
.

.
.
.
.
.
.
1
0 0
0
· · · s1 s2 ···s
0
n−2
and then

b1

1

0

−1
P LP = 
0
.
.
.
0
b2 s1
0
1
0
..
.
0
b3 s1 s2
0
0
1
..
.
0
···
···
···
···
..
.
···
bn−1 s1 s2 · · · sn−2
0
0
0
..
.
1

bn s1 s2 · · · sn−1

0



0

.
0


..

.

0
But this is the companion matrix of the polynomial
p(x) = xn − b1 xn−1 − b2 s1 xn−2 − · · · − bn−1 s1 s2 · · · sn−2 x − bn s1 s2 · · · sn−1 ,
and by Exercise 32 in Section 4.3, the characteristic polynomial of P −1 LP is (−1)n p(λ). Finally, the
characteristic polynomials of L and P −1 LP are the same since they have the same eigenvalues, so the
characteristic polynomial of L is
det(L − λI) = det(P −1 LP − λI) = (−1)n p(λ)
= (−1)n (λn − b1 λn−1 − b2 s1 λn−2 − · · · − bn−1 s1 s2 · · · sn−2 λ − bn s1 s2 · · · sn−1 )
4.6. APPLICATIONS AND THE PERRON-FROBENIUS THEOREM
331
18. By Exercise 46 in Section 4.4, the eigenvectors of P −1 LP are of the form P −1 x where x is an eigenvalue
of L, and by Exercise 32 in Section 4.3, an eigenvector of P −1 LP for λ1 is
 n−1 
λ1
λn−2

 1n−3 
λ

 1 
 . .
 .. 


 λ1 
1
So an eigenvector of L corresponding to λ1 is

 n−1  
λ1n−1
λ1

λn−2
 
s1 λn−2
1

 1n−3  
n−3

λ
 
s1 s2 λ1

 1  
P . =
.
.
..

 ..  


 
 λ1   s1 s2 · · · sn−2 λ1 
s1 s2 · · · sn−2 sn−1
1
Finally, we can scale this vector by dividing each entry by λn−1
to get
1


1


s
/λ
1
1


2


s
s
/λ
1 2
1



.
..


.


 s1 s2 · · · sn−2 /λn−2

1
n−1
s1 s2 · · · sn−2 sn−1 /λ1
19. Using a CAS, we find the positive eigenvalue of

1
L = 0.7
0
1
0
0.5

3
0
0
to be λ ≈ 1.7456, so this is the steady state growth rate of the population with this Leslie matrix.
From Exercise 18, a corresponding eigenvector is

 

1
1

 ≈ 0.401 .
0.7/1.7456
0.115
0.7 · 0.5/(1.7456)2
So the long-term proportions of the age classes are in the ratios 1 : 0.401 : 0.115.
20. Using a CAS, we find the positive eigenvalue of

0
0.5
L=
0
0
1
0
0.7
0

2 5
0 0

0 0
0.3 0
to be λ ≈ 1.2023, so this is the steady state growth rate of the population with this Leslie matrix.
From Exercise 18, a corresponding eigenvector is

 

1
1

 0.416
0.5/1.2023

 

 0.5 · 0.7/(1.2023)2  ≈ 0.242
0.5 · 0.7 · 0.3/(1.2023)3
0.060
So the long-term proportions of the age classes are in the ratios 1 : 0.416 : 0.242 : 0.06.
332
CHAPTER 4. EIGENVALUES AND EIGENVECTORS
21. Using a CAS, we find the positive eigenvalue of

0 0.4 1.8
0.3 0
0

 0 0.7 0

0 0.9
L=
0
0
0
0

0
0
0
0
0
0
1.8 1.8 1.6
0
0
0
0
0
0
0
0
0
0.9 0
0
0 0.9 0
0
0 0.6

0.6
0

0

0
.
0

0
0
to be λ ≈ 1.0924, so this is the steady state growth rate of the population with this Leslie matrix.
From Exercise 18, a corresponding eigenvector is


 
1
1
 0.275

0.3/1.0924


 
2

 0.176
0.3 · 0.7/(1.0924)





 ≈ 0.145 .
0.3 · 0.7 · 0.9/(1.0924)3





 0.119
0.3 · 0.7 · 0.9 · 0.9/(1.0924)4




 0.3 · 0.7 · 0.9 · 0.9 · 0.9/(1.0924)5  0.098
0.054
0.3 · 0.7 · 0.9 · 0.9 · 0.9 · 0.6/(1.0924)6
The entries of this vector give the long-term proportions of the age classes.
22. (a) The table gives us 13 separate age classes, each of duration two years; from the table, the Leslie
matrix of the system is


0
0.02 0.70 1.53 1.67 1.65 1.56 1.45 1.22 0.91 0.70 0.22 0
0.91
0
0
0
0
0
0
0
0
0
0
0
0


 0
0.88
0
0
0
0
0
0
0
0
0
0
0


 0
0
0.85
0
0
0
0
0
0
0
0
0
0


 0
0
0
0.80
0
0
0
0
0
0
0
0
0


 0
0
0
0
0.74
0
0
0
0
0
0
0
0


0
0
0
0
0.67
0
0
0
0
0
0
0
L=
.
 0

 0
0
0
0
0
0
0.59
0
0
0
0
0
0


 0
0
0
0
0
0
0
0.49
0
0
0
0
0


 0
0
0
0
0
0
0
0
0.38
0
0
0
0


 0
0
0
0
0
0
0
0
0
0.27
0
0
0


 0
0
0
0
0
0
0
0
0
0
0.17
0
0
0
0
0
0
0
0
0
0
0
0
0
0.15 0
Using a CAS, this matrix has positive eigenvalue ≈ 1.333, with a corresponding eigenvector
(normalized to add to 100)


36.1
 24.65 


 16.27 


 10.38 


 6.23 


 3.46 


 1.74  .


 0.77 


 0.28 


 0.08 


 0.016 


 0.0021 
0.00023
4.6. APPLICATIONS AND THE PERRON-FROBENIUS THEOREM
333
(b) In the long run, the percentage of seals in each age group is given by the entries in the eigenvector
in part (a), and the growth rate will be about 1.333.
23. (a) Each term in r is the number of daughters born to a female in a particular age class times the
probability of a given individual surviving to be in that age class, so it represents the expected
number of daughters produced by a given individual while she is in that age class. So the sum
of the terms gives the expected number of daughters ever born to a particular female over her
lifetime.
(b) Define g(λ) as suggested. Then g(λ) = 1 if and only if
λn g(λ) = b1 λn−1 + b2 s1 λn−2 + · · · + bn s1 s2 · · · sn−1 = λn
n
λ − b1 λ
n−1
n−2
− b2 s1 λ
⇒
− · · · − bn s1 s2 · · · sn−1 = 0.
But from Exercise 16, this is true if and only if λ is a zero of the characteristic polynomial of the
Leslie matrix, indicating that λ is an eigenvalue of L. So r = g(1) = 1 if and only if λ = 1 is an
eigenvalue of L, as desired.
(c) For given values of bi and si , it is clear that g(λ) is a decreasing function of λ. Also, r = g(1) and
g(λ1 ) = 1. So if λ1 > 1, then r = g(1) > g(λ1 ) = 1 so that r > 1. On the other hand, if λ1 < 1,
then r = g(1) < g(λ1 ) = 1, so that r < 1. So we have shown that if the population is decreasing,
then r < 1; if the population is increasing, then r > 1, and (part (b)) if the population is stable,
then r = 1. Since this exhausts the possibilities, we have proven the converses as well. Thus r < 1
if and only if the population is decreasing and r > 1 if and only if the population is increasing.
Note that this corresponds to our intuition, since if the net reproduction rate is less than one, we
would expect the population in the next generation to be smaller.
24. If λ1 is the unique positive eigenvalue, then λ1 x = Lx for the corresponding eigenvector x. If h is the
1
1
x. Thus λ1 = 1−h
, so that
sustainable harvest ratio, then (1 − h)Lx = x, which means that Lx = 1−h
1
h = 1 − λ1 .
25. (a) Exercise 21 found the positive eigenvalue of L to be λ1 ≈ 1.0924, so from Exercise 24, h = 1− λ11 ≈
1
≈ 0.0846.
1 − 1.0924
(b) Reducing the initial population levels by 0.0846 gives
  

10
9.15
 2   1.831 
  

 8   7.323 




 

x00 = (1 − 0.0846) 
 5  =  4.577  .
12 10.985
  

0  0 
1
0.915
Then

0
0.3

0

0
0
x1 = Lx0 = 
0
0

0
0
0.4
0
0.7
0
0
0
0
1.8 1.8 1.8 1.6
0
0
0
0
0
0
0
0
0.9 0
0
0
0 0.9 0
0
0
0 0.9 0
0
0
0 0.6

 

0.6
9.15
42.47

 

0
  1.831   2.75 




0   7.323   1.28 


 

0
  4.577  =  6.59  .




0  10.985  4.12 

0   0   9.89 
0
0.915
0
The population has still grown substantially, because we did not start with a stable population
334
CHAPTER 4. EIGENVALUES AND EIGENVECTORS
vector. If we repeat with the stable population ratios computed in Exercise 21, we get

 

0.915
1
0.275 0.252

 

0.176 0.161





 
x00 = (1 − 0.0846) 
0.145 = 0.133 ,
0.119 0.109

 

0.098 0.090
0.049
0.054
and

0
0.3

0

x01 = Lx00 = 
0
0

0
0
0.4
0
0.7
0
0
0
0
1.8 1.8 1.8 1.6
0
0
0
0
0
0
0
0
0.9 0
0
0
0 0.9 0
0
0
0 0.9 0
0
0
0 0.6


 
0.999
0.6 0.915

 

0
 0.252 0.275
0.161 0.176
0

 


 

0
 0.133 = 0.145 ,
0.109 0.119
0


 
0  0.090 0.098
0.053
0.049
0
which is essentially x0 .
1
≈ 0.25, which is about one quarter of the population.
26. Since λ1 ≈ 1.333, we have h = 1 − λ11 = 1 − 1.333
27. Suppose λ1 = r1 (cos 0 + i sin 0); then r1 = λ1 > 0 since λ1 is a positive eigenvalue. Let λ = r(cos θ +
i sin θ) be some other eigenvalue. Then g(λ) = g(λ1 ) = 1. Computing g(λ) gives
b1
b2 s1
bn s1 s2 · · · sn−1
+
+ ··· + n
r(cos θ + i sin θ) r2 (cos θ + i sin θ)2
r (cos θ + i sin θ)n
b2 s1
b2 s1 s2 · · · sn−1
b1
(cos θ − i sin θ)n ,
= (cos θ − i sin θ) + 2 (cos θ − i sin θ)2 + · · · +
r
r
rn
1 = g(λ) =
k
θ−i sin θ)
where we get to the final equation by multiplying each fraction by (cos
= 1 and using the
(cos θ−i sin θ)k
identity (cos θ + i sin θ)(cos θ − i sin θ) = cos2 θ + sin2 θ = 1. Now apply DeMoivre’s theorem to get
1 = g(λ) =
b2 s1
b2 s1 s2 · · · sn−1
b1
(cos θ − i sin θ) + 2 (cos 2θ − i sin 2θ) + · · · +
(cos nθ − i sin nθ).
r
r
rn
Since g(λ) = 1, the imaginary parts must all cancel, so we are left with
1 = g(λ) =
b1
b2 s1
b2 s1 s2 · · · sn−1
cos θ + 2 cos 2θ + · · · +
cos nθ.
r
r
rn
But cos kθ ≤ 1 for all k, and bi , si , and r are all nonnegative, so
1 = g(λ) ≤
b1
b2 s1
b2 s1 s2 · · · sn−1
+ 2 + ··· +
.
r
r
rn
Since g(λ1 ) = 1 as well, we get
b1
b2 s1
b2 s1 s2 · · · sn−1
b1
b2 s1
b2 s1 s2 · · · sn−1
+ 2 + ··· +
≤
+ 2 + ··· +
.
n
r1
r1
r1
r
r
rn
Then since r1 > 0, we get r1 ≥ r.
28. The characteristic polynomial of A is
det(A − λI) =
2−λ
1
0
= (λ − 2)(λ − 1),
1−λ
4.6. APPLICATIONS AND THE PERRON-FROBENIUS THEOREM
335
so that the Perron root is λ1 = 2. To find its Perron eigenvector,
0
0 0
1 −1 0
A − 2I 0 =
→
,
1 −1 0
0
0 0
1
so that
is an eigenvector for λ1 = 2. The entries of the Perron eigenvector must sum to 1, however,
1
" #
so dividing through by 2 gives for the Perron eigenvector
1
2
1
2
.
29. The characteristic polynomial of A is
det(A − λI) =
1−λ
2
3
= (1 − λ)(−λ) − 6 = λ2 − λ − 6 = (λ − 3)(λ + 2),
−λ
so that the Perron root is λ1 = 3. To find its Perron eigenvector,
"
#
"
h
i
−2
3 0
1 − 23
→
A − 3I 0 =
2 −3 0
0
0
#
0
,
0
3
so that
is an eigenvector for λ1 = 3. The entries of the Perron eigenvector must sum to 1, however,
2
" #
so dividing through by 5 gives for the Perron eigenvector
3
5
2
5
.
30. The characteristic polynomial of A is
det(A − λI) =
−λ
1
1
1
−λ
1
1
1 = −(λ + 1)2 (λ − 2),
−λ
so that the Perron root is λ1 = 2. To find its Perron eigenvector,



1 0
−2
1
1 0
h
i



1 0 → 0 1
A − 2I 0 =  1 −2
1
1 −2 0
0 0
−1
−1
0

0

0 ,
0
 
1
so that 1 is an eigenvector for λ1 = 2. The entries of the Perron eigenvector must sum to 1, however,
1
 
1
3
 
so dividing through by 3 gives for the Perron eigenvector  13 .
1
3
31. The characteristic polynomial of A is
det(A − λI) =
2−λ
1
1
1
1
1−λ
0
= −λ(λ − 1)(λ − 3),
0
1−λ
so that the Perron root is λ1 = 3. To find its Perron eigenvector,



−1
1
1 0
1 0
h
i



A − 3I 0 =  1 −2
0 0 → 0 1
1
0 −2 0
0 0
−2
−1
0

0

0 ,
0
336
CHAPTER 4. EIGENVALUES AND EIGENVECTORS
 
2
so that 1 is an eigenvector for λ1 = 3. The entries of the Perron eigenvector must sum to 1, however,
1
 
1
2
 
so dividing through by 4 gives for the Perron eigenvector  14 .
1
4
32. Here n = 4, and

1

0
(I + A)n−1 = 
0
1
0
1
1
0
3 
1
0
3
1
 =
1
0
3
1
1
0
1
0
3
1
3
1

1
3
 > O,
3
1
3
1
1
3
so A is irreducible.
33. Here n = 4, and

1

0
(I + A)n−1 = 
1
1
0
1
0
1
1
1
1
0
3 
0
4
6
1
 =
4
0
1
6
0
4
0
4
4
6
4
6

0
4
,
0
4
so A is reducible. Interchanging rows 1 and 4 and then columns 1 and 4 produces a matrix in the
required form, so that
 


 
0 1 0 1
0 0 0 1
0 0 0 1
0 1 0 0 0 1 0 0 1 0 1 0

 
 

0 0 1 0 A 0 0 1 0 = 0 0 0 1 .
0 0 1 0
1 0 0 0
1 0 0 0
34. Here n = 5, and

1
0

(I + A)n−1 = 
1
0
1
1
1
0
0
0
0
1
2
1
0
0
0
0
2
0
4 
11
0
22
1



1
 = 28
23

0
6
1
6
11
16
7
6
11
17
23
33
5
0
0
0
16
0

11
17

22
,
18
6
so that A is reducible. Exchanging rows 1 and 4 and then exchanging columns 1 and 4 produces a
matrix in the required form, so that

 
 

0 0 0 1 0
0 0 0 1 0
1 0 1 0 0
0 1 0 0 0 0 1 0 0 0 0 0 1 0 1

 
 

0 0 1 0 0 A 0 0 1 0 0 = 0 0 1 1 1 ,

 
 

1 0 0 0 0 1 0 0 0 0 0 1 0 0 0
0 0 0 0 1
0 0 0 0 1
0 0 0 1 0
and B is a 1 × 1 matrix while D is 4 × 4.
35. Here n = 5, and

1
0

(I + A)n−1 = 
1
0
0
so that A is irreducible.
1
1
0
0
0
0
0
1
1
0
0
0
0
1
1
4 
0
1
1
1



1
 = 5

6
0
2
5
4
1
6
4
1
1
5
6
5
11

5 11
11 16

12 21
,
6 12
16 22
4.6. APPLICATIONS AND THE PERRON-FROBENIUS THEOREM
337
36. (a) Let G be a graph and A its adjacency matrix. Let G0 be the graph obtained from G by adding
an edge from each vertex to itself if no such edge exists in G. Then G0 is connected if and only if
G is, since every vertex is always connected to itself. The adjacency matrix A0 of G results from
A by replacing every 0 on the diagonal by a 1. Now consider the matrix A + I. This matrix is
identical to A0 except that some diagonal entries of A0 equal to 1 may be equal to 2 in A + I (if
the vertex in question had an edge from it to itself in G). But since all entries in A + I and A0
are nonnegative, (A0 )n−1 > O if and only (A + I)n−1 > O — whether those diagonal entries are
1 or 2 does not affect whether a particular entry in the power is zero or not. So G is connected
if and only if G0 is connected, which is the case if and only if (A0 )n−1 of its adjacency matrix is
> O, so that there is a path between any two vertices of length n − 1 (note that any shorter path
can be made into a path of length n − 1 by adding traverses of the edge from the vertex to itself).
But (A0 )n−1 > O then means that (A + I)n−1 > O, so that A is irreducible.
(b) Since all the graphs in Section 4.0 are connected, they all have irreducible adjacency matrices.
The graph in Figure 4.1 has a primitive adjacency matrix, since
2 


3 2 2 2
0 1 1 1
2 3 2 2
1 0 1 1



A2 = 
1 1 0 1 = 2 2 3 2 > O.
2 2 2 3
1 1 1 0
So does the graph in Figure 4.2:

0 1 0 0 1 0 1 0
1 0 1 0 0 0 0 1

0 1 0 1 0 0 0 0

0 0 1 0 1 0 0 0

1 0 0 1 0 1 0 0
4
A =
0 0 0 0 1 0 0 1

1 0 0 0 0 0 0 0

0 1 0 0 0 1 0 0

0 0 1 0 0 1 1 0
0 0 0 1 0 0 1 1
0
0
1
0
0
1
1
0
0
0
4 
15
0
4
0


9
0


9
1



0
 =4
9
0


4
1


9
1


9
0
9
0
4
15
4
9
9
9
9
4
9
9
9
4
15
4
9
9
9
9
4
9
9
9
4
15
4
9
9
9
9
4
The graph in Figure 4.3 has a primitive adjacency matrix:
4 

6 1 4
0 1 0 0 1
1 6 1
1 0 1 0 0





A4 = 
0 1 0 1 0 = 4 1 6
4 4 1
0 0 1 0 1
1 4 4
1 0 0 1 0
4
4
1
6
1
4
9
9
4
15
4
9
9
9
9
9
9
9
9
4
15
9
4
4
9
4
9
9
9
9
9
15
9
4
4
9
4
9
9
9
4
9
15
9
4
9
9
4
9
9
4
4
9
15
9

9
9

9

4

9
>O
9

4

4

9
15

1
4

4
 > O.
1
6
However, the adjacency matrix for the graph is Figure 4.4 is


0 0 0 1 1 1
0 0 0 1 1 1


0 0 0 1 1 1

,
A=

1 1 1 0 0 0
1 1 1 0 0 0
1 1 1 0 0 0
and any power of A will always have a 3 × 3 block of zeroes in either the upper left or upper right,
so A is not primitive.
37. (a) Suppose G is bipartite with disjoint vertex sets U and V and A is its adjacency matrix. Recall
that (Ak )ij gives the number of paths between vertices i and j. Also, from Exercise 80 in Section
3.7, if u ∈ U and v ∈ V , then any path from u to itself will have even length, and any path from
338
CHAPTER 4. EIGENVALUES AND EIGENVECTORS
u to v will have odd length, since in any path, each successive edge moves you from U to V and
back. So odd powers of A will have zero entries along the diagonal, while even powers of A will
have zero entries in any cell corresponding to a vertex in U and a vertex in V . Since every power
of A has a zero entry, A is not primitive.
(b) Number the vertices so that v1 , v2 , . . . , vk are in U and vk+1 , vk+2 , . . . , vn are in V . Then the
adjacency matrix of G is
O B
A=
.
BT O
v
Write the eigenvector v corresponding to λ as 1 , where v1 is a k-element column vector and
v2
v2 is an (n − k)-element column vector. Then
λv1
T
Av = λv ⇐⇒ Bv2 B v1 =
,
λv2
so that Bv2 = λv1 and B T v1 = λv2 . Then
−Bv2
v1
−λv1
−v1
−λ
=
=
=
A
.
−v2
λv2
v2
B T v1
This shows that −λ is an eigenvalue of A with corresponding eigenvector
−v1
.
v2
38. (a) Since each vertex of G connects to k other vertices, we see that each column of the adjacency
matrix A of G sums to k. Thus A = kP , where P is a matrix each of whose columns sums to
1. So P is a transition matrix, and by Theorem 4.30, 1 is an eigenvalue of P , so that k is an
eigenvalue of kP = A.
(b) If A is primitive, then Ar > O for some r, and therefore P r > O as well. Therefore P is regular,
so by Theorem 4.31, every other eigenvalue of P has absolute value strictly less than 1, so that
every other eigenvalue of A = kP has absolute value strictly less than k.
39. This exercise is left to the reader after doing the exploration suggested in Section 4.0.
40. (a) Let A = [aij ]. Then cA = [caij ], so that
|cA| = [|caij |] = [|c| · |aij |] = |c| [|aij |] = |c| · |A| .
(b) Using the Triangle Inequality,
|A + B| = [|aij + bij |] ≤ [|aij | + |bij |] = [|aij |] + [|bij |] = |A| + |B| .
(c) Ax is the n × 1 matrix whose ith entry is
ai1 x1 + ai2 x2 + · · · + ain xn .
Thus the i
th
entry of |Ax| is
|(Ax)i | = |ai1 x1 + ai2 x2 + · · · + ain xn |
≤ |ai1 x1 | + |ai2 x2 | + · · · + |ain xn |
= |ai1 | |x1 | + |ai2 | |x2 | + · · · + |ain | |xn | .
But this is just the ith entry of |A| |x|.
(d) The ij entry of |AB| is
|AB|ij = |ai1 b1j + ai2 b2j + · · · + ain bnj |
≤ |ai1 b1j | + |ai2 b2j | + · · · + |ain bnj |
= |ai1 | |b1j | + |ai2 | |b2j | + · · · + |ain | |bnj | ,
which is the ij entry of |A| |B|.
4.6. APPLICATIONS AND THE PERRON-FROBENIUS THEOREM
339
41. A matrix is reducible if, after permuting rows and columns according to the same permutation, it can
be written
B C
,
O D
where B and D are square and O is the zero matrix. If A is 2 × 2, then B, C, O, and D must all be
1 × 1 matrices (which are therefore just the entries of A). If A is in the above form after applying no
permutations, leaving A alone, then a21 = 0. If A is in this form after applying the only other possible
permutation — exchanging the two rows and the two columns — then the resulting matrix is
a22 a21
,
a12 a11
so that a12 = 0. This proves the result.
42. (a) If λ1 and v1 are the Perron eigenvalue and eigenvector respectively, then by Theorem 4.37, λ1 > 0
and v1 > 0. By Exercise 22 in Section 4.3, λ1 − 1 is an eigenvalue, with eigenvector v1 , for
A − I, and therefore 1 − λ1 is an eigenvalue for I − A, again with eigenvector v1 . Since I − A is
1
is an eigenvalue for (I − A)−1 , again with eigenvector v1 .
invertible, Theorem 4.18 says that 1−λ
1
But (I − A)−1 ≥ 0 and v1 ≥ 0, so that
1
v1 = (I − A)−1 v1 ≥ 0.
1 − λ1
1
≥ 0 and thus λ1 < 1. Since we also have λ1 > 0, the result follows: 0 < λ1 < 1.
It follows that 1−λ
1
(b) Since 0 < λ1 < 1, v1 > λ1 v1 = Av1 .
43. x0 = 1, x1 = 2x0 = 2, x2 = 2x1 = 4, x3 = 2x2 = 8, x4 = 2x3 = 16, x5 = 2x4 = 32.
44. a1 = 128, a2 = a21 = 64, a3 = a22 = 32, a4 = a23 = 16, a5 = a24 = 8.
45. y0 = 0, y1 = 1, y2 = y1 − y0 = 1, y3 = y2 − y1 = 0, y4 = y3 − y2 = −1, y5 = y4 − y3 = −1.
46. b0 = 1, b1 = 1, b2 = 2b1 + b0 = 3, b3 = 2b2 + b1 = 7, b4 = 2b3 + b2 = 17, b5 = 2b4 + b3 = 41.
47. The recurrence is xn −3xn−1 −4xn−2 = 0, so the characteristic equation is λ2 −3λ−4 = (λ−4)(λ+1) = 0.
Thus the system has eigenvalues −1 and 4, so xn = c1 (−1)n + c2 · 4n , where c1 and c2 are constants.
Using the initial conditions, we have
0 = x0 = c1 (−1)0 + c2 · 40 = c1 + c2 ,
5 = x1 = c1 (−1)1 + c2 · 41 = −c1 + 4c2 .
Add the two equations to solve for c2 , then substitute back in to get c1 = −1 and c2 = 1, so that
xn = −(−1)n + 4n = 4n + (−1)n−1 .
48. The recurrence is xn −4xn−1 +3xn−2 = 0, so the characteristic equation is λ2 −4λ+3 = (λ−3)(λ−1) = 0.
Thus the system has eigenvalues 1 and 3, so xn = c1 · 1n + c2 · 3n , where c1 and c2 are constants. Using
the initial conditions, we have
0 = x0 = c1 · 10 + c2 · 30 = c1 + c2 ,
1 = x1 = c1 · 11 + c2 · 31 = c1 + 3c2 .
Subtract the two equations to solve for c2 , then substitute back in to get c1 = − 21 and c2 = 21 , so that
xn = − 12 · 1n + 12 · 3n = 12 (3n − 1).
49. The recurrence is yn − 4yn−1 + 4yn−2 = 0, so the characteristic equation is λ2 − 4λ + 4 = (λ − 2)2 = 0.
Thus the system has the double eigenvalue 2, so yn = c1 · 2n + c2 · n2n , where c1 and c2 are constants.
Using the initial conditions, we have
1 = y1 = c1 · 21 + c2 · 1 · 21 = 2c1 + 2c2 ,
6 = y2 = c1 · 22 + c2 · 2 · 22 = 4c1 + 8c2 .
Solving the system gives c1 = − 12 and c2 = 1, so that yn = − 21 · 2n + n · 2n = n2n − 2n−1 .
340
CHAPTER 4. EIGENVALUES AND EIGENVECTORS
50. The recurrence is an −an−1 + 14 an−2 = 0, so the characteristic equation is 14 (4λ2 −4λ+1) = 14 (2λ−1)2 =
0. Thus the system has the double eigenvalue 12 = 2−1 , so an = c1 · 2−n + c2 · n2−n , where c1 and c2
are constants. Using the initial conditions, we have
4 = a0 = c1 · 2−0 + c2 · 0 · 2−0 = c1 ,
1
1
1 = a1 = c1 · 2−1 + c2 · 1 · 2−1 = c1 + c2 .
2
2
Solving the system gives c1 = 4 and c2 = −2, so that an = 4 · 2−n − 2n · 2−n = 22−n − n21−n .
51. The recurrence is bn − 2bn−1 − 2bn−2 = 0, so the characteristic
equation is λ2 − 2λ − 2 = 0. Using
√
3,
which
are therefore the eigenvalues. So
the quadratic√formula, this equation
has
roots
λ
=
1
±
√
bn = c1 · (1 + 3)n + c2 · (1 − 3)n , where c1 and c2 are constants. Using the initial conditions, we have
√
√
0 = b0 = c1 · (1 + 3)0 + c2 · (1 − 3)0 = c1 + c2 ,
√
√
√
1 = b1 = c1 · (1 + 3)1 + c2 · (1 − 3)1 = c1 + c2 + (c1 − c2 ) 3.
√
√
From the first equation, c1 + c2 = 0, so the second becomes 1 = (c1 − c2 ) 3, so that c1 − c2 = 33 .
√
√
Since c2 = −c1 , we get c1 = 63 and c2 = − 63 , so that the recurrence is
√
√
√ √ n
√ n
√
√ 3
3
3
(1 + 3) −
(1 − 3) =
(1 + 3)n − (1 − 3)n .
bn =
6
6
6
52. The recurrence is yn − yn−1 + yn−2 = 0, so the characteristic
equation is λ2 − λ + 1 = 0. Using the
√
1
quadratic formula, this equation has roots λ = 2 (1 ± 3 i), which are therefore the eigenvalues. So
√ n
√ n
yn = c1 · 21 (1 + 3 i) +c2 · 12 (1 − 3 i) , where c1 and c2 are constants. Using the initial conditions,
we have
0
0
√
√
1
1
0 = y0 = c1 ·
(1 + 3 i) + c2 ·
(1 − 3 i) = c1 + c2 ,
2
2
1
1
√
√
√
√
1
1
1
1
1 = y1 = c1 ·
(1 + 3 i) + c2 ·
(1 − 3 i) = c1 (1 + 3 i) + c2 (1 − 3 i).
2
2
2
2
√
√
From the first equation, c1 + c2 = 0, so the second becomes 2 = (c1 − c2 ) 3 i, so that c1 − c2 = − 2 3 3 i.
√
√
Since c2 = −c1 , we get c1 = − 33 i and c2 = 63 , so that the recurrence is
√ n √ n
√
√
3 1
3 1
i
(1 + 3 i)
+
i
(1 − 3 i) .
yn = −
3
2
3
2
53. Working with the matrix A from the Theorem, since it has distinct eigenvalues λ1 6= λ2 , it can be
diagonalized, so that there is some invertible P such that
e f
λ1 0
−1
P =
,
P AP = D =
.
g h
0 λ2
First consider the case where one of the eigenvalues, say λ2 , is zero. In this case the matrix must be
of the form
a 0
A=
,
1 0
so that the other eigenvalue is a. In this case, the recurrence relation is xn = axn−1 + 0xn−2 = axn−1 ,
so if we are given xj and xj−1 as initial conditions, we have xn = an−j xj . So take c1 = xj a−j and
c2 = 0; then xn = c1 λn1 + c2 λn2 as desired. Now suppose that neither eigenvalue is zero. Then
1
1
e f λk1 0
h −f
ehλk1 − f gλk2 −ef λk1 + ef λk2
Ak = P Dk P −1 =
=
.
0 λk2 −g e
eh − f g g h
eh − f g ghλk1 − ghλk2 −f gλk1 + ehλk2
4.6. APPLICATIONS AND THE PERRON-FROBENIUS THEOREM
341
If we are given two terms of the sequence, say xj−1 and xj , then for any n we have
xn
xn =
= An−j xj
xn−1
1
xj
+ ef λn−j
−ef λn−j
− f gλn−j
ehλn−j
2
1
2
1
=
xj−1
eh − f g ghλ1n−j − ghλn−j
+ ehλn−j
−f gλn−j
2
1
2


xj−1
xj + −ef λn−j
+ ef λn−j
ehλn−j
− f gλn−j
1
1
2
1
2


=
n−j
n−j
n−j
eh − f g
x
x
+
−f
gλ
+
ehλ
ghλn−j
−
ghλ
j−1
j
1
2
1
2


−j
λn
(ehxj − ef xj−1 ) λ1 λn1 + (−f gxj + ef xj−1 ) λ−j
1
2
2.

=
eh − f g
λn2
λn1 + (−ghxj + ehxj−1 ) λ−j
(ghxj − f gxj−1 ) λ−j
2
1
Looking at the first row, we see that for
c1 =
ehxj − ef xj−1 −j
λ1 ,
eh − f g
c2 =
−f gxj + ef xj−1 −j
λ2 ,
eh − f g
c1 and c2 are independent of n, and we have xn = c1 λn1 + c2 λn2 , as required.
54. (a) First suppose that the eigenvalues are distinct, i.e., that λ1 6= λ2 . Then Exercise 53 gives one
formula for c1 and c2 involving the diagonalizing matrix P . But since we now know that c1 and c2
exist, we can derive a simpler formula. We know the solutions are of the form xn = c1 λn1 + c2 λn2 ,
valid for all n. Since x0 = r and x1 = s, we get the following linear system in c1 and c2 :
λ01 c1 + λ02 c2 = r
or
c1 +
λ11 c1 + λ12 c2 = s
c2 = r
λ1 c1 + λ2 c2 = s.
Solving this system, for example by setting up the augmented matrix and row-reducing, gives a
unique solution
s − λ1 r
s − λ1 r
c1 = r −
,
c2 =
.
λ2 − λ1
λ2 − λ1
So the system has a unique solution in terms of x0 = r, x1 = s, λ1 , and λ2 . Note that the system
has a unique solution (and these expressions make sense) since λ1 6= λ2 .
Next suppose that the eigenvalues are the same, say λ1 = λ2 = λ. Then the solutions are of
the form xn = c1 λn + c2 nλn . Note that we must have λ 6= 0, since otherwise the characteristic
polynomial of the recurrence matrix is λ2 , so the recurrence relation is xn = 0. With x0 = r and
x1 = s, we get the following linear system in c1 and c2 :
λ0 c1 + 0 · λ0 c2 = r
1
1
λ c1 + 1 · λ c2 = s
or
c1
=r
λc1 + λc2 = s.
Thus c1 = r, and substituting into the second equation gives (since λ 6= 0) c2 = s−λr
λ . So in this
case as well we can find the scalars c1 and c2 uniquely. Finally, note that 0 is an eigenvalue of the
recurrence matrix only when b = 0 in the recurrence relation xn = axn−1 + bxn−2 .
(b) From part (a), the general solution to the recurrence is
s − λ1 r
s − λ1 r
n
n
n
x n = c1 λ 1 + c2 λ 2 = r −
λ1 +
λn2 .
λ2 − λ1
λ2 − λ1
Substituting x0 = r = 0 and x1 = s = 1 gives
1 − λ1 · 0
1 − λ1 · 0
1
1
xn = 0 −
λn1 +
λn2 =
(−λn1 + λn2 ) =
(λn − λn2 ).
λ2 − λ1
λ2 − λ1
λ2 − λ1
λ1 − λ2 1
342
CHAPTER 4. EIGENVALUES AND EIGENVECTORS
55. (a) For n = 1, since f0 = 0 and f1 = f2 = 1, the equation holds:
1 1
f f1
A1 = A =
= 2
.
1 0
f1 f0
Now assume that the equation is true for n = k, and consider Ak+1 :
f
fk
1 1
f
+ fk fk+1
f
Ak+1 = Ak A = k+1
= k+1
= k+2
fk
fk−1 1 0
fk + fk−1
fk
fk+1
fk+1
fk
as desired.
n
(b) Since det A = −1, it follows that det(An ) = (det A) = (−1)n . But from part (a),
2
det(An ) = fn+2 fn − fn+1
= (−1)n .
(c) If we look at the 5 × 13 “rectangle” as being composed of two “triangles”, we see that in fact they
are not triangles: the “diagonal” of the “rectangle” is not straight because the 8 × 3 triangular
5
6= 38 . So there is empty space along the
pieces are not similar to the entire triangles, since 13
diagonal of the figure that adds up to the missing 1 square unit.
56. (a) t1 = 1 (the only way is using the 1 × 1 rectangle); t2 = 3 (two 1 × 1 rectangles or either of the
1 × 2 rectangles); t3 = 5; t4 = 11; t5 = 21. For t0 , we can think of that as the number of ways to
tile an empty rectangle; there is only one way — using no rectangles.
(b) The rightmost rectangle in a tiling of a 1 × n rectangle is either the 1 × 1 or one of the 1 × 2
rectangles. So the number of tilings of a 1 × n rectangle is the number of tilings of a 1 × (n − 1)
rectangle (adding the 1 × 1) plus twice the number of tilings of a 1 × (n − 2) rectangle (adding
either 1 × 2). So the recurrence relation is
tn = tn−1 + 2tn−2 .
(c) The characteristic equation is λ2 − λ − 2 = (λ − 2)(λ + 1), so the eigenvalues are −1 and 2. To
find the constants, we solve
t1 = 1 = c1 (−1)1 + c2 · 21 = −c1 + 2c2
t2 = 3 = c1 (−1)2 + c2 · 22 = c1 + 4c2 .
This gives c1 = 31 and c2 = 23 , so that
tn =
2
1
1
(−1)n + · 2n =
(−1)n + 2n+1 .
3
3
3
Evaluating for various values of t gives
1
(−1)0 + 2 = 1,
3
1
t2 =
(−1)2 + 23 = 3,
3
1
t4 =
(−1)4 + 25 = 11,
3
t0 =
1
(−1)1 + 22 = 1,
3
1
t3 =
(−1)3 + 24 = 5,
3
1
t5 =
(−1)5 + 26 = 21.
3
t1 =
These agree with part (a).
57. (a) d1 = 1, d2 = 2, d3 = 3, d4 = 5, d5 = 8. As in Exercise 56, d0 can be interpreted as the number of
ways to cover a 2 × 0 rectangle, which is one — use no dominos.
(b) The right end of any tiling of a 2 × n rectangle either consists of a domino laid vertically, or of
two dominos each laid horizontally. Removing those leaves either a 2 × (n − 1) or a 2 × (n − 2)
rectangle. So the number of tilings of a 2 × n rectangle is equal to the number of tilings of a
2 × (n − 1) rectangle plus the number of tilings of a 2 × (n − 2) rectangle, or
dn = dn−1 + dn−2 .
4.6. APPLICATIONS AND THE PERRON-FROBENIUS THEOREM
343
(c) We recognize the recurrence as the recurrence for the Fibonacci sequence; the initial values shift
it by 1, so we get
√ !n+1
√ !n+1
1
1
1+ 5
1− 5
dn = √
−√
.
2
2
5
5
Computing values using this formula gives the same values as in part (a), including the value for
d0 .
58. Row-reducing gives
h
h
√
A − 1+2 5 I
√
A − 1−2 5 I
"
√
i
1 − 1+2 5
0 =
1
"
√
i
1 − 1−2 5
0 =
1
So the eigenvectors are
"
v1 =
√ #
1+ 5
2
1
"
1
√
→
1+ 5
0
− 2
"
#
0√
1
→
1− 5
0
− 2
#
0
"
and v2 =
√ #
−1− 5
2
0
√ #
−1+ 5
2
.
0
√ #
1− 5
2
.
1
Then
√ k
√ k
1+ 5
1− 5
√1
−
2
2


5
√ k−1 
 √ k−1
1+
5
1−
5
1
1
√
− √5
2
2
5

xk
= lim
k→∞ λk
k→∞
1
lim
√1
5

√ k
1+ 5
2
√ k
1−√5


1+ 5
= lim 
√ k  .
√ k→∞
2√
√1 ·
√5
− √15 · 1−2 5 1−
5 1+ 5
1+ 5

Since
√
1−√5
1+ 5
√1 − √1
5
5

≈ 0.381 < 1, the terms involving k th powers of this fraction go to zero as k → ∞, leaving
us with (in the limit)
"
1
√
#
5
√ 2 √
5(1+ 5)
2
√
=√
5(1 + 5)
"
√ #
1+ 5
2
=√
1
√
2
5− 5
√ v1 =
v1 ,
10
5(1 + 5)
which is a constant times v1 .
59. The coefficient matrix has characteristic polynomial
1−λ
3
= (1 − λ)(2 − λ) − 6 = λ2 − 3λ − 4 = (λ − 4)(λ + 1),
2
2−λ
so that the eigenvalues are λ1 = 4 and λ2 = −1. The corresponding eigenvectors are found by rowreduction:
−3
3 0
1 −1 0
1
A − 4I 0 =
→
⇒ v1 =
2 −2 0
0
0 0
1
3
2 3 0
3
1 2 0
A+I 0 =
→
⇒ v2 =
.
2 3 0
−2
0 0 0
Then by Theorem 4.40,
x = C1 e4t
1
3
+ C2 e−t
.
1
−2
344
CHAPTER 4. EIGENVALUES AND EIGENVECTORS
Substitute t = 0 to determine the constants:
0
1
3
C1 + 3C2
x(0) =
= C1 e0
+ C 2 e0
=
.
5
1
−2
C1 − 2C2
These two equations give C1 = 3 and C2 = −1, so that the solution is
x = 3e4t
x(t) = 3e4t − 3e−t
1
3
− e−t
, or
1
−2
y(t) = 3e4t + 2e−t .
60. The coefficient matrix has characteristic polynomial
2−λ
−1
= (2 − λ)2 − 1 = λ2 − 4λ + 3 = (λ − 1)(λ − 3),
−1
2−λ
so that the eigenvalues are λ1 = 1 and λ2 = 3. The corresponding eigenvectors are found by rowreduction:
1 −1 0
1 −1 0
1
A−I 0 =
→
⇒ v1 =
−1
1 0
0
0 0
1
−1 −1 0
1 1 0
1
A − 3I 0 =
→
⇒ v2 =
.
−1 −1 0
0 0 0
−1
Then by Theorem 4.40,
1
1
3t
x = C1 e
+ C2 e
.
1
−1
t
Substitute t = 0 to determine the constants:
1
1
C1 + C2
0 1
0
x(0) =
= C1 e
+ C2 e
=
.
1
1
−1
C1 − C2
These two equations give C1 = 1 and C2 = 0, so that the solution is
x(t) = et
1
1
3t
x=e
+0·e
, or
1
−1
y(t) = et .
t
61. The coefficient matrix has characteristic polynomial
√
√
1−λ
1
= (1 − λ)(−1 − λ) − 1 = λ2 − 2 = (λ − 2 )(λ + 2 ),
1
−1 − λ
√
√
so that the eigenvalues are λ1 = 2 and λ2 = − 2. The corresponding eigenvectors are found by
row-reduction:
√
√
√ √
1− 2
1√
0
1 −1 − 2 0
1+ 2
→
⇒ v1 =
A − 2I 0 =
0
0
0
1
1
−1 − 2 0
√
√
√ √
1+ 2
1√
0
1 −1 + 2 0
1− 2
→
⇒ v2 =
.
A + 2I 0 =
0
0
0
1
1
−1 + 2 0
Then by Theorem 4.40,
√
x = C1 e 2 t
√ √ √
1+ 2
1− 2
+ C 2 e− 2 t
.
1
1
Substitute t = 0 to determine the constants:
√ √ √
√ 1
1+ 2
1− 2
C1 (1 + 2) + C2 (1 − 2)
x(0) =
= C1 e0
+ C 2 e0
=
.
0
1
1
C1 + C2
4.6. APPLICATIONS AND THE PERRON-FROBENIUS THEOREM
345
√
The second equation
gives C1 + C2 = 0; substituting
into √the first equation gives (C1 − C2 ) 2 = 1, so
√
√
that C1 − C2 = 22 . It follows that C1 = 42 and C2 = − 42 , so that the solution is
√
√
2 + 2 √2 t 2 − 2 −√2 t
√
√
√
√
x1 (t) =
e
+
e
2 √2 t 1 + 2
2 −√2 t 1 − 2
4
x=
−
, or
e
e
√ 4
√
1
1
4
4
2 √2 t
− e− 2 t .
x2 (t) =
e
4
62. The coefficient matrix has characteristic polynomial
1−λ
−1
= (1 − λ)2 + 1 = λ2 − 2λ + 2,
1
1−λ
so that the eigenvalues are λ1 = 1 + i and λ2 = 1 − i. The corresponding eigenvectors are found by
row-reduction:
−i −1 0
1 −i 0
i
A − (1 + i)I 0 =
→
⇒ v1 =
1 −i 0
0
0 0
1
i −1 0
1 i 0
−i
A − (1 − i)I 0 =
→
⇒ v2 =
.
1
i 0
0 0 0
1
Then by Theorem 4.40,
y = C1 e(1+i)t
i
−i
+ C2 e(1−i)t
.
1
1
Substitute t = 0 to determine the constants:
1
(C1 − C2 )i
0 i
0 −i
y(0) =
= C1 e
+ C2 e
=
.
1
1
1
C1 + C2
1+i
Solving gives C1 = 1−i
2 , C1 = 2 , so that the solution is
1 − i (1+i)t i
1 + i (1−i)t −i
y=
e
+
e
,
1
1
2
2
or
1 + i (1−i)t
i + 1 (1+i)t 1 − i (1−i)t
1 − i (1+i)t
e
i+
e
· (−i) =
e
+
e
2
2
2
2
1 − i (1+i)t 1 + i (1−i)t
y2 (t) =
e
+
e
.
2
2
y1 (t) =
Now substitute cos t + i sin t for eit , giving
i + 1 (1+i)t 1 − i (1−i)t
e
+
e
2
2
i+1 t
1−i t
=
e (cos t + i sin t) +
e (cos t − i sin t)
2
2
i − 1 −i − 1 t
= et cos t +
+
e sin t
2
2
y1 (t) =
= et (cos t − sin t)
1 − i (1+i)t 1 + i (1−i)t
y2 (t) =
e
+
e
2
2
1−i t
1+i t
=
e (cos t + i sin t) +
e (cos t − i sin t)
2
2
1+i 1−i t
= et cos t +
+
e sin t
2
2
= et (cos t + sin t).
346
CHAPTER 4. EIGENVALUES AND EIGENVECTORS
63. The coefficient matrix has characteristic polynomial


−λ 1 −1
 1 −λ 1  = −λ3 + λ = −λ(1 − λ)(1 + λ),
1
1 −λ
so that the eigenvalues are λ1 = 0, λ2 = 1, and λ3 = −1. The corresponding eigenvectors are found by
row-reduction:




 
0 1 −1 0
1 0
1 0
−1
1 0 → 0 1 −1 0 ⇒ v1 =  1
A − 0I 0 = 1 0
1 1
0 0
0 0
0 0
1



 

1 0
0 0
0
−1
1 −1 0
1 0 → 0 1 −1 0 ⇒ v2 = 1
A − I 0 =  1 −1
1
1 −1 0
0 0
0 0
1



 

1 1 0 0
−1
1 1 −1 0
1 0 → 0 0 1 0 ⇒ v2 =  1 .
A + I 0 = 1 1
1 1
1 0
0 0 0 0
0
Then by Theorem 4.40,
 
 
 
0
−1
−1
x = C1 e0t  1 + C2 et 1 + C3 e−t  1 .
1
0
1
Substitute t = 0 to determine the constants:
 
 
 
  

1
−C1 − C3
−1
0
−1
x(0) =  0 = C1 e0  1 + C2 e0 1 + C3 e0  1 = C1 + C2 + C3 
−1
1
1
0
C1 + C2
Solving gives C1 = −2, C3 = 1, C2 = 1, so that the solution is
 
 
 
−1
0
−1
x = −2  1 + et 1 + e−t  1
1
1
0
or
x(t) = 2 − e−t
y(t) = −2 + et + e−t
z(t) = −2 + et .
64. The coefficient matrix has characteristic polynomial


1−λ
0
3
 1
−2 − λ 1  = −(λ − 4)(λ + 2)2
3
0
−λ
so that the eigenvalues are λ1 = 4 and λ2 = −2 (with algebraic multiplicity 2). The corresponding
eigenvectors are found by row-reduction:




 
1 0 −1 0
−3
0
3 0
3
1 0 → 0 1 − 13 0 ⇒ v1 = 1
A − 4I 0 =  1 −6
3
0 −3 0
3
0 0
0 0




 
 
3 0 3 0
1 0 1 0
0
−1
A + 2I 0 = 1 0 1 0 → 0 0 0 0 ⇒ v2 = 1 , v3 =  0 .
3 0 3 0
0 0 0 0
0
1
4.6. APPLICATIONS AND THE PERRON-FROBENIUS THEOREM
347
Then by Theorem 4.40,
 
 
 
3
0
−1
x = C1 e4t 1 + C2 e−2t 1 + C3 e−2t  0 .
3
0
1
Substitute t = 0 to determine the constants:
 
 
 
  

2
3
0
−1
3C1 − C3
x(0) = 3 = C1 e0 1 + C2 e0 1 + C3 e0  0 =  C1 + C2 
4
3
0
1
3C1 + C3
Solving gives C1 = 1, C2 = 2, C3 = 1, so that the solution is
 
 
 
3
0
−1
x = e4t 1 + 2e−2t 1 + e−2t  0
3
0
1
or
x(t) = 3e4t − e−2t
y(t) = e4t + 2e−2t
z(t) = 3e4t + e−2t .
65. (a) The coefficient matrix has characteristic polynomial
1.2 − λ
−0.2
= (1.2 − λ)(1.5 − λ) − 0.04 = λ2 − 2.7λ + 1.76 = (λ − 1.1)(λ − 1.6)
−0.2
1.5 − λ
so that the eigenvalues are λ1 = 1.1 and λ2 = 1.6. The corresponding eigenvectors are found by
row-reduction:
0.1 −0.2 0
1 −2 0
2
A − 1.1I 0 =
→
⇒ v1 =
0.2
0.4 0
0
0 0
1
1
−0.4 −0.2 0
−1
1 2 0
A − 1.6I 0 =
→
⇒ v2 =
.
−0.2 −0.1 0
2
0 0 0
Then by Theorem 4.40,
x = C1 e
1.1t
2
1.6t −1
+ C2 e
.
1
2
Substitute t = 0 to determine the constants:
400
2
−1
2C1 − C2
x(0) =
= C 1 e0
+ C2 e0
=
.
500
1
2
C1 + 2C2
Solving gives C1 = 260, C2 = 120, so that the solution is
1.1t 2
1.6t −1
x = 260e
+ 120e
1
2
or
x(t) = 520e1.1t − 120e1.6t
y(t) = 260e1.1t + 240e1.6t .
A plot of the populations over time is
348
CHAPTER 4. EIGENVALUES AND EIGENVECTORS
Population
15 000
X
10 000
5000
Y
0
1.0
0.5
1.5
2.0
2.5
3.0
Time HdaysL
From the graph, the population of Y dies out after about three days.
(b) Substituting a and b for 400 and 500 in the initial value computation above gives
a
2
−1
2C1 − C2
x(0) =
= C 1 e0
+ C2 e0
=
.
b
1
2
C1 + 2C2
Solving gives C1 = 51 (2a + b), C2 = 15 (−a + 2b), so that the solution is
1
1
1.1t 2
1.6t −1
x = (2a + b)e
+ (−a + 2b)e
1
2
5
5
or
1
2
x(t) = (2a + b)e1.1t − (−a + 2b)e1.6t
5
5
1
2
y(t) = (2a + b)e1.1t + (−a + 2b)e1.6t .
5
5
In the long run, the e1.6t term will dominate, so if a > 2b, then −a + 2b < 0, so the coefficient
of e1.6t in x(t) is positive and in y(t) is negative, so that Y will die out and X will dominate.
If a < 2b, then −a + 2b > 0, so the reverse will happen — X will die out and Y will dominate.
Finally, if a = 2b, then the e1.6t term disappears in the solutions, and both populations will grow
at a rate proportional to e1.1t , though one will grow twice as fast as the other.
66. The coefficient matrix has characteristic polynomial
−0.8 − λ
0.4
= (−0.8 − λ)(−0.2 − λ) − 0.16 = λ2 + λ = λ(λ + 1)
0.4
−0.2 − λ
so that the eigenvalues are λ1 = 0 and λ2 = −1. The corresponding eigenvectors are found by rowreduction:
−0.8
0.4 0
1
1 − 12 0
A − 0I 0 =
⇒ v1 =
→
2
0.4 −0.2 0
0
0 0
0.2 0.4 0
1 2 0
−2
A+I 0 =
→
⇒ v2 =
.
0.4 0.8 0
0 0 0
1
Then by Theorem 4.40,
1
−t −2
x = C1 e
+ C2 e
.
2
1
0t
Substitute t = 0 to determine the constants:
15
C1 − 2C2
0 1
0 −2
x(0) =
= C1 e
+ C2 e
=
.
10
2
1
2C1 + C2
Solving gives C1 = 7, C2 = −4, so that the solution is
1
−2
x=7
− 4e−t
2
1
or
x(t) = 7 + 8e−t
y(t) = 14 − 4e−t .
As t → ∞, the exponential terms go to zero, and the populations stabilize at 7 of X and 14 of Y .
4.6. APPLICATIONS AND THE PERRON-FROBENIUS THEOREM
349
67. We want to convert x0 = Ax + b into an equation u0 = Au. If we could write b = Ab0 , then letting
u = x + b0 would do the trick. To find b0 , we invert A:
#
#"
# "
"
1
−10
− 12 −30
0
−1
2
=
.
b =A b= 1
1
−20
−10
2
2
−10
Then let u = x +
, so that
−20
0
−10
−10
−10
u0 = x +
= x0 = Ax + b = Ax + AA−1
= A x + A−1
= Au.
−20
−20
−20
So now we want to solve this equation. A has characteristic polynomial
1−λ
1
= (1 − λ)2 + 1 = λ2 − 2λ + 2,
−1
1−λ
so the eigenvalues are 1 + i and 1 − i. Eigenvectors are found by row-reducing:
−i
1 0
1 i 0
1
A − (1 + i)I 0 =
→
⇒ v1 =
−1 −i 0
0 0 0
i
1 −i 0
i
i 1 0
A − (1 − i)I 0 =
→
⇒ v2 =
.
−1 i 0
0
0 0
1
Then by Theorem 4.40,
u = C1 e
(1+i)t
1
(1−i)t i
+ C2 e
.
i
1
Substitute t = 0 to determine the constants:
−10
10
C1 + iC2
0 1
0 i
u(0) = x(0) +
=
= C1 e
+ C2 e
=
.
−20
10
i
1
iC1 + C2
Solving gives C1 = C2 = 5 − 5i, so that the solution is
1
i
u = (5 − 5i)e(1+i)t
+ (5 − 5i)e(1−i)t
i
1
1
i
= (5 − 5i)et (cos t + i sin t)
+ (5 − 5i)et (cos t − i sin t)
i
1
(5 − 5i) cos t + (5 + 5i) sin t + (5 + 5i) cos t + (5 − 5i) sin t
= et
(5 + 5i) cos t + (−5 + 5i) sin t + (5 − 5i) cos t + (−5 − 5i) sin t
10 cos t + 10 sin t
= et
.
10 cos t − 10 sin t
Rewriting in terms of the original variables gives
−10
10
t 10 cos t + 10 sin t
t 10 cos t + 10 sin t
x=e
−
=e
+
.
10 cos t − 10 sin t
−20
10 cos t − 10 sin t
20
A plot of the populations is below. Both populations die out; species Y after about 1.2 time units and
species X after about 2.4.
Population
60
X
50
40
30
20
Y
10
0
0.0
0.5
1.0
1.5
2.0
2.5
3.0
Time HdaysL
350
CHAPTER 4. EIGENVALUES AND EIGENVECTORS
68. We want to convert x0 = Ax + b into an equation u0 = Au. If we could write b = Ab0 , then letting
u = x + b0 would do the trick. To find b0 , we invert A:
#
#" # "
"
−20
0
− 12 − 12
0
−1
=
.
b =A b= 1
−20
− 12 40
2
−20
Then let u = x +
, so that
−20
0
−20
0
0
0
0
−1
−1
u = x+
= x = Ax + b = Ax + AA
=A x+A
= Au.
−20
40
40
So now we want to solve this equation. A has characteristic polynomial
−1 − λ
1
= (−1 − λ)2 + 1 = λ2 + 2λ + 2,
−1
−1 − λ
so the eigenvalues are −1 + i and −1 − i. Eigenvectors are found by row-reducing:
−i
1 0
1 i 0
1
A − (−1 + i)I 0 =
→
⇒ v1 =
−1 −i 0
0 0 0
i
i 1 0
1 −i 0
i
A − (−1 − i)I 0 =
→
⇒ v2 =
.
−1 i 0
0
0 0
1
Then by Theorem 4.40,
u = C1 e
(−1+i)t
1
(−1−i)t i
+ C2 e
.
i
1
Substitute t = 0 to determine the constants:
−20
−10
1
i
C1 + iC2
u(0) = x(0) +
=
= C1 e0
+ C 2 e0
=
.
−20
10
i
1
iC1 + C2
Solving gives C1 = −5 − 5i and C2 = 5 + 5i, so that the solution is
1
i
u = (−5 − 5i)e(−1+i)t
+ (5 + 5i)e(−1−i)t
i
1
1
i
= (−5 − 5i)e−t (cos t + i sin t)
+ (5 + 5i)e−t (cos t − i sin t)
i
1
(−5 − 5i) cos t + (5 − 5i) sin t + (−5 + 5i) cos t + (5 + 5i) sin t
= e−t
(5 − 5i) cos t + (5 + 5i) sin t + (5 + 5i) cos t + (5 − 5i) sin t
−10 cos t + 10 sin t
= e−t
.
10 cos t + 10 sin t
Rewriting in terms of the original variables gives
−20
20
−t −10 cos t + 10 sin t
−t −10 cos t + 10 sin t
x=e
−
=e
+
.
10 cos t + 10 sin t
−20
10 cos t + 10 sin t
20
A plot of the populations is below; both populations stabilize at 20 as the e−t term goes to zero.
Population
30
Y
25
20
X
15
10
5
0
0
1
2
3
4
Time HdaysL
5
4.6. APPLICATIONS AND THE PERRON-FROBENIUS THEOREM
351
69. (a) Making the change of variables y = x0 and z = x, the original equation x00 + ax0 + bx = 0 becomes
y 0 + ay + bz = 0; since z = x, we have z 0 = x0 = y. So we get the system
y 0 = −ay − bz
z0 =
y
(b) The coefficient matrix of the system in (a) is
−a
1
.
−b
,
0
whose characteristic polynomial is (−a − λ)(−λ) + b = λ2 + aλ + b.
70. Make the substitutions y1 = x(n−1) , y2 = x(n−2) , . . . , yn−1 = x0 , yn = x. Then the original equation
x(n) + an−1 x(n−1) + · · · + a1 x0 + a0 = 0
can be written as
y10 + an−1 y1 + an−2 y2 + · · · + a2 yn−2 + a1 yn−1 + a0 yn = 0,
and for i = 2, 3, . . . , n, we have yi0 = yi−1 . So we get the n equations
y10 = an−1 y1 − an−2 y2 − · · · − a2 yn−2 − a1 yn−1 − a0 yn
y20 = y1
y30 = y2
..
.
yn0 = yn−1
whose coefficient matrix is

−an−1
 1

 0

 ..
 .
−an−2
0
1
..
.
0
0
· · · −a1
···
0
···
0
..
..
.
.
···
1

−a0
0 

0 
.
.. 
. 
0
This is the companion matrix of p(λ).
71. From Exercise 69, we know that the characteristic polynomial is λ2 − 5λ + 6, so the eigenvalues are 2
and 3. Therefore the general solution is x(t) = C1 e2t + C2 e3t .
72. From Exercise 69, we know that the characteristic polynomial is λ2 + 4λ + 3, so the eigenvalues are −1
and −3. Therefore the general solution is x(t) = C1 e−t + C2 e−3t .
73. From
Exercise
59, the matrix of the system has eigenvalues 4 and −1 with corresponding eigenvectors
1
3
and
. So by Theorem 4.41, the system x0 = Ax has solution x = eAt x(0). In this case
1
−2
0
x(0) =
, we get
5
"
#"
#"
#−1 "
#
2 4t
3 −t
3 4t
3 −t
4t
e
+
e
e
−
e
1
3
e
0
1
3
eAT = P (et )D P −1 =
= 52 4t 25 −t 53 4t 52 −t .
1 −2
0 e−t 1 −2
5e − 5e
5e + 5e
Thus the solution is
"
2 4t
3 −t
5e + 5e
2 4t
2 −t
5e − 5e
3 4t
3 −t
5e − 5e
3 4t
2 −t
5e + 5e
which matches the result from Exercise 59.
#" # "
#
0
3e4t − 3e−t
=
,
5
3e4t + 2e−t
352
CHAPTER 4. EIGENVALUES AND EIGENVECTORS
74. From
Exercise
60, the matrix of the system has eigenvalues 1 and 3 with corresponding eigenvectors
1
1
and
. So by Theorem 4.41, the system x0 = Ax has solution x = eAt x(0). In this case
1
−1
1
x(0) =
, we get
1
"
e
AT
t D
= P (e ) P
−1
1
=
1
#"
1 et
−1 0
0
e3t
#−1 "
1 t
e + 1 e3t
1
= 21 t 21 3t
−1
2e − 2e
#"
1
1
1 t
1 3t
2e − 2e
1 t
1 3t
2e + 2e
#
.
Thus the solution is
"
1 t
1 3t
2e + 2e
1 t
1 3t
2e − 2e
1 t
1 3t
2e − 2e
1 t
1 3t
2e + 2e
#" # " #
1
et
= t ,
1
e
which matches the result from Exercise 60.
75. From
of the system has eigenvalues 0, 1, and −1 with corresponding eigenvectors
 Exercise
  63, the matrix

−1
0
−1
 1, 1, and  1. So by Theorem 4.41, the system x0 = Ax has solution x = eAt x(0). In this
1
1
0
 
1
case x(0) =  0, and we get
−1



−1
−1 0 −1
1 0
0
−1 0 −1
1 0 et
0  1 1
1
eAT = P (et )D P −1 =  1 1
−t
1 1
0 0 0 e
1 1
0


−t
−t
1
1−e
−1 + e
= −1 + e−t −1 + e−t + et 1 − e−t  .
−1 + et
−1 + et
1
Thus the solution is

1
−1 + e−t
−1 + et
1 − e−t
−1 + e−t + et
−1 + et
  

−1 + e−t
1
2 − e−t
1 − e−t   0 = −2 + et + e−t  ,
−1
1
−2 + et
which matches the result from Exercise 63.
76. From Exercise 64, the matrix of the system
has
 eigenvalues
  4 and −2 (with algebraic multiplicity 2)
0
−1 
3

with corresponding eigenvectors 1 and 1 ,  0 . So by Theorem 4.41, the system x0 = Ax


3
0 
1
2
has solution x = eAt x(0). In this case x(0) = 3, and we get
4

3

AT
t D −1
e
= P (e ) P = 1
3

0
1
0

−1 e4t

0  0
1
0
1 4t
1 −2t
2e + 2e
 1 4t 1 −2t
= 6e − 6e
1 4t
1 −2t
2e − 2e
0
e−2t
0
0
e−2t
0

0
3

0  1
−2t
e
3
0
1
0

1 4t
1 −2t
2e − 2e
1 4t
1 −2t 
.
6e − 6e
1 4t
1 −2t
e
+
e
2
2
−1
−1

0
1
4.6. APPLICATIONS AND THE PERRON-FROBENIUS THEOREM
353
Thus the solution is

1 4t
1 −2t
2e + 2e
 1 4t 1 −2t
6e − 6e
1 4t
1 −2t
2e − 2e


x3 = Ax2 =
27
.
27
 
1 4t
1 −2t
3e4t − e−2t
2
2e − 2e

 4t
1 4t
1 −2t   
 3 = e + 2e−2t  ,
6e − 6e
1 4t
1 −2t
3e4t + e−2t
4
2e + 2e
0
e−2t
0
which matches the result from Exercise 64.
2 1
77. (a) With A =
, we have
0 3
1
3
9
x1 = A
=
, x2 = Ax1 =
,
1
3
9
25
20
15
10
5
10
5
2
(b) With A =
0
1
, we have
3
1
2
x1 = A
=
,
0
0
20
15
4
x2 = Ax1 =
,
0
25
8
x3 = Ax2 =
.
0
1.0
0.5
2
4
6
8
-0.5
-1.0
(c) Since the matrix is upper triangular, we see that its eigenvalues are 2 and 3; since these are both
greater than 1 in magnitude, the origin is a repeller.
(d) Some other trajectories are
10
5
5
-5
-5
-10
10
354
CHAPTER 4. EIGENVALUES AND EIGENVECTORS
78. (a) With A =
0.5
0
−0.5
, we have
0.5
1
0
x1 = A
=
,
1
0.5
−0.25
x2 = Ax1 =
,
0.25
−0.25
x3 = Ax2 =
.
0.125
1.0
0.8
0.6
0.4
0.2
0.2
-0.2
(b) With A =
0.5
0
0.4
0.6
0.8
1.0
−0.5
, we have
0.5
x1 = A
1
0.5
=
,
0
0
0.25
,
0
x3 = Ax2 =
0.8
1.2
x2 = Ax1 =
0.125
.
0
0.1
0.2
0.4
0.6
1.0
1.4
-0.1
(c) Since the matrix is upper triangular, we see that its only eigenvalue is 0.5; since this is less than
1 in magnitude, the origin is an attractor.
(d) Some other trajectories are
1.0
0.5
-1.0
0.5
-0.5
1.0
-0.5
-1.0
-1.5
-2.0
2
79. (a) With A =
−1
−1
, we have
2
x1 = A
so that
1
is a fixed point:
1
1
1
=
,
1
1
1.5
2.0
4.6. APPLICATIONS AND THE PERRON-FROBENIUS THEOREM
355
2.0
1.5
1.0
0.5
1.0
0.5
(b) With A =
2
−1
2.0
1.5
−1
, we have
2
1
2
x1 = A
=
,
0
−1
2
5
x2 = Ax1 =
,
−4
4
6
8
10
14
x3 = Ax2 =
.
−13
12
14
-2
-4
-6
-8
-10
-12
(c) The characteristic polynomial of the matrix is
2−λ
−1
det(A − λI) =
= λ2 − 4λ + 3 = (λ − 3)(λ − 1).
−1
2−λ
Thus one eigenvalue is greater than 1 in magnitude and the other is equal to 1. So the origin is
not a repeller, an attractor, or a saddle point; there are no nonzero values of x0 for which the
trajectory of x0 approaches the origin.
(d) Some other trajectories are
10
5
5
-5
10
-5
-10
−4
80. (a) With A =
1
2
, we have
−3
x1 = A
1
−2
=
,
1
−2
x2 = Ax1 =
4
,
4
x3 = Ax2 =
−8
.
−8
356
CHAPTER 4. EIGENVALUES AND EIGENVECTORS
4
H4, 4L
2
H1, 1L
-8
-6
-4
2
-2
H-2, -2L
4
-2
-4
-6
H-8, -8L
−4
(b) With A =
1
-8
2
, we have
−3
1
−4
x1 = A
=
,
0
1
x2 = Ax1 =
18
,
−7
x3 = Ax2 =
−86
.
39
40
H-86, 39L
30
20
10
H-4, 1L
-80
-60
-40
H1, 0L
20
-20
H18, -7L
(c) The characteristic polynomial of the matrix is
−4 − λ
2
det(A − λI) =
= λ2 + 7λ + 10 = (λ + 5)(λ + 2).
1
−3 − λ
Since both eigenvalues, −5 and −2, are greater than 1 in magnitude, the origin is a repeller.
(d) Some other trajectories are
x3
x2 20
30
x0 =H-1, 1L
50
20
100
150
x1
-20
10
-40
x0 =H2, 1L
-80
-60
-40
-20
-60
20
x2
x1
x3
-80
x3
x3
50
100
40
30
20
50
10
x1
x1
x0 =H2, -1L
-250
-200
-150
-100
-50
50
x2
100
-60
-40
-20
x0 =H-1, -2L -10
x2
4.6. APPLICATIONS AND THE PERRON-FROBENIUS THEOREM
81. (a) With A =
1.5
−1
357
−1
, we have
0
1
−2
x1 = A
=
,
1
−2
4
x2 = Ax1 =
,
4
1.0
−8
x3 = Ax2 =
.
−8
H1, 1L
0.5
0.5
1.0
2.0
1.5
3.0
2.5
H1.75, -0.5L
-0.5
-1.0
H0.5, -1L
-1.5
H3.125, -1.75L
(b) With A =
1.5
−1
−1
, we have
0
x1 = A
1
−4
=
,
0
1
H1, 0L
1
x2 = Ax1 =
2
3
18
,
−7
4
x3 = Ax2 =
5
−86
.
39
6
-0.5
H1.5, -1L
-1.0
H3.25, -1.5L
-1.5
-2.0
-2.5
-3.0
H6.375, -3.25L
(c) The characteristic polynomial of the matrix is
1.5 − λ
det(A − λI) =
−1
−1
= λ2 − 1.5λ + 1 = (λ − 2)(λ + 0.5)
−λ
Since one eigenvalue is greater than 1 in magnitude and the other is less, the origin is a saddle
point. Eigenvectors for λ = −0.5 are attracted to the origin, and other vectors are repelled.
358
CHAPTER 4. EIGENVALUES AND EIGENVECTORS
(d) Some other trajectories are
2.0
5
x3
x0 =H1, 2L
1.5
4
1.0
3
0.5
x2
x2
2
-0.4
0.2
-0.2
0.4
0.6
0.8
1.0
x0 =H-1, 1L
x3
x1
1
x1
-0.5
-1.0
-8
-6
-4
-2
1.0
10
5
x0 =H2, -1L
15
x1
0.5
x3
x1
-2
-1.0
-0.8
-0.6
x2
-4
-0.4
0.2
-0.2
x2
-0.5
-1.0
-6
-1.5
0.1
82. (a) With A =
0.5
-2.0
0.9
, we have
0.5
x1 = A
so
x0 =H-1, -2L
x3
-8
1
1
=
,
1
1
1
is a fixed point - it is an eigenvector for the eigenvalue λ = 1.
1
2.0
1.5
1.0
H1, 1L
0.5
0.5
(b) With A =
0.1
0.5
1.0
1.5
2.0
0.9
, we have
0.5
x1 = A
1
0.1
=
,
0
0.5
x2 = Ax1 =
0.46
,
0.30
x3 = Ax2 =
0.316
.
0.38
0.4
4.6. APPLICATIONS AND THE PERRON-FROBENIUS THEOREM
359
0.5
H0.1, 0.5L
0.4
H0.316, 0.38L
0.3
H0.46, 0.3L
0.2
0.1
H1, 0L
0.2
0.4
0.6
0.8
1.0
(c) The characteristic polynomial of the matrix is
det(A − λI) =
0.1 − λ
0.5
0.9
= λ2 − 0.6λ − 0.45 = (λ − 1)(λ + 0.4).
0.5 − λ
Since one eigenvalue is less than 1 in magnitude and the other is equal to 1, the origin is neither
an attractor, a repeller, nor a saddle point.
(d) Some other trajectories are
1.5
5
x0 =H-9, 5L
x1
4
x3
x2
3
1.0
x0 =H2, 1L
2
x2
1
0.5
-8
-6
-4
2
-2
x3
-1
0.0
-2.0
0.5
1.0
1.5
-1.5
-1.0
-0.5
x1
2.0
-2
x1
0.5
-0.5
-0.2
-0.5
x3
-0.4
x2
-1.0
-0.6
x1
x3
x2
-1.5
-0.8
x0 =H-1, -2L
-2.0
-1.0
x0 =H1, -1L
0.2 0.4
83. (a) With A =
, we have
−0.2 0.8
x1 = A
1
0.6
=
,
1
0.6
x2 = Ax1 =
0.36
,
0.36
x2 = Ax2 =
0.216
.
0.216
1.0
360
CHAPTER 4. EIGENVALUES AND EIGENVECTORS
1.0
0.8
0.6
0.4
0.4
0.6
0.8
1.0
0.2 0.4
(b) With A =
, we have
−0.2 0.8
x1 = A
1
0.2
=
,
0
−0.2
x2 = Ax1 =
−0.04
,
−0.20
x3 = Ax2 =
−0.088
.
−0.152
0.2
0.1
0.2
-0.2
0.4
0.6
0.8
1.0
-0.1
-0.2
-0.3
(c) The characteristic polynomial of the matrix is
det(A − λI) =
0.4
= λ2 − λ + 0.24 = (λ − 0.4)(λ − 0.6).
0.8 − λ
0.2 − λ
−0.2
Since both eigenvalues are smaller than 1 in magnitude, the origin is an attractor.
(d) Some other trajectories are
4
2
-2
1
-1
2
-2
-4
84. (a) With A =
0
1.2
−1.5
, we have
3.6
x1 = A
1
−1.5
=
,
1
4.8
x2 = Ax1 =
−7.2
,
15.48
x2 = Ax2 =
−23.22
.
47.088
4.6. APPLICATIONS AND THE PERRON-FROBENIUS THEOREM
361
40
30
20
10
-20
(b) With A =
0
1.2
-10
-15
-5
−1.5
, we have
3.6
x1 = A
1
0
=
,
0
1.2
x2 = Ax1 =
−1.8
,
4.32
−6.48
.
13.392
x3 = Ax2 =
12
10
8
6
4
2
-6
-4
-5
-3
-2
1
-1
(c) The characteristic polynomial of the matrix is
−λ
−1.5
det(A − λI) =
= λ2 − 3.6λ + 1.8 = (λ − 3)(λ − 0.6).
1.2 3.6 − λ
Since one eigenvalue is smaller than 1 in magnitude and the other is greater, the origin is a saddle
point; the origin attracts some initial points and repels others.
(d) Some other trajectories are
6
4
2
-4
2
-2
4
-2
-4
85. With the given matrix A, we have r =
"
r
A=
0
√
#"
0 cos θ
r sin θ
a 2 + b2 =
√
2 and θ = π4 , so that
# "√
− sin θ
2
=
cos θ
0
#"
√1
0
2
√
2 √12
− √12
√1
2
#
.
362
CHAPTER 4. EIGENVALUES AND EIGENVECTORS
Since r = |λ| =
√
2 > 1, the origin is a spiral repeller:
2.0
1.5
1.0
0.5
-4
-3
-2
1
-1
√
86. With the given matrix A, we have r = a2 + b2 = 0.5 and θ = 3π
4 , so that
"
#"
# "
#"
#
r 0 cos θ − sin θ
0.5 0
0 1
.
A=
=
0 r sin θ
cos θ
0 0.5 −1 0
Since r = |λ| = 0.5 < 1, the origin is a spiral attractor:
1.0
0.5
0.2
-0.2
0.4
0.6
0.8
1.0
1.2
-0.5
√
87. With the given matrix A, we have r = a2 + b2 = 2 and θ = 5π
3 , so that
#"
# "
#"
"
√ #
3
1
2 0
r 0 cos θ − sin θ
2√
2
=
A=
.
3
1
0 2 − 2
cos θ
0 r sin θ
2
Since r = |λ| = 2 > 1, the origin is a spiral repeller:
-8
-6
-4
2
-2
-2
-4
-6
-8
88. With the given matrix A, we have r =
"
r
A=
0
0
r
√
a 2 + b2 =
#"
cos θ
sin θ
r
√
− 23
# "
− sin θ
1
=
cos θ
0
2
+
1 2
= 1 and θ = 5π
2
6 , so that
#" √
0 − 23
1
1
2
#
− 12
√
.
− 23
4.6. APPLICATIONS AND THE PERRON-FROBENIUS THEOREM
363
Since r = |λ| = 1, the origin is the orbital center:
1.0
x3
x0
0.5
-1.0
1.0
0.5
-0.5
x1
x2
-0.5
-1.0
89. The characteristic polynomial of A is
0.1 − λ
det(A − λI) =
0.1
−0.2
= λ2 − 0.4λ + 0.05;
0.3 − λ
solving this quadratic gives λ = 0.2 ± 0.1i for the eigenvalues. Row-reducing gives
−1 −1
eigenvector. Thus P = Re x Im x =
, and so
1
0
0
1 0.1 −0.2 −1 −1
0.2 −0.1
−1
C = P AP =
=
.
−1 −1 0.1
0.3
1
0
0.1
0.2
√
Since |λ| = 0.05 < 1, the origin is a spiral attractor. The first six points are
1
−0.1
−0.09
x0 =
, x1 =
, x2 =
,
1
0.4
0.11
−0.031
−0.0079
−0.00161
x3 =
, x4 =
, x5 =
.
0.024
0.0041
0.00044
1.0
0.8
0.6
0.4
0.2
0.2
0.4
0.6
0.8
1.0
90. The characteristic polynomial of A is
det(A − λI) =
2−λ
−2
1
= λ2 − 2λ + 2;
−λ
−1 − i
for an
1
364
CHAPTER 4. EIGENVALUES AND EIGENVECTORS
−1 − i
solving this quadratic gives λ = 1 ± i for the eigenvalues. Row-reducing gives
for an eigen2
−1 −1
vector. Thus P = Re x Im x =
, and so
2
0
#
#"
#"
# "
"
1
1
1
2
1
−1
−1
0
2
=
.
C = P −1 AP =
−1 1
−1 − 21 −2 0
2
0
Since |λ| =
√
2 > 1, the origin is a spiral repeller. The first six points are
1
3
4
2
−4
x0 =
, x1 =
, x2 =
, x3 =
, x4 =
,
1
−2
−6
−8
−4
x5 =
−12
.
8
5
-10
-5
-5
91. The characteristic polynomial of A is
det(A − λI) =
solving this quadratic gives λ =
1
2 ±
1−λ
1
−1
= λ2 − λ + 1;
−λ
√
3
2 i for the eigenvalues.
eigenvector. Thus
P = Re x
Im x =
1
2
1
"
C=P
AP =
#"
0
1
#"
1
−1
− √23
√1
3
1
0
Row-reducing gives
√
1
2
− 23
1
0
#
"
=
1
√2
3
2
1.0
0.5
0.5
-0.5
-0.5
-1.0
1.0
√
− 23
Since |λ| = 1, the origin is an orbital center. The first six points are
1
0
−1
−1
0
x0 =
, x1 =
, x2 =
, x3 =
, x4 =
,
1
1
0
−1
−1
-1.0
√ 3
1
−
2
2 i for an
1
√ − 23
,
0
and so
−1
1
2
#
.
1
x5 =
.
0
Chapter Review
365
92. The characteristic polynomial of A is
−λ
det(A − λI) =
1
√
√ −1
= λ2 − 3λ + 1;
3−λ
√
solving this quadratic gives λ =
3
1
2 ± 2 i for the eigenvalues.
eigenvector. Thus
√
3
Im x = − 2
1
P = Re x
and so
"
C=P
−1
AP =
0
−2
#"
1 0
√
− 3 1
√
− 23 − 12 i
Row-reducing gives
for an
1
− 12
,
0
#" √
−1 − 23
√
1
3
− 12
"√
#
=
0
3
2
1
2
− 21
√
3
2
#
.
Since |λ| = 1, the origin is an orbital center. The first six points are
√ −1√
1
−1 −√ 3
x0 =
, x1 =
, x2 =
,
1
1+ 3
2+ 3
√ √ √ −2 −√ 3
−2 −√ 3
−1 − 3
, x4 =
, x5 =
x3 =
.
1
2+ 3
1+ 3
3
2
1
-3
-2
1
-1
2
3
-1
-2
-3
Chapter Review
1. (a) False. What is true is that if A is an n × n matrix, then det(−A) = (−1)n det A. This is because
multiplying any row by −1 multiplies the determinant by −1, and −A multiplies every row by
−1. See Theorem 4.7.
(b) True, since det(AB) = det(A) det(B) = det(B) det(A) = det(BA). See Theorem 4.8.
(c) False. It depends on what order the columns are in. Swapping any two columns of A multiplies
the determinant by −1. See Theorem 4.4.
(d) False. While det AT = det A by Theorem 4.10, det A−1 = det1 A by Theorem 4.9. So unless
det A = ±1, the statement is false.
0 1
(e) False. For example, the matrix
has a characteristic polynomial of 0, so 0 is its only
0 0
eigenvalue.
(f ) False. For example, the identity matrix has only λ = 1 as an eigenvalue, but ei is an eigenvector
for every i.
(g) True. See Theorem 4.25 in Section 4.4.
(h) False. Any diagonal matrix with two identical entries on the diagonal (such as the identity matrix)
provides a counterexample.
366
CHAPTER 4. EIGENVALUES AND EIGENVECTORS
(i) False. If A and B are similar, then they have the same eigenvalues, but not necessarily the same
eigenvectors. Exercise 48 in Section 4.4 shows that if B = P −1 AP and v is an eigenvector of A,
then P −1 v is an eigenvector for B for the same eigenvalue.
(j) False. For example, if A is invertible, then its reduced row echelon form is the identity matrix,
which has only the eigenvalue 1. So if A has an eigenvalue other than 1, it cannot be similar to
the identity matrix.
2. (a) For example, expand along the first row:
1
3
7
3
5
9
5
5
7 =1
9
11
3
7
−3
7
11
3
7
+5
7
11
5
9
= 1(5 · 11 − 7 · 9) − 3(3 · 11 − 7 · 7) + 5(3 · 9 − 5 · 7)
= 0.
(b) Apply row operations to diagonalize the matrix:

1
3
7
3
5
9


1
5
R2 −3R1
7  −−−−−→ 0
R3 −7R1
11 −
−−−−→ 0


1
3
5
0
−4 −8
R3 −3R2
−12 −24 −
−−−−→ 0

3
5
−4 −8 .
0
0
Then det A is the product of the diagonal entries, which is zero.
3. To get from the first matrix to the second:
• Multiply the first column by 3; this multiplies the determinant by 3.
• Multiply the second column by 2; this multiplies the determinant by 2.
• Subtract 4 times the third column from the second; this does not change the determinant.
• Interchange the first and second rows; this multiplies the determinant by −1.
So the determinant of the resulting matrix is 3 · 3 · 2 · (−1) = −18.
4. (a) We have
det C = det (AB)−1 =
1
1
1
= −2.
=
=
det(AB)
det A det B
2 · − 14
(b) We have
det C = det A2 B(3AT ) = det A · det A · det B · det(3AT )
= − det(3AT ) = −34 det(AT ) = −34 det A = −162.
5. For any n × n matrix, det A = det(AT ), and det(−A) = (−1)n det A. So if A is skew-symmetric, then
AT = −A and
det A = det(AT ) = det(−A) = (−1)n det A = − det A
since n is odd. But det A = − det A means that det A = 0.
6. Compute the determinant by expanding along the first row:
1
1
2
−1
2
1
1 k =1
4
2
4 k
k
1
− (−1)
k2
2
k
1
+2
k2
2
1
4
= 1(k 2 − 4k) + 1(k 2 − 2k) + 2(4 − 2)
= 2k 2 − 6k + 4 = 2(k − 2)(k − 1).
So the determinant is zero when k = 1 or k = 2.
Chapter Review
367
7. Since
3
Ax =
4
1
3
1
5
1
=
=5
,
2
10
2
we see that x is indeed an eigenvector, with corresponding eigenvalue 5.
8. Since

  
  
 
13 −60 −45
3
3 · 13 − 1 · (−60) + 2 · (−45)
9
3
18
15 −1 =  3 · (−5) − 1 · 18 + 2 · 15  = −3 = 3 −1 ,
Ax = −5
10 −40 −32
2
3 · 10 − 1 · (−40) + 2 · (−32)
6
2
we see that x is indeed an eigenvector, with corresponding eigenvalue 3.
9. (a) The characteristic polynomial of A is
−5 − λ
3
0
−6
4−λ
0
3
−3
−2 − λ
= (−2 − λ)
−5 − λ
3
−6
4−λ
det(A − λI) =
= −(λ + 2)(λ2 + λ − 2) = −(λ + 2)(λ + 2)(λ − 1).
(b) The eigenvalues of A are the roots of the characteristic polynomial, which are λ1 = 1 and λ2 = −2
(with algebraic multiplicity 2).
(c) We row-reduce A − λI for each eigenvalue λ:
A−I

−6
0 = 3
0
−6
3
3 −3
0 −3

1
R +6R1 
0
−−2−−−→
0

R1 +R2
−−−
−−→ 1
0
R3 +3R1
−−−−−→ 0
A + 2I

−3
0 = 3
0
1
0
0
−1
−3
1
1 0
0 1
0 0
−6
3
6 −3
0
0
Thus the eigenspaces are
 
−1
E1 = span  1 ,
0


3
0
R1 ↔R2 
0 −
−6
−−−−→
0
0
−1
1
−3
 13 R1 
1
0 −−
−→
−6
0
0 − 13 R3
0
−−−−→

0
0
0
−6 3
0 0
0 0
 − 1 R1 
3
0 −
−−
−→ 1 2
0 0
0
0
0 0
3 −3
−6
3
0 −3


1 1
0
R2 ↔R3 
0 −
−−−−→ 0 0
0
0 0

0
0
0


0
−3
R2 +R2 
0 −
0
−−−−→
0
0
1 −1
−6
3
0
1
−1
0
0

0
0
0

0
0 .
0


   
1
−2
 −2s + t 
E−2 =  s  = span  1 , 0 .


t
0
1
(d) Even though A does not have three distinct eigenvalues, it is diagonalizable since the eigenvalue
of algebraic multiplicity 2, which is λ2 = −2, also has geometric multiplicity 2. A diagonalizing
matrix is a matrix P whose columns are the eigenvectors above, so


−1 −2 1
1 0 ,
P = 1
0
0 1
368
CHAPTER 4. EIGENVALUES AND EIGENVECTORS
and

1
P −1 AP = −1
0

2 −1
−5
−1
1  3
0
1
0

−6
3 −1
4 −3  1
0 −2
0
 
−2 1
1
1 0 = 0
0 1
0

0
0
−2
0 = D.
0 −2
10. Since A is diagonalizable, A ∼ D where D is a diagonal matrix whose diagonal entries are its eigenvalues. Since A ∼ D, we know that A and D have the same eigenvalues, so the diagonal entries of
D are −2, 3, and 4 and thus det D = −24. But A ∼ D also implies that det D = det(P −1 AP ) =
det(P −1 ) det A det P = det A, so that det A = −24.
11. Since
3
1
1
=5
−2
,
7
1
−1
we know by Theorem 4.19 that
−5 1
1
1
160
2
162
3
− 2(−1)−5
=
+
=
.
A−5
=5
1
−1
160
−2
158
7
2
12. Since A is diagonalizable, it has n linearly
independent eigenvectors, which therefore form a basis for
P
Rn . So if x is any vector, then x =
ci vi for some constants ci , so that by Theorem 4.19, and using
the Triangle Inequality,
!
n
X
X
X
n
n
n
=
ci λni vi ≤
|ci | |λi | kvi k .
ci v i
kA xk = A
i=1
n
Then |λi | → 0 as n → ∞ since |λi | < 1, so each term of this sum goes to zero, so that kAn xk → 0 as
n → ∞. Since this holds for all vectors x, it follows that An → O as n → ∞.
13. The two characteristic polynomials are
pA =
4−λ
3
2
= (4 − λ)(1 − λ) − 6 = λ2 − 5λ − 2
1−λ
pB =
2−λ
3
2
= (2 − λ)(2 − λ) − 6 = λ2 − 4λ − 2.
2−λ
Since the characteristic polynomials are different, the matrices are not similar.
14. Since both matrices are diagonal, we can read off their eigenvalues; both matrices have eigenvalues 2
and 3. To convert A to B, we want to reverse the rows, and reverse the columns, so choose
0 1
.
P = P −1 =
1 0
Then
P −1 AP =
0
1
1
0
2
0
0 0
3 1
1
3
=
0
0
0
= B.
2
15. These matrices are not similar. If there were, then there would be some 3 × 3 matrix


a b c
P = d e f 
g h i
such that P −1 AP = B, or AP = P B. But

a+d b+e
AP = d + g e + h
g
h

c+f
f + i
i

a
P B = d
g

a+b c
d + e f .
g+h i
Chapter Review
369
If these matrices are to be equal, then comparing entries in the last row and column gives f = i = g = 0.
The upper left entries also give d = 0. The two middle entries are e + h and d + e = e, so that h = 0 as
well. Thus the entire bottom row of P is zero, so P cannot be invertible and A and B are not similar.
Note that this is a counterexample to the converse of Theorem 4.22. These two matrices satisfy parts
(a) through (g) of the conclusion, yet they are not similar.
16. The characteristic polynomial of A is
det(A − λI) =
2−λ
1
k
= (2 − λ)(−λ) − k = λ2 − 2λ − k.
−λ
(a) A has eigenvalues 3 and −1 if λ2 − 2λ − k = (λ − 3)(λ + 1), so when k = 3.
(b) A has an eigenvalue with algebraic multiplicity 2 when its characteristic polynomial has a double
root, which occurs when its discriminant is 0. But its discriminant is (−2)2 − 4 · 1 · (−k) = 4 + 4k,
so we must have k = −1.
(c) A has no real eigenvalues when the discriminant 4 + 4k of its characteristic polynomial is negative,
so for k < −1.
17. If λ is an eigenvalue of A with eigenvector x, then Ax = λx, so that
A3 x = A(A(A(x))) = A(A(λx)) = A(λ2 x) = λ3 x.
If A3 = A, then Ax = λ3 x, so that λ = λ3 and then λ = 0 or λ = ±1.
18. If A has two identical rows, then Theorem 4.3(c) says that det A = 0, so that A is not invertible. But
then by Theorem 4.17, 0 must be an eigenvalue of A.
19. If x is an eigenvalue of A with eigenvalue 3, then
(A2 − 5A + 2I)x = A2 x − 5Ax + 2Ix = 9x − 5 · 3x + 2x = −4x.
This shows that x is also an eigenvector of A2 − 5A + 2I, with eigenvalue −4.
20. Suppose that B = P −1 AP where P is invertible. This can be rewritten as BP −1 = P −1 A. Let λ be
an eigenvalue of A with eigenvector x. Then
B(P −1 x) = (BP −1 )x = (P −1 A)x = P −1 (Ax) = P −1 (λx) = λ(P −1 x).
Thus P −1 x is an eigenvector for B, corresponding to λ.
Chapter 5
Orthogonality
5.1
Orthogonality in Rn
1. Since
v1 · v2 = (−3) · 2 + 1 · 4 + 2 · 1 = 0,
v1 · v3 = (−3) · 1 + 1 · (−1) + 2 · 2 = 0,
v2 · v3 = 2 · 1 + 4 · (−1) + 1 · 2 = 0,
this set of vectors is orthogonal.
2. Since
v1 · v2 = 4 · (−1) + 2 · 2 − 5 · 0 = 0,
v1 · v3 = 4 · 2 + 2 · 1 − 5 · 2 = 0,
v2 · v3 = −1 · 2 + 2 · 1 + 0 · 2 = 0,
this set of vectors is orthogonal.
3. Since v1 · v2 = 3 · (−1) + 1 · 2 − 1 · 1 = −2 6= 0, this set of vectors is not orthogonal. There is no need
to check the other two dot products; all three must be zero for the set to be orthogonal.
4. Since v1 · v3 = 5 · 3 + 3 · 1 + 1 · (−1) = 17 6= 0, this set of vectors is not orthogonal. There is no need
to check the other two dot products; all three must be zero for the set to be orthogonal.
5. Since
v1 · v2 = 2 · (−2) + 3 · 1 − 1 · (−1) + 4 · 0 = 0,
v1 · v3 = 2 · (−4) + 3 · (−6) − 1 · 2 + 4 · 7 = 0,
v2 · v3 = −2 · (−4) + 1 · (−6) − 1 · 2 + 0 · 7 = 0,
this set of vectors is orthogonal.
6. Since v2 · v4 = −1 · 0 + 0 · (−1) + 1 · 1 + 2 · 1 = 3 6= 0, this set of vectors is not orthogonal. There is no
need to check the other five dot products (they are all zero); all dot products must be zero for the set
to be orthogonal.
7. Since the vectors are orthogonal:
4
1
·
= 4 · 1 − 2 · 2 = 0,
−2
2
371
372
CHAPTER 5. ORTHOGONALITY
they are linearly independent by Theorem 5.1. Since there are two vectors, they form an orthogonal
basis for R2 . By Theorem 5.2, we have
v1 · w
v2 · w
4 · 1 − 2 · (−3)
1 · 1 + 2 · (−3)
1
v1 +
v2 =
v1 +
v2 = v1 − v2 ,
v1 · v1
v2 · v2
4 · 4 − 2 · (−2)
1·1+2·2
2
w=
so that
[w]B =
1
2
−1
.
8. Since the vectors are orthogonal:
1
−6
·
= 1 · (−6) + 3 · 2 = 0,
3
2
they are linearly independent by Theorem 5.1. Since there are two vectors, they form an orthogonal
basis for R2 . By Theorem 5.2, we have
w=
v1 · w
−6 · 1 + 2 · 1
2
1
v2 · w
1·1+3·1
v1 +
v2 = v1 − v2 ,
v1 +
v2 =
v1 · v1
v2 · v2
1·1+3·3
−6 · (−6) + 2 · 2
5
10
so that
"
[w]B =
2
5
1
− 10
#
.
9. Since the vectors are orthogonal:

  
1
1
 0 · 2 = 1 · 1 + 0 · 2 − 1 · 1 = 0
−1
1
   
1
1
 0 · −1 = 1 · 1 + 0 · (−1) − 1 · 1 = 0
−1
1
   
1
1
2 · −1 = 1 · 1 + 2 · (−1) + 1 · 1 = 0,
1
1
they are linearly independent by Theorem 5.1. Since there are three vectors, they form an orthogonal
basis for R3 . By Theorem 5.2, we have
v1 · w
v2 · w
v3 · w
v1 +
v2 +
v3
v1 · v1
v2 · v2
v3 · v3
1·1+0·1−1·1
1·1+2·1+1·1
1·1−1·1+1·1
=
v1 +
v2 +
v3
1 · 1 + 0 · 0 − 1 · (−1)
1·1+2·2+1·1
1 · 1 − 1 · (−1) + 1 · 1
2
1
= v2 + v3 ,
3
3
w=
so that
 
0
2
[w]B =  3  .
1
3
5.1. ORTHOGONALITY IN RN
373
10. Since the vectors are orthogonal:
   
1
1
1 · −1 = 1 · 1 + 1 · (−1) + 1 · 0 = 0
1
0
   
1
1
1 ·  1 = 1 · 1 + 1 · 1 + 1 · (−2) = 0
1
−2
   
1
1
−1 ·  1 = 1 · 1 − 1 · 1 + 0 · (−2) = 0,
0
−2
they are linearly independent by Theorem 5.1. Since there are three vectors, they form an orthogonal
basis for R3 . By Theorem 5.2, we have
v2 · w
v3 · w
v1 · w
v1 +
v2 +
v3
v1 · v1
v2 · v2
v3 · v3
1·1−1·2+0·3
1·1+1·2−2·3
1·1+1·2+1·3
v1 +
v2 +
v3
=
1·1+1·1+1·1
1 · 1 − 1 · (−1) + 0 · 0
1 · 1 + 1 · 1 − 2 · (−2)
1
1
= 2v1 − v2 − v3 ,
2
2
w=
so that


2
 
[w]B = − 21  .
− 12
11. Since kv1 k =
mal.
q
3 2
+
5
4 2
= 1 and kv2 k =
5
q
− 45
2
+
3 2
= 1, the given set of vectors is orthonor5
q q 1 2
1 2
1 2
1 2
√1 6= 1 and kv2 k =
+
=
+
−
= √12 6= 1, the given set of vectors
12. Since kv1 k =
2
2
2
2
2
is not orthonormal. To normalize the vectors, we divide each by its norm to get
" #
" #) (" # "
#)
(
1
1
√1
√1
1
1
2
2 ,
2
√ 2 , √
=
.
√1
√1
−
1/ 2 12
1/ 2 − 12
2
2
13. Taking norms, we get
s 2 2
2
1
2
2
kv1 k =
+
+
=1
3
3
3
s √
2
2
2
1
5
kv2 k =
+ −
+ 02 =
3
3
3
s
√
2
5
3 5
2
2
kv3 k = 1 + 2 +
=
.
2
2
So the first vector is already normalized; we must normalize the other two by dividing by their norms,
to get
 
 
     2   2 
2
√
√
 13

1   1

1  31 
1    32   15   3 4 5 
2
√
√
√
√
,
,
=
,
,
−
2
−
3
 3
 
3 
  3 5 .
5

 

5/3
3 5/2
 2
 2

5 
5
−2
0
− √
0
3
3
3 5
374
CHAPTER 5. ORTHOGONALITY
14. Taking norms, we get
s 2 2 2
2
1
1
1
1
kv1 k =
+
+ −
+
=1
2
2
2
2
s
√
2 2 2
2
1
1
6
2
2
kv2 k = 0 +
+
+
+0 =
3
3
3
3
s √
2 2 2
2
1
1
1
3
1
+ −
+
+ −
=
kv3 k =
.
2
6
6
6
3
So the first vector is already normalized; we must normalize the other two by dividing by their norms,
to get
 1 
 
 1   1     √3 


0
0



2
2 
2


   √23 
 






 1 


1
1 
1   √1  

−
−
1
1
3
 6
 2   6   √6 
 2
.
2 , √
 1 =  1 ,  2  , 
 1 , √


− 2   √6   63 
− 2 
6/3  3 
3/3  6 









√
 


1
1
1

√1
− 16
− 63 
2
3
2
6
15. Taking norms, we get
s 2 2 2
2
1
1
1
1
+
+ −
+
=1
kv1 k =
2
2
2
2
v
u
√ !2 2 2
u
1
6
1
t
2
+ −√
=1
+ √
kv2 k = 0 +
3
6
6
v
u √ !2
√ !2
√ !2
√ !2
u
3
3
3
3
kv3 k = t
+ −
+
+ −
=1
2
6
6
6
s
2 2
1
1
2
2
+ √
= 1.
kv4 k = 0 + 0 + √
2
2
So these vectors are already orthonormal.
16. The columns of the matrix are orthogonal:
0
−1
·
= 0 · (−1) + 1 · 0 = 0.
1
0
Further, they are orthonormal since
0
0
·
= 0 · 0 + 1 · 1 = 1,
1
1
−1
−1
·
= −1 · (−1) + 0 · 0 = 1.
0
0
Thus the matrix is an orthogonal matrix, so by Theorem 5.5, the inverse of this matrix is its transpose,
or
0 1
.
−1 0
17. The columns of the matrix are orthogonal:
"
√1
2
− √12
# "
·
√1
2
√1
2
#
1
1
1
1
= √ · √ − √ · √ = 0.
2
2
2
2
5.1. ORTHOGONALITY IN RN
375
Further, they are orthonormal since
"
# "
√1
2
− √12
"
·
√1
2
√1
2
√1
2
− √12
# "
·
√1
2
√1
2
#
1
1
1
1
= √ · √ − √ · −√
=1
2
2
2
2
#
1
1
1
1
= √ · √ + √ · √ = 1.
2
2
2
2
Thus the matrix is an orthogonal matrix, so by Theorem 5.5, the inverse of this matrix is its transpose,
or
#
"
√1
√1
−
2
2 .
1
1
√
√
2
2
18. This is not an orthogonal matrix since none of the columns have norm 1:
1 1 1
1
+ + = ,
9 9 9
3
1 1
1
+ +0= ,
4 4
2
q3 · q3 =
19. This matrix Q is orthogonal by Theorem 5.4 since QQT = I:


cos θ sin θ
cos θ sin θ − cos θ
− sin2 θ
QQT =  cos2 θ
sin θ
− cos θ sin θ  − cos θ
sin θ
0
cos θ
− sin2 θ

2
2
2
4
3
cos2 θ
sin θ
− cos θ sin θ
q1 · q1 =
q2 · q2 =
cos θ sin θ+cos θ+sin θ
0
1
0

cos θ sin2 θ−cos θ sin2 θ
cos4 θ+sin2 θ+cos2 θ sin2 θ
cos2 θ sin θ−cos2 θ sin θ 
2
2
sin θ cos θ−sin θ cos θ

1
= 0
0

sin θ
0 
cos θ
cos θ sin θ−sin θ cos θ+cos θ sin3 θ
= cos3 θ sin θ−sin θ cos θ+cos θ sin3 θ
2
1
1
4
6
+
+
=
.
25 25 25
25
2
sin θ cos θ−cos θ sin θ
sin2 θ+cos2 θ

0
0 .
1
Thus by Theorem 5.5 Q−1 = QT .
20. The columns are all orthogonal since each dot product consists of two terms of 21 · 12 and two of − 12 · 21 .
The columns are orthonormal since the norm of each column is
s
2 2 2 2
1
1
1
1
+ ±
+ ±
+ ±
= 1.
±
2
2
2
2
So the matrix is orthonormal, so by Theorem 5.5 its inverse is equal to its transpose,
 1

1
1
− 12
2
2
2
− 1
1
1
1
 2
2
2
2 .
 1

1
1
 2
− 21 
2
2
1
2
− 21
1
2
1
2
21. The dot product of the first and fourth columns is
   1 
√
1
6
  
√1 
0 
1

6 = 1 · √
 ·
6= 0,
0 − √1 

6
  
6
√1
0
2
so the matrix is not orthonormal.
376
CHAPTER 5. ORTHOGONALITY
T
−1
22. By Theorem 5.5, Q−1 is orthogonal if and only if Q−1
= Q−1
. But since Q is orthogonal,
Q−1 = QT , so that
T
−1
−1
,
Q−1 = QT
= Q−1
and Q−1 is orthogonal.
23. Since det Q = det QT , we have
1 = det(I) = det(QQ−1 ) = det(QQT ) = (det Q)(det QT ) = (det Q)2 .
Since (det Q)2 = 1, we must have det Q = ±1.
T
24. By Theorem 5.5, it suffices to show that (Q1 Q2 )−1 = (Q1 Q2 )T . But using the fact that Q−1
1 = Q1
−1
T
and Q2 = Q2 , we have
−1
T T
T
(Q1 Q2 )−1 = Q−1
2 Q1 = Q2 Q1 = (Q1 Q2 ) .
Thus Q1 Q2 is orthogonal.
25. By Theorem 3.17, if P is a permutation matrix, then P −1 = P T . So by Theorem 5.5, P is orthogonal.
26. Reordering the rows of Q is the same as multiplying it on the left by a permutation matrix P . By the
previous exercise, P is orthogonal, and then by Exercise 24, P Q is also orthogonal. So reordering the
rows of an orthogonal matrix results in an orthogonal matrix.
27. Let θ be the angle between x and y, and φ the angle between Qx and Qy. By Theorem 5.6, kQxk = kxk,
kQyk = kyk, and Qx · Qy = x · y, so that
cos θ =
Qx · Qy
x·y
=
= cos φ.
kxk kyk
kQxk kQyk
Since 0 ≤ θ, φ ≤ π, the fact that their cosines are equal implies that θ = φ.
a c
28. (a) Suppose that Q =
is an orthogonal matrix. Then
b d
T
T
QQ = Q Q = I
⇒
a
c
b a
d b
2
c
a + b2 ac + bd
1
=
=
d
ac + bd c2 + d2
0
0
.
1
So a2 + b2 = 1 and c2 + d2 = 1, and thus both columns of Q are unit vectors. Also, ac + bd = 0,
which implies that ac = −bd and therefore a2 c2 = b2 d2 . But c2 = 1 − d2 and b2 = 1 − a2 ;
substituting gives
a2 (1 − d2 ) = (1 − a2 )d2
⇒
a2 − a2 d2 = d2 − a2 d2
⇒
If a = d = 0, then b = c = ±1, so we get one of the matrices
0 1
0 −1
0 1
0
,
,
,
1 0
1
0
−1 0
−1
a2 = d2
⇒
a = ±d.
−1
,
0
all of which are of one of the required forms. Otherwise, if a = d 6= 0, then c2 = 1−d2 = 1−a2 = b2 ,
so that b = ±c and we get
a ±b
,
b a
and we must choose the minus sign in order that the columns be orthogonal. So the result is
a −b
,
b
a
5.1. ORTHOGONALITY IN RN
377
which is also of the required form. Finally, if a = −d 6= 0, then again b2 = c2 and we get
a ±b
;
b −a
this time we must choose the plus sign for the columns to be orthogonal, resulting in
a
b
,
b −a
which is again of the required form.
a
(b) Since
is a unit vector, we have a2 +b2 = 1, so that (a, b) is a point on the unit circle. Therefore
b
there is some θ with (a, b) = (cos θ, sin θ). Substituting into the forms from part (a) gives the
desired matrices.
(c) Let
cos θ
Q=
sin θ
− sin θ
,
cos θ
cos θ
R=
sin θ
sin θ
.
− cos θ
From Example 3.58 in Section 3.6, Q is the standard matrix of a rotation through angle θ. From
Exercise 26(c) in Section 3.6, R is a reflection through the line ` that makes an angle of θ2 with
the x-axis, which is the line y = tan θ2 x (or x = 0 if θ = π).
(d) From part (c), we have det Q = cos2 θ + sin2 θ = 1 and det R = − cos2 θ − sin2 θ = −1.
29. Since det Q = √12 · √12 − − √12 · √12 = 1, this matrix represents a rotation. To find the angle, we have
cos θ = √12 and sin θ = √12 , so that θ = π4 = 45◦ .
√ √ 30. Since det Q = − 21 · − 12 − 23 · − 23 = 1, this matrix represents a rotation. To find the angle, we
√
◦
have cos θ = − 12 and sin θ = − 23 , so that θ = 4π
3 = 240 .
√ √ 31. Since det Q = − 12 · 12 − 23 · 23 = −1, this matrix represents a reflection. If cos θ = − 21 and
√
, so that by part (d) of the previous exercise the reflection occurs through the
sin θ = 23 , then θ = 2π
√3
line y = tan π3 x = 3 x.
32. Since det Q = − 35 · 53 − − 45 · − 45 = −1, this matrix represents a reflection. Let θ be the (third
quadrant) angle with cos θ = − 35 and sin θ = − 54 ; then by part (d) of the previous exercise, the
reflection occurs through the line y = tan θ2 x.
33. (a) Since A and B are orthogonal matrices, we know that AT = A−1 and B T = B −1 , so that
A(AT + B T )B = A A−1 + B −1 B = AA−1 B + AB −1 B = B + A = A + B.
(b) Since A and B are orthogonal matrices, we have det A = ±1 and det B = ±1. If det A+det B = 0,
then one of them must be 1 and the other must be −1, so that (det A)(det B) = −1. But by part
(a),
det(A + B) = det A(AT + B T )B = (det A) det(AT + B T ) (det B)
= (det A) det((A + B)T ) (det B) = (det A)(det B) (det(A + B)) = − det(A + B).
Since det(A+B) = − det(A+B), we conclude that det(A+B) = 0 so that A+B is not invertible.
34. It suffices to show that QQT = I. Now,
"
#"
x1
yT x1
T
QQ =
1
t
y I − 1−x1 yy
y
I−
y
#
T
1
1−x1
yyt
.
378
CHAPTER 5. ORTHOGONALITY
Also, x21 + yyT = x21 + x22 + · · · + x2n = 1, since x is a unit vector. Rearranging gives yyT = 1 − x21 .
Further, we have
1
1
T
T
T
y
yy
yT (1 − x1 )2 = yT − (1 + x1 )yT
I−
=y −
1 − x1
1 − x1
= −x1 yT
1
1
I−
yyT y = y −
(1 − x21 )y = y − (1 + x1 )y
1 − x1
1 − x1
= −x1 y
I−
1
1 − x1
yyT
I−
1
1 − x1
yyT
2
1
1
yyT +
yyT yyT
1 − x1
1 − x1
2(1 − x1 ) T
1 − x21
=I−
yy
+
yyT
(1 − x1 )2
(1 − x1 )2
1 − 2x1 + x21 T
=I−
yy
(1 − x1 )2
= I2 − 2
= I − yyT .
Then evaluating QQT gives


1
2
T
T
T
T
x
+
y
y
x
y
+
y
I
−
yy
1
1
1−x
1 
QQT = 
1
1
1
T
T
T
T
yy
x1 y + I − 1−x1 yy y yy + I − 1−x1 yy
I − 1−x
1
1
x1 y T − x1 y T
=
x1 y − x1 y yyT + (I − yyT )
1 O
=
O I
= I.
Thus Q is orthogonal.
35. Let Q be orthogonal and upper triangular. Then the columns qi of Q are orthonormal; that is,
(
n
X
0 i 6= j
qi · qj =
qki qkj =
.
1 i=j
k=1
Since Q is upper triangular, we know that qij = 0 when i > j. We prove by induction on the rows that
each row has only one nonzero entry, along the diagonal. Note that q11 6= 0, since all other entries in
the first column are zero and the norm of the first column vector must be 1. Now the dot product of
q1 with qj for j > 1 is
n
X
q1 · qj =
qk1 qkj = q11 q1j .
k=1
Since these columns must be orthogonal, this product is zero, so that q1j = 0 for j > 1 and therefore
q11 is the only nonzero entry in the first row. Now assume that q11 , q22 , . . . , qii are the only nonzero
entries in their rows, and consider qi+1,i+1 . It is the only nonzero entry in its column, since entries
in lower-numbered rows are zero by the inductive hypothesis and entries in higher-numbered rows are
zero since Q is upper triangular. Therefore, qi+1,i+1 6= 0 since the (i + 1)st column must have norm 1.
Now consider the dot product of qi+1 with qj for j > i + 1:
qi+1 · qj =
n
X
k=1
qk,i+1 qkj = qi+1,i+1 qi+1,j .
5.2. ORTHOGONAL COMPLEMENTS AND ORTHOGONAL PROJECTIONS
379
Since these columns must be orthogonal, this product is zero, so that qi+1,j = 0 for j > i + 1. So all
entries to the right of the (i + 1)st column are zero. But also entries to the left of the (i + 1)st column
are zero since Q is upper triangular. Thus qi+1,i+1 is the only nonzero entry in its row. So inductively,
qkk is the only nonzero entry in the k th row for k = 1, 2, . . . , n, so that Q is diagonal.
36. If n > m, then the maximum rank of A is m. By the Rank Theorem (Theorem 3.26, in Section 3.5),
we know that rank(A) + nullity(A) = n; since n > m, we have nullity(A) = n − m > 0. So there is
some nonzero vector x such that Ax = 0. But then kxk =
6 0 while kAxk = k0k = 0.
37. (a) Since the {vi } are a basis, we may choose αi , βi such that
x=
n
X
α k vk ,
k=1
y=
n
X
βk v k .
k=1
Because {vi } form an orthonormal basis, we know that vi · vi = 1 and vi · vj = 0 for i 6= j. Then
!
!
n
n
n
n
n
X
X
X
X
X
x·y =
αk vk ·
βk v k =
αi βj vi · vj =
αi βi vi · vi =
αi βi .
k=1
i,j=1
k=1
i=1
i=1
Pn
But x · vi = k=1 αk vk · vi = αi and similarly y · vi = βi . Therefore the sum at the end of the
displayed equation above is
n
X
i=1
αi βi =
n
X
(x · vi )(y · vi ) = (x · v1 )(y · v1 ) + (x · v2 )(y · v2 ) + · · · + (x · vn )(y · vn ),
i=1
proving Parseval’s Identity.
(b) Since [x]B = [αk ] and [y]B = [βk ], Parseval’s Identity says that
x·y =
n
X
αk βk = [x]B · [y]B .
k=1
In other words, the dot product is invariant under a change of orthonormal basis.
5.2
Orthogonal Complements and Orthogonal Projections
1
1. W is the column space of A =
, so that W ⊥ is the null space of AT = 1
2
matrix is already row-reduced:
1 2 0 ,
2 . The augmented
so that a basis for the null space, and thus for W ⊥ , is given by
−2
.
1
−4
2. W is the column space of A =
, so that W ⊥ is the null space of AT = −4
3
gives
h
i
h
i
−4 3 0 → 1 − 43 0 ,
so that a basis for the null space, and thus for W ⊥ , is given by
" #
" #
3
3
4 ,
or
.
1
4
3 . Row-reducing
380
CHAPTER 5. ORTHOGONALITY
3. W consists of
   

 
 
x
1
0
0 
 1
 y  = x 0 + y 1 , so that 0 , 1 is a basis for W .


x+y
1
1
1
1

Then W is the column space of

1
A = 0
1

0
1 ,
1
so that W ⊥ is the null space of AT . The augmented matrix is already row-reduced:
1 0 1 0
0 1 1 0
so that the null space, which is W ⊥ , is the set
 
 
 
−t
−1
−1
−t = t −1 = span −1 .
t
1
1
This vector forms a basis for W ⊥ .
4. W consists of
   

 
 
x
1
0
0 
 1
2x + 3z  = x 2 + y 3 , so that 2 , 3 is a basis for W .


z
0
1
0
1

Then W is the column space of

1
A = 2
0

0
3 ,
1
so that W ⊥ is the null space of AT . Row-reduce the augmented matrix:
"
#
"
#
1 2 0 0
1 0 − 32 0
→
1
0 3 1 0
0 1
0
3
so that the null space, which is W ⊥ , is the set


 
 
2
2
2
t
3
3
 1 
 
 
− 3 t = t − 13  = span −1 .
3
t
1
This vector forms a basis for W ⊥ .


1
5. W is the column space of A = −1, so that W ⊥ is the null space of AT = 1
3
augmented matrix is already row-reduced:
1 −1 3 0 ,
so that the null space, which is W ⊥ , is the set


 
 
   
1
−3
s − 3t
1
−3
 s  = s 1 + t  0 = span 1 ,  0
t
0
1
0
1
These two vectors form a basis for W ⊥ .
−1
3 . The
5.2. ORTHOGONAL COMPLEMENTS AND ORTHOGONAL PROJECTIONS

6.
381

1
2
 1
W is the column space of A = − 2 , so that W ⊥ is the null space of AT = 12
− 12
2 . Row-reducing
2
gives
1
2
− 12
0 → 1
2
−1
4
0 ,
so the null space, which is W ⊥ , is the set


 
 
   
s − 4t
1
−4
1
−4
 s  = s 1 + t  0 = span 1 ,  0
t
0
1
0
1
These two vectors form a basis for W ⊥ .
7. To find the required bases, we row-reduce A 0 :



1 −1
3 0
1
 5
0

2
1
0

→
 0
0
1 −2 0
−1 −1
1 0
0
0
1
0
0
1
−2
0
0

0
0
.
0
0
Therefore, a basis for the row space of A is the nonzero rows of the reduced matrix, or
u1 = 1 0 1 ,
u2 = 0 1 −2 .
The null space null(A) is the set of solutions of the augmented matrix, which are given by
 
−s
 2s .
s
Thus a basis for null(A) is
 
−1
v =  2 .
1
Every vector in row(A) is orthogonal to every vector in null(A):
u1 · v = 1 · (−1) + 0 · 2 + 1 · 1 = 0,
8. To find the required bases, we row-reduce A

1
1 −1 0 2
−2
0
2 4 4

 2
2 −2 0 1
−3 −1
3 4 5
u2 · v = 0 · (−1) + 1 · 2 − 2 · 1 = 0.
0 :


0
1

0
 → 0
0
0
0
0
0
1
0
0
−1 −2 0
0
2 0
0
0 1
0
0 0

0
0
.
0
0
Thus a basis for the row space of A is the nonzero rows of the reduced matrix, or
u1 = 1 0 −1 −2 0 ,
u2 = 0 1 0 2 0 ,
u3 = 0 0 0
0
1 .
The null space null(A) is the set of solutions of the augmented matrix, which are given by


 
 
   
s + 2t
1
2
1
2
0
−2
0 −2
 −2t 


 
 
   
 s  = s 1 + t  0 = span 1 ,  0 .

 
 
   

 t 
0
 1
0  1
0
0
0
0
0
382
CHAPTER 5. ORTHOGONALITY
Thus a basis for null(A) is
 
1
0
 

v1 = 
1 ,
0
0

2
−2
 

v2 = 
 0 .
 1
0

Every vector in row(A) is orthogonal to every vector in null(A):
u1 · v1 = 1 · 1 + 0 · 0 − 1 · 1 − 2 · 0 + 0 · 0 = 0, u1 · v2 = 1 · 2 + 0 · (−2) − 1 · 0 − 2 · 1 + 0 · 0 = 0,
u2 · v1 = 0 · 1 + 1 · 0 + 0 · 1 + 2 · 0 + 0 · 0 = 0, u2 · v2 = 0 · 2 + 1 · (−2) + 0 · 0 + 2 · 1 + 0 · 0 = 0,
u3 · v1 = 0 · 1 + 0 · 0 + 0 · 1 + 0 · 0 + 1 · 0 = 0, u3 · v2 = 0 · 2 + 0 · (−2) + 0 · 0 + 0 · 1 + 1 · 0 = 0.
9. Using the row reduction from Exercise 7, we see that a basis for the column space of A is
 
 
1
−1
 5
 2


u1 = 
u2 = 
 0 ,
 1 .
−1
−1
To find the null space of AT , we must row-reduce AT 0 :




3
1 0 − 75
1 5
0 −1 0
0
7




1
1 −1 0 → 0 1
− 27 0 .
−1 2
7
3 1 −2
1 0
0 0
0
0 0
The null space null(AT ) is the set of solutions of the augmented matrix, which are given by
 5
 5
 3
   

3
−3
−7
5
7s − 7t
7
− 1 s + 2 t
− 1 
 2
−1  2
 7
  
7  = s  7  + t  7  = span 

 
 
  ,   .


 1
 0
 7  0

s
t
0
0
1
7
Thus a basis for null(AT ) is

5
−1

v1 = 
 7 ,
0

 
−3
 2

v2 = 
 0 .
7
Every vector in col(A) is orthogonal to every vector in null(AT ):
u1 · v1 = 1 · 5 + 5 · (−1) + 0 · 7 − 1 · 0 = 0, u1 · v2 = 1 · (−3) + 5 · 2 + 0 · 0 − 1 · 7 = 0,
u2 · v1 = −1 · 5 + 2 · (−1) + 1 · 7 − 1 · 0 = 0, u2 · v2 = −1 · (−3) + 2 · 2 + 1 · 0 − 1 · 7 = 0.
10. Using the row reduction from Exercise 8, we see that a basis for the column space of A is
 
 
 
1
1
2
−2
 0
4



u1 = 
u2 = 
u3 = 
 2 ,
 2 ,
1
−3
−1
5
To find the null space of AT , we must row-reduce AT 0 :




1 −2
2 −3 0
1 0 0
1 0
 1
0 1 0
0
2 −1 0
1 0




−1


2 −2
3 0 → 0 0 1 −1 0

.
 0
0 0 0
4
0
4 0
0 0
2
4
1
5 0
0 0 0
0 0
5.2. ORTHOGONAL COMPLEMENTS AND ORTHOGONAL PROJECTIONS
383
The null space null(AT ) is the set of solutions of the augmented matrix, which are given by
 
 
−s
−1
−s
−1
  = s .
s
 1
s
1
Thus a basis for null(AT ) is
 
−1
−1

v1 = 
 1 .
1
Every vector in col(A) is orthogonal to every vector in null(AT ):
u1 · v1 = 1 · (−1) − 2 · (−1) + 2 · 1 − 3 · 1 = 0,
u2 · v1 = 1 · (−1) + 0 · (−1) + 2 · 1 − 1 · 1 = 0,
u3 · v1 = 2 · (−1) + 4 · (−1) + 1 · 1 + 5 · 1 = 0.
11. Let

A = w1
w2
2
= 1
−2

4
0 .
1
Then the subspace W spanned by w1 and w2 is the column space of A, so that by Theorem 5.10,
W ⊥ = (col(A))⊥ = null(AT ). To compute null(AT ), we row-reduce
"
#
"
#
i
h
1
2
1
−2
0
1
0
0
4
→
.
AT 0 =
4 0
1 0
0 1 − 52 0
Thus null(AT ) is the set of solutions of this augmented matrix, which is


 
 
−1
− 41 t
− 14
 5 
 5
 
=
t
=
span
 2t 
 2
 10
4
t
1
so this vector forms a basis for W ⊥ .
12. Let

A = w1
1
−1
w2 = 
 3
−2

0
1
.
−2
1
Then the subspace W spanned by w1 and w2 is the column space of A, so that by Theorem 5.10,
W ⊥ = (col(A))⊥ = null(AT ). To compute null(AT ), we row-reduce
T
1 −1
3 −2 0
1 0
1 −1 0
0 =
A
→
.
0
1 −2
1 0
0 1 −2
1 0
Thus null(AT ) is the set of solutions of this augmented matrix, which is


 
 
   
−s + t
−1
1
−1
1
 2s − t 
 2
−1
 2 −1


 
 
   
 s  = s  1 + t  0 = span  1 ,  0 ,
t
0
1
0
1
so these two vectors are a basis for W ⊥ .
384
CHAPTER 5. ORTHOGONALITY
13. Let

A = w1
w2
2
−1
w3 = 
 6
3

−1 2
2 5
.
−3 6
−2 1
Then the subspace W spanned by w1 , w2 , and w3 is the column space of A, so that by Theorem 5.10,
W ⊥ = (col(A))⊥ = null(AT ). To compute null(AT ), we row-reduce




4
2 −1
6
3 0
1 0 3
0
h
i
3




AT 0 = −1
2 −3 −2 0 → 0 1 0 − 13 0 .
2
5
6
1 0
0 0 0
0 0
Thus null(AT ) is the set of solutions of this augmented matrix, which is
 4
   

 

−4
−3
−3
−3s − 43 t
−3










1
1
 
 0  1

 0

3t

 = s   + t  3  = span   ,   ,
 0
 1  0

 1

s
0
t
1
0


1
2
2
2

0
2
.
1 −1
−3
2
3
so these two vectors form a basis for W ⊥ .
14. Let
4
 6

v2 = 
−1
 1
−1
A = v1
Then the subspace W spanned by w1 , w2 , and w3 is the column space of A, so that by Theorem 5.10,
W ⊥ = (col(A))⊥ = null(AT ). To compute null(AT ), we row-reduce




4
6
−1
1
−1
0
1
0
0
−2
7
0
h
i 



3

0
1 −3 0
AT 0 = 
−5 0
1 2
 → 0 1 0
.
2
2
2
2 −1
2
0
0
0
1
0 −1
Thus null(AT ) is the set of solutions of this augmented matrix, which is


 
 
2s − 7t
2
−7
 3

 3
 
− s + 5t
− 
 5
 2

 2
 


 
 

 = s  0 + t  1
t


 
 


 1
 0
s


 
 
t
0
1
Clearing fractions, we get


4
−3
 
 0 ,
 
 2
0
 
−7
 5
 
 1
 
 0
1
as a basis for W ⊥ .
15. We have
projW (v) =
u1 · v
u1 · u1
1 · 7 + 1 · (−4)
3
u1 =
u1 = u1 =
1·1+1·1
2
" #
3
2
3
2
0
5.2. ORTHOGONAL COMPLEMENTS AND ORTHOGONAL PROJECTIONS
16. We have
u1 · v
u2 · v
projW (v) =
u1 +
u2
u1 · u1
u2 · u2
1 · 3 + 1 · 1 + 1 · (−2)
1 · 3 − 1 · 1 + 0 · (−2)
=
u1 +
u2
1·1+1·1+1·1
1 · 1 − 1 · (−1) + 0 · 0
2
= u1 + u2
3
 
   
5
1
1
3
2    
1 + −1 = 
=
− 13 


3
1
0
2
3
17. We have
u1 · v
u2 · v
u1 +
u2
u1 · u1
u2 · u2
−1 · 1 + 2 · 2 + 4 · 3
2·1−2·2+1·3
u1 +
u2
=
2 · 2 − 2 · (−2) + 1 · 1
−1 · (−1) + 1 · 1 + 4 · 4
13
1
= u1 + u2
9
  18
  
projW (v) =
2
− 13
−1
 9   18   2 
2
 13   1 
=
− 9  +  18  =  2 .
1
52
3
9
18
18. We have
u1 · v
u2 · v
u3 · v
projW (v) =
u1 +
u2 +
u3
u1 · u1
u2 · u2
u3 · u3
1 · 3 + 1 · (−2) + 0 · 4 + 0 · (−3)
1 · 3 − 1 · (−2) − 1 · 4 + 1 · (−3)
=
u1 +
u2 +
1·1+1·1+0·0+0·0
1 · 1 − 1 · (−1) − 1 · (−1) + 1 · 1
0 · 3 + 0 · (−2) + 1 · 4 + 1 · (−3)
u3
0·0+0·0+1·1+1·1
1
1
1
= u1 − u2 + u3
2   2  2
   
1
1
0
0
 1 −1 1 0 1
1
1
 −   +   =  .
= 
2 0 2 −1 2 1 1
0
1
1
0
19. Since W is spanned by the vector u1 =
projW (v) =
u1 · v
u1 · u1
1
, we have
3
" 2#
−5
1 · 2 + 3 · (−2)
2 1
u1 =
u1 = −
=
.
1·1+3·3
5 3
− 65
The component of v orthogonal to W is thus
"
projW ⊥ (v) = v − projW (v) =
12
5
− 45
#
.
385
386
CHAPTER 5. ORTHOGONALITY
 
1
20. Since W is spanned by the vector u1 = 1, we have
1
projW (v) =
u1 · v
u1 · u1
 
 
4
1
3
1 · 3 + 1 · 2 + 1 · (−1)
4  
4
1 =
u1 =
u1 =
 3 .
1·1+1·1+1·1
3
1
4
3
The component of v orthogonal to W is thus


5
 3
2
projW ⊥ (v) = v − projW (v) = 
 3 .
− 73
21. Since W is spanned by
 
1
u1 = 2
1


1
and u2 = −1 ,
1
we have
u1 · v
u2 · v
projW (v) =
u1 +
u2
u1 · u1
u2 · u2
1 · 4 − 1 · (−2) + 1 · 3
1 · 4 + 2 · (−2) + 1 · 3
u1 +
u2
=
1·1+2·2+1·1
1 · 1 − 1 · (−1) + 1 · 1
1
= u1 + 3u2
2     

7
1
3
2
2
     
=  1  + −3 = −2.
1
7
3
2
2
The component of v orthogonal to W is thus

1
2

 
projW ⊥ (v) = v − projW (v) =  0 .
− 12
22. Since W is spanned by
 
1
1

u1 =  
1
0


1
 0

and u2 = 
−1 ,
1
we have
u1 · v
u2 · v
u1 +
u2
u1 · u1
u2 · u2
1 · 2 + 1 · (−1) + 1 · 5 + 0 · 6
1 · 2 + 0 · (−1) − 1 · 5 + 1 · 6
=
u1 +
u2
1·1+1·1+1·1+0·0
1 · 1 + 0 · 0 − 1 · (−1) + 1 · 1
= 2u1 + u2
     
1
1
3
1  0 2
    
= 2
1 + −1 = 1 .
0
1
1
projW (v) =
5.2. ORTHOGONAL COMPLEMENTS AND ORTHOGONAL PROJECTIONS
387
The component of v orthogonal to W is thus
 
−1
−3

projW ⊥ (v) = v − projW (v) = 
 4 .
5
23. W ⊥ is defined to be the set of all vectors in Rn that are orthogonal to every vector in W . So if
w ∈ W ∩ W ⊥ , then w · w = 0. But this means that w = 0. Thus 0 is the only element of W ∩ W ⊥ .
24. If v ∈ W ⊥ , then since v is orthogonal to every vector in W , it is certainly orthogonal to wi , so that
v · wi = 0 for i = 1, 2, . . . , k. For the converse, suppose w ∈ W ; then since W = span(w1 , . . . , wk ),
Pk
there are scalars ai such that w = i=1 ai wi . Then
v·w =v·
k
X
ai wi =
k
X
i=1
ai v · wi = 0
i=1
since v is orthogonal to each of the wi . But this means that v is orthogonal to every vector in W , so
that v ∈ W ⊥ .
 
 
0
1
25. No, it is not true. For example, let W ⊂ R3 be the xy-plane, spanned by e1 = 0 and e2 = 1.
0
0
 
0
/ W ⊥.
Then W ⊥ = span(e3 ) = span 0. Now let w = e1 and w0 = e2 ; then w · w0 = 0 but w0 ∈
1
26. Yes, this is true. Let V = span(vk+1 , . . . , vn ); we want to show that V = W ⊥ . P
By Theorem 5.9(d),
n
all we must show is that for every v ∈ V , vi · v = 0 for i = 1, 2, . . . , k. But v = j=k+1 cj vj , so that
if i ≤ k, since the vi are an orthogonal basis for Rn we get
n
X
vi ·
cj vj =
j=k+1
n
X
cj vj · vi = 0.
j=k+1
27. Let {uk } be an orthonormal basis for W . If
x = projW (x) =
n X
uk · x
k=1
uk · uk
uk =
n
X
(uk · x)uk ,
k=1
then x is in the span of the uk , which is W . For the converse, suppose that x ∈ W . Then x =
P
n
k=1 ak uk . Since the uk are an orthogonal basis for W , we get
x · ui =
n
X
ak uk · ui = ai ui · ui = ai .
k=1
Therefore
x=
n
X
ak uk =
k=1
n
X
(x · uk )uk = projW (x).
k=1
28. Let {uk } be an orthonormal basis for W . Then
n
n X
X
uk · x
uk =
(uk · x)uk .
projW (x) =
uk · uk
k=1
k=1
So if projW (x) = 0, then uk · x = 0 for all k since the uk are linearly independent. But then by
Theorem 4.9(d), x ∈ W ⊥ , so that x is orthogonal to W . Conversely, if x is orthogonal to W , then
uk · x = 0 for all k, so the sum above is zero and thus projW (x) = 0.
388
CHAPTER 5. ORTHOGONALITY
29. If uk is an orthonormal basis for W , then
projW (x) =
n
X
(uk · x)uk ∈ W.
k=1
By Exercise 27, x ∈ W if and only if projW (x) = x. Applying this to projW (x) ∈ W gives
projW (projW (x)) = projW (x)
as desired.
30. (a) Let W = span(S) = span{v1 , . . . , vk }, and let T = {b1 , . . . , bj } be an orthonormal basis for
W ⊥ . By Theorem 5.13, dim W + dim W ⊥ = k + j = n. Therefore S ∪ T is a basis for Rn .
Further, it is an orthonormal basis: any two distinct members of S are orthogonal since S is an
orthonormal set; similarly, any two distinct members of T are orthogonal. Also, vi · bj = 0 since
one of the vectors is in W and the other is in W ⊥ . Finally, vi · vi = bj · bj = 1 since both sets
are orthonormal sets.
Choose x ∈ Rn ; then there are αr , βs such that
x=
k
X
αr vr +
r=1
j
X
βs bs .
s=1
Then using orthonormality of S ∪ T , we have
!
j
k
X
X
x · vi =
αr vr +
βs bs · vi = αi and x · bi =
r=1
s=1
k
X
αr vr +
r=1
j
X
!
βs bs
· bi = βi .
s=1
But then (again using orthonormality)
2
kxk = x · x
=
k
X
α r vr +
r=1
=
k
X
r=1
=
k
X
αr2 +
≥
!
k
X
βs bs
s=1
j
X
r=1
αr vr +
j
X
!
βs bs
s=1
βs2
s=1
2
|x · vr | +
r=1
k
X
j
X
j
X
2
|x · bs |
s=1
2
|x · vr | ,
r=1
as required.
(b) Looking at the string of equalities and inequalities above, we see that equality holds if and only
Pj
2
if
s=1 |x · bs | = 0. But this means that x · bs = 0 for all s, which in turn means that
x ∈ span{vj } = W .
5.3
The Gram-Schmidt Process and the QR Factorization
1. First obtain an orthogonal basis:
" #
1
v1 = x1 =
1
" #
" # " # " # " #
3
1
− 21
1
1·1+1·2 1
v2 = x2 − projv1 (x2 )v1 =
−
=
− 23 =
.
1
1·1+1·1 1
2
2
2
2
5.3. THE GRAM-SCHMIDT PROCESS AND THE QR FACTORIZATION
389
Normalizing each of the vectors gives
" # " #
√1
v1
1 1
q1 =
=√
= 12 ,
kv1 k
√
2 1
2
" # "
#
− √12
− 12
v2
1
q2 =
=
.
= √
1
kv2 k
√1
1/ 2
2
2
2. First obtain an orthogonal basis:
v1 = x1 =
3
−3
3·3−3·1
3
3
−
v2 = x2 − projv1 (x2 )v1 =
1
3 · 3 − 3 · (−3) −3
1
3
3
3
1
2
=
=
−
−
=
.
1
1
−1
2
3 −3
Normalizing each of the vectors gives
" # "
#
√1
3
1
v1
2
=
=√
,
q1 =
kv1 k
18 −3
− √12
" # " #
√1
v2
1 2
= 12
q2 =
=√
kv2 k
√
8 2
2
3. First obtain an orthogonal basis:


1
v1 = x1 = −1
−1
 
       
2
0
1
0
2
1
·
0
−
1
·
3
−
1
·
3
−1 = 3 + −2 = 1
v2 = x2 − projv1 (x2 )v1 = 3 −
1 · 1 − 1 · (−1) − 1 · (−1)
3
−1
3
−2
1
v3 = x3 − projv1 (x3 )v1 − projv2 (x3 )v2
 
   
 
     
3
3
1
2
0
1
2
1
·
3
−
1
·
2
−
1
·
4
2
·
3
+
1
·
2
+
1
·
4
−1 −
1 = 2 + −1 − 2 1 = −1 .
= 2 −
1 · 1 − 1 · (−1) − 1 · (−1)
2·2+1·1+1·1
4
4
−1
1
1
−1
1
Normalizing each of the vectors gives

 

√1
1
3

v1
1  
 = − √1  ,
q1 =
=√ 
−1
kv1 k
3    13 
−1
− √3
   
√2
2
  16 
v2
1 
q2 =
=√ 
1 =  √  ,
kv2 k
6    16 
√
1
6
  

0
0
 1 
v3
1  
√ 
q3 =
=√ 
=
−1


− 2  .
kv3 k
2
√1
1
2
390
CHAPTER 5. ORTHOGONALITY
4. First obtain an orthogonal basis:
 
1
v1 = x1 = 1
1
 
 
 
 
1
1
1
1
 3
1
·
1
+
1
·
1
+
1
·
0
2
1
v1 = 1 − 1 = 
v2 = x2 − projv1 (x2 )v1 = 1 −
 3
1·1+1·1+1·1
3
1
0
0
−2
3
v3 = x3 − projv1 (x3 )v1 − projv2 (x3 )v2
 
1
1
·1+ 1 ·0− 2 ·0
1·1+1·0+1·0
= 0 −
v1 − 1 31 1 3 1 2 3 2 v2
1·1+1·1+1·1
3 · 3 + 3 · 3 − 3 · −3
0
 
 
 
1
1
1
  1   1  3

 
 1
=
0 − 3 1 − 2  3 
0
1
− 23
 
1
 2
1
=
− 2 .
0
Normalizing each of the vectors gives
   
√1
1
  13 
v1
1 
q1 =
=√ 
1 =  √  ,
kv1 k
3    13 
√
1
3



q3 =



1
√1
3
 16 
v2
1 
 1 =  √  ,
q2 =
=p
kv2 k
2/3  22   26 
−3
− √6
v3
1
= √
kv3 k
1/ 2


1
√1
2
 2 
− 1  = − √1 
 2 
2
0
0
5. Using Gram-Schmidt, we get
 
1
 
v1 = x1 = 1
0
 
   
   
3
1
3
1
− 21
  1 · 3 + 1 · 4 + 0 · 2     7    1
v2 = x2 − projv1 (x2 )v1 = 4 −
1 = 4 − 1 =  2  .
1·1+1·1+0·0
2
2
0
2
0
2
5.3. THE GRAM-SCHMIDT PROCESS AND THE QR FACTORIZATION
6. Using Gram-Schmidt, we get


2
−1

v1 = x1 = 
 1
2


3
−1 2 · 3 − 1 · (−1) + 1 · 0 + 2 · 4

v2 = x2 − projv1 (x2 )v1 = 
 0 − 2 · 2 − 1 · (−1) + 1 · 1 + 2 · 2 v1
4
 
 
 
0
2
3
 
−1 3 −1  12 
   

=
 0 − 2  1 = − 3 
 2
2
4
1
v3 = x3 − projv1 (x3 )v1 − projv2 (x3 )v2
 
1
1
3
1
 − 2 · 1 − 1 · 1 + 1 · 1 + 2 · 1 v1 − 0 · 1 + 2 · 1 − 2 · 1 + 1 · 1 v2
=
1 2 · 2 − 1 · (−1) + 1 · 1 + 2 · 2
0 · 0 + 12 · 21 − 32 · − 32 + 1 · 1
1
   
1
 
 
0
2
1
 1   57 
 2 5
1 2 −1
   

 
=
1 − 5  1 + 0 − 3  =  3 .
 2 5
2
1
1
1
5
7. First compute the projection of v on each of the vectors v1 and v2 from Exercise 5:
   
1
0
v1 · v
1 · 4 + 1 · (−4) + 0 · 3    
projv1 v =
=
v1 =
1 0
v1 · v1
1·1+1·1+0·0
0
0
 
   
−2
− 12
−1
1
1
− 2 · 4 + 2 · (−4) + 2 · 3  1  4  21   92 
v2 · v
projv2 v =
v2 =
 2 =  2 =  9 .
2
2
v2 · v2
9
− 12 + 12 + 22
8
2
2
9
Then the component of v orthogonal to both v1 and v2 is

   

38
4
− 92
9
   2   38 
−4 −  9  = − 9  ,
8
19
3
9
9
so that


38
9
 38  4
v = − 9  + v2 .
9
19
9
391
392
CHAPTER 5. ORTHOGONALITY
8. First compute the projection of v on each of the vectors v1 and v2 from Exercise 6:
 
   2
2
2
5
 1 −1 − 1 
−1
v1 · v
2·1−1·4+1·0+2·2
 
   5
projv1 v =
v1 =
  =   =  1
v1 · v1
22 + 12 + 12 + 22
 1 5  1  5 
2
2
5

  

0
0
0
 1  4
1
0 · 1 + f rac12 · 4 − 32 · 0 + 1 · 2 
v2 · v
 2 8  2  7
projv2 v =
v2 =
 3  =  3  =  12 
2
2
v2 · v2
− 2  7 − 2  − 7 
02 + 1 + − 3 + 1 2
2

2
2
1
v3 · v
v3 =
v3 · v3
8
7
1
5
5
60


  217 

1
7
3
1
7
7
·
1
+
·
4
+
·
0
+
·
2
31






5
5
5
5
.
 53  =  60
5 =
93 
1 2
7 2
3 2
1 2 3
12




+ 5 + 5 + 5
5
5
60
5
1
1
31
5
5
60
1
projv3 v =
1
 31 
 
Then the component of v orthogonal to both v1 and v2 is
 2 
  31   1 
 
0
5
60
12
4
 1   4   217   1 
  − 5   7   60   84 
−4 −  1  −  12  −  93  =  1  ,
 5  − 7   60  − 28 
3
2
8
31
5
− 84
5
7
60
so that
1
12
 1 1
8
31


v =  84
 + v1 + v2 + v3 .
1
7
12
− 28  5
5
− 84


9. To find the column space of the matrix, row-reduce it:



0 1 1
1
1 0 1 → 0
1 1 0
0
0
1
0

0
0 .
1
It follows that, since there are 1s in all three columns, the columns of the original matrix are linearly
independent, so they form a basis for the column space. We must orthogonalize those three vectors.
 
0
 
v1 = x1 = 1
1
 
   
   
0
1
0
1
1
  0 · 1 + 1 · 0 + 1 · 1     1    1
v2 = x2 − projv1 (x2 )v1 = 0 −
1 = 0 − 1 = − 2 
0·0+1·1+1·1
2
1
1
1
1
1
2
v3 = x3 − projv1 (x3 )v1 − projv2 (x3 )v2
 
 
 
1
0
1
1
1
1 · 1 − 2 · 1 + 2 · 0  1
  0·1+1·1+1·0 
= 1 −
1 −
− 2 
0·0+1·1+1·1
1 · 1 − 21 · − 12 + 12 · 21
1
0
1
2
 
 
   
2
1
0
1
3
  1  1   
= 1 − 1 − − 21  =  23  .
2
3
1
0
1
− 23
2
5.3. THE GRAM-SCHMIDT PROCESS AND THE QR FACTORIZATION
10. To find the column space of the matrix, row-reduce it:



1
1
1 1
0
 1 −1 2

→
0
−1
1 0
0
1
5 1
0
1
0
0
393

0
0
.
1
0
It follows that, since there are 1s in all three columns, the columns of the original matrix are linearly
independent, so they form a basis for the column space. We must orthogonalize those three vectors.
 
1
 1

v1 = x1 = 
−1
1
 
       
0
1
1
1
1
−1 1 · 1 + 1 · (−1) − 1 · 1 + 1 · 5  1 −1  1 −2

       
v2 = x2 − projv1 (x2 )v1 = 
 1 − 1 · 1 + 1 · 1 − 1 · (−1) + 1 · 1 −1 =  1 − −1 =  2
4
5
1
5
1
v3 = x3 − projv1 (x3 )v1 − projv2 (x3 )v2
 
 
 
1
1
0
2
 1
−2
1
·
1
+
1
·
2
−
1
·
0
+
1
·
1
0
·
1
−
2
·
2
+
2
·
0
+
4
·
1

 
 
=
0 − 1 · 1 + 1 · 1 − 1 · (−1) + 1 · 1 −1 − 0 · 0 − 2 · (−2) + 2 · 2 + 4 · 4  2
1
1
4
   
   
0
0
1
1
−2 1
2  1
   
  
=
0 − −1 + 0  2 = 1 .
0
4
1
1
 
 
0
0
11. The given vector, x1 , together with x2 = 1 and x3 = 0 form a basis for R3 ; we want to
0
1
orthogonalize that basis.
 
3
 
v1 = x1 = 1
5
 

   
  
3
0
− 35
3
0
3
1    34 
  3·0+1·1+5·0   
v2 = x2 − projv1 (x2 )v1 = 1 −
1 = 1 −
1 =  35 
2
2
2
3 +1 +5
35
0
− 71
5
0
5
v3 = x3 − projv1 (x3 )v1 − projv2 (x3 )v2
 
 


3
0
3
− 35
34
1
3
− ·0+ ·0− ·1 
  3·0+1·0+5·1 

= 0 −
1 − 353 2 3534 2 7 1 2  34
35 
32 + 12 + 52
− 35 + 35 + 7
1
5
− 71
 
 

 

3
0
3
− 35
− 15
34
5  34  
  1 

= 0 − 1 +
 35  =  0 .
7
34
9
1
5
− 17
34
12. Let


1
 2

v1 = 
−1 ,
0
 
1
0

v2 = 
1 .
3
394
CHAPTER 5. ORTHOGONALITY
Note that v1 · v2 = 0. Now let
 
0
0

x4 = 
0 .
1
 
0
0

x3 = 
1 ,
0
To see that v1 , v2 , x3 , and x4 form a basis for R4 , row-reduce the matrix that has these vectors as its
columns:




1 1 0 0
1 0 0 0
 2 0 0 0
0 1 0 0




−1 1 1 0 → 0 0 1 0 .
0 3 0 1
0 0 0 1
Since there are 1s in all four columns, the column vectors are linearly independent, so they form a
basis for R4 . Next we need to orthogonalize that basis. Since v1 and v2 are already orthogonal, we
start with x3 :


5
 
 
 
66
0
1
1
 1 
0 1  2
 3 
v1 · x3
1 
v2 · x3
0







v3 = x3 −
=
v1 −
v2 =   +   −
 49 


1
−1
1
v1 · v1
v2 · v2
6
11
 66 
0
0
3
3
− 11
v1 · x4
v2 · x4
v3 · x4
v4 = x4 −
v1 −
v2 −
v3
v1 · v1
v2 · v2
v3 · v3
 

 

 
 
5
12
1
− 49
1
0
66
 
 1   6 
 
 
 

 2
0
0
18 
3 



 3  =  49 .



+
=   + 0  −






49
−1 11 1 49  66   0 
0
3
4
3
0
− 11
1
49
 
0
13. To start, we let x3 = 0 be the third column of the matrix. Then the determinant of this matrix
1
is (for example, expanding along the second row) √16 6= 0, so the resulting matrix is invertible and
therefore the three column vectors form a basis for R3 . The first two vectors are orthogonal and each
is of unit length, so we must orthogonalize the third vector as well:
v3 = x3 −
q1 · x3
q1 · q1
q1 −
q2 · x3
q2 · q2
q2
 1 
 1 
 
√
√
0
3
2
√1
√1
−

 √1 
 

2
3
= 0 − 2
0 − 2 2 2  3 
2 
√1
√1
+ 02 + − √12
+ √13 + √13
√1
1
− √12
2
3
3
 
 1 
 1   
1
√
√
0
2
1 
1  13   61 
 

= 0 + √ 
0 − √  √3  = − 3  .
2
3 1
1
1
√
√
1
− 2
6
3
Then v3 is orthogonal to the other two columns of the matrix, but we must normalize v3 to convert it
to a unit vector.
s 2 2
2
1
1
1
1
kv3 k =
+ −
+
=√ ,
6
3
6
6
5.3. THE GRAM-SCHMIDT PROCESS AND THE QR FACTORIZATION
395
so that

√1
6
1


q3 = √ v3 = − √26  ,
1/ 6
√1
6

so that the completed matrix Q is

√1
2

Q=
0
− √12
√1
3
√1
3
√1
3

√1
6

− √26  .
√1
6
 
 
2
1
 1
1

 
14. To simplify the calculations, we will start with the multiples v1 = 
1 and v2 =  0 of the given
−3
1
basis vectors (column vectors of Q). We will find an orthogonal basis containing those two vectors and
then normalize all four at the very end. Note that v1 and v2 are already orthogonal. If we choose
 
 
0
0
0
0

 
x3 = 
1 and x4 = 0 ,
1
0
then calculating the determinant of v1 v2 x3 x4 , for example by expanding along the last column,
shows that the matrix is invertible, so that these four vectors form a basis for R4 . We now need to
orthogonalize the last two vectors:
p3 = x3 −
v1 · x3
v1 · v1
v1 −
v2 · x3
v2 · v2
v2
 
 
 
0
1
2
0 1 · 0 + 1 · 0 + 1 · 1 + 1 · 0 1

2 · 0 + 1 · 0 + 0 · 1 − 3 · 0  1
 
 

= −
 −
 
1 1 · 1 + 1 · 1 + 1 · 1 + 1 · 1 1 2 · 2 + 1 · 1 + 0 · 0 − 3 · (−3)  0
0
1
−3
 
 
 
0
1
−1
0 1 1 1 −1
 
 
 
=   −   =  .
1 4 1 4  3
0
1
−1
396
CHAPTER 5. ORTHOGONALITY
 
−1
−1
 
Now let v3 = 4p3 =  , again avoiding fractions. Next,
 3
−1
v4 = x4 −
v1 · x4
v1 · v1
v1 −
v2 · x4
v2 · v2
v2 −
v3 · x4
v3 · v3
v3
 
 
 
1
2
0

0 1 · 0 + 1 · 0 + 1 · 0 + 1 · 1 1
2 · 0 + 1 · 0 + 0 · 0 − 3 · 1  1

 
 
= −
 −
 
0 1 · 1 + 1 · 1 + 1 · 1 + 1 · 1 1 2 · 2 + 1 · 1 + 0 · 0 − 3 · (−3)  0
1
−3
1
 
−1


−1 · 0 − 1 · 0 + 3 · 0 − 1 · 1
−1
−
 
−1 · (−1) − 1 · (−1) + 3 · 3 − 1 · (−1)  3
−1
 
 
 
 
0
1
2
−1
0 1 1
 1
−1
3
1
 
 
 
 
= −  +
 +
 
0 4 1 14  0 12  3
1
 2
−3
1
−1
21
− 5 


=  42  .
 0
1
42
We now normalize v1 through v4 ; note that we already have normalized forms for the first two.
p
√
kv3 k = 12 + 12 + 32 + 12 = 2 3
s 2
2
2
5
1
1
2
+ −
+ 02 +
=√ ,
kv4 k =
21
42
42
42
so that the matrix is
1
2
1
2
Q=
1
2
1
2
15. From Exercise 9, kv1 k =
√
√
√2
14
√1
14
− 63
0
3
2√
− 63
− √314
√
− 63
√

√4
42

− √542 
.
0 

√1
42
√
2, kv2 k =
√
6
2 3
,
and
kv
k
=
3
2
3 , so that

h
Q = q1
q2
√2
6
− √16
√1
6
0
i

q3 =  √12
√1
2
Therefore

0

R = QT A =  √26
√1
3
√1
2
− √16
√1
3

√1
0
2

√1  1
6
1
− √13
1
0
1

√1
3

√1  .
3
− √13
 √
2
1
 
1 =  0
0
0
√1
2
√3
6
0

√1
2

√1  .
6
√2
3
5.3. THE GRAM-SCHMIDT PROCESS AND THE QR FACTORIZATION
397
√
√
16. From Exercise 10, kv1 k = 2, kv2 k = 2 6, and kv3 k = 2, so that

h
Q = q1
1
2

i  1
2
q3 = 
− 1
 2
1
2
q2
√

0

√1 
2 .

√1 
2
0
0
− 66
√
6
√6
6
3
Therefore

1
2
R = QT A = 
0
0
17. We have

− 21
√
6
6
√1
2
1
√2
− 66
√1
2

2
3

R = QT A =  13
2
3
1
3
2
3
− 32

1

√2  
6 
3 
0
− 23
1
1
−1
1

2
2 
3  1
1
−2
3


1
2


−1 2
 = 0

1 0

0
5 1
1
2
√
2 6
0
 
3 9
8
2
 
7 −1 = 0 6
−2
1
0 0
2


.
0 
√ 
2

1
3
2
3
7
3
18. We have
"
T
R=Q A=
√1
6
√1
3
√2
6
− √16
0
√1
3

1
#
0 
 2

1
√
−1
3
0

3
"√
4
6

=
−1
0
√ #
2 6
√
.
3
1
19. If A is already orthogonal, let Q = A and R = I, which is upper triangular; then A = QR.
20. First, if A is invertible, then A has linearly independent columns, so that by Theorem 5.16, A = QR
where R is invertible. Since R is upper triangular, its determinant is the product of its diagonal entries,
so that R invertible implies that all of its diagonal entries are nonzero. For the converse, if R is upper
triangular with all diagonal entries nonzero, then it has nonzero determinant; since Q is orthogonal, it
is invertible so has nonzero determinant as well. But then det A = det(QR) = (det Q)(det R) 6= 0, so
that A is invertible. Note that when this occurs,
A−1 = (QR)−1 = R−1 Q−1 = R−1 QT .
21. From Exercise 20, A−1 = R−1 QT . To compute R−1 , we row-reduce R I using Exercise 15:
√


√
√
√ 
2 √12 √12 1 0 0
1 0 0 22 − 66 − 63
√
√ 



6
3 .
 0 √3
 → 0 1 0
√1
0
1
0
−
0



3
6
6
√6 
3
0 0 1
0
0
0
0 √23 0 0 1
2
Thus
√
√
√
2
 2
− 66
− 63
6
3
√
0
0
A−1 = R−1 QT = 
 0
√

0
√ 
 2
− 63 
  √6
3
2
√1
3
√1
2
− √16
√1
3


√1
−1
2
 2
1
√ = 1
 2
6
1
− √13
2
1
2
− 12
1
2

1
2
1 .
2
− 21
22. From Exercise 20, A−1 = R−1 QT . To compute R−1 , we row-reduce R I using Exercise 17:




2
3 9 13 1 0 0
1 0 0 13 − 12
21



1
1 
− 21
0 6 23 0 1 0 → 0 1 0 0
.
6
7
3
0 0 3 0 0 1
0 0 1 0
0
7
398
CHAPTER 5. ORTHOGONALITY
Thus

1
3

− 12

A−1 = R−1 QT =  0
0
1
6
0
2
2
21
3


1
− 21
  31
3
2
7
3
1
3
2
3
− 23
− 23


5
42


2
1
=  42
3
2
1
3
7
− 27
11
− 21
1
7
− 27

2 
.
21 
1
7
23. If A = QR is a QR-factorization of A, then we want to show that Rx = 0 implies that x = 0 so that
R is invertible by part (c) of the Fundamental Theorem. Note that since Q is orthogonal, so is QT , so
that by Theorem 5.6(b), QT x = kxk for any vector x. Then
(QT A)x = QT (Ax) = kAxk = 0 ⇒ Ax = 0.
Rx = 0 ⇒ kRxk = 0 ⇒
P
If ai are the columns of A, then Ax = 0 means that
ai xi = 0. But this means that all of the xi are
zero since the columns of A are linearly independent. Thus x = 0. By part (c) of the Fundamental
Theorem, we conclude that R is invertible.
24. Recall that row(A) = row(B) if and only if there is an invertible matrix M such that A = M B. Then
A = QR implies that AT = RT QT ; since A has linearly independent columns, R is invertible, so that
RT is as well. Thus row(AT ) = row(QT ), so that col(A) = col(Q).
Exploration: The Modified QR Process
d 1. I − 2uu = I − 2 1 d1
d2
T
d2
2d21
=I−
2d1 d2
2d1 d2
1 − 2d21
=
2
2d2
−2d1 d2
−2d1 d2
.
1 − 2d22
2. (a) From Exercise 1,
"
1 − 2d21
Q=
−2d1 d2
(b) Since kx − yk =
p
# "
9
−2d1 d2
1 − 2 · 25
=
1 − 2d22
−2 · 53 · 45
# "
7
−2 · 53 · 45
25
=
16
24
1 − 2 · 25
− 25
#
− 24
25 .
7
− 25
√
(5 − 1)2 + (5 − 7)2 = 2 5, we have
"
# "
#
4
√
√2
1
2
5
5
u = √ (x − y) =
=
.
2
2 5
− 2√
− √15
5
Then from Exercise 1,
"
Q=
1 − 2d21
−2d1 d2
−2d1 d2
1 − 2d22
#
3. (a) QT = (I − 2uuT )T = I − 2 uuT

 "
1 − 2 · 45
−2 · √25 · − √15
− 35
=
=
4
−2 · √25 · − √15
1 − 2 · 15
5
T
= I − 2 uT
T
4
5
3
5
#
.
uT = I − 2uuT = Q, so that Q is symmetric.
(b) Start with
QQT = QQ = (I − 2uuT )(I − 2uuT ) = I − 4uuT + −2uuT
2
= I − 4uuT + 4uuT uuT .
Now, uT u = u · u = 1 since u is a unit vector, so the equation above becomes
QQT = I − 4uuT + 4uuT uuT = I − 4uuT + 4uuT = I.
Thus QT = Q−1 and it follows that Q is orthogonal.
(c) This follows from parts (a) and (b): Q2 = QQ = QQT = QQ−1 = I.
Exploration: The Modified QR Process
399
4. First suppose that v is in the span of u; say v = cu. Then again using the fact that uT u = u · u = 1,
Qv = Q(cu) = (I − 2uuT )(cu) = c(u − 2uuT u) = c(u − 2u) = −cu = −v.
If v · u = uT v = 0, then
Qv = (I − 2uuT )v = v − 2uuT v = v.
5. First normalize the given vector to get


√1
 16 

u=
− √6  .
√2
6
Then

1
6

uuT = − 16
1
3
− 16
1
6
− 13

1
3

− 13  ,
2
3

2
3

so that Q = I − 2uuT =  13
− 23
1
3
2
3
2
3
− 32

2
.
3
1
−3
For Exercise 3, clearly Q is symmetric, and each column has norm 1. A short computation shows that
each pair of columns is orthogonal. Therefore Q is orthogonal. Finally,



2
2
1
1
− 32
− 23
3
3
3
3

2  1
2
2
Q2 =  13 23
= I.
3  3
3
3
2
1
2
1
2
2
−3 3 −3
−3 3 −3
For Exercise 4, first suppose that v = cu. Then

2
 3
1
Qv = Q(cu) = cQu = c 
 3
− 23
1
3
2
3
2
3




√1
− √1
  16 
 16 
2   √  = c  √  = −cu = −v.
− 6

3 
6
1
√2
√2
−3
−
6
6
− 23
If v · u = uT v = 0, then
h
√1
6
− √16
 
i x1
√2
x2  = √1 x1 − √1 x2 + √2 x3 = 0,
6
6
6
6
x3
 
 
1
0
so that x1 − x2 + 2x3 = 0 and therefore x2 = x1 + 2x3 . Thus if v · u = 0, then v = s 1 + t 2. But
0
1
then
  
 
1
0
  
 
Qv = Q s 1 + t 2
0
1
 
 
1
0
 
 
= sQ 1 + tQ 2
0
1

 

 
2
1
2
1
− 23
1
− 23
0
3
3
3
3
 1 2
 1 2
2  
2  
+ t 3 3
= s 3 3
3  1
3  2
2
2
1
2
2
−3 3 −3
0
− 3 3 − 13
1
 
 
1
0
 
 
= s 1 + t 2 = v.
0
1
400
CHAPTER 5. ORTHOGONALITY
6. x − y is a multiple of u, so that Q(x − y) = −(x − y) by Exercise 4. Next,
(x + y) · u =
1
(x + y) · (x − y).
kx − yk
But by Exercise 61 in Section 1.2, this dot product is zero because kxk = kyk. Thus x+y is orthogonal
to u, so that Q(x + y) = x + y. So we have
Q(x − y) = −(x − y)
Q(x + y) = x + y
7. We have
Qx − Qy = −x + y
⇒
Qx + Qy = x + y
 
−2
1
1
u=
(x − y) = √  2 ,
kx − yk
2 3
2
so that

1
3
 1
T
uu = − 3
− 13
− 13
− 31
1
3
1
3
Then

q

1
3
1
3
1
3
2
Qx =  3
2
3
8. Since kyk =
⇒ 2Qx = 2y ⇒ Qx = y.

⇒
2
3
1
3
− 32
1
3
2
T
Q = I − 2uu =  3
2
3
 
2
3
1
3
− 32

2
3

− 23  .
1
3
 
2
3
1
3
 
2  
=
− 3  2 0 = y.
1
0
2
3
2
kxk = kxk, if ai is the ith column vector of A, we can apply Exercise 6 to get
Q1 A = Q1 a1
Q1 a2
...
Q1 an = Q1 x
Q1 a2
...
Q1 an
= y Q1 a2
...
Q1 an
∗ ∗
=
,
0 A1
since the last m − 1 entries in y are all zero. Here A1 is (m − 1) × (n − 1), the left-hand ∗ is 1 × 1, and
the right-hand ∗ is 1 × (n − 1).
9. Since P2 is a Householder matrix, it is orthogonal, so that
1 0
1 0
Q2 QT2 =
= I,
0 P2 0 P2T
so that Q2 is orthogonal as well. Further,
1 0
Q2 Q1 A =
0 P2

∗
∗ ∗
∗
∗
=
= 0
0 A1
0 P2 A 1
0

∗ ∗
∗ ∗ .
0 A2
10. Using induction, suppose that

∗
Qk Qk−1 · · · Q1 A = 0
0
∗
∗
0

∗
∗ .
Ak
Repeat Exercise 8 on the matrix Ak , giving a Householder matrix Pk+1 such that
∗
∗
Pk+1 Ak =
.
0 Ak+1
Exploration: The Modified QR Process
401
1
0
. Then using the fact that Pk+1 is orthogonal, we again conclude that Qk+1 is
0 Pk+1
orthogonal, and that


∗ ∗
∗
∗ .
Qk+1 Qk Qk−1 · · · Q1 A = 0 ∗
0 0 Ak+1
Let Qk+1 =
The result then follows by induction. The resulting matrix R = Qm−1 · · · Q2 Q1 A is upper triangular
by construction.
11. If Q = Q1 Q2 · · · Qm−1 , then Q is orthogonal since each Qi is orthogonal, and
T
QT = (Q1 Q2 · · · Qm−1 ) = QTm−1 · · · QT2 QT1 = Qm−1 · · · Q2 Q1 .
Thus QT A = Qm−1 · · · Q2 Q1 A = R, so that QQT A = A = QR, since QQT = I.
p
3
32 + (−4)2 = 5
. Then
12. (a) We start with x =
, so that y =
−4
0
"
#
"
#
2
1
− √15
T
5
5
u=
⇒ uu = 2 4 .
− √25
5
5
Then
"
− 54
− 53
3
5
− 45
T
Q1 = I − 2uu =
#
"
5
⇒ Q1 A =
0
#
3 −1
= R,
−9 −2
and A = QT1 R.
 
 
3
1
(b) We start with x = 2, so that y = 0. Then
0
2



1
− √13
3
 1 

√  ⇒ uuT = − 1
u=

 3
3
√1
− 31
3
Thus

1
3

Q1 = I − 2uuT =  23
2
3

2
3
1
3
− 32
Then
2
3

− 23 
1
3
"
4
A1 =
3
3
1
− 31
− 13

1
3
1
3
1 .
3
1
3

3

⇒ Q1 A = 0
0
1
3
4
3

−5 1
4 3
3 1

8
3
1
.
3
4
3
#
,
so that in the next iteration
x=
" #
4
⇒
3
y=
Then
" #
5
"
T
uu =
0
1
10
3
− 10
"
⇒
u=
3
− 10
#
9
10
√3
10
P2 = I − 2uuT =
4
5
3
5
3
5
− 54
#
"
⇒ Q2 =
#
.
.
So
"
− √110
1 0
0 P2
#

1

= 0
0

0
0
4
5
3
5
3
.
5
4
−5
402
CHAPTER 5. ORTHOGONALITY
Finally,

1

Q2 Q1 A = 0
0
0
4
5
3
5

3
3 
5  0
0
− 45
0
−5 1
4 3
3 1



8
−5 1
3
16 
= R,
5 3
15 
0 1 − 13
15
8
3
3

1
= 0
3
4
0
3
and Q1 Q2 R = A.
Exploration: Approximating Eigenvalues with the QR Algorithm
1. We have
A1 = RQ = (Q−1 Q)RQ = Q−1 (QR)Q = Q−1 AQ,
so that A1 ∼ A. Thus by Theorem 4.22(e), A1 and A have the same eigenvalues.
2. Using Gram-Schmidt, we have
" #
1
v1 =
1
" #
" # " #
" # " #
0
0
− 23
1·0+1·3 1
3 1
v2 =
−
.
=
−
=
3
1·1+1·1 1
2 1
3
3
2
Normalizing these vectors gives
"
Q=
√1
2
√1
2
√1
2
− √12
Thus
"√
A1 = RQ =
"√
#
2
0
T
, so that R = Q A =
√ #"
3 2
√1
2
2
√
√1
−322
2
√1
2
− √12
2
0
#
"
=
5
2
− 32
√ #
3 2
2
√
.
−322
− 21
3
2
#
.
Then
1−λ
1
0
= (1 − λ)(3 − λ)
3−λ
5
−λ
− 12
3
3
5
2
det(A1 − λI) =
−λ
− λ − = λ2 − 4λ + 3 = (λ − 1)(λ − 3).
=
3
2
2
4
− 32
−
λ
2
det(A − λI) =
So A and A1 do indeed have the same eigenvalues.
3. The proof is by induction on k. We know that A1 ∼ A. Now assume An−1 ∼ A. Then
−1
−1
An = Rn−1 Qn−1 = (Q−1
n−1 Qn−1 )Rn−1 Qn−1 = Qn−1 (Qn−1 Rn−1 )Qn−1 = Qn−1 An−1 Qn−1 ,
so that An ∼ An−1 ∼ A.
4. Converting A1 = Q1 R1 to decimals gives, using technology
0.86 0.51 2.92 −1.20
3.12 0.47
A1 = Q1 R1 ≈
⇒ A2 = R1 Q1 ≈
−0.51 0.86
0
1.04
−0.53 0.88
−0.99 0.17 −3.16 −0.32
3.06 −0.84
A2 = Q2 R2 ≈
⇒ A3 = R2 Q2 ≈
0.17 0.99
0
0.95
0.16
0.93
−1.00 −0.05 −3.06 0.79
3.02 0.94
⇒ A4 = R3 Q3 ≈
A3 = Q3 R3 ≈
−0.05
1.00
0 0.97
−0.05 0.97
−1.00 0.02 −3.02 −0.92
3.00 −0.98
A4 = Q4 R4 ≈
⇒ A5 = R4 Q4 ≈
.
0.02 1.00
0
0.99
0.02 0.99
The Ak are approaching an upper triangular matrix U .
Exploration: Approximating Eigenvalues with the QR Algorithm
403
5. Since Ak ∼ A for all k, the eigenvalues of A and Ak are the same, so in the limit, the upper triangular
matrix U will have eigenvalues equal to those of A; but the eigenvalues of a triangular matrix are its
diagonal entries. Thus the diagonal entries of U will be the eigenvalues of A.
6. (a) Follow the same process as in Exercise 4.
0.71
0.71 2.83 2.83
4.02
0.00
A = QR ≈
⇒ A1 = RQ ≈
0.71 −0.71
0 1.41
1.00 −1.00
−0.97 −0.24 −4.14
0.24
3.96
1.23
A1 = Q1 R1 ≈
⇒ A2 = R1 Q1 ≈
−0.24
0.97
0 −0.97
0.23 −0.94
−1.00 −0.06 −3.97 −1.17
4.04 −0.93
A2 = Q2 R2 ≈
⇒ A3 = R2 Q2 ≈
−0.06
1.00
0 −1.01
0.06 −1.01
−1.00 −0.01 −4.04
0.94
4.03
0.98
A3 = Q3 R3 ≈
⇒ A4 = R3 Q3 ≈
−0.01
1.00
0 −1.00
0.01 −1.00
−1.00 0.00 −4.03 −0.98
4.03 −0.98
A4 = Q4 R4 ≈
⇒ A5 = R4 Q4 ≈
.
0.00 1.00
0 −1.00
0.00 −1.00
The eigenvalues of A are approximately 4.03 and −1.00.
(b) Follow the same process as in Exercise 4.
0.45
0.89 2.24 1.34
2.20
1.40
A = QR ≈
⇒ A1 = RQ ≈
0.89 −0.45
0 0.45
0.40 −0.20
−0.98 −0.18 −2.24 −1.34
2.44 −0.92
A1 = Q1 R1 ≈
⇒ A2 = R1 Q1 ≈
−0.18
0.98
0 −0.45
0.08 −0.44
−1.00 −0.03 −2.44
0.93
2.41
1.01
A2 = Q2 R2 ≈
⇒ A3 = R2 Q2 ≈
−0.03
1.00
0 −0.41
0.01 −0.41
−1 0 −2.41 −1.01
2.42
−1
A3 = Q3 R3 ≈
⇒ A4 = R3 Q3 ≈
0 1
0 −0.42
0 −0.42
−1.00 0.00 −2.42
1
2.41
1
A4 = Q4 R4 ≈
⇒ A5 = R4 Q4 ≈
.
0.00 1.00
0 −0.41
0
−0.41
√
The eigenvalues of A are approximately 2.41 and −0.41 (the true values are 1 ± 2).
(c) Follow the same process as in Exercise 4.



4.24 0.47 −0.94
0.24 −0.06 −0.97
1.26 ⇒
0.97
0  0 1.94
A = QR ≈  0.24
−0.94
0.23 −0.24
0
0
0.73


2.00
0 −3.89
A1 = RQ ≈ −0.73 2.18 −0.31
−0.69 0.17 −0.18



−0.89 −0.33
0.30 −2.24
0.76
3.32
0 −2.05
1.57 ⇒
A1 = Q1 R1 ≈  0.33 −0.94 −0.07 
0.31
0.03
0.95
0
0 −1.31


3.27
0.13
2.44
1.98
1.64
A2 = R1 Q1 ≈ −0.18
−0.40 −0.04 −1.25



−0.99 −0.05 0.12
−3.3 −0.03 −2.47
0 −1.99 −1.8 ⇒
A2 = Q2 R2 ≈  0.06 −1.00 0.01 
0.12
0.02 0.99
0
0 −1.31


2.96
0.16 −2.86
1.95 −1.81
A3 = R2 Q2 ≈ −0.33
−0.11 −0.02 −0.91
404
CHAPTER 5. ORTHOGONALITY

−0.99
A3 = Q3 R3 ≈  0.11
0.04


−0.11 0.04
−2.98
0.06
2.61
−0.99 0.01 
0 −1.95
2.10 ⇒
0.01 1.00
0
0 −1.03


3.07
0.30
2.49
1.96
2.09
A4 = R3 Q3 ≈ −0.14
−0.04 −0.01 −1.03



−1 −0.04 0.01
−3.07 −0.21 −2.41
−1
0 
0 −1.97 −2.20 ⇒
A4 = Q4 R4 ≈ 0.04
0.01
0
1
0
0 −0.99


3.03 0.34 −2.45
A5 = R4 Q4 ≈ −0.12 1.96 −2.21 .
−0.01
0
−0.99
The eigenvalues of A are approximately 3.03, 1.96, and −0.99. (The true values are 3, 2, and −1).
(d) Follow the same process as in Exercise 4.


0.45 0.72 2.24 −3.13 −2.24
3.00 −1.07
0 0.60
A = QR ≈ 
⇒ A1 = RQ ≈
0
3.35
0
0
2.00
−0.89 0.36
1.00 0.00 3.00 −1.07
3.00 −1.07
A1 = Q1 R1 ≈
⇒ A2 = R1 Q1 ≈
.
0.00 1.00
0
2.00
0
2.00
Since Q1 is the identity matrix, we get Q2 = I and R2 = R1 , so that successive Ai are equal to
A1 . Thus the eigenvalues of A are approximately 3 and 2 (in fact, they are 3, 2, and 0).
7. Following the process in Exercise 4,
"
A = QR =
√2
5
− √15
"
A1 = Q1 R1 =
√2
5
− √15
− √15
# "√
− √25
− √15
− √25
√8
5
√1
5
5
0
#"
√1
5
0
#
"
⇒ A1 = RQ =
2
5
− 15
#
− √85
√
5
− 21
5
− 25
"
⇒ A2 = R1 Q1 ≈
#
2
#
3
−1
−2
= A.
Here A2 = A (Q1 = Q−1 and R1 = R−1 ); the algorithm fails because the eigenvalues of A, which are
±1, do not have distinct absolute values.
8. Set
B = A + 0.9I =
2.9
−1
3
.
−1.1
Then
−0.95 0.33 −3.07 −3.19
1.86 −4.02
⇒ B1 = RQ ≈
B = QR ≈
0.33 0.95
0 −0.06
−0.02 −0.06
−1 0.01 −1.86 4.02
1.9
4
B1 = Q1 R1 ≈
⇒ B2 = R1 Q1 ≈
0.01
1
0 −0.1
0 −0.1
−1.00
0 −1.9
−4
1.9
−4
B2 = Q2 R2 ≈
⇒ B3 = R2 Q2 ≈
.
0 1.00
0 −0.1
0 −0.1
Further iterations will simply reverse the sign of the upper right entry of Bi . The eigenvalues of
B are approximately 1.9 and −0.1, so the eigenvalues of A are approximately 1.9 − 0.9 = 1.0 and
−0.1 − 0.9 = −1.0. These are correct (see Exercise 7).
5.4. ORTHOGONAL DIAGONALIZATION OF SYMMETRIC MATRICES
405
9. We prove the first equality using induction on k. For k = 1, it says that Q0 A1 = AQ0 , or QA1 = AQ.
Now, A1 = RQ and A = QR, so that QA1 = QRQ = (QR)Q = AQ. This proves the basis of the
induction. Assume the equality holds for k = n; we prove it for k = n + 1:
Q0 Q1 Q2 · · · Qn An+1 = Q0 Q1 Q2 · · · Qn−1 (Qn An+1 )
= Q0 Q1 Q2 · · · Qn−1 (Qn (Rn Qn ))
= Q0 Q1 Q2 · · · Qn−1 (Qn Rn )Qn
= (Q0 Q1 Q2 · · · Qn−1 An )Qn
= (AQ0 Q1 Q2 · · · Qn−1 )Qn
= AQ0 Q1 Q2 · · · Qn .
For the second equality, note that Ak = Qk Rk , so that
Q0 Q1 Q2 · · · Qk−1 Ak = Q0 Q1 Q2 · · · Qk−1 Qk Rk = AQ0 Q1 Q2 · · · Qk−1 .
Multiply both sides by Rk−1 · · · R2 R1 to get
(Q0 Q1 Q2 · · · Qk−1 Qk )(Rk Rk−1 · · · R2 R1 ) = A(Q0 Q1 Q2 · · · Qk−1 )(Rk−1 · · · R2 R1 ),
as desired.
For the QR-decomposition of Ak+1 , we again use induction. For k = 0, the equation says that
Q0 R0 = A0+1 = A, which is true. Now assume the statement holds for k = n. Then using the first
equation in this exercise together with the fact that Ak = Qk Rk and the inductive hypothesis, we get
An+1 = A · An
= A(Q0 Q1 · · · Qn−1 )(Rn−1 · · · R1 R0 )
= Q0 Q1 · · · Qn−1 An Rn−1 · · · R1 R0
= Q0 Q1 · · · Qn−1 Qn Rn Rn−1 · · · R1 R0
= (Q0 Q1 · · · Qn )(Rn Rn−1 · · · R1 R0 ),
as desired.
5.4
Orthogonal Diagonalization of Symmetric Matrices
1. The characteristic polynomial of A is
det(A − λI) =
4−λ
1
1
= λ2 − 8λ + 15 = (λ − 5)(λ − 3)
4−λ
Thus the eigenvalues of A are λ1 = 3 and λ2 = 5. To find the eigenspace corresponding to λ1 = 3, we
row-reduce
1 1 0
1 1 0
A − 3I 0 =
→
.
1 1 0
0 0 0
−1
A basis for the eigenspace is given by
.
1
To find the eigenspace corresponding to λ2 = 5, we row-reduce
−1
1 0
1
A − 5I 0 =
→
1 −1 0
0
A basis for the eigenspace is given by
1
.
1
−1
0
0
.
0
406
CHAPTER 5. ORTHOGONALITY
These eigenvectors are orthogonal, as is guaranteed by Theorem
5.19, so to turn them into orthonormal
√
vectors, we normalize them. The norm of each vector is 2, so setting
"
#
− √12 √12
Q=
1
1
√
2
√
2
we see that Q is an orthonormal matrix whose columns are the normalized eigenvectors for A, so that
3 0
−1
T
Q AQ = Q AQ =
= D.
0 5
2. The characteristic polynomial of A is
det(A − λI) =
−1 − λ
3
3
= λ2 + 2λ − 8 = (λ + 4)(λ − 2).
−1 − λ
Thus the eigenvalues of A are λ1 = −4 and λ2 = 2. To find the eigenspace corresponding to λ1 = −4,
we row-reduce
3 3 0
1 1 0
A + 4I 0 =
→
.
3 3 0
0 0 0
−1
A basis for the eigenspace is given by
.
1
To find the eigenspace corresponding to λ2 = 2, we row-reduce
−3
3 0
1
A − 2I 0 =
→
3 −1 0
0
1
A basis for the eigenspace is given by
.
1
−1
0
0
.
0
These eigenvectors are orthogonal, as is guaranteed by Theorem
5.19, so to turn them into orthonormal
√
vectors, we normalize them. The norm of each vector is 2, so setting
"
#
− √12 √12
Q=
1
1
√
2
√
2
we see that Q is an orthonormal matrix whose columns are the normalized eigenvectors for A, so that
−4 0
Q−1 AQ = QT AQ =
= D.
0 2
3. The characteristic polynomial of A is
1−λ
det(A − λI) = √
2
√
2
= λ2 − λ − 2 = (λ − 2)(λ + 1)
−λ
Thus the eigenvalues of A are λ1 = −1 and λ2 = 2. To find the eigenspace corresponding to λ1 = −1,
we row-reduce
"
#
"
#
√
√
h
i
2
2 0
1 22 0
→
.
A+I 0 = √
2 1 0
0 0 0
−1
A basis for the eigenspace is given by √ .
2
To find the eigenspace corresponding to λ2 = 2, we row-reduce
√
√
−1
2 0
1 − 2
A − 2I 0 = √
→
0
0
2 −1 0
0
.
0
5.4. ORTHOGONAL DIAGONALIZATION OF SYMMETRIC MATRICES
A basis for the eigenspace is given by
407
√ 2
.
1
These eigenvectors are orthogonal, as is guaranteed by Theorem
5.19, so to turn them into orthonormal
√
vectors, we normalize them. The norm of each vector is 3, so setting
√ #
"
6
− √13
3
√
Q=
6
1
3
√
3
we see that Q is an orthonormal matrix whose columns are the normalized eigenvectors for A, so that
−1 0
−1
T
Q AQ = Q AQ =
= D.
0 2
4. The characteristic polynomial of A is
det(A − λI) =
9−λ
−2
−2
= λ2 − 15λ + 50 = (λ − 5)(λ − 10).
6−λ
Thus the eigenvalues of A are λ1 = 5 and λ2 = 10. To find the eigenspace corresponding to λ1 = 5,
we row-reduce
"
#
"
#
h
i
4 −2 0
1 − 21 0
→
.
A − 5I 0 =
−2
1 0
0
0 0
1
A basis for the eigenspace is given by
.
2
To find the eigenspace corresponding to λ2 = 10, we row-reduce
−1 −2 0
1
A − 10I 0 =
→
−2 −4 0
0
−2
A basis for the eigenspace is given by
.
1
2
0
0
.
0
These eigenvectors are orthogonal, as is guaranteed by Theorem
5.19, so to turn them into orthonormal
√
vectors, we normalize them. The norm of each vector is 5, so setting
#
"
√1
√2
−
5
Q = 25
1
√
5
√
5
we see that Q is an orthonormal matrix whose columns are the normalized eigenvectors for A, so that
5 0
−1
T
Q AQ = Q AQ =
= D.
0 10
5. The characteristic polynomial of A is
det(A − λI) =
5−λ
0
0
0
0
1−λ
3
= −λ3 + 7λ2 − 2λ − 40 = −(λ − 4)(λ − 5)(λ + 2).
3
1−λ
Thus the eigenvalues of A are λ1 = 4, λ2 = 5, and λ3 = −2. To find the eigenspace corresponding to
λ1 = 4, we row-reduce




1
0
0 0
1 0
0 0
i
h




3 0 → 0 1 −1 0 .
A − 4I 0 = 0 −3
0
3 −3 0
0 0
0 0
408
CHAPTER 5. ORTHOGONALITY
 
0
A basis for the eigenspace is given by 1.
1
To find the eigenspace corresponding to λ2 = 5, we row-reduce



0
0
0 0
0 1 0
3 0 → 0 0 1
A − 5I 0 = 0 −4
0
3 −4 0
0 0 0
 
1
A basis for the eigenspace is given by 0.
0
To find the eigenspace corresponding to λ3 = −2, we row-reduce



7 0 0 0
1 0 0
A + 2I 0 = 0 3 3 0 → 0 1 1
0 3 3 0
0 0 0
 
0
A basis for the eigenspace is given by −1.
1

0
0 .
0

0
0 .
0
These eigenvectors are orthogonal, as is guaranteed by Theorem 5.19, so√to turn them into orthonormal
vectors, we normalize them. The norm of the first and third vectors is 2, while the second is already
a unit vector, so setting


0 1
0

 1
√
√1 
Q=
 2 0 − 2
√1
√1
0
2
2
we see that Q is an orthonormal matrix whose columns are the normalized eigenvectors for A, so that


4 0
0
0 = D.
Q−1 AQ = QT AQ = 0 5
0 0 −2
6. The characteristic polynomial of A is
det(A − λI) =
2−λ
3
0
3
0
2−λ
4
= −λ3 + 6λ2 + 13λ − 42 = −(λ + 3)(λ − 2)(λ − 7).
4
2−λ
Thus the eigenvalues of A are λ1 = −3, λ2 = 2, and λ3 = 7. To find the eigenspace corresponding to
λ1 = −3, we row-reduce




3
5
3
0
0
1
0
−
0
4
h
i 



5
 → 0 1
A + 3I 0 = 
0
0
3
5
4



.
4
0
4
5
0
0
0

0

3
A basis for the eigenspace is given (after clearing fractions) by −5.
4
To find the eigenspace corresponding to λ2 = 2, we row-reduce



0
0
3
0
1 0
h
i 





=
→
A − 2I 0
3 0 4 0
0 1
0 4 0 0
0 0
4
3
0
0
0

0

0
.
0
5.4. ORTHOGONAL DIAGONALIZATION OF SYMMETRIC MATRICES
409
 
−4
A basis for the eigenspace is given (after clearing fractions) by  0.
3
To find the eigenspace corresponding to λ3 = 7, we row-reduce



1 0
0
−5
3
0
i 
h





4 0 → 0 1
A − 7I 0 =  3 −5
0
4 −5

− 43
0
− 54

0
.
0
0 0
 
3
A basis for the eigenspace is given (after clearing fractions) by 5.
4
0
0
These eigenvectors are orthogonal, as is guaranteed by Theorem 5.19, so to √
turn them into
√ orthonormal
vectors, we normalize them. The norms of these vectors are respectively 5 2, 5, and 5 2, so setting


3
4
3
√
√
−
5
5 2
 5 12
√
√1 
Q=
−
0

2
2 
4
√
5 2
3
5
4
√
5 2
we see that Q is an orthonormal matrix whose columns are the normalized eigenvectors for A, so that


−3 0 0
Q−1 AQ = QT AQ =  0 2 0 = D.
0 0 7
7. The characteristic polynomial of A is
det(A − λI) =
1−λ
0
−1
0
1−λ
0
−1
0 = −λ3 + 3λ2 − 2λ = −λ(λ − 1)(λ − 2).
1−λ
Thus the eigenvalues of A are λ1 = 0, λ2 = 1, and λ3 = 2. To find the eigenspace corresponding to
λ1 = 0, we row-reduce




1 0 −1 0
1 0 −1 0
h
i




A − 0I 0 =  0 1
0 0 → 0 1
0 0 .
−1 0
1 0
0 0
0 0
 
1
A basis for the eigenspace is given by 0.
1
To find the eigenspace corresponding to λ2 = 1, we row-reduce



0 0 −1 0
1 0 0
0 0 → 0 0 1
A−I 0 = 0 0
−1 0
0 0
0 0 0
 
0
A basis for the eigenspace is given by 1.
0
To find the eigenspace corresponding to λ3 = 2, we row-reduce



−1
0 −1 0
1 0 1
0 0 → 0 1 0
A − 2I 0 =  0 −1
−1
0 −1 0
0 0 0

0
0 .
0

0
0 .
0
410
CHAPTER 5. ORTHOGONALITY
 
−1
A basis for the eigenspace is given by  0.
1
These eigenvectors are orthogonal, as is guaranteed by Theorem 5.19, so√to turn them into orthonormal
vectors, we normalize them. The norm of the first and third vectors is 2, while the second is already
a unit vector, so setting


√1
√1
0
−
2
 2
Q=
0
1
0


√1
√1
0
2
2
we see that Q is an orthonormal matrix whose columns are the normalized eigenvectors for A, so that


0 0 0
Q−1 AQ = QT AQ = 0 1 0 = D.
0 0 2
8. The characteristic polynomial of A is
det(A − λI) =
1−λ
2
2
2
2
1−λ
2
= −λ3 + 3λ2 + 9λ + 5 = −(λ + 1)2 (λ − 5)
2
1−λ
Thus the eigenvalues of A are λ1 = −1 and λ2 = 5. To find the eigenspace E−1 corresponding to
λ1 = −1, we row-reduce




2 2 2 0
1 1 1 0
A + I 0 = 2 2 2 0 → 0 0 0 0 .
2 2 2 0
0 0 0 0
Thus the eigenspace E−1 is



 
 
−s − t
−1
−1
 s  = span x1 =  1 , x2 =  0 .
t
0
1
To find the eigenspace E5 corresponding to λ2 = 5, we row-reduce



1 0
−4
2
2 0
A − 5I 0 =  2 −4
2 0 → 0 1
2
2 −4 0
0 0
 
1
Thus a basis for E5 is v3 = 1.
1
−1
−1
0

0
0 .
0
E5 is orthogonal to E−1 as guaranteed by Theorem 10.19, so to turn this basis into an orthogonal basis
we must orthogonalize the basis for E−1 above. Use Gram-Schmidt:
 
 
 
 
−1
−1
−1
−1
 2
v
·
x
1
1
2
1
v1 =  1 ,
v2 = x2 −
v1 =  0 −  1 = 
− 2 .
v1 · v1
2
0
1
0
1
We double v2 to remove fractions, so we get the orthogonal basis
 
 
 
−1
−1
1
v1 =  1 , v2 = −1 , v3 = 1
0
2
1
5.4. ORTHOGONAL DIAGONALIZATION OF SYMMETRIC MATRICES
411
√ √
√
The norms of these vectors are, respectively, 2, 6, and 3, so normalizing each vector and setting
Q equal to the matrix whose columns are the normalized eigenvectors gives


− √12 − √16 √13

 1
Q=
− √16 √13 
,
 √2
1
2
√
√
0
6
3
so that

−1
Q−1 AQ = QT AQ =  0
0

0 0
−1 0 = D.
0 5
9. The characteristic polynomial of A is
1−λ
1
det(A − λI) =
0
0
1
0
1−λ
0
0
1−λ
0
1
0
0
= λ4 − 4λ3 + 4λ2 = λ2 (λ − 2)2 .
1
1−λ
Thus the eigenvalues of A are λ1 = 0 and λ2 = 2. To find the eigenspace E0 corresponding to λ1 = 0,
we row-reduce




1 1 0 0 0
1 1 0 0 0
0 0 1 1 0
1 1 0 0 0



A + 0I 0 = 
0 0 1 1 0 → 0 0 0 0 0 .
0 0 1 1 0
0 0 0 0 0
Thus the eigenspace E0 is
 

 
 
0
−1
−s
 0

 1
 s
  = span x1 =   , x2 =   .
−1

 0
 −t
1
0
t
To find the eigenspace E2 corresponding to λ2 = 2, we row-reduce



−1
1
0
0 0
1 −1 0
0

0
 1 −1
0
0
0
1
−1
0
→
A − 2I 0 = 
 0
0
0 −1
1 0
0 0
0
0
0
1 −1 0
0
0 0
0

0
0
.
0
0
Thus the eigenspace E2 is
 

 
 
s
1
0
s

1
0
  = span x3 =   , x4 =   .
 t

0
1
t
0
1
E0 is orthogonal to E2 as guaranteed by Theorem 10.19, and x1 and x2 are also orthogonal,
as are x3
√
and x4 . So these four vectors form an orthogonal basis. The norm of each vector is 2, so normalizing
each vector and setting Q equal to the matrix whose columns are the normalized eigenvectors gives


− √12
0 √12
0


 √1
0 √12
0


2
Q=


√1
√1 
0
−
0

2
2
√1
0
0 √12
2
412
CHAPTER 5. ORTHOGONALITY
so that

0

0
Q−1 AQ = QT AQ = 
0
0
0
0
0
0
0
0
2
0

0
0
 = D.
0
2
10. The characteristic polynomial of A is
2−λ
0
det(A − λI) =
0
1
0
0
1−λ
0
0
1−λ
0
0
1
0
= λ4 − 6λ3 + 12λ2 − 10λ + 3 = (λ − 1)3 (λ − 3)
0
2−λ
Thus the eigenvalues of A are λ1 = 1 and λ2 = 3. To find the eigenspace E1 corresponding to λ1 = 1,
we row-reduce




1 0 0 1 0
1 0 0 1 0
0 0 0 0 0
0 0 0 0 0



AI 0 = 
0 0 0 0 0 → 0 0 0 0 0 .
1 0 0 1 0
0 0 0 0 0
Thus the eigenspace E1 is
     
 
0
0
−1
−s
 0 1 0
 t 
  = span   ,   ,   .
 0 0 1
u
0
0
1
s
Note that these vectors are mutually orthogonal.
To find the eigenspace E3 corresponding to λ2 = 3, we row-reduce



1 0 0
−1
0
0
1 0

0 1 0
 0 −2
0
0
0
→
A − 3I 0 = 
0 0 1
 0
0 −2
0 0
1
0
0 −1 0
0 0 0
 
1
0

.
Thus a basis for E3 is  
0
1
−1
0
0
0

0
0
.
0
0
Since the eigenvectors of E1 are orthogonal to those of E3 by Theorem 10.19, and the eigenvectors of
E3 are already
all we need to do is to normalize the vectors. The norms are,
√ mutually orthogonal,
√
respectively, 2, 1, 1, and 2, so setting Q equal to the matrix whose columns are the normalized
eigenvectors gives

 1
− √2 0 0 √12


 0
1 0 0

,
Q=
0 1 0
 0

1
√1
√
0 0
2
2
so that

1
0
−1
T
Q AQ = Q AQ = 
0
0
0
1
0
0
0
0
1
0

0
0
 = D.
0
3
11. The characteristic polynomial is
det(A − λI) =
a−λ
b
= (a − λ)2 − b2 .
b
a−λ
5.4. ORTHOGONAL DIAGONALIZATION OF SYMMETRIC MATRICES
413
This is zero when a − λ = ±b, so the eigenvalues are a + b and a − b. To find the corresponding
eigenspaces, row-reduce:
−b
b 0
1 −1 0
A − (a + b)I 0 =
→
b −b 0
0
0 0
1 1 0
b b 0
A − (a − b)I 0 =
→
.
b b 0
0 0 0
1
−1
So bases for the two eigenspaces (which are orthogonal by Theorem 5.19) are
and
. Each of
1
1
√
these vectors has norm 2, so normalizing each gives
#
"
√1
√1
−
2 ,
Q = 12
1
√
√
2
so that
Q−1 AQ = QT AQ =
2
a+b
0
.
0
a−b
12. The characteristic polynomial is (expanding along the second row)
det(A − λI) =
a−λ
0
b
0
a−λ
0
b
0
= (a − λ) (a − λ)2 − b2 .
a−λ
This is zero when a = λ or when a − λ = ±b, so the eigenvalues are a, a + b and a − b. To find the
corresponding eigenspaces, row-reduce:




0 0 b 0
1 0 0 0
A − aI 0 = 0 0 0 0 → 0 0 1 0
b 0 0 0
0 0 0 0




−b
0
b 0
1 0 −1 0
0 0 → 0 1
0 0
A − (a + b)I 0 =  0 −b
b
0 −b 0
0 0
0 0




1 0 1 0
b 0 b 0
A − (a − b)I 0 = 0 b 0 0 → 0 1 0 0 .
b 0 b 0
0 0 0 0
   
 
0
1
−1
So bases for the three eigenspaces (which are orthogonal by Theorem 5.19) are 1, 0 and  0.
0
1
1
Normalizing the second and third vectors gives


0 √12 − √12


Q=
0
0
1
,
1
1
√
0 √2
2
so that

a
Q−1 AQ = QT AQ = 0
0
0
a+b
0

0
0 .
a−b
13. (a) Since A and B are each orthogonally diagonalizable, they are each symmetric by the Spectral
Theorem. Therefore A + B is also symmetric, so it too is orthogonally diagonalizable by the
Spectral Theorem.
414
CHAPTER 5. ORTHOGONALITY
(b) Since A is orthogonally diagonalizable, it is symmetric by the Spectral Theorem. Therefore cA is
also symmetric, so it too is orthogonally diagonalizable by the Spectral Theorem.
(c) Since A is orthogonally diagonalizable, it is symmetric by the Spectral Theorem. Therefore
A2 = AA is also symmetric, so it too is orthogonally diagonalizable by the Spectral Theorem.
14. If A is invertible and orthogonally diagonalizable, then QT AQ = D for some invertible diagonal matrix
D and orthogonal matrix Q, so that
−1
−1
T
D−1 = QT AQ
= Q−1 A−1 QT
= QT A−1 QT
= QT A−1 Q.
Thus A−1 is also orthogonally diagonalizable (and by the same orthogonal matrix).
15. Since A and B are orthogonally diagonalizable, they are both symmetric. Since AB = BA, Exercise
36 in Section 3.2 implies that AB is also symmetric, so it is orthogonally diagonalizable as well.
16. Let A be a symmetric matrix. Then it is orthogonally diagonalizable, say D = QT AQ. Then the
diagonal entries of D are the eigenvalues of A. First, if every eigenvalue of A is nonnegative, let
√



λ1 · · ·
0
λ1 · · · 0
√



..  .
..
D =  ... . . . ...  , and define D =  ...

.
√.
0 · · · λn
λn
0
···
√
Let B = Q DQT . Then B is diagonally orthogonalizable by construction, so it is symmetric. Further,
since Q is orthogonal,
√
√
√
√ √
√
B 2 = Q DQT Q DQT = Q DQ−1 Q DQT = Q D DQT = QDQT = A.
Conversely, suppose A = B 2 for some symmetric matrix B. Then B is orthogonally diagonalizable,
say QT BQ = D, so that
QT AQ = QT B 2 Q = (QT BQ)(QT BQ) = D2 .
But D2 is a diagonal matrix whose diagonal entries are the squares of the diagonal entries of D, so
they are all nonnegative. Thus the eigenvalues of A are all nonnegative.
17. From Exercise 1 we have
"
λ1 = 3,
q1 =
− √12
#
√1
2
"
,
λ2 = 5,
q2 =
√1
2
√1
2
#
.
Therefore
"
A = λ1 q1 qT1 + λ2 q2 qT2 = 3
1
2
− 21
− 12
#
1
2
"
+5
1
2
1
2
1
2
1
2
#
"
√1
2
√1
2
#
.
18. From Exercise 2 we have
"
λ1 = −4,
q1 =
− √12
√1
2
#
,
λ2 = 2,
q2 =
.
Therefore
"
A = λ1 q1 qT1 + λ2 q2 qT2 = −4
1
2
− 12
− 12
1
2
#
"
+2
1
2
1
2
1
2
1
2
#
.
5.4. ORTHOGONAL DIAGONALIZATION OF SYMMETRIC MATRICES
415
19. From Exercise 5 we have

λ1 = 4,
0
 1 
√ 
q1 = 
 2 ,

 
1
 

q2 = 
0 ,
0

λ2 = 5,
√1
2
λ3 = −2,
0

 1 
√ 
q2 = 
− 2  .
√1
2
Therefore

0

T
T
T

A = λ1 q1 q1 + λ2 q2 q2 + λ3 q3 q3 = 4 0
0

1


1 + 5 
0
2
1
0
2
0
0
1
2
1
2

0
0
0


0
0




0 − 2 0
0
0
1
2
− 12





0

− 21 
.
0
1
2
20. From Exercise 8 we have
λ1 = −1,


− √16
 1 

q2 = 
− √6  ,


− √12


1 
q1 = 
 √2  ,
0
λ2 = 5,
√2
6
√1
 13 

q3 = 
 √3  .
√1
3
Therefore

1
2

1
A = λ1 q1 qT1 + λ1 q2 qT2 + λ2 q3 qT3 = − 
−
 2
− 12
1
2
0
0
 
1
0
  6
 1
0
− 6
0
− 13
1
6
1
6
− 31
1

3
1
− 13 
 + 5 3
2
1
3
3
− 13

1
3
1
3
1
3
1
3
1 .
3
1
3
− 32
#
21. Since v1 and v2 are already orthogonal, normalize each of them to get
" #
"
#
q1 =
√1
2
√1
2
,
q2 =
√1
2
− √12
.
Now let Q be the matrix whose columns are q1 and q2 ; then the desired matrix is
#
"
#
"
"
#"
#"
# "
1
√1
√1
√1
√1
−1 0
−1 0
−1
0
T
−1
2
2
2
2
Q = 12
Q =Q
Q
=
√
0 2
0 2
0 2 √12 − √12
− 32
− √12
2
.
1
2
22. Since v1 and v2 are already orthogonal, normalize each of them to get
" #
"
#
√1
− √25
5
q1 = 2 ,
q2 =
.
1
√
√
5
5
Now let Q be the matrix whose columns are q1 and q2 ; then the desired matrix is
"
#
"
#
"
#"
#"
# "
√1
√1
√2
− √25 3
3
0
3
0
0
− 59
−1
T
5
5
5
Q
Q =Q
Q = 2
=
12
√
√1
0 −3
0 −3
0 −3 − √25 √15
5
5
5
12
5
9
5
#
.
√ √
√
23. Note that v1 , v2 , and v3 are mutually orthogonal, with norms 2, 3, and 6 respectively. Let Q be
the matrix whose columns are q1 = kvv11 k , q2 = kvv22 k , and q3 = kvv33 k ; then the desired matrix is




1 0 0
1 0 0

 −1

 T



Q
0 2 0 Q = Q 0 2 0 Q
0
0
3
0

√1
 12
=
 √2
0
0
3
√1
3
− √13
√1
3

1

1


√
0
6 
2
√
0
6
− √16
0
2
0

√1
0
  12

0
  √3
3 − √16
√1
2
− √13
√1
6
 
5
0
3


√1  = − 2
 3
3
√2
− 13
6
− 32
5
3
1
3
− 13


1 .
3
8
3
416
CHAPTER 5. ORTHOGONALITY
√
√
√
24. Note that v1 , v2 , and v3 are mutually orthogonal, with norms 42, 3, and 14 respectively. Let Q
be the matrix whose columns are q1 = kvv11 k , q2 = kvv22 k , and q3 = kvv33 k ; then the desired matrix is

1


Q 0
0


1
0

 −1


−4
0 Q = Q 0
0
0 −4


0
 T
−4
0
Q
0 −4
0
0
√4
 542
=
 √42
− √142

− 44
 21
50
=
 21
10
− 21

− √13
√2
14  1
1

− √14 
 0
3
√
0
14
√1
3
√1
3
50
21
43
− 42
25
− 42
− 10
21

√4
0
  42
 1
−4
0
 − √3
√2
0 −4
14
0
√5
42
√1
3
− √114
− √142


√1 
3
√3
14


.
− 25
42 
− 163
42
25. Since q is a unit vector, q · q = 1, so that
projW v =
q·v
q·q
q = (q · v)q = q(q · v) = q(qT v) = (qqT )v.
26. (a) By definition (see Section 5.2), the projection of v onto W is
projW v =
q1 · v
q1 · q1
q1 + · · · +
qk · v
qk · qk
qk
= (q1 · v)q1 + · · · + (qk · v)qk
= q1 (q1 · v) + · · · + qk (qk · v)
= q1 (qT1 v) + · · · + qk (qTk · v)
= (q1 qT1 + · · · + qk qTk )v.
Therefore the matrix of the projection is indeed q1 qT1 + · · · + qk qTk .
(b) First,
T
+ · · · + qk qTk
T
T
= qT1 qT1 + · · · + qTk qk = q1 qT1 + · · · + qk qTk = P.
P T = (q1 qT1 + · · · + qk qTk )T = q1 qT1
T
Next, since qi · qj = 0 for i 6= j, and qi · qi = qTi qi = 1, we have
P 2 = (q1 qT1 )(q1 qT1 ) + · · · + (qk qTk )(qk qTk )
= q1 qT1 q1 qT1 + · · · + qk qTk qk qTk = q1 qT1 + · · · qk qTk = P.
(c) We have
QQT = q1
···
 T
q
 .1 
qk  ..  = q1 qT1 + · · · qk qTk = P.
qTk
Thus rank(P ) = rank(QQT ) = rank(Q) by Theorem 3.28 in Section 3.5. But if q1 , . . . , qk is an
orthonormal basis, then rank(Q) = k, so that rank(P ) = k as well.
5.5. APPLICATIONS
417
27. Suppose the eigenvalues are λ1 , . . . , λn and that v1 , . . . , vn are a corresponding orthonormal set of
eigenvectors. Let Q1 = v1 · · · vn . Then
 T
 T
v1
v1
 
 
QT1 AQ1 =  ...  A v1 · · · vn =  ...  Av1 · · · Avn
vnT
vnT
 T
v1
 ..  =  .  λ1 v1
vnT
···
λ
λn vn = 1
0
∗
= T,
A1
where A1 is an (n − 1) × (n − 1) matrix. The result now follows by induction since T is block upper
triangular.
28. We know that if A is nilpotent, then its only eigenvalue is 0. So if A is nilpotent, then all of its
eigenvalues are real, so that by Exercise 27 there is an orthogonal matrix Q such that QT AQ is upper
triangular. But A and QT AQ have the same eigenvalues; since the eigenvalues of QT AQ are the
diagonal entries, all the diagonal entries must be zero.
5.5
Applications
x 2 3 x y = 2x2 + 6xy + 4y 2 .
y 3 4
x1 5
1 T
x1 x2 = 5x21 + 2x1 x2 − x22 .
2. f (x) = x Ax =
x2 1 −1
1
3 −2 T
1 6 = 3 · 12 − 4 · 1 · 6 + 4 · 62 = 123.
3. f (x) = x Ax =
6 −2
4

 
1 0 −3 x
1 x y z = 1x2 + 2y 2 + 3z 2 + 0xy − 6xz + 2yz = x2 + 2y 2 + 3z 2 − 6xz + 2yz.
4. f (x) = y   0 2
−3 1
3
z
1. f (x) = xT Ax =
5. From Exercise 4,


2
f −1 = 22 + 2 · (−1)2 + 3 · 12 − 6 · 2 · 1 + 2 · (−1) · 1 = −5.
1
 

1 2 2 0 6. f (x) = xT Ax = 2 2 0 1 1 2 3 = 2 · 12 + 0 · 22 + 1 · 32 + 4 · 1 · 2 + 0 · 1 · 3 + 2 · 2 · 3 = 31.
3 0 1 1
1 3
7. The diagonal entries are 1 and 2, and the corners are each 21 · 6 = 3, so A =
.
3 2
0 1
8. The diagonal entries are both zero, and the corners are each 21 · 1 = 12 , so A = 1 2 .
0
2
3 − 32
1
3
9. The diagonal entries are 3 and −1, and the corners are each 2 · (−3) = − 2 , so A =
.
− 23 −1
10. The diagonal entries are 1, 0, and −1, while the off-diagonal entries are 12 · 8 = 4, 0, and 12 · (−6) = −3.
Thus


1
4
0
0 −3 .
A = 4
0 −3 −1
418
CHAPTER 5. ORTHOGONALITY
11. The diagonal entries are 5, −1, and 2, while the off-diagonal entries are 12 · 2 = 1, 21 · (−4) = −2, and
1
2 · 4 = 2. Thus


5
1 −2
2 .
A =  1 −1
−2
2
2
12. The diagonal entries are 2, −3, and 1, while the off-diagonal entries are 21 · 0 = 0, 21 · (−4) = −2, and
1
2 · 0 = 0. Thus


2
0 −2
0 .
A =  0 −3
−2
0
1
2 −2
13. The matrix of this form is A =
, so that its characteristic polynomial is
−2
5
2−λ
−2
−2
= λ2 − 7λ + 6 = (λ − 1)(λ − 6).
5−λ
So the eigenvalues are
the eigenvectors by the usual method gives corresponding
1 and 6; finding
2
1
eigenvectors v1 =
and v2 =
. Normalizing these and constructing Q gives
1
−2
"
Q=
√2
5
√1
5
√1
5
− √25
#
.
The new quadratic form is
1
y2
0
T
f (y) = y Dy = y1
14. The matrix of this form is A =
1
4
1−λ
4
0
6
y1
= y12 + 6y22 .
y2
4
, so that its characteristic polynomial is
1
4
= λ2 − 2λ − 15 = (λ − 5)(λ + 3).
1−λ
So the eigenvaluesare
the eigenvectors by the usual method gives corresponding
5 and −3;
finding
1
1
eigenvectors v1 =
and v2 =
. Normalizing these and constructing Q gives
1
−1
"
Q=
√1
2
√1
2
√1
2
− √12
#
.
The new quadratic form is
f (y) = yT Dy = y1

7
15. The matrix of this form is A = 4
4
7−λ
4
4
y2
5
0
0 y1
= 5y12 − 3y22 .
−3 y2

4
4
1 −8, so that its characteristic polynomial is
−8
1
4
1−λ
−8
4
−8 = (λ − 9)2 (λ + 9).
1−λ
5.5. APPLICATIONS
419
So the eigenvalues are 9 (of algebraic multiplicity
the eigenvectors
 2) and −9; finding

 by the usual
2
2
−1
method gives corresponding eigenvectors v1 = 0 and v2 = 1 for λ = 9 and  2 for λ = −9.
1
0
2
Orthogonalizing v1 and v2 gives
   2
 
2
2
5
4
v2 · v1
v1 = 1 − 0 =  1 .
v20 = v2 −
v1 · v1
5
1
0
− 45
Normalizing these and constructing Q gives

√2
 5
√1
5
2
√
3 5
5
√
3 5
4
− 3√
5

9
y3 0
0
0
9
0
Q=
 0
− 31


2
.
3
2
3
The new quadratic form is
f (y) = yT Dy = y1
y2
 
0 y1
0 y2  = 9y12 + 9y22 − 9y32 .
−9 y3


−2 0
1 0, so that its characteristic polynomial is
0 3
1−λ
−2
0
−2
0
1−λ
0
= (λ − 3)2 (λ + 1).
0
3−λ
1
16. The matrix of this form is A = −2
0
So the eigenvalues are 3 (of algebraic multiplicity
 and −1; finding
  the eigenvectors
 by the usual
 2)
0
1
1
method gives corresponding eigenvectors v1 = −1 and v2 = 0 for λ = 3 and 1 for λ = −1.
1
0
0
Since v1 and v2 are already orthogonal, we normalize all three vectors and construct Q to get


√1
√1
0
2
2

1
1 
Q=
− √2 0 √2  .
0 1
0
The new quadratic form is
f (y) = yT Dy = y1

1
17. The matrix of this form is A = −1
1
1−λ
−1
1
y2

3
y3 0
0
0
3
0
 
0 y1
0 y2  = 3y12 + 3y22 − y32 .
−1 y3

−1 0
0 1, so that its characteristic polynomial is
1 1
−1
−λ
1
0
1
= (λ − 2)(λ − 1)(λ + 1).
1−λ
420
CHAPTER 5. ORTHOGONALITY
So the eigenvalues are 
2, 1, and−1;
 finding
 the
 eigenvectors by the usual method gives corresponding
1
1
1
eigenvectors v1 = −1, v2 = 0, and  2. Normalize all three vectors and construct Q to get
−1
1
−1


√1
 13
Q=
− √3
− √13
√1
2
0
√1
2
√1
6
√2  .
6
− √16
The new quadratic form is
f (x0 ) = (x0 )T Dx0 = x0

0
18. The matrix of this form is A = 1
1
y0

2
z 0 0
0
0
1
0
 
0 x0
0 y 0  = 2(x0 )2 + (y 0 )2 − (z 0 )2 .
−1
z0

1
1, so that its characteristic polynomial is
0
1
0
1
−λ
1
1
1
−λ
1
1
1 = (λ − 2)(λ + 1)2 .
−λ
So the eigenvalues are 2 and −1; finding the eigenvectors
by the 
usual method gives corresponding

0
−2
1
eigenvectors v1 = 1 for λ = 2 and v2 =  1 and v3 =  1 for λ = −1. The latter two are
−1
1
1
already orthogonal, so simply normalize all three vectors and construct Q to get


√1
√2
−
0
6

 13
√
√1
√1  .
Q=
 3
6
2
√1
√1
− √12
3
6
The new quadratic form is
f (x0 ) = (x0 )T Dx0 = x0
1
19. The matrix of this form is A =
0
y0

2
z 0 0
0
 
0
0 x0
−1
0 y 0  = 2(x0 )2 − (y 0 )2 − (z 0 )2 .
0 −1
z0
0
, so that its characteristic polynomial is
2
1−λ
0
0
= (1 − λ)(2 − λ).
2−λ
Since the eigenvalues, 1 and 2, are both positive, this is a positive definite form.
1 −1
20. The matrix of this form is A =
, so that its characteristic polynomial is
−1
1
1−λ
−1
−1
= λ(λ − 2).
1−λ
Since the eigenvalues, 0 and 2, are both nonnegative, but one is zero, this is a positive semidefinite
form.
5.5. APPLICATIONS
421
21. The matrix of this form is A =
−2
1
1
, so that its characteristic polynomial is
−2
−2 − λ
1
1
= (λ + 3)(λ + 1).
−2 − λ
Since the eigenvalues, −1 and −3, are both negative, this is a negative definite form.
1 2
22. The matrix of this form is A =
, so that its characteristic polynomial is
2 1
1−λ
2
2
= (λ − 3)(λ + 1).
1−λ
Since the eigenvalues, −1 and 3, have opposite signs, this is an indefinite form.


2 1 1
23. The matrix of this form is A = 1 2 1, so that its characteristic polynomial is
1 1 2
2−λ
1
1
1
1
2−λ
1
= −(λ − 1)2 (λ − 4).
1
2−λ
Since both eigenvalues, 1 and 4, are positive, this is a positive definite form.


1 0 1
24. The matrix of this form is A = 0 1 0, so that its characteristic polynomial is
1 0 1
1−λ
0
1
0
1
1−λ
0
= −λ(λ − 1)(λ − 2).
0
1−λ
Since the eigenvalues, 0, 1, and 2, are all nonnegative, but one of them is zero, this is a positive
semidefinite form.


1 2
0
0, so that its characteristic polynomial is
25. The matrix of this form is A = 2 1
0 0 −1
1−λ
2
0
2
1−λ
0
0
0
= −(λ + 1)2 (λ − 3).
−1 − λ
Since the eigenvalues, −1 and 3, have opposite signs, this is an indefinite form.


−1 −1 −1
26. The matrix of this form is A = −1 −1 −1, so that its characteristic polynomial is
−1 −1 −1
−1 − λ
−1
−1
−1
−1 − λ
−1
−1
−1
= −λ2 (λ + 3)
−1 − λ
Since the eigenvalues, 0 and −3, are both nonpositive but one of them is zero, this is a negative
semidefinite form.
422
CHAPTER 5. ORTHOGONALITY
27. By the Principal Axes Theorem, if A is the matrix of f (x), then there is an orthogonal matrix Q such
that QT AQ = D is a diagonal matrix, and then if y = QT x, we have
f (x) = xT Ax = yT Dy = λ1 y12 + · · · + λn yn2 .
• For 5.22(a), f (x) is positive definite if and only if λ1 y12 + · · · + λn yn2 > 0 for all y1 , . . . , yn . But
this happens if and only if λi > 0 for all i, since if some λi ≤ 0, take yi = 1 and all other yj = 0
to get a value for f (x) that is not positive.
• For 5.22(b), f (x) is positive semidefinite if and only if λ1 y12 + · · · + λn yn2 ≥ 0 for all y1 , . . . , yn .
But this happens if and only if λi ≥ 0 for all i, since if some λi < 0, take yi = 1 and all other
yj = 0 to get a value for f (x) that is negative.
• For 5.22(c), f (x) is negative semidefinite if and only if λ1 y12 + · · · + λn yn2 < 0 for all y1 , . . . , yn .
But this happens if and only if λi < 0 for all i, since if some λi ≥ 0, take yi = 1 and all other
yj = 0 to get a value for f (x) that is nonnegative.
• For 5.22(d), f (x) is negative semidefinite if and only if λ1 y12 + · · · + λn yn2 ≤ 0 for all y1 , . . . , yn .
But this happens if and only if λi ≤ 0 for all i, since if some λi > 0, take yi = 1 and all other
yj = 0 to get a value for f (x) that is positive.
• For 5.22(e), f (x) is indefinite if and only if λ1 y12 + · · · + λn yn2 can be either negative or positive.
But from the arguments above, we see that this can happen if and only if some λi is positive and
some λj is negative.
28. The given matrix represents the form ax2 + 2bxy + dy 2 , which can be written in the form given in the
hint:
2 b2
b
y2 .
ax2 + 2bxy + dy 2 = a x + y + d −
a
a
This is a positive definite form if and only if the right-hand side is positive for all values of x and y not
2
both zero. First suppose that a > 0 and det A = ad − b2 > 0. Then ad > b2 so that d > ba (using the
2
fact that a > 0), and thus d − ba > 0. Therefore the first term on the right-hand side is always positive
if either x or y is nonzero since a > 0 and nonzero squares are always positive, while the second term
is always nonnegative since the constant factor is positive. Thus the form is positive definite. Next
assume the form is positive definite; we must show a > 0 and det A > 0. By way of contradiction,
assume a ≤ 0; then setting x = 1 and y = 0 we get a ≤ 0 for the value of the form, which contradicts
the assumption that it is positive definite. Thus a > 0. Next assume that det A = ad − b2 ≤ 0; then
2
2
(since a > 0) d ≤ ba . Now setting x = − ab and y = 1 gives d − ba ≤ 0 for the value of the form, which
is again a contradiction. Thus a > 0 and det A > 0.
2
29. Since A = B T B, we have xT Ax = xT B T Bx = (Bx)T (Bx) = kBxk ≥ 0. Now, if xT Ax = 0, then
2
kBxk = 0. But since B is invertible, this implies that x = 0. Therefore xT Ax > 0 for all x 6= 0, so
that A is positive definite.
30. Since A is symmetric and positive definite, we can find an orthogonal matrix Q and a diagonal matrix
D such that A = QDQT , where


λ1 · · · 0


D =  ... . . . ...  , where λi > 0 for all i.
0
Let
···
λn
√
λ1
 ..
T
C=C = .
0
···
..
.
···

0
..  ;

√.
λn
then D = C 2 = C T C. This gives
A = QDQT = QC T CQT = (CQT )T CQT .
5.5. APPLICATIONS
423
Since all the λi are positive, C is invertible; since Q is orthogonal, it is invertible. Thus B = CQT is
invertible and we are done.
31. (a) xT (cA)x = c xT Ax > 0 because c > 0 and A is positive definite.
2
(b) xT A2 x = xT Ax xT Ax = xT Ax > 0.
(c) xT (A + B)x = xT Ax + xT Bx > 0 since both A and B are positive definite.
(d) First note that A is invertible by Exercise 30, since it factors as A = B T B where B is invertible.
n o
Since A is positive definite, its eigenvalues {λi } are positive; the eigenvalues of A−1 are
which are also positive. Thus A
−1
1
λ1
,
is positive definite.
32. From Exercise 30, we have A = QC T CQT , where Q is orthogonal and C is a diagonal matrix. But
then C T = C, so that A = QCCQT = QCQT QCQT = (QCQT )(QCQT ) = B 2 . B is symmetric
since B = QCQT is orthogonally diagonalizable, and it is positive definite since its eigenvalues are the
diagonal entries of C, which are all positive.
33. The eigenvalues of this form, from Exercise 20, are λ1 = 2 and λ2 = 0, so that by Theorem 5.23, the
maximum value of f subject to the constraint kxk = 1 is 2 and the minimum value is 0. These occur
at the unit eigenvectors of A, which are found by row-reducing:
−1 −1 0
1 1 0
A − 2I 0 =
→
−1 −1 0
0 0 0
1 −1 0
1 −1 0
A − 0I 0 =
→
.
−1
1 0
0
0 0
So normalized eigenvectors are
" # "
#
√1
1
1
2
v1 = √
=
,
2 −1
− √12
" # " #
√1
1 1
= 12 .
v2 = √
√
2 1
2
Therefore the maximum value of 2 occurs at x = ±v1 , and the minimum of 0 occurs at x = ±v2 .
34. The eigenvalues of this form, from Exercise 22, are λ1 = 3 and λ2 = −1, so that by Theorem 5.23, the
maximum value of f subject to the constraint kxk = 1 is 3 and the minimum value is −1. These occur
at the unit eigenvectors of A, which are found by row-reducing:
−2
2 0
1 −1 0
A − 3I 0 =
→
2 −2 0
0
0 0
2 2 0
1 1 0
A+I 0 =
→
.
2 2 0
0 0 0
So normalized eigenvectors are
" # " #
√1
1 1
v1 = √
= 12 ,
√
2 1
2
" # "
#
√1
1
1
2 .
v2 = √
=
2 −1
− √12
Therefore the maximum value of 3 occurs at x = ±v1 , and the minimum of −1 occurs at x = ±v2 .
35. The eigenvalues of this form, from Exercise 23, are λ1 = 4 and λ2 = 1, so that by Theorem 5.23, the
maximum value of f subject to the constraint kxk = 1 is 4 and the minimum value is 1. These occur
at the unit eigenvectors of A, which are found by row-reducing:




−2
1
1 0
1 0 −1 0
A − 4I 0 =  1 −2
1 0 → 0 1 −1 0
1
1 −2 0
0 0
0 0




1 1 1 0
1 1 1 0
A − I 0 = 1 1 1 0 → 0 0 0 0 .
1 1 1 0
0 0 0 0
424
CHAPTER 5. ORTHOGONALITY
So normalized eigenvectors are

   
√1
1
  13 
1 
v1 = √ 
1 =  √  ,
3    13 
√
1
3




√1
1
2
 
1 
 = − √1  ,
v2 = √ 
−1
2
2  
0
0
1



√1
2

1  
v3 = √ 
=
0
0

.


2
1
− √2
−1
Therefore the maximum value of 4 occurs at x = ±v1 , and the minimum of 1 occurs at x = ±v2 as
well as at x = ±v3 .
36. The eigenvalues of this form, from Exercise 24, are λ1 = 2, λ2 = 1, and λ3 = 0, so that by Theorem
5.23, the maximum value of f subject to the constraint kxk = 1 is 2 and the minimum value is 0.
These occur at the corresponding unit eigenvectors of A, which are found by row-reducing:




−1
0
1 0
1 0 −1 0
A − 2I 0 =  0 −1
0 0 → 0 1
0 0
1
0 −1 0
0 0
0 0




1 0 1 0
1 0 1 0
A − 0I 0 = 0 1 0 0 → 0 1 0 0 .
1 0 1 0
0 0 0 0
So normalized eigenvectors are
   
√1
1
  2
1 
v1 = √ 
0 =  0  ,
2   1 
√
1
2

1


√1
2



1  
v2 = √ 
=
0
0 



.
2
1
√
−1
− 2
Therefore the maximum value of 2 occurs at x = ±v1 , and the minimum of 0 occurs at x = ±v2 .
37. Since λ1 ≥ λ2 ≥ · · · ≥ λn , we have
2
f (x) = λ1 y12 + λ2 y22 + · · · + λn yn2 ≥ λn (y12 + y22 + · · · + yn2 ) = λn kyk = λn
since we are assuming the constraint kyk = 1.
38. If qn is a unit eigenvector corresponding to λn , then Aqn = λn qn , so that
f (qn ) = qTn Aqn = qTn λn qn = λn qTn qn = λn .
Thus the quadratic form actually takes the value λn , so by property (a) of Theorem 5.23, it must be
the minimum value of f (x), and it occurs when x = qn .
39. Divide this equation through by 25 to get
x2
y2
+
= 1,
25
5
which is an ellipse, from Figure 5.15.
40. Divide through by 4 and rearrange terms to get
x2
y2
−
= 1,
4
4
which is a hyperbola, from Figure 5.15.
5.5. APPLICATIONS
425
41. Rearrange terms to get
y = x2 − 1,
which is a parabola, from Figure 5.15.
42. Divide through by 8 and rearrange terms to get
y2
x2
+
= 1,
4
8
which is an ellipse, from Figure 5.15.
43. Rearrange terms to get
y 2 − 3x2 = 1,
which is a hyperbola, from Figure 5.15.
44. This is a parabola, from Figure 5.15.
45. Group the x and y terms and then complete the square, giving
(x2 − 4x + 4) + (y 2 − 4x + 4) = −4 + 4 + 4
⇒ (x − 2)2 + (y − 2)2 = 4.
2
2
This is a circle of radius 2 centered at (2, 2). Letting x0 = x−2 and y 0 = y −2, we have (x0 ) +(y 0 ) = 4.
y
4
3
y¢
x¢
2
1
1
2
3
4
x
46. Group the x and y terms and simplify:
(4x2 − 8x) + (2y 2 + 12y) = −6
⇒
4(x2 − 2x) + 2(y 2 + 6y) = −6.
Now complete the square and simplify again:
4(x2 − 2x + 1) + 2(y 2 + 6y + 9) = −6 + 4 + 18
2
2
4(x − 1) + 2(y + 3) = 16
⇒
⇒
(x − 1)2
(y + 3)2
+
= 1.
4
8
0 2
0 2
This is an ellipse centered at (1, −3). Letting x0 = x − 1 and y 0 = y + 3, we have (x4) + (y8) = 1.
426
CHAPTER 5. ORTHOGONALITY
y
1
-1
2
3
x
-1
y¢
-2
x¢
-3
-4
-5
-6
47. Group the x and y terms to get 9x2 − 4(y 2 + y) = 37. Now complete the square and simplify:
2
2
y + 21
1
1
x2
2
2
2
−
= 1.
9x − 4 y + y +
= 37 − 1 ⇒ 9x − 4 y +
= 36 ⇒
4
2
4
9
2
2
(x0 )
(y 0 )
This is a hyperbola centered at 0, − 21 . Letting x0 = x and y 0 = y + 21 , we have 4 − 9 = 1.
y
y¢
3
2
1
-3
-2
1
-1
2
3
x
x¢
-1
-2
-3
48. Rearrange terms to get 3y − 13 = x2 + 10x; then complete the square:
3y − 13 + 25 = x2 + 10x + 25 = (x + 5)2
⇒
3y + 12 = (x + 5)2
⇒
y+4=
1
(x + 5)2 .
3
5.5. APPLICATIONS
427
2
This is a parabola. Letting x0 = x + 5 and y 0 = y + 4, we have y 0 = 13 (x0 ) .
y
x
-9
-8
-7
-6
-5
-4
-3
-2
-1
-1
y¢
-2
-3
x¢
-4
49. Rearrange terms to get 4x = −2y 2 − 8y; then divide by 2, giving 2x = −y 2 − 4y. Finally, complete the
square:
1
2x − 4 = −(y 2 + 4y + 4) = −(y + 2)2 ⇒ x − 2 = − (y + 2)2 .
2
2
This is a parabola. Letting x0 = x − 2 and y 0 = y + 2, we have x0 = − 21 (y 0 ) .
y
1
-3
-2
y¢
1
-1
2
3
x
-1
x¢
-2
-3
-4
50. Complete the square and simplify:
2y 2 − 3x2 − 18x − 20y + 11 = 0
2
⇒
2
2(y − 10y + 25) − 3(x + 6x + 9) + 11 − 50 + 27 = 0
2
2
2(y − 5) − 3(x + 3) = 12
2
⇒
⇒
2
(y − 5)
(x + 3)
−
= 1.
6
4
This is a hyperbola. Letting x0 = x + 3 and y 0 = y − 5, we have
(y0 )
6
2
−
( x0 )
4
y
y¢
12
10
8
6
x¢
4
2
-12
-10
-8
-6
-4
2
-2
-2
x
2
= 1.
428
CHAPTER 5. ORTHOGONALITY
51. The matrix of this form is
"
A=
#
1
1
2
1
2
1
,
which has characteristic polynomial
1
3
(1 − λ) − = λ2 − 2λ + =
4
4
2
1
3
λ−
λ−
.
2
2
Therefore the associated diagonal matrix is
"
D=
3
2
0
0
1
2
#
,
so that the equation becomes
T
xT Ax = (x0 ) Dx0 =
3 0 2 1 0 2
(x ) + (y ) = 6.
2
2
Dividing through by 6 to simplify gives
2
2
(y 0 )
(x0 )
+
= 1,
4
12
which is an ellipse. To determine the rotation matrix Q, find unit eigenvectors corresponding to λ1 = 32
and λ2 = 21 by row-reduction to get
"
#
√1
− √12
2
Q= 1
.
1
√
2
√
2
Then
"
e1 → Qe1 =
√1
2
√1
2
#
e2 → Qe2 =
,
"
#
− √12
√1
2
,
so that the axis rotation is through a 45◦ angle.
y
3
y¢
x¢
2
1
-3
-2
1
-1
-1
-2
-3
52. The matrix of this form is
A=
4
5
5
,
4
2
3
x
5.5. APPLICATIONS
429
which has characteristic polynomial
(4 − λ)2 − 25 = λ2 − 8λ − 9 = (λ − 9)(λ + 1).
Therefore the associated diagonal matrix is
D=
9
0
0
,
−1
so that the equation becomes
T
2
2
xT Ax = (x0 ) Dx0 = 9 (x0 ) − (y 0 ) = 9.
Dividing through by 9 to simplify gives
2
2
(x0 ) −
(y 0 )
= 1,
9
which is a hyperbola. To determine the rotation matrix Q, find unit eigenvectors corresponding to
λ1 = 9 and λ2 = 1 by row-reduction to get
"
Q=
√1
2
√1
2
− √12
#
√1
2
.
Then
"
e1 → Qe1 =
√1
2
√1
2
#
e2 → Qe2 =
,
"
#
− √12
√1
2
,
so that the axis rotation is through a 45◦ angle.
y
3
y¢
x¢
2
1
-3
-2
1
-1
2
3
x
-1
-2
-3
53. The matrix of this form is
4
A=
3
3
,
−4
which has characteristic polynomial
(4 − λ)(−4 − λ) − 9 = λ2 − 25 = (λ − 5)(λ + 5)
430
CHAPTER 5. ORTHOGONALITY
Therefore the associated diagonal matrix is
5
D=
0
0
,
−5
so that the equation becomes
T
2
2
xT Ax = (x0 ) Dx0 = 5 (x0 ) − 5 (y 0 ) = 5.
Dividing through by 5 to simplify gives
2
2
(x0 ) − (y 0 ) = 1,
which is a hyperbola. To determine the rotation matrix Q, find unit eigenvectors corresponding to
λ1 = 9 and λ2 = 1 by row-reduction to get
"
#
√3
√1
−
10
10 .
Q=
1
3
√
10
√
10
Then
"
e1 → Qe1 =
√3
10
√1
10
#
h
e2 → Qe2 = − √110
,
√3
10
i
√
,
10
√
so that the axis rotation is through an angle whose tangent is 1/
= 13 , or about 18.435◦ .
3/ 10
y
3
y¢
2
x¢
1
-3
-2
1
-1
2
3
x
-1
-2
-3
54. The matrix of this form is
A=
3
−1
−1
,
3
which has characteristic polynomial
(3 − λ)(3 − λ) − 1 = λ2 − 6λ + 8 = (λ − 4)(λ − 2).
Therefore the associated diagonal matrix is
D=
4
0
0
,
2
5.5. APPLICATIONS
431
so that the equation becomes
T
2
2
xT Ax = (x0 ) Dx0 = 4 (x0 ) + 2 (y 0 ) = 8.
Dividing through by 8 to simplify gives
2
2
(x0 )
(y 0 )
+
= 1,
2
4
which is an ellipse. To determine the rotation matrix Q, find unit eigenvectors corresponding to λ1 = 4
and λ2 = 2 by row-reduction to get
"
#
Q=
√1
2
− √12
"
#
√1
2
√1
2
.
Then
e1 → Qe1 =
√1
2
− √12
"
e2 → Qe2 =
,
√1
2
√1
2
#
,
so that the axis rotation is through a −45◦ angle.
y
2
y¢
1
-2
1
-1
2
x
-1
x¢
-2
55. Writing the equation as xT Ax + Bx + C = 0, we have
√
3 −2
A=
,
B = −28 2
−2
3
√ 22 2 ,
C = 84.
A has characteristic polynomial
(3 − λ)(3 − λ) − 4 = λ2 − 6λ + 5 = (λ − 5)(λ − 1).
Therefore the associated diagonal matrix is
D=
5
0
0
.
1
To determine the rotation matrix Q, find unit eigenvectors corresponding to λ1 = 5 and λ2 = 1 by
row-reduction to get
"
#
Q=
√1
2
− √12
√1
2
√1
2
.
432
CHAPTER 5. ORTHOGONALITY
Thus
h
√
√ i
22 2
BQ = −28 2
"
√1
2
− √12
√1
2
√1
2
#
h
i
= −50 −6 ,
so the equation becomes
2
2
5 (x0 ) − 50x0 + (y 0 ) − 6y 0 = −84 ⇒
2
2
5 (x0 ) − 10x0 + 25 + (y 0 ) − 6y 0 + 9 = −84 + 125 + 9
0
0
2
2
5(x − 5) + (y − 3) = 50
0
0
2
⇒
⇒
2
(y − 3)
(x − 5)
+
= 1.
10
50
Letting x00 = x0 − 5 and y 00 = y 0 − 3 gives
2
2
(x00 )
(y 00 )
+
= 1,
10
50
which is an ellipse.
56. Writing the equation as xT Ax + Bx + C = 0, we have
6 −2
A=
,
B = −20 −10 ,
−2
9
C = −5.
A has characteristic polynomial
(6 − λ)(9 − λ) − 4 = λ2 − 15λ + 50 = (λ − 10)(λ − 15).
Therefore the associated diagonal matrix is
D=
10
0
0
.
5
To determine the rotation matrix Q, find unit eigenvectors corresponding to λ1 = 10 and λ2 = 5 by
row-reduction to get
#
"
− √15 √25
.
Q=
2
1
√
5
√
5
Thus
"
h
i − √1
5
BQ = −20 −10
2
√
5
√2
5
√1
5
#
h
= 0
√ i
−10 5 ,
so the equation becomes
√
2
2
10 (x0 ) + 5 (y 0 ) − 10 5y 0 = 5 ⇒
√
2
2
10 (x0 ) + 5 (y 0 ) − 2 5 y 0 + 5 = 5 + 25
√ 2
2
10 (x0 ) + 5 y 0 − 5 = 30 ⇒
√ 2
2
y0 − 5
(x0 )
+
= 1.
3
6
√
Letting x00 = x0 and y 00 = y 0 − 5 gives
2
2
(x00 )
(y 00 )
+
= 1,
3
6
which is an ellipse.
⇒
5.5. APPLICATIONS
433
57. Writing the equation as xT Ax + Bx + C = 0, we have
√
0 1
A=
,
B= 2 2
1 0
0 ,
C = −1.
A has characteristic polynomial
(−λ)(−λ) − 1 = λ2 − 1 = (λ − 1)(λ + 1)
Therefore the associated diagonal matrix is
D=
1
0
0
.
−1
To determine the rotation matrix Q, find unit eigenvectors corresponding to λ1 = 1 and λ2 = −1 by
row-reduction to get
#
"
Q=
√1
2
√1
2
√1
2
− √12
"
√1
2
− √12
.
Thus
h √
BQ = 2 2
i
0
√1
2
√1
2
#
h
= 2
i
2 ,
so the equation becomes
2
2
(x0 ) − (y 0 ) + 2x0 + 2y 0 = 1 ⇒
2
2
(x0 ) + 2x0 + 1 − (y 0 ) − 2y 0 + 1 = 1 + 1 − 1
2
⇒
2
(x0 + 1) − (y 0 − 1) = 1.
Letting x00 = x0 + 1 and y 00 = y 0 − 1 gives
2
2
(x00 ) − (y 00 ) = 1,
which is a hyperbola.
58. Writing the equation as xT Ax + Bx + C = 0, we have
√
1 −1
A=
,
B= 4 2
−1
1
0 ,
C = −4.
A has characteristic polynomial
(1 − λ)(1 − λ) − 1 = λ2 − 2λ = λ(λ − 2).
Therefore the associated diagonal matrix is
2
D=
0
0
.
0
To determine the rotation matrix Q, find unit eigenvectors corresponding to λ1 = 2 and λ2 = 0 by
row-reduction to get
"
#
Q=
√1
2
− √12
√1
2
√1
2
"
√1
2
√1
2
.
Thus
h √
BQ = 4 2
i
0
√1
2
− √12
#
h
= 4
i
4 ,
434
CHAPTER 5. ORTHOGONALITY
so the equation becomes
2
2 (x0 ) + 4x0 + 4y 0 = 4
⇒
0 2
2y 0 = − (x ) − 2x0 + 2 ⇒
2
2y 0 − 1 = − (x0 ) + 2x0 + 1
⇒
1
2
= − (x0 + 1)
⇒
2 y0 −
2
1
1
2
y 0 − = − (x0 + 1) .
2
2
Letting x00 = x0 + 1 and y 00 = y 0 − 12 gives
1
2
y 00 = − (x00 ) ,
2
which is a parabola.
59. x2 − y 2 = (x + y)(x − y) = 0 means that x = y or x = −y, so the graph of this degenerate conic is the
pair of lines y = x and y = −x.
2
1
-2
1
-1
2
-1
-2
60. Rearranging terms gives x2 + 2y 2 = −2, which has no solutions since the left-hand side is a sum of two
squares, so is always nonnegative. Therefore this is an imaginary conic.
61. The left-hand side is a sum of two squares, so it equals zero if and only if x = y = 0. The graph of this
degenerate conic is the single point (0, 0).
1.
0.5
-1.
0.5
-0.5
-0.5
-1.
1.
5.5. APPLICATIONS
435
62. x2 + 2xy + y 2 = (x + y)2 . Then (x + y)2 = 0 is satisfied only when x + y = 0, so the graph of this
degenerate conic is the line y = −x.
2
1
-2
1
-1
2
-1
-2
63. Rearranging terms gives
√
√
x2 − 2xy + y 2 = 2 2(y − x), or (x − y)2 = 2 2(y − x).
This is satisfied when y = x. If y 6= x, then we can divide through by y − x to get
√
√
y − x = 2 2, or y = x + 2 2.
√
So the graph of this degenerate conic is the union of the lines y = x and y = x + 2 2.
4
3
2
1
-2
1
-1
2
-1
-2
64. Writing the equation as xT Ax + Bx + C = 0, we have
√
√ 2 1
A=
,
B = 2 2 −2 2 ,
1 2
C = 6.
436
CHAPTER 5. ORTHOGONALITY
A has characteristic polynomial
(2 − λ)(2 − λ) − 1 = λ2 − 4λ + 3 = (λ − 3)(λ − 1).
Therefore the associated diagonal matrix is
3
D=
0
0
.
1
To determine the rotation matrix Q, find unit eigenvectors corresponding to λ1 = 3 and λ2 = 1 by
row-reduction to get
#
"
√1
− √12
2
.
Q= 1
1
√
√
2
2
Thus
h √
BQ = 2 2
√ i
−2 2
"
− √12
√1
2
√1
2
#
√1
2
h
= 0
i
4 ,
so the equation becomes
2
2
3 (x0 ) + (y 0 ) + 4y 0 = −6 ⇒
2
2
3 (x0 ) + (y 0 ) + 4y 0 + 4 = −6 + 4
2
⇒
2
3 (x0 ) + (y 0 + 2) = −2.
The left-hand side is always nonnegative, so this equation has no solutions and this is an imaginary
conic.
65. Let the eigenvalues of A be λ1 and λ2 . Then there is an orthogonal matrix Q such that
λ
0
QT AQ = D = 1
.
0 λ2
Then by the Principal Axes Theorem, xT Ax = x0T Dx0 = λ1 (x0 )2 + λ2 (y 0 )2 = k. If k 6= 0, then we may
divide through by k to get the standard equation
λ1 0 2 λ2 0 2
(x ) + (y ) = 1.
k
k
(a) Since det A = det D = λ1 λ2 < 0, the coefficients of (x0 )2 and (y 0 )2 in the standard equation have
opposite signs, so this curve is a hyperbola.
(b) Since det A = det D = λ1 λ2 > 0, the coefficients of (x0 )2 and (y 0 )2 in the standard equation have
the same sign. If they are both positive, the curve is an ellipse; if they are positive and equal,
then the ellipse is in fact a circle. If they are both negative, then the equation has no solutions
so is an imaginary conic.
(c) Since det A = det D = λ1 λ2 = 0, then at least one of the coefficients of (x0 )2 and (y 0 )2 in the
standard equation
is zero. If the remaining coefficient is positive, then we get the two parallel
q
lines x0 = ± λki , while if it is zero or negative, the equation has no solutions and we get an
imaginary conic.
(d) If k = 0 but det A = det D = λ1 λ2 6= 0, then the equation is
λ1 (x0 )2 + λ2 (y 0 )2 = 0,
and both λ1 and λ2 are nonzero. Then solving for y 0 gives
r
λ1
0
y = ± − x0 .
λ2
If λ1 and λ2 have the same sign, then this equation has only the solution x0 = y 0 = 0, so the
graph is just the origin. If they have opposite signs, then the graph is two straight lines whose
equations are above.
5.5. APPLICATIONS
437
(e) If k = 0 and det A = det D = λ1 λ2 = 0, then the equation is
λ1 (x0 )2 + λ2 (y 0 )2 = 0,
and at least one of λ1 and λ2 is zero. If λ1 = 0, we get x0 = 0, so the graph is the y 0 -axis, a
straight line. Similarly, if λ2 = 0, we get y 0 = 0, so the graph is the x0 -axis, also a straight line
(if both λ1 and λ2 are zero, then A was the zero matrix to start, so it does not correspond to a
quadratic form).
66. The matrix of this form is

4
A = 2
2

2
2 ,
4
2
4
2
which has characteristic polynomial
−λ3 + 12λ2 − 36λ + 32 = −(λ − 2)2 (λ − 8)
Therefore the associated diagonal matrix is

8
D = 0
0

0
0 ,
2
0
2
0
so that the equation becomes
2
T
2
2
2
2
xT Ax = (x0 ) Dx0 = 8 (x0 ) + 2 (y 0 ) + 2 (z 0 ) = 8, or (x0 ) +
This is an ellipsoid.
67. The matrix of this form is

1
A = 0
0

0
0
1 −2 ,
−2
1
which has characteristic polynomial
−λ3 + 3λ2 + λ − 3 = −(x + 1)(x − 1)(x − 3)
Therefore the associated diagonal matrix is

−1
D= 0
0
0
1
0

0
0 ,
3
so that the equation becomes
T
2
2
2
xT Ax = (x0 ) Dx0 = − (x0 ) + (y 0 ) + 3 (z 0 ) = 1,
which is a hyperboloid of one sheet.
68. The matrix of this form is

−1
A= 2
2

2
2
−1
2 ,
2 −1
which has characteristic polynomial
−λ3 − 3λ2 + 9λ + 27 = −(x + 3)2 (x − 3)
2
(z 0 )
(y 0 )
+
= 1.
4
4
438
CHAPTER 5. ORTHOGONALITY
Therefore the associated diagonal matrix is

−3
D= 0
0

0 0
−3 0 ,
0 3
so that the equation becomes
2
2
T
2
2
2
(y 0 )
(z 0 )
(x0 )
+
−
= −1.
4
4
4
2
xT Ax = (x0 ) Dx0 = −3 (x0 ) − 3 (y 0 ) + 3 (z 0 ) = 12, or
This is a hyperboloid of two sheets.
69. Writing the form as xT Ax + Bx + C = 0, we have


0 1 0
A = 1 0 0 , B = 0
0 0 0
0
1 ,
C = 0.
The characteristic polynomial of A is λ − λ3 = −λ(λ − 1)(λ + 1). Computing unit eigenvectors for each
eigenspace gives

 

 
√1
√1
−
0
2

 2
 
1 
1 

v−1 = 
v1 = 
v0 = 
 √ .
√  ,
0 ,
1
2
2
0
0
Then setting

0

Q=
0
1
√1
2
√1
2
− √12

√1
2

,

0
0

0

D=
0
0
0
1
0

0

0

−1
we get
BQ = 1
0
0 .
Then the form becomes
2
2
2
2
xT Ax + Bx + C = 0 ⇒ x0T Dx0 + BQx0 = 0 ⇒ (y 0 ) − (z 0 ) + x0 = 0 ⇒ x0 = (z 0 ) − (y 0 ) .
This is a hyperbolic paraboloid.
70. Writing the form as xT Ax + Bx + C = 0, we have


16
0 −12
0 , B = −60
A =  0 100
−12
0
9
0
−80 ,
C = 0.
The characteristic polynomial of A is
−λ3 + 125λ2 − 2500λ = −λ(λ − 25)(λ − 100).
Computing unit eigenvectors for each eigenspace gives
 
 
3
− 54
5
 
 
v0 =  0  ,
v25 =  0 ,
4
5
3
5
 
0
 
v100 = 1 .
0

0

D = 0
0
0
25
0
Then setting

3
5

Q = 0
4
5
− 45
0
3
5

0

1 ,
0

0

0 
100
5.5. APPLICATIONS
439
we get
BQ = −100
0 .
0
Then the form becomes
2
2
2
xT Ax + Bx + C = 0 ⇒ x0T Dx0 + BQx0 = 0 ⇒ 25 (y 0 ) + 100 (z 0 ) − 100x0 = 0 ⇒ x0 =
(y 0 )
2
+ (z 0 ) .
4
This is an elliptic paraboloid.
71. Writing the form as xT Ax + Bx + C = 0, we have


1 2 −1
1 , B = −1
A= 2 1
−1 1 −2
1 ,
1
C = 0.
The characteristic polynomial of A is
−λ3 + 9λ = −λ(λ − 3)(λ + 3).
Computing unit eigenvectors for each eigenspace gives

 

√1
− √13
 2
 1 
1 

v3 = 
v0 = 
 √2  ,
 √3  ,
√1
0
3


√1
 16 

v−3 = 
− √6  .
√2
6
Then setting

− √1
 13
Q=
 √3
√1
3
√1
2
√1
2
0


0


D = 0
0
√1
6 
− √16 
,
√2
6
0
3
0

0

0

−3
we get
BQ =
√
3
0 .
0
Then the form becomes
2
2
xT Ax + Bx + C = 0 ⇒ x0T Dx0 + BQx0 = 0 ⇒ 3 (y 0 ) − 3 (z 0 ) +
√
2
3 x0 = 0 ⇒ x0 =
This is a hyperbolic paraboloid.
72. Writing the form as xT Ax + Bx = 15, we have


10
0 −20
0 ,
A =  0 25
−20 0
10
√
B = 20 2
50
√ 20 2 .
The characteristic polynomial of A is
−λ3 + 45λ2 − 200λ − 7500 = −(λ + 10)(λ − 25)(λ − 30).
Computing unit eigenvectors for each eigenspace gives
 
 
√1
0
 2
 



v−10 = 
,
v
=
25
0
1 ,
1
√
0
2


− √12


v30 = 
0

.
√1
2
2
(y 0 )
(z 0 )
√ − √
1/ 3 1/ 3
440
CHAPTER 5. ORTHOGONALITY
Then setting

√1
 2
0
Q=
0
1
√1
2
0
− √12


−10 0 0



D=
 0 25 0
0 0 30


0 
,
√1
2
we get
BQ = 40
0 .
50
Then the form becomes
xT Ax + Bx = 15 ⇒
x0T Dx0 + BQx0 = 15 ⇒
2
2
2
−10 (x0 ) + 25 (y 0 ) + 30 (z 0 ) + 40x0 + 50y 0 = 15 ⇒
2
2
2
−10 (x0 ) + 4x0 + 25 (y 0 ) + 2y 0 + 30 (z 0 ) = 15 ⇒
2
2
2
−10 (x0 + 2) + 25 (y 0 + 1) + 30 (z 0 ) = 0 ⇒
2
2
2
(x0 + 2) =
(z 0 )
(y 0 + 1)
+
.
2/5
1/3
Letting x00 = x0 + 2, y 00 = y 0 + 1, z 00 = z 0 gives
2
2
(x00 ) =
2
(y 00 )
(z 00 )
+
.
2/5
1/3
This is an elliptic cone.
73. Writing the form as xT Ax + Bx = 6, we have


11
1
4
A =  1 11 −4 ,
4 −4 14
B = −12
12
12 .
The characteristic polynomial of A is
−λ3 + 36λ2 − 396λ + 1296 = −(λ − 6)(λ − 12)(λ − 18).
Computing unit eigenvectors for each eigenspace gives


− √13
 1 

v6 = 
 √3  ,



√1
3

√1
 16 

v18 = 
− √6  .
√2
6
√1
 2
1 
v12 = 
 √2  ,
0
Then setting

− √1
 13
Q=
 √3
√1
3
we get
√1
2
√1
2
0

√1
6 
− √16 
,
2
√
6
√
BQ = 12 3

6


D = 0
0
0
0 .

0
0
12

0

0
18
5.5. APPLICATIONS
441
Then the form becomes
xT Ax + Bx = 6 ⇒
x0T Dx0 + BQx0 = 6 ⇒
√
2
2
2
6 (x0 ) + 12 (y 0 ) + 18 (z 0 ) + 12 3x0 6 ⇒
√
2
2
2
6 (x0 ) + 2 3 x0 + 12 (y 0 ) + 18 (z 0 ) = 6 ⇒
√
2
2
2
6 (x0 ) + 2 3 x0 + 3 + 12 (y 0 ) + 18 (z 0 ) = 24 ⇒
√ 2
2
2
x0 + 3
(y 0 )
(z 0 )
+
+
= 1.
4
2
4/3
Letting x00 = x0 +
√
3, y 00 = y 0 , z 00 = z 0 gives
2
2
2
(x00 )
(y 00 )
(z 00 )
+
+
= 1.
4
2
4/3
This is an ellipsoid.
74. The strategy is to reduce the problem to one in which we can apply part (b) of Exercise 65. Using the
−1
hint, let v be an eigenvector corresponding to λ = a−bi, and let P = Re v Im v . Set B = P P T
.
Now, B is a real matrix; it is also symmetric since
−1 T T −1 T T T −1
−1
P
= PPT
= B.
BT = P P T
= PPT
= P
Also,
det B = det
PPT
−1 =
1
1
1
=
=
2 > 0.
det (P P T )
det P det P T
(det P )
Thus Exercise 65(b) applies to B, so that xT Bx = k is an ellipse if both eigenvalues of B are positive
(since k > 0). But by Exercise 29 in this section, B −1 is positive definite, so that B is as well (Exercise
31(d)), and therefore has positive eigenvalues. Hence the trajectories of xT B are ellipses.
Next, let
a −b
C=
,
b
a
so that by Theorem 4.43, A = P CP −1 . Note that
2
2
a b a −b
a + b2 ab − ab
|λ|
CT C =
=
=
−b a b
a
ab − ab a2 + b2
0
1
0
=
0
|λ|
0
= I.
1
Now,
(Ax)T B(Ax) = xT AT P P T
= xT
= xT
= xT
= xT
= xT
−1
(Ax)
−1 −1
P CP
PT
P P CP −1 x
T
−1
P −1 C T P T P T
CP −1 x
−1 T
PT
C CP −1 x
−1 −1
PT
P x
−1
PPT
x
−1 T
= xT Bx.
Thus the quadratics xT Bx = k and (Ax)T B(Ax) = k are the same, so that the trajectories of Ax are
ellipses as well.
442
CHAPTER 5. ORTHOGONALITY
Chapter Review
1. (a) True. See Theorem 5.1 and the definition preceding it.
(b) True. See Theorem 5.15; the Gram-Schmidt process allows us to turn any basis into an orthonormal basis, showing that any subspace has such a basis.
(c) True. See the definition preceding Theorem 5.5 in Section 5.1.
(d) True. See Theorem 5.5.
(e) False. If Q is an orthogonal matrix, then |det Q| = 1. However, there are many matrices with
determinant 1 that are not orthogonal. For example,
1 1
2 5
,
.
0 1
1 3
⊥
⊥
(f ) True. By Theorem 5.10, (row(A)) = null(A). But if (row(A)) = Rn , then null(A) = Rn so
that Ax = 0 for every x, and therefore A = O (since Aei = 0 for every i, showing that every row
of A is zero).
(g) False. For example, let W be the subset of R2 spanned by e1 , and let v = e2 . Then projW (v) = 0,
but v 6= 0.
(h) True. If A is orthogonal, then A−1 = AT ; but if A is also symmetric, then AT = A. Thus
A−1 = A, so that A2 = AA−1 = I.
(i) False. For example, any diagonal matrix is orthogonally diagonalizable, since it is already diagonal.
But if it has a zero entry on the diagonal, it is not invertible.
(j) True. For example, the diagonal matrix with λ1 , λ2 , . . . , λn on the diagonal has this property.
2. The first pair of vectors is already orthogonal, since 1 · 4 + 2 · 1 + 3 · (−2) = 0. The other two pairs are
orthogonal if and only if
1·a+2·b+3·3=0
⇒
4·a+1·b−2·3=0
a + 2b = −9
4a + b = 6 .
Computing and row-reducing the augmented matrix gives
1 0
1 2 −9
→
4 1
6
0 1
3
−6
So the unique solution is a = 3 and b = −6; these are the only values of a and b for which these vectors
form an orthogonal set.
3. Let
 
1
u1 = 0 ,
1


1
u2 =  1 ,
−1
 
−1
u3 =  2 .
1
Note that these vectors form an orthogonal set. Therefore using Theorem 5.2, we write
v = c1 u1 + c2 u2 + c3 u3
v · u1
v · u2
v · u3
=
u1 +
u2 +
u3
u1 · u1
u2 · u2
u3 · u3
7·1−3·0+2·1
7 · 1 − 3 · 1 + 2 · (−1)
7 · (−1) − 3 · 2 + 2 · 1
=
u1 +
u2 +
u3
1·1+0·0+1·1
1 · 1 + 1 · 1 − 1 · (−1)
−1 · (−1) + 2 · 2 + 1 · 1
9
2
11
= u1 + u2 − u3 .
2
3
6
So the coordinate vector is


9
2


[v]B =  23  .
− 11
6
Chapter Review
443
4. We first find v2 . Since v1 and v2 =
x
are orthogonal, we have
y
3
4
x + y = 0,
5
5
4
or x = − y.
3
25 2
3
4
2
2
But also v2 is a unit vector, so that x2 + y 2 = 16
9 y + y = 9 y = 1. Therefore y = ± 5 and x = ∓ 5 .
h
i
• If v2 = − 45 35 , then
• If v2 =
h
4
5
1
v = −3v1 + v2 = −3
2
" #
3
5
4
5
" # "
#
− 11
1 − 54
5
+
=
.
3
2
− 21
5
10
1
v = −3v1 + v2 = −3
2
" #
#
" # "
4
− 57
1
5
+
=
.
2 − 35
− 27
10
i
− 35 , then
3
5
4
5
5. Using Theorem 5.5, we show that the matrix is orthogonal by showing that that QQT = I:

 


6
4
2
3
6
√
− √15
1 0 0
7
7
7
7
7
5

 
 1

√
√2   2
√  = 0 1 0 = I.
0
QQT = 
0
− 715


− 5
5  7
5
4
√
7 5
√
− 715
5
2
√
7 5
3
7
√2
5
2
√
7 5
0
0
1
6. Let Q be the given matrix. If Q is to be orthogonal, then by Theorem 5.5, we must have
"
#"
# "
# "
#
1
1
a 12 b
+ a2 21 b + ac
1 0
T
2
4
QQ =
= 1
=
.
b c a c
b2 + c2
0 1
2 b + ac
Therefore we must have
a2 +
1
= 1,
4
1
ac + b = 0,
2
b2 + c2 = 1.
√
The first equation gives a = ± 23 . Substituting this into the second equation gives
√
±
3
1
c + b = 0,
2
2
√
b = ∓ 3c
⇒
b2 = 3c2 .
⇒
But b2 + c2 = 1 and b2 = 3c2 means that 4c2 = 1, so that c = ± 21 . So there are two possibilities:
√
• If a =
3
2 , then b = −
√
3 c, so we get the two matrices
"
√
3
2
1
2
1
2√
− 23
√
• If a = − 23 , then b =
√
#
"
,
1
√2
3
2
√
3
2
− 12
#
.
3 c, so we get the two matrices
"
1
√2
3
2
√
− 23
1
2
#
"
,
1
2√
− 23
√
− 23
− 21
#
.
444
CHAPTER 5. ORTHOGONALITY
7. We must show that Qvi · Qvj = 0 for i 6= j and that Qvi · Qvi = 1. Since Q is orthogonal, we know
that QQT = QT Q = I. Then for any i and j (possibly equal) we have
T
Qvi · Qvj = (Qvi ) Qvj = viT QT Q vj = viT vj = vi · vj .
Thus if i 6= j, the dot product is zero since vi · vj = 0 for i 6= j, and if i = j, then the dot product is 1
since vi · vi = 1.
8. The given condition means that for all vectors x and y,
x·y
Qx · Qy
=
.
kxk kyk
kQxk kQyk
Let qi be the ith column of Q. Then qi = Qei . Since the ei are orthonormal and the qi are unit
vectors, we have
(
0 i 6= j
qi · qj
ei · ej
qi · qj =
=
= ei · ej =
kqi k kqj k
kei k kej k
1 i = j.
It follows that Q is an orthogonal matrix.
5
9. W is the column space of A =
, so that W ⊥ is the null space of AT = 5
2
augmented matrix:
5 2 0 → 1 52 0 ,
2 . Row-reduce the
so that a basis for the null space, and thus for W ⊥ , is given by
" #
" #
− 25
−2
, or
.
1
5

1
10. W is the column space of A =  2, so that W ⊥ is the null space of AT = 1
−1
augmented matrix is already row-reduced:
1 2 −1 0 ,

2
−1 . The
so that the null space, which is W ⊥ , is the set


 
 
   
−2s + t
−2
1
−2
1
 s  = s  1 + t 0 = span  1 , 0 .
t
0
1
0
1
These two vectors form a basis for W ⊥ .
11. Let A be the matrix whose columns are the given basis of W . Then by Theorem 5.10, W ⊥ =
⊥
(col(A)) = null(AT ). Row-reducing AT , we get
1 −1
4 0
1 0
1 0
→
,
0
1 −3 0
0 1 −3 0
so that a basis for W ⊥ is
 
−1
 3 .
1
Chapter Review
445
12. Let A be the matrix whose columns are the given basis of W . Then by Theorem 5.10, W ⊥ =
⊥
(col(A)) = null(AT ). Row-reducing AT , we get
"
1
1
#
"
0
1 0 1
→
0 1 0
0
1 1 1
−1 1 2
#
0
,
0
3
2
− 12
so that

   
   
 3 
−3 
−s − 32 t 
−2 
−1
−1








   
 1
   

1 


0
0
t
2
 = s   + t  2  = span   ,  1 .
W⊥ = 
 s    1
 0
 1  0








 




2
0
0
1
t
13. Row-reducing A gives

1
0
A→
0
0
0
1
0
0
2
0
0
0
3
2
0
0

4
1
,
0
0
so that a basis for row(A) is given by the nonzero rows of the reduced matrix, or
1
0
2
3
4 , 0
1
0
2
1
.
A basis for col(A) is given by the columns of A corresponding to leading 1s in the reduced matrix, so
one basis is
   
−1 
1



   
−1
  ,  2 .
 2  1





−5
3
A basis for null(A) is given by the solution set of the matrix, which, from the reduced form, is
     


−4 
−3
−2
−2s − 3t − 4u 









 0 −2 −1




−2t
−
u



     


 = span  1 ,  0 ,  0 .

s
     






 0  1  0


t












1
0
0
u
Finally, to determine null(AT ), we row-reduce AT :

1
−1

 2

 1
3


−1 2
3
1
0
2 1 −5



−2 4
6
 → 0
0
1 8 −1
−2 9
7
0
0
1
0
0
0
5
3
0
0
0

1
−2

0
.
0
0
Then

   

−5s − t 
−5
−1 







   


−3s
+
2t
−3
 = span   ,  2 .
null(AT ) = 

 1  0
s










t
0
1
14. The orthogonal decomposition of v with respect to W is
v = projW (v) + (v − projW (v)),
446
CHAPTER 5. ORTHOGONALITY
so we must find projW (v). Let the three given vectors in W be w1 , w2 , and w3 . Then
w1 · v
w2 · v
w3 · v
projW (v) =
w1 +
w2 +
w3
w1 · w1
w2 · w2
w3 · w3
 
 
0
1
 1 · 1 + 0 · 0 + 1 · (−1) − 1 · 2  0
0 · 1 + 1 · 0 + 1 · (−1) + 1 · 2 
1
 +
 
=
0 · 0 + 1 · 1 + 1 · 1 + 1 · 1 1 1 · 1 + 0 · 0 + 1 · 1 − 1 · (−1)  1
1
−1
 
3

3 · 1 + 1 · 0 − 2 · (−1) + 1 · 2 
 1
+


−2
3 · 3 + 1 · 1 − 2 · (−2) + 1 · 1
1
 
 
 
0
1
3
 2  0
 1
7
1
1
−  +
 
= 
3 1 3  1 15 −2
1
−1
1
 11 
15
 4


=  195 .
− 15 
22
15
Then
  11   4 
1
15
15
 0   4  − 4 
   5  5
v − projW (v) =   −  19  =  4  .
−1 − 15   15 

2
22
15
8
15
15. (a) Using Gram-Schmidt, we get
 
1
1

v1 = x1 = 
1
1
 
 
1
1
1
1
1
·
1
+
1
·
1
+
1
·
1
+
1
·
0

 
v2 = x2 − projv1 (x2 )v1 = 
1 − − 1 · 1 + 1 · 1 + 1 · 1 + 1 · 1 1
1
0
 
   1
1
1
4
1
1 3 1 

 −   =  4
=
1 4 1  1 

4
0
1
3
−
4
v3 = x3 − projv1 (x3 )v1 − projv2 (x3 )v2
 
0
1
1
1
3
1 1 · 0 + 1 · 1 + 1 · 1 + 1 · 1
4 ·0+ 4 ·1+ 4 ·1− 4 ·1
−
v
−
v2
=
1
1
1
1
9
1 1 · 1 + 1 · 1 + 1 · 1 + 1 · 1
16 + 16 + 16 + 16
1
 1  2
 
 
−3
0
1
4
 1
1
1 3 1 1 
 4  3

 
=
 =  .
1 − 4 1 + 3 
 14   13 
1
1
−3
0
4
Chapter Review
447
(b) To find a QR factorization, we normalize the vectors from (a) and use those as the column vectors
of Q. Then we set R = QT A. Normalizing v1 , v2 , and v3 yields
   
1
1
   21 
 
v1
1 1
 = 2
q1 =
= 


1
kv1 k
2 1 
2
1
1
2
  
√ 
3
1
4
√  1   √63 
 

v2
2 3
 4  =  √6 
q2 =
=




3
1
kv2 k
3   

4
√6
3
3
−4
− 2
   √ 
− 6
−2
√  31   √36 
 

v3
6
 3  =  √6  .
q3 =
=




6
1
kv3 k
2   
3
6 
0
0
Then

1
 12
2
Q=
1
2
1
2
√
3
√6
3
√6
3
6√
− 23
√
− 36

6
√6
6
6


,


√
0
and

1
 √2
T
3
R=Q A=
 6√
− 36
1
√2
3
√6
6
6
1
√2
3
√6
6
6

 1
1
2√  
1

− 23 
 1

0
1
1
1
1
0


0
2


1
 = 0

1

0
1
3
√2
3
2
0
3
2√


− 63 
.
√
6
3
Then A = QR is a QR factorization of A.
16. Note that the two given vectors,
 
 
1
0
0
 1

 
v1 = 
2 and v2 =  1 ,
2
−1
are indeed orthogonal. We extend these to a basis by adding e3 and e4 . We can verify that this is a
basis by computing the determinant of the matrix whose column vectors are the vi ; this determinant
is 1, so the matrix is invertible and therefore the vectors are linearly independent, so they do in fact
448
CHAPTER 5. ORTHOGONALITY
form a basis for R4 . We apply Gram-Schmidt to v3 and v4 :
v3 = x3 − projv1 (x3 )v1 − projv2 (x3 )v2
 
0
0 0 · 1 + 0 · 0 + 1 · 2 + 0 · 2
0 · 0 + 0 · 1 + 1 · 1 + 0 · (−1)
 
= −
v1 −
v2
12 + 02 + 22 + 22
02 + 12 + 12 + (−1)2
1
0
 
 
   2
−9
0
1
0
0 2 0 1  1 − 1 
 
 
   
=   −   −   =  32 
1 9 2 3  1  9 
0
2
−1
− 19
v4 = x4 − projv1 (x4 )v1 − projv2 (x4 )v2 − projv3 (x4 )v3
 
0
0 0 · 1 + 0 · 0 + 0 · 2 + 1 · 2
0 · 0 + 0 · 1 + 0 · 1 + 0 · (−1)
 
= −
v1 −
v2
2
2
2
2
1 +0 +2 +2
02 + 12 + 12 + (−1)2
0
1
0 · − 92 + 0 · − 13 + 0 · 29 + 1 · − 19
v3
4
1
4
1
81 + 9 + 81 + 81
 
 
 
 2  1
0
−3
1
0
−9
0 2 0 1  1 1 − 1   1 
 
 
 
   
=   −   +   +  32  =  6  .
0 9 2 3  1 2  9   0
−
1
−1
2
− 19
1
6
17. If x1 + x2 + x3 + x4 = 0, then x4 = −x1 − x2 − x3 , so that a basis for W is given by
 
 

0 
0
1

 0
 1
 0
 , x2 =   , x3 =   .
B = x1 = 
 1
 0
 0





−1
−1
−1





Then using Gram-Schmidt,



v1 = x1 = 


1
0


0
−1



v2 = x2 − projv1 (x2 )v1 = 





 
0
1


1
1 · 0 + 0 · 1 + 0 · 0 − 1 · (−1)  0

−−
 
12 + 02 + 02 + (−1)2
 0
0
−1
 1
1
−2
 1
0
  
= 
0  0
0
 1 1 

 
= − 
 0 2 
−1
−1

− 12
−1
Chapter Review
449
v3 = x3 − projv1 (x3 )v1 − projv2 (x3 )v2
 
 
 1
0
1
−2
 0 1 · 0 + 0 · 0 + 0 · 1 − 1 · (−1)  0 − 1 · 0 + 1 · 0 + 0 · 1 − 1 · (−1)  1
 
 
 
2
= −
 − 2
 
1
1
12 + 02 + 02 + (−1)2
 1
 0
 0
+
1
+
0
+
4
4
−1
−1
− 12
 
 
 1  1
−3
0
1
−2
 0 1  0 1  1 − 1 
 
 
   
=   −   −   =  3 .
 1 2  0 3  0  1
−1
−1
− 12
− 13
18. (a) The characteristic polynomial of A is
1
−1
2−λ
1
= −λ3 + 6λ2 − 9λ = −λ(λ − 3)2 ,
1
2−λ
so the eigenvalues are 0 and 3. To find the eigenspaces, we row-reduce A − λI 0 :




2 1 −1 0
1 0 −1 0
A − 0I 0 =  1 2
1 0 → 0 1
1 0
−1 1
2 0
0 0
0 0




1 −1 1 0
−1
1 −1 0
1 0 → 0
0 0 0 .
A − 3I 0 =  1 −1
−1
1 −1 0
0
0 0 0
det(A − λI) =
2−λ
1
−1
Thus the eigenspaces are

 
1 

E0 = span v1 = −1 ,


1

 
 
1
−1 

E3 = span v2 = 1 , x3 =  0 .


0
1
The vectors in E0 are orthogonal to those in E3 since they correspond to different eigenvalues,
but the basis vectors above in E3 are not orthogonal, so we use Gram-Schmidt to convert the
second basis vector into one orthogonal to the first:
 
   
−1
1
− 12
  1 · (−1) + 1 · 0 + 0 · 1    1 
v3 =  0 −
1 =  2  .
1·1+1·1+0·0
1
0
1
Finally, normalize v1 , v2 , and v3 and let Q be the matrix whose columns are those vectors; then
Q is the orthogonal matrix


√1
√1
− √16
3
2
 1

1
√1  .
Q=
− √3 √2
6 
√1
√2
0
3
6
Finally, an orthogonal diagonalization of A is given by

0 0
QT AQ = D = 0 3
0 0

0
0 .
3
450
CHAPTER 5. ORTHOGONALITY
(b) From part (a), we have λ1 = 0 and λ2 = λ3 = 3, with


 
√1
 13 

q1 = 
− √3  ,
√1
3


− √16
 1 

q3 = 
 √6  .
√1
 2
1 
q2 = 
 √2  ,
√2
6
0
The spectral decomposition of A is
A = λ1 q1 qT1 + λ2 q2 qT2 + λ3 q3 qT3
= 3q2 qT2 + 3q3 qT3
 


"
#
√1
− √16
1
1
2
√
  √

h
2
2 + 3  √1  − √1
√1 
= 3
 2 0

6
6
√2
0
6




1
1
1
1
0
− 6 − 13
2
2
6




1
1
1 .
1
 1

= 3
 2 2 0 + 3 − 6
6
3
0
0
− 31
0
1
3
√1
6
√2
6
i
2
3
19. See Example
  5.20, or Exercises 23 and 24 in Section 5.4. The basis vector for E−2 , which we name
1
v3 = −1, is orthogonal to the other two since they correspond to different eigenvalues. However,
0
we must orthogonalize the basis for E0 using Gram-Schmidt:
 
 
   
1
1
1
0
 
  1·1+1·1+1·0   
v1 = 1 ,
v2 = 1 −
=
1 0 .
12 + 1 2 + 0 2
0
1
0
1
√ √
Now v1 , v2 , and v3 are mutually orthogonal, with norms 2, 2, and 1 respectively. Let Q be the
matrix whose columns are q1 = kvv11 k , q2 = kvv22 k , and q3 = kvv33 k ; then the desired matrix is

1

Q
0
0
0
1
0

0
 T
1
0
Q
0 −2

√1
0 √12
1
2


1
1

=
− √2 0 √2  0
0 1
0 0


1
0
 −1


0
 Q = Q 0
−2
0

0
20. vi viT is symmetric since
vi viT
T
= viT
0
1
0
T

√1
0
 2

0
 0
√1
−2
2
− √12
0
√1
2
 
0
−1
  2
 3
1
 = − 2
0
0
− 23
− 21
0

0

0
.
1
viT = viT vi ,
so it is equal to its own transpose. Since a scalar times a symmetric matrix is a symmetric matrix, and
the sum of symmetric matrices is symmetric, it follows that A is symmetric. Next,
!
n
n
n
X
X
X
T
Avj =
ci vi vi vj =
ci vi viT vj =
ci vi (vi · vj ) = cj vj
i=1
i=1
i=1
since {vi } is a set of orthonormal vectors. This shows that each cj is an eigenvalue for A with
corresponding eigenvector vj . Since there are n of them, these are all the eigenvalues and eigenvectors.
Chapter 6
Vector Spaces
6.1
Vector Spaces and Subspaces
x
. V is the set of all vectors in R2 whose first and second components are the same.
x
We verify all ten axioms of a vector space:
x
y
x
y
x+y
1. If
,
∈ V then
+
=
∈ V since its first and second components are the same.
x
y
x
y
x+y
x
y
x+y
y+x
y
x
2.
+
=
=
=
+
.
x
y
x+y
y+x
y
x
x
y
z
x+y
z
x+y+z
x
y+z
x
y
z
3.
+
+
=
+
=
=
+
=
+
+
.
x
y
z
x+y
z
x+y+z
x
y+z
x
y
z
0
x
0
x+0
x
4.
∈ V , and
+
=
=
∈V.
0
x
0
x+0
x
x
−x
x
−x
x−x
0
5. If
∈ V , then
∈ V , and
+
=
=
.
x
−x
x
−x
x−x
0
x
x
cx
6. If
∈ V , then c
=
∈ V since its first and second components are the same.
x
x
cx
x
y
x+y
c(x + y)
c(y + x)
y+x
y
x
7. c
+
=c
=
=
=c
=c
+
.
x
y
x+y
c(x + y)
c(y + x)
y+x
y
x
x
(c + d)x
cx + dx
cx
dx
x
x
=
=
=
+
=c
+d
.
8. (c + d)
x
(c + d)x
cx + dx
cx
dx
x
x
x
dx
c(dx)
(cd)x
x
=c
=
=
= (cd)
.
9. c d
x
dx
c(dx)
(cd)x
x
x
1x
x
=
=
.
10. 1
x
1x
x
x
2. V =
: x, y ≥ 0 fails to satisfy Axioms 5 and 6, so it is not a vector space:
y
x
0
x
cx
5. If
6=
∈ V , choose c < 0; then c
=
∈
/ V since at least one of cx and cy is negative.
y
0
y
cy
x
0
−x
6. If
6=
∈ V , then
∈
/ V since −x < 0.
x
0
−x
1. Let V =
451
452
CHAPTER 6. VECTOR SPACES
x
: xy ≥ 0 fails to satisfy Axiom 1, so it is not a vector space:
y
x
−y
1. Choose x 6= y with xy ≥ 0. Then
and
are both in V , but
y
−x
x
−y
x−y
x−y
+
=
=
∈
/V
y
−x
y−x
−(x − y)
3. V =
since (x − y)(−(x − y)) = −(x − y)2 < 0 because x 6= y.
x
4. V =
: x ≥ y fails to satisfy Axioms 5 and 6, so it is not a vector space:
y
x
−x
5. If
∈ V and x 6= y, then x > y, so that −x < −y. Therefore
∈
/ V.
y
−y
x
cx
−x
6. Again let
∈ V and x 6= y, and choose c = −1. Then
=
∈
/ V as above.
y
cy
−y
.
5. R2 with the given scalar multiplication fails to satisfy Axiom 8, so it is not a vector space:
x
8. Choose
∈ R2 with y 6= 0, and let c and d be nonzero scalars. Then
y
x
(c + d)x
cx + dx
x
x
cx
dx
cx + dx
(c + d)
=
=
, while c
+d
=
+
=
.
y
y
y
y
y
y
y
2y
These two are unequal since we chose y 6= 0.
6. R2 with the given addition fails to satisfy Axiom 7, so it is not a vector space:
x
x
8. Choose 1 , 2 ∈ R2 and let c be a scalar not equal 1. Then
y1
y2
x1
x
x + x2 + 1
cx1 + cx2 + c
c
+ 2
=c 1
=
y1
y2
y1 + y2 + 1
cy1 + cy2 + c
x
x
cx1
cx2
cx1 + cx2 + 1
c 1 +c 2 =
+
=
.
y1
y2
cy1
cy2
cy1 + cy2 + 1
These two are unequal since we chose c 6= 1.
7. All axioms apply, so that this set is a vector space. Let x, y, and z by positive real numbers, so that
x, y, z ∈ V , and c and d be any real numbers. Then
1. x ⊕ y = xy > 0, so that x ⊕ y ∈ V .
2. x ⊕ y = xy = yx = y ⊕ x.
3. (x ⊕ y) ⊕ z = (xy) ⊕ z = (xy)z = x(yz) = x ⊕ (yz) = x ⊕ (y ⊕ z).
4. 0 = 1, since x ⊕ 0 = x1 = 1x = 0 ⊕ x = x.
5. −x is the positive real number x1 , since x ⊕ (−x) = x · x1 = 1 = 0.
6. c
7. c
x = xc > 0, so that c
9. c
10. 1
(d
x=x
c c
c+d
c d
x∈V.
(x ⊕ y) = (xy) = x y = xc ⊕ y c = (c
8. (c + d)
c
x) = c
x
x = x = x.
c
x) ⊕ (c
y).
d
= x x = x ⊕ x = (c x) ⊕ (c y).
c
d
x = xd = xcd = (xc ) = d xc = d
d
(c
x).
6.1. VECTOR SPACES AND SUBSPACES
453
8. This is not a vector space, since Axiom 6 fails. Choose any rational number q, and let c = π. Then qπ
is not rational, so that V is not closed under scalar multiplication.
9. All
apply,
so
axioms
that this is a vector space. Let x, y, z, u, v, w, c, and d be real numbers. Then
x y
u v
and
∈ V , and
0 z
0 w
1. The sum of upper triangular matrices is again upper triangular.
2. Matrix addition is commutative.
3. Matrix addition is associative.
x y
x
4. The zero matrix O is upper triangular, and
+O =
0 z
0
x y
−x −y
.
5. −
is the matrix
0 z
0 −z
x y
cx cy
6. c
=
∈ V since it is upper triangular.
0 z
0 cz
y
.
z
7.
c
x
0
y
u v
x+u y+v
+
=c
z
0 w
0
z+w
c(x + u) c(y + v)
=
0
c(z + w)
cx + cu cy + cv
=
0
cz + cw
cx cy
cu cv
+
=
0 cz
0 cw
x y
u v
=c
+c
.
0 z
0 w
8.
(c + d)
x y
dx
9. c d
=c
0 z
0
x y
x y
10. 1
=
.
0 z
0 z
x
0
y
(c + d)x (c + d)y
=
z
0
(c + d)z
cx + dx cy + dy
=
0
cz + dz
cx cy
dx dy
=
+
0 cz
0 dz
x y
x y
=c
+d
.
0 z
0 z
dy
c(dx) c(dy)
(cd)x (cd)y
x
=
=
= (cd)
dz
0
c(dz)
0
(cd)z
0
y
.
z
10. This set V is not a vector space, since it is not closed under addition. For example,
1 0
0 0
1 0
0 0
1 0
,
∈ V, but
+
=
∈
/V
0 0
0 1
0 0
0 1
0 1
since the product of the diagonal entries is not zero.
454
CHAPTER 6. VECTOR SPACES
11. This set V is a vector space. Let A be a skew-symmetric matrix and c a constant. Then
1. The sum of skew-symmetric matrices is skew-symmetric.
2. Matrix addition is commutative.
3. Matrix addition is associative.
4. The zero matrix O is skew-symmetric, and A + O = A.
5. If A is skew-symmetric, then so is −A, since (−A)ij = −aij = aji = (A)ji = −(−A)ji .
6. If A is skew-symmetric, then so is cA, since (cA)ij = c(A)ij = −c(A)ji = −(cA)ji .
7. Scalar multiplication distributes over matrix addition.
8, 9. These properties hold for general matrices, and skew-symmetric matrices are closed under these
operations, so then in V .
10. 1A = A is skew-symmetric.
12. Axioms 1, 2, and 6 were verified in Example 6.3. For the remaining examples, using the notation from
Example 3, and letting r(x) = c0 + c1 x + c2 x2 ,
3.
(p(x) + q(x)) + r(x) = ((a0 + b0 ) + (a1 + b1 )x + (a2 + b2 )x2 ) + r(x)
= (a0 + b0 + c0 ) + (a1 + b1 + c1 )x + (a2 + b2 + c2 )x2
= (a0 + a1 x + a2 x2 ) + ((b0 + c0 ) + (b1 + c1 )x + (b2 + c2 )x2 )
= p(x) + (q(x) + r(x)).
4. The zero polynomial 0 + 0x + 0x2 is the zero vector, since
0 + p(x) + (0 + a0 ) + (0 + a1 )x + (0 + a2 )x2 = a0 + a1 x + a2 x2 = p(x).
5. −p(x) is the polynomial −a0 − a1 x − a2 x2 .
7.
c(p(x) + q(x)) = c (a0 + a1 x + a2 x2 ) + (b0 + b1 x + b2 x2 )
= c (a0 + b0 ) + (a1 + b1 )x + (a2 + b2 )x2
= c(a0 + b0 ) + c(a1 + b1 )x + c(a2 + b2 )x2
= (ca0 + cb0 ) + (ca1 + cb1 )x + (ca2 + cb2 )x2
= ca0 + ca1 x + ca2 x2 + cb0 + cb1 x + cb2 x2
= c a0 + a1 x + a2 x2 + c b0 + b1 x + b2 x2
= cp(x) + cq(x).
8.
(c + d)p(x) = (c + d) a0 + a1 x + a2 x2
= (c + d)a0 + (c + d)a1 x + (c + d)a2 x2
= (ca0 + da0 ) + (ca1 + da1 )x + (ca2 + da2 )x2
= (ca0 + ca1 x + ca2 x2 ) + (da0 + da1 x + da2 x2 )
= c a0 + a1 x + a2 x2 + d a0 + a1 x + a2 x2
= cp(x) + dp(x).
6.1. VECTOR SPACES AND SUBSPACES
455
9.
c(dp(x)) = c da0 + da1 x + da2 x2
= c(da0 ) + c(da1 )x + c(da2 )x2
= (cd)a0 + (cd)a1 x + (cd)a2 x2
= (cd) a0 + a1 x + a2 x2
= (cd)p(x).
10. 1p(x) = 1a0 + 1a1 x + 1a2 x2 = a0 + a1 x + a2 x2 = p(x).
13. Axioms 1 and 6 were verified in Example 6.4. For the rest, using the notation from Example 4,
2, 3. Addition of functions is commutative and associative.
4. The zero function f (x) = 0 for all x is the zero vector, since (0 + f )(x) = 0(x) + f (x) = f (x).
5. The function −f such that (−f )(x) = −f (x) is such that (−f + f )(x) = (−f )(x) + f (x) =
−f (x) + f (x) = 0, so their sum is the zero function.
7. The function c(f + g) is defined by (c(f + g))(x) = c((f + g)(x)) by the definition of scalar multiplication. But (f + g)(x) = f (x) + g(x), so we get (c(f + g))(x) = c(f (x) + g(x)) = cf (x) + cg(x) =
(cf + cg)(x).
8. Again using the definition of scalar multiplication, the function (c+d)f is defined by ((c+d)f )(x) =
(c + d)(f (x)) = cf (x) + df (x) = (cf + df )(x).
9. The function df is defined by (df )(x) = df (x), so c(df ) is defined by (c(df ))(x) = c((df )(x)) =
cdf (x) = ((cd)f )(x).
10. (1f )(x) = 1f (x) = f (x).
14. This is not a complex vector space, since Axiom 6 fails. Let z ∈ C be any nonzero complex number,
and let c ∈ C be any complex number with nonzero real and imaginary parts. Then
z
cz
cz
c
=
.
6=
z̄
cz
cz̄
15. This is a vector space over C; the proof is identical to the proof that Mmn (R) is a real vector space.
z
w
16. This is a complex vector space. Let 1 , 1 ∈ C2 , and let c, d ∈ C. Then
z2
w2
1, 2, 3, 4, 5. These follow since addition is the same as in the usual vector space C2 .
z
c̄z1
6. c 1 =
∈ C2 .
z2
c̄z2
7.
z1
w
z +w+1
c̄(z1 + w1 )
c̄z1 + c̄w1
c
+ 1
=c 1
=
=
z2
w2
z2 + w2
c̄(z2 + w2 )
c̄z2 + c̄w2
c̄z1
c̄w1
z
w
=
+
=c 1 +c 1 .
c̄z2
c̄w2
z2
w2
(c + d)z1
cz1
dz1
(c + d)z1
z
z
z1
=
=
+
=c 1 +d 1 .
=
cz2
z2
z2
z2
(c + d)z2
dz2
(c + d)z2
¯
¯
(cd)z1
c̄dz
dz
z
z
= ¯1 =c ¯1 =c d 1 .
9. (cd) 1 =
z2
c̄
dz
dz
z2
(cd)z2
2
2
z
1̄z1
1z1
z
10. 1 1 =
=
= 1 .
z2
1̄z2
1z2
z2
8. (c + d)
456
CHAPTER 6. VECTOR SPACES
17. This is not a complex vector space, since Axiom 6 fails. Let x ∈ Rn be any nonzero vector, and let
c ∈ C be any complex number with nonzero imaginary part. Then cx ∈
/ Rn , so that Rn is not closed
under scalar multiplication.
18. This set V is a vector space over Z2 . Let x, y ∈ Zn2 each have an even number of 1s, say x has 2r ones
and y has 2s ones.
1. x + y has a 1 where either x or y has a 1, except that if both x and y have a 1, x + y has a zero. Say
this happens in k different places. Then the total number of 1s in x+y is 2r+2s−2k = 2(r+s−k),
which is even, so V is closed under addition.
2. Since addition in Z2 is commutative, we see that x + y = y + x since we are simply adding
corresponding components.
3. Since addition in Z2 is associative, we see that (x + y) + z = x + (y + z) since we are simply adding
corresponding components.
4. 0 ∈ Zn2 has an even number (zero) of ones, so 0 ∈ V , and x + 0 = x.
5. Since 1 = −1 in Z2 , we can take −x = x; then −x + x = x + x = 0.
6. If c ∈ Z2 , then cx is equal to either 0 or x depending on whether c = 0 or c = 1. In either case,
x ∈ V implies that cx ∈ V .
7, 8. Since multiplication distributes over addition in Z2 , these axioms follow since operations are
component by component.
9. Since multiplication in Z2 is distributive, these axioms follow since operations are component by
component.
10. 1x = x since multiplication is component by component.
19. This set V is not a vector space:
1. For example, let x and y be vectors with exactly one 1. Then x + y has either 0 or 2 ones (it has
zero if the locations of the 1s in x and y are the same), so x + y ∈
/ V and V is not closed under
addition.
4. 0 ∈
/ V since 0 does not have an odd number of 1s.
6. If c = 0, then cx = 0 ∈
/ V if x ∈ V , so that V is not closed under scalar multiplication.
20. This is a vector space over Zp ; the proof is identical to the proof that Mmn (R) is a real vector space.
21. This is not a vector space. Axioms 1 through 7 are all satisfied, as is Axiom 10, but Axioms 8 and 9
fail since addition and multiplication are not the same in Z6 as they are in Z3 . For example:
8. Let c = 2, d = 1, and u = 1. Then (c + d)u = (2 + 1)u = 0u = 0 since addition of scalars is
performed in Z3 , while cu + du = 2 · 1 + 1 · 1 = 2 + 1 = 3 since here the addition is performed in
Z6 .
9. Let c = d = 2 and u = 1. Then (cd)u = 1 · 1 = 1 since the scalar multiplication is done in Z3 , while
c(du) = 2(2 · 1) = 2 · 2 = 4 since here the multiplication is done in Z6 .
22. Since 0 = 0 + 0, we have
0u = (0 + 0)u = 0u + 0u
by Axiom 8. By Axiom 5, we add −0u to both sides of this equation, giving
−0u + 0u = (−0u + 0u) + 0u, so that by Axiom 5 0 = 0 + 0u.
Finally, 0 + 0u = 0u by Axiom 4, so that 0 = 0u.
23. To show that (−1)u = −u, it suffices to show that (−1)u + u = 0, by Axiom 5. But (−1)u + u =
(−1)u + 1u by Axiom 10. Then use Axiom 8 to rewrite this as (−1 + 1)u = 0u. By part (a) of Theorem
6.1, 0u = 0 and we are done.
6.1. VECTOR SPACES AND SUBSPACES
457
24. This is a subspace. We check each of the conditions in Theorem 6.2:
    

a
b
a+b
1. 0 + 0 =  0  ∈ W since its first and third components are equal and its second component
a
b
a+b
is zero.
   
a
ca
2. c 0 =  0  ∈ W .
a
cz
25. This is a subspace. We check each of the conditions in Theorem 6.2:
    
 

a
b
a+b
(a + b)
1. −a + −b =  −a − b  = −(a + b) ∈ W .
2a
2b
2a + 2b
2(a + b)
  
 

a
ac
(ac)
2. c −a = −ac = −(ac) ∈ W .
2a
2ac
2(ac)
26. This is not a subspace. It fails both conditions in Theorem 6.2:
 
 


c
a+c
a
+
=
, which is not in W .
d
b+d
b
1. 
c+d+1
(a + c) + (b + d) + 2
a+b+1
 
 


ca
ca
a
=
=
, which is not in W if c 6= 1.
cb
cb
b
2. c 
c(a + b + 1)
ca + cb + c
a+b+1
27. This is not a subspace. It fails both conditions in Theorem 6.2:

    
c
a+c
a
6 |a + c|, so that
1.  b  +  d  =  b + d . If a and c do not have the same sign, then |a| + |c| =
|c|
|a| + |c|
|a|
the sum is not in W .
  

a
ca
2. c  b  =  cb . If c < 0, then c |a| =
6 |ca|, so that the result is not in W .
|a|
c |a|
28. This is a subspace. We check each of the conditions in Theorem 6.2:
a b
c d
a+c
b+d
a+c
b+d
1.
+
=
=
∈ W.
b 2a
d 2c
b + d 2a + 2c
b + d 2(a + c)
a b
ca cb
ac
bc
2. c
=
=
∈ W.
b 2a
cb 2ac
bc 2(ac)
29. This is not a subspace; condition 1 in Theorem 6.2 fails. For example, if a, d ≥ 0, then
a 0
−a
a
,
∈ W,
0 d
d −d
but
a
0
since 0 < ad.
0
−a
a
0 a
+
=
∈
/W
d
d −d
d 0
458
CHAPTER 6. VECTOR SPACES
30. This is not a subspace; both conditions of Theorem 6.2 fail. For example:
Let A = B = I; then det A = det B = 1, so that A, B ∈ W , but det(A + B) = det 2I = 2, so
A+B ∈
/ W.
Let A = I and c be any constant other than ±1. Then det(cA) = cn det A 6= det A = 1, so that
cA ∈
/ W.
31. Since the sum of two diagonal matrices is again diagonal, and a scalar times a diagonal matrix is
diagonal, this subset satisfies both conditions of Theorem 6.2, so it is a subspace.
32. This is not a subspace. For example, it fails the second condition: let A be idempotent and c be any
constant other than 0 or 1. Then
(cA)2 = c2 A2 = c2 A 6= cA
so that cA is not idempotent, so is not in W .
33. This is a subspace. We check each of the conditions in Theorem 6.2:
1. Suppose that A, C ∈ W . Then (A + C)B = AB + CB = BA + BC = B(A + C), so that A + C ∈ W .
2. Suppose that A ∈ W and c is any scalar. Then (cA)B = c(AB) = c(BA) = B(cA), so that cA ∈ W .
34. This is a subspace. We check each of the conditions in Theorem 6.2:
1. Let v = bx + cx2 , v0 = b0 x + c0 x2 . Then
v + v0 = bx + cx2 + b0 x + c0 x2 = (b + b0 )x + (c + c0 )x2 ∈ W.
2. Let v = bx + cx2 and d be any scalar. Then dv = d(bx + cx2 ) = (db)x + (dc)x2 ∈ W .
35. This is a subspace. We check each of the conditions in Theorem 6.2:
1. Let v = a + bx + cx2 , v0 = a0 + b0 x + c0 x2 ∈ W . Then
v + v0 = a + bx + cx2 + a0 + b0 x + c0 x2 = (a + a0 ) + (b + b0 )x + (c + c0 )x2 .
Then (a + a0 ) + (b + b0 ) + (c + c0 ) = (a + b + c) + (a0 + b0 + c0 ) = 0, so that v + v0 ∈ W .
2. Let v = a + bx + cx2 and d be any scalar. Then dv = d(a + bx + cx2 ) = (da) + (db)x + (dc)x2 , and
da + db + dc = d(a + b + c) = d · 0 = 0, so that dv ∈ W .
36. This is not a subspace; it fails condition 1 in Theorem 6.2. For example, let v = 1 + x and v0 = x2 .
Then both v and v0 are in V , but v + v0 = 1 + x + x2 ∈
/ V since the product of its coefficients is not
zero.
37. This is not a subspace; it fails both conditions of Theorem 6.2.
1. For example let v = x3 and v0 = −x3 . Then v + v0 = 0, which is not a degree 3 polynomial.
2. Let c = 0; then cv = 0 for any v, but 0 ∈
/ W , so that this condition fails.
38. This is a subspace. We check each of the conditions in Theorem 6.2:
1. Let f, g ∈ W . Then (f + g)(−x) = f (−x) + g(−x) = f (x) + g(x) = (f + g)(x), so that f + g ∈ W .
2. Let f ∈ W and c be any scalar. Then (cf )(−x) = cf (−x) = cf (x) = (cf )(x), so that cf ∈ W .
39. This is a subspace. We check each of the conditions in Theorem 6.2:
1. Let f, g ∈ W . Then (f + g)(−x) = f (−x) + g(−x) = −f (x) − g(x) = −(f (x) + g(x)) = −(f + g)(x),
so that f + g ∈ W .
2. Let f ∈ W and c be any scalar. Then (cf )(−x) = cf (−x) = −cf (x) = −(cf )(x), so that cf ∈ W .
6.1. VECTOR SPACES AND SUBSPACES
459
40. This is not a subspace; it fails both conditions in Theorem 6.2. For example, if f, g ∈ W , then
(f + g)(0) = f (0) + g(0) = 1 + 1 6= 1, so that f + g ∈
/ W . Similarly, if c is a scalar not equal to 1, then
(cf )(0) = cf (0) = c 6= 1, so that cf ∈
/ W.
41. This is a subspace. We check each of the conditions in Theorem 6.2:
1. Let f, g ∈ W . Then (f + g)(0) = f (0) + g(0) = 0 + 0 = 0, so that f + g ∈ W .
2. Let f ∈ W and c be any scalar. Then (cf )(0) = cf (0) = c · 0 = 0, so that cf ∈ W .
42. If f and g ∈ W are integrable and c is a constant, then
Z
Z
Z
(f + g) = f + g,
Z
Z
(cf ) = c
f,
so that both f + g and cf ∈ W . Thus W is a subspace.
43. This is not a subspace; it fails condition 2 in Theorem 6.2. For example, let f ∈ W be such that
f 0 (x) > 0 everywhere, and choose c = −1; then (cf )0 (x) = cf 0 (x) < 0, so that cf ∈
/ W.
44. If f, g ∈ W and c is a constant, then
(f + g)00 = f 00 + g 00 ,
(cf )00 = cf 00 ,
so that both f + g and cf have continuous second derivatives, so they are both in W .
45. This is not a subspace; it fails both conditions in Theorem 6.2.
/ W , since limx→0 (−f (x)) = −∞.
1. f (x) = x12 ∈ W , but −f (x) = − x12 ∈
2. Take c = 0 and f ∈ W ; then limx→0 (cf )(x) = limx→0 0 = 0, so that cf ∈
/ W.
46. Suppose that v, v0 ∈ U ∩ W , and let c be a scalar. Then v, v0 ∈ U , so that v + v0 and cv ∈ U .
Similarly, v, v0 ∈ W , so that v + v0 and cv ∈ W . Therefore v + v0 ∈ U ∩ W and cv ∈ U ∩ W , so that
U ∩ W is a subspace.
x
0
47. For example, let U =
, the x-axis, and let W =
, the y-axis. Then U ∪ W consists of all
0
y
2
vectors in R in which at least one component is zero. Choose a, b 6= 0; then
a
0
a
0
a
∈ U,
∈ W, but
+
=
∈
/ U ∪W
0
b
0
b
b
since neither a nor b is zero.
48. (a) U + W is the xy-plane, since
 
 
x
0
U =  0  , W = y  ,
0
0
so that any point in the xy-plane is
     
a
a
0
 b  = 0 +  b  ∈ U + W.
0
0
0
However, a point not in the xy-plane has nonzero z-coordinate, so it cannot be a sum of an
element of U and an element of W . Thus U + W is exactly the xy-plane.
460
CHAPTER 6. VECTOR SPACES
(b) Suppose x, x0 ∈ U + W ; say x = u + w and x0 = u0 + w0 , where u, u0 ∈ U and w, w0 ∈ W . Then
x + x0 = u + w + u0 + w0 = (u + u0 ) + (w + w0 ) ∈ U + W
since both U and W are subspaces. If c is a scalar, then
cx = c(u + w) = cu + cw ∈ U + W,
again since both U and W are subspaces.
49. We verify all ten axioms of a vector space. Note that both U and V are vector spaces, so they satisfy
all the vector space axioms. These facts will be used without comment below. Let (u1 , v1 ), (u2 , v2 ),
and (u3 , v3 ) be elements of U × V , and let c and d be scalars. Then
1. (u1 , v1 ) + (u2 , v2 ) = (u1 + u2 , v1 + v2 ) ∈ U × V .
2. (u1 , v1 ) + (u2 , v2 ) = (u1 + u2 , v1 + v2 ) = (u2 + u1 , v2 + v1 ) = (u2 , v2 ) + (u1 , v1 ).
3.
((u1 , v1 ) + (u2 , v2 )) + (u3 , v3 ) = (u1 + u2 , v1 + v2 ) + (u3 , v3 )
= ((u1 + u2 ) + u3 , (v1 + v2 ) + v3 )
= (u1 + (u2 + u3 ), v1 + (v2 + v3 ))
= (u1 , v1 ) + (u2 + u3 , v2 + v3 )
= (u1 , v1 ) + ((u2 , v2 ) + (u3 , v3 )).
4. (0, 0) ∈ U × V , and (u1 , v1 ) + (0, 0) = (u1 + 0, v1 + 0) = (u1 , v1 ).
5. (u1 , v1 ) + (−u1 , −v1 ) = (u1 − u1 , v1 − v1 ) = (0, 0).
6. c(u1 , v1 ) = (cu1 , cv1 ) ∈ U × V .
7.
c ((u1 , v1 ) + (u2 , v2 )) = c(u1 + u2 , v1 + v2 )
= (c(u1 + u2 ), c(v1 + v2 ))
= (cu1 + cu2 , cv1 + cv2 )
= (cu1 , cv1 ) + (cu2 , cv2 )
= c(u1 , v1 ) + c(u2 , v2 ).
8.
(c + d)(u1 , v1 ) = ((c + d)u1 , (c + d)v1 )
= (cu1 + du1 , cv1 + dv1 )
= (cu1 , cv1 ) + (du1 , dv1 )
= c(u1 , v1 ) + d(u1 , v1 ).
9. c (d(u1 , v1 )) = c(du1 , dv1 ) = (c(du1 ), c(du2 )) = ((cd)u1 , (cd)v1 ) = (cd)(u1 , v1 ).
10. 1(u1 , v1 ) = (1u1 , 1v1 ) = (u1 , v1 ).
50. Suppose that (w, w), (w0 , w0 ) ∈ ∆. Then
(w, w) + (w0 , w0 ) = (w + w0 , w + w0 ),
which is in ∆ since its two components are the same. Let c be a scalar. Then
c(w, w) = (cw, cw),
which is also in ∆ since its components are the same. Therefore ∆ satisfies both conditions of Theorem
6.2, so is a subspace.
6.1. VECTOR SPACES AND SUBSPACES
461
51. If aA + bB = C, then
1
a
−1
1
1
+b
1
1
−1
a+b
=
0
b−a
a−b
1
=
a
3
2
.
4
Then a − b = 2 and b − a = 3, which is impossible. Therefore C is not in the span of A and B.
52. If aA + bB = C, then
a
1
−1
1
1
+b
1
1
−1
a+b a−b
3
=
=
0
b−a
a
5
−5
.
−1
Then a = −1 from the lower right. The upper left then gives −1+b = 3, so that b = 4. Then a−b = −5
and b − a = 5, so that −A + 4B = C.
53. Suppose that
a(1 − 2x) + b(x − x2 ) + c(−2 + 3x + x2 ) = (a − 2c) + (−2a + b + 3c)x + (−b + c)x2 = 3 − 5x − x2 .
Then
− 2c =
a
3
−2a + b + 3c = −5
− b + c = −1.
The third equation gives b = c + 1; substituting this in the second equation gives −2a + 4c = −6,
which is −2 times the first equation. So there is a family of solutions, which can be parametrized as
a = 2t + 3, b = t + 1, c = t. For example, with t = 0 we get
s(x) = 3p(x) + q(x),
and s(x) ∈ span(p(x), q(x), r(x)).
54. Suppose that
a(1 − 2x) + b(x − x2 ) + c(−2 + 3x + x2 ) = (a − 2c) + (−2a + b + 3c)x + (−b + c)x2 = 1 + x + x2 .
Then
a
− 2c = 1
−2a + b + 3c = 1
− b + c = 1.
The third equation gives b = c − 1; substituting into the second equation gives −2a + 4c − 1 = 1,
or a − 2c = −1. This conflicts with the first equation, so the system has no solution. Thus s(x) ∈
/
span(p(x), q(x), r(x)).
55. Since sin2 x + cos2 x = 1, we have 1 = f (x) + g(x) ∈ span(sin2 x, cos2 x).
56. Since cos 2x = cos2 x − sin2 x = g(x) − f (x), we have cos 2x = −f (x) + g(x) ∈ span(sin2 x, cos2 x).
57. Suppose that sin 2x = af (x) + bg(x) = a sin2 x + b cos2 x for all x. Then
Let x = 0; then 0 = a sin2 0 + b cos2 0 = b, so that b = 0.
π
π
π
π
= sin π = 0 = a sin2 + b cos2 = a, so that a = 0.
Let x = ; then sin 2 ·
2
2
2
2
Thus the only solution is a = b = 0, but this is not a valid solution since sin 2x 6= 0. So sin 2x ∈
/
span(sin2 x, cos2 x).
462
CHAPTER 6. VECTOR SPACES
58. Suppose that sin x = af (x) + bg(x) = a sin2 x + b cos2 x for all x. Then
Let x = 0; then 0 = a sin2 0 + b cos2 0 = b, so that b = 0.
π
π
π
π
Let x = ; then sin = 1 = a sin2 + b cos2 = a, so that a = 1.
2
2
2
2
Thus the only possibility is sin x = sin2 x. But this is not a valid equation; for example, if x = 3π
2 ,
then the left-hand side is −1 while the right-hand side is 1. So sin x ∈
/ span(sin2 x, cos2 x).
59. Let
1
V1 =
0
1
,
1
0
V2 =
1
1
,
0
1
V3 =
1
0
,
1
0
V4 =
1
−1
.
0
Now, V4 = V3 − V1 , so that span(V1 , V2 , V3 , V4 ) = span(V1 , V2 , V3 ). If span(V1 , V2 , V3 ) = M22 , then
every matrix is in the span, so in particular E11 is. But
a+c a+b
1 0
aV1 + bV2 + cV3 = E11 ⇒
=
.
b+c a+c
0 0
Now the diagonal entries give a + c = 0 and a + c = 1, which has no solution. So E11 is not in the span
and thus span(V1 , V2 , V3 ) 6= M22 .
60. Let
V1 =
1
1
0
,
0
V2 =
1
1
1
,
0
V3 =
1
1
1
,
1
V4 =
0
1
−1
.
0
Solving the four independent systems
E11 = aV1 + bV2 + cV3 + dV4
E12 = aV1 + bV2 + cV3 + dV4
E21 = aV1 + bV2 + cV3 + dV4
E22 = aV1 + bV2 + cV3 + dV4
gives
E11 = 2V1 − V2 − V4
E12 = −V1 + V2
E21 = −V1 + V2 + V4
E22 = −V2 + V3 .
Therefore span(V1 , V2 , V3 , V4 ) ⊇ span(E11 , E12 , E21 , E22 ) = M22 .
61. Let p(x) = 1 + x, q(x) = x + x2 , and r(x) = 1 + x2 . Then
a
+ c = 1
1
1
1
= 0 ⇒ a= , b=− , c=
1 = ap(x) + bq(x) + cs(x) ⇒ a + b
2
2
2
b + c = 0
a
+ c = 0
1
1
1
= 1 ⇒ a= , b= , c=−
x = ap(x) + bq(x) + cs(x) ⇒ a + b
2
2
2
b + c = 0
a
+ c = 0
1
1
1
= 0 ⇒ a=− , b= , c= .
x2 = ap(x) + bq(x) + cs(x) ⇒ a + b
2
2
2
b + c = 1
Thus span(p(x), q(x), r(x)) ⊇ span(1, x, x2 ) = P2 .
62. Let p(x) = 1 + x + 2x2 , q(x) = 2 + x + 2x2 , and r(x) = −1 + x + 2x2 . Then
x = ap(x) + bq(x) + cs(x) ⇒
a + 2b − c = 0
a + b + c = 1
2a + 2b + 2c = 0 .
But the second pair of equations are inconsistent, so that x ∈
/ span(p(x), q(x), r(x)).
span(p(x), q(x), r(x)) 6= P2 .
Therefore
6.2. LINEAR INDEPENDENCE, BASIS, AND DIMENSION
463
63. Suppose 0 and 00 satisfy Axiom 4. Then 0 + 00 = 0 since 00 satisfies Axiom 4, and also 0 + 00 = 00
since 0 satisfies Axiom 4. Putting these two equalities together gives 0 = 00 , so that 0 is unique.
64. Suppose v0 and v00 satisfy Axiom 5 for v. Then using Axiom 5,
v00 = v00 + 0 = v00 + (v + v0 ) = (v00 + v) + v0 = 0 + v0 = v0 .
6.2
Linear Independence, Basis, and Dimension
1. We wish to determine if there are constants c1 , c2 , and c3 , not all zero, such that
1
1
1 −1
1 0
c1
+ c2
+ c3
= 0.
0 −1
1
0
3 2
This is equivalent to asking for solutions to the linear system
c1 + c2 + c3 = 0
c1 − c2
=0
c2 + 3c3 = 0
−c1
+ 2c3 = 0.
Row-reducing the corresponding augmented matrix, we get



1 0
1
1 1 0
0 1
 1 −1 0 0
→

0 0
 0
1 3 0
−1
0 2 0
0 0
0
0
1
0

0
0
.
0
0
This system has only the trivial solution c1 = c2 = c3 = 0, so that these matrices are linearly
independent.
2. We wish to determine if there are constants c1 , c2 , and c3 , not all zero, such that
2 −3
1 −1
−1 3
c1
+ c2
+ c3
= 0.
4
2
3
3
1 5
This is equivalent to asking for solutions to the linear system
2c1 + c2 − c3 = 0
−3c1 − c2 + 3c3 = 0
4c1 + 3c2 + c3 = 0
2c1 + 3c2 + 5c3 = 0.
Row-reducing the corresponding augmented matrix, we get



2
1 −1 0
1 0

−3 −1
3 0
 → 0 1

 4
0 0
3
1 0
2
3
5 0
0 0
−2
3
0
0

0
0
.
0
0
This system has a nontrivial solution c1 = 2t, c2 = −3t, c3 = t; setting t = 1 for example gives
2 −3
1 −1
−1 3
2
−3
+
= 0.
4
2
3
3
1 5
so that the matrices are linearly dependent.
464
CHAPTER 6. VECTOR SPACES
3. We wish to determine if there are constants c1 , c2 , c3 , and c4 , not all zero, such that
−1 1
3 0
0 2
−1 0
c1
+ c2
+ c3
+ c4
= 0.
−2 2
1 1
−3 1
−1 7
This is equivalent to asking for solutions to the linear system
−c1 + 3c2
c1
− c4 = 0
+ 2c3
=0
−2c1 + c2 − 3c3 − c4 = 0
2c1 + c2 + c3 + 7c4 = 0.
Row-reducing the corresponding augmented matrix, we get



1 0
−1 3
0 −1 0
 1 0
0 1

2
0
0



−2 1 −3 −1 0 → 0 0
2 1
1
7 0
0 0
0
0
1
0
4
1
−2
0

0
0
.
0
0
This system has a nontrivial solution c1 = −4t, c2 = −t, c3 = 2t, c4 = t; setting t = 1 for example
gives
−1 1
3 0
0 2
−1 0
= 0.
−4
−
+2
+
−2 2
1 1
−3 1
−1 7
so that the matrices are linearly dependent.
4. We wish to determine if there are constants c1 , c2 , c3 , and c4 , not all zero, such that
1 1
1 0
0 1
1 1
c1
+ c2
+ c3
+ c4
= 0.
0 1
1 1
1 1
1 0
This is equivalent to asking for solutions to the linear system
c1 + c2
c1
+ c4 = 0
+ c3 + c4 = 0
c2 + c3 + c4 = 0
c1 + c2 + c3
= 0.
Row-reducing the corresponding augmented matrix, we get



1 0 0 0
1 1 0 1 0
1 0 1 1 0
0 1 0 0



0 1 1 1 0 → 0 0 1 0
1 1 1 0 0
0 0 0 1

0
0
.
0
0
Since the only solution is the trivial solution, it follows that these four matrices are linearly independent
over M22 .
5. Following, for instance, Example 6.26, we wish to determine if there are constants c1 and c2 , not both
zero, such that
c1 (x) + c2 (1 + x) = 0.
This is equivalent to asking for solutions to the linear system
c2 = 0
c1 + c2 = 0.
By forward substitution, we see that the only solution is the trivial solution, c1 = c2 = 0. It follows
that these polynomials are linearly independent over P2 .
6.2. LINEAR INDEPENDENCE, BASIS, AND DIMENSION
465
6. Following, for instance, Example 6.26, we wish to determine if there are constants c1 , c2 , and c3 , not
all zero, such that
c1 (1 + x) + c2 (1 + x2 ) + c3 (1 − x + x2 ) = (c1 + c2 + c3 ) + (c1 − c3 )x + (c2 + c3 )x2 = 0.
This is equivalent to asking for solutions to the linear system
c1 + c2 + c3 = 0
− c3 = 0
c1
c2 + c3 = 0.
Row-reducing the corresponding augmented matrix, we get



1 1
1 0
1 0 0
1 0 −1 0 → 0 1 0
0 1
1 0
0 0 1

0
0 .
0
Since the only solution is the trivial solution, it follows that these three polynomials are linearly
independent over P2 .
7. Following, for instance, Example 6.26, we wish to determine if there are constants c1 , c2 , and c3 , not
all zero, such that
c1 (x) + c2 (2x − x2 ) + c3 (3x + 2x2 ) = (c1 + 2c2 + 3c3 )x + (−c2 + 2c3 )x2 = 0.
This is a homogeneous system of two equations in three unknowns, so we know it has a nontrivial
solution and these polynomials are linearly dependent. To find an expression showing the linear
dependence, we must find solutions to the linear system
c1 + 2c2 + 3c3 = 0
− c2 + 2c3 = 0.
Row-reducing the corresponding augmented matrix, we get
1
2 3 0
1 0
7
→
0 −1 2 0
0 1 −2
0
.
0
The general solution is c1 = −7t, c2 = 2t, c3 = t. Setting t = 1, for example, we get
−7(x) + 2(2x − x2 ) + (3x + 2x2 ) = 0.
8. Following, for instance, Example 6.26, we wish to determine if there are constants c1 , c2 , c3 , and c4 ,
not all zero, such that
c1 (2x)+c2 (x−x2 )+c3 (1+x3 )+c4 (2−x2 +x3 ) = (c3 +2c4 )+(2c1 +c2 )x+(−c2 −c4 )x2 +(c3 +c4 )x3 = 0.
This is equivalent to asking for solutions to the linear system
c3 + 2c4 = 0
2c1 + c2
=0
− c2
− c4 = 0
c3 + c4 = 0.
Row-reducing the corresponding augmented matrix, we get



0
0 1
2 0
1 0
2

0 1
1
0
0
0



0 −1 0 −1 0 → 0 0
0
0 1
1 0
0 0
0
0
1
0
0
0
0
1

0
0
.
0
0
Since the only solution is the trivial solution, it follows that these four polynomials are linearly independent over P3 .
466
CHAPTER 6. VECTOR SPACES
9. Following, for instance, Example 6.26, we wish to determine if there are constants c1 , c2 , c3 , and c4 ,
not all zero, such that
c1 (1 − 2x) + c2 (3x + x2 − x3 ) + c3 (1 + x2 + 2x3 ) + c4 (3 + 2x + 3x3 )
= (c1 + c3 + 3c4 ) + (−2c1 + 3c2 + 2c4 )x + (c2 + c3 )x2 + (−c2 + 2c3 + 3c4 )x3 = 0.
This is equivalent to asking for solutions to the linear system
c1
+ c3 + 3c4 = 0
−2c1 + 3c2
+ 2c4 = 0
c2 + c3
=0
− c2 + 2c3 + 3c4 = 0.
Row-reducing the corresponding augmented matrix, we get



1
0 1 3 0
1 0
0 1

−2
3
0
2
0

→
 0
0 0
1 1 0 0
0 −1 2 3 0
0 0
0
0
1
0
0
0
0
1

0
0
.
0
0
Since the only solution is the trivial solution, it follows that these four polynomials are linearly independent over P3 .
10. We wish to find solutions to a + b sin x + c cos x = 0 that hold for all x. Setting x = 0 gives a + c = 0,
so that a = −c; setting x = π gives a − c = 0, so that a = c. This forces a = c = 0; then setting x = π2
gives b = 0. So the only solution is the trivial solution, and these functions are linearly independent in
F.
11. Since sin2 x + cos2 x = 1, the functions are linearly dependent.
12. Suppose that aex + be−x = 0 for all x. Setting x = 0 gives ae0 + be0 = a + b = 0, while setting x = ln 2
gives 2a + 12 b = 0. Since the coefficient matrix,
"
#
1 1
,
2 12
has nonzero determinant, the only solution to the system is the trivial solution, so that these functions
are linearly independent in F .
13. Using properties of the logarithm, we have ln(2x) = ln 2 + ln x and ln(x2 ) = 2 ln x, so we want to see
if {1, ln 2 + ln x, 2 ln x} are linearly independent over F . But clearly they are not, since by inspection
(−2 ln 2) · 1 + 2 · (ln 2 + ln x) − 2 ln x = 0.
So these functions are linearly dependent in F .
14. We wish to find solutions to a sin x + b sin 2x + c sin 3x = 0 that hold for all x. Setting
π
2
π
x=
6
π
x=
3
a−c=0
√
1
3
⇒
a+
b=0
2√
2√
3
3
⇒
a−
b = 0.
2
2
√ Adding the second and third equations gives 12 + 23 a = 0, so that a = 0; then the first and third
equations force b = c = 0. So the only solution is the trivial solution, and thus these functions are
linearly independent in F .
x=
⇒
6.2. LINEAR INDEPENDENCE, BASIS, AND DIMENSION
467
15. We want to prove that if the Wronskian is not identically zero, then f and g are linearly independent.
We do this by proving the contrapositive: assume f and g are linearly dependent. If either f or g is
zero, then clearly the Wronskian is zero as well. So assume that neither f nor g is zero; since they are
linearly dependent, each is a nonzero multiple of the other. Say f (x) = ag(x), so that f 0 (x) = ag 0 (x).
Now, the Wronskian is
f (x) g(x)
= f (x)g 0 (x) − g(x)f 0 (x) = ag(x)g 0 (x) − g(x)(ag 0 (x)) = ag(x)g 0 (x) − ag(x)g 0 (x) = 0.
f 0 (x) g 0 (x)
Thus the Wronskian is identically zero, proving the result.
1 sin x
cos x
cos x − sin x
= − cos2 x − sin2 x = −1 6= 0. Since
W (x) = 0 cos x − sin x =
− sin x − cos x
0 − sin x − cos x
the Wronskian is not zero, the functions are linearly independent.
Exercise 11:
1
sin2 x
cos2 x
2 sin x cos x
−2 sin x cos x
W (x) = 0
0 2(cos2 x − sin2 x) 2(sin2 x − cos2 x)
16. Exercise 10:
1 sin2 x
= 0 sin 2x
0 2 cos 2x
=
sin 2x
2 cos 2x
cos2 x
− sin 2x
−2 cos 2x
− sin 2x
−2 cos 2x
= 0.
Since the Wronskian is zero, the functions are linearly dependent.
ex
e−x
= −1 − 1 = −2 6= 0. Since the Wronskian is not zero, the
Exercise 12: W (x) = x
e
−e−x
functions are linearly independent.
Exercise 13:
1 ln(2x) ln(x2 )
1
2
1
2
x
x
W (x) = 0
=
= 0.
x
x
− x12 − x22
0 − x12
− x22
Since the Wronskian is zero, the functions are linearly dependent.
sin x
sin 2x
sin 3x
2 cos 2x
3 cos 3x . Rather than computing the determinant and
Exercise 14: W (x) = cos x
− sin x −4 sin 2x −9 sin 3x
showing it is not identically zero, it suffices to find some value of x such that W (x) 6= 0. Let
x = π3 ; then
W
π
3
sin π3
sin 2π
3
cos π3
2 cos 2π
3
− sin π3 −4 sin 2π
3
√
√
3
3
0
2
2
= 21
−1
−3
√
√
− 23 −2 3 0
√
√
3
3
2√
2
=3
√
− 23 −2 3
=
sin π
3 cos π
−9 sin π
3
27
= 3 −3 +
=−
6= 0.
4
4
468
CHAPTER 6. VECTOR SPACES
So the Wronskian is not identically zero and therefore the functions are linearly independent.
17. (a) Yes, it is. Suppose
a(u + v) + b(v + w) + c(u + w) = (a + c)u + (a + b)v + (b + c)w = 0.
Since {u, v, w} are linearly independent, it follows that a + c = 0, a + b = 0, and b + c = 0.
Subtracting the latter two from each other gives a − c = 0; combining this with the first equation
shows that a = c = 0, and then b = 0. So the only solution to the system is the trivial one, so
these three vectors are linearly independent.
(b) No, they are not, since
(u − v) + (v − w) − (u − w) = 0.
18. Since dim M22 = 4 but B has only three elements, it cannot be a basis.
19. If
a
1
0
0
0
+b
1
1
−1
1
+c
0
1
1
1
+d
1
1
1
0
=
−1
0
0
,
0
then
a
+c+d= 0
−b+c+d= 0
b+c+d= 0
a
+ c − d = 0.
Subtracting the second and third equations shows that b = 0; subtracting the first and fourth shows
that d = 0. Then from the second equation, c = 0 as well, which forces a = 0. So the only solution
is the trivial solution, and these matrices are linearly independent. Since there are four of them, and
dim M22 = 4, it follows that they form a basis for M22 .
20. Since dim M22 = 4, if these vectors are linearly independent then they span and thus form a basis. To
see if they are linearly independent, we want to solve
1 0
0 1
1 1
1 0
c1
+ c2
+ c3
+ c4
= 0.
0 1
1 0
0 1
1 1
Solving this linear system gives c1 = −2c4 , c2 = −c4 , and c3 = c4 . For example, letting c4 = 1 gives
1 0
0 1
1 1
1 0
−2
−
+
+
= 0.
0 1
1 0
0 1
1 1
Thus the matrices are linearly dependent so do not form a basis (see Theorem 6.10).
21. Since dim M22 = 4 but B has five elements, it cannot be a basis.
22. Since dim P2 = 3, if these vectors are linearly independent then they span and thus form a basis. To
see if they are linearly independent, we want to solve
c1 x + c2 (1 + x) + c3 (x − x2 ) = c2 + (c1 + c2 + c3 )x − c3 x2 = 0.
Rather than setting up the augmented matrix and row-reducing it, simply note that the first and third
terms force c2 = c3 = 0; then the second term forces c1 = 0. So the only solution is the trivial solution,
and these three polynomials are linearly independent, so they form a basis for P2 .
6.2. LINEAR INDEPENDENCE, BASIS, AND DIMENSION
469
23. Since dim P2 = 3, if these vectors are linearly independent then they span and thus form a basis. To
see if they are linearly independent, we want to solve
c1 (1 − x) + c2 (1 − x2 ) + c3 (x − x2 ) = (c1 + c2 ) + (−c1 + c3 )x + (−c2 − c3 )x2 = 0.
To solve the resulting linear system, we row-reduce its augmented matrix:




1
1
0 0
1 0 −1 0
−1
0
1 0 → 0 1
1 0 .
0 −1 −1 0
0 0
0 0
This has general solution c1 = t, c2 = −t, c3 = t, so setting t = 1 we get
(1 − x) − (1 − x2 ) + (x − x2 ) = 0.
Therefore the vectors are not linearly independent, so they cannot form a basis for P2 , which has
dimension 3.
24. Since dim P2 = 3 but there are only two vectors in B, they cannot form a basis.
25. Since dim P2 = 3 but there are four vectors in B, they must be linearly dependent, so cannot form a
basis.
26. We want to find constants c1 , c2 , c3 , and c4 such that
c1 E22 + c2 E21 + c3 E12 + c4 E11 = A =
1
3
2
.
4
Setting up the equivalent linear system gives
c1
=4
c2
=3
c3
=2
c4 = 1.
This system obviously has the solution c1 = 4, c2 = 3, c3 = 2, c4 = 1, so the coordinate vector is
 
4
3

[A]B = 
2 .
1
27. We want to find constants c1 , c2 , c3 , and c4 such that
1 0
1 1
1 1
1
c1
+ c2
+ c3
+ c4
0 0
0 0
1 0
1
1
1
=
1
3
2
.
4
Setting up the equivalent linear system gives
c1 + c2 + c3 + c4 = 1
c2 + c3 + c4 = 2
c3 + c4 = 3
c4 = 4.
By backwards substitution, the solution is c4 = 4, c3 = −1, c2 = −1, c1 = −1, so the coordinate vector
is
 
−1
−1

[A]B = 
−1 .
4
470
CHAPTER 6. VECTOR SPACES
28. We want to find constants c1 , c2 , and c3 such that
p(x) = 1 + 2x + 3x2 = c1 (1 + x) + c2 (1 − x) + c3 (x2 ) = (c1 + c2 ) + (c1 − c2 )x + c3 x2 .
The corresponding linear system is
c1 + c2
=1
c1 − c2
=2
c3 = 3.
Row-reducing the corresponding augmented matrix gives



1
1 0 1
1 0 0



1 −1 0 2 → 0 1 0



0
0
1
3
0
0
1

3
2
− 12 
,
3
so that c1 = 23 , c2 = − 12 , and c3 = 3. Thus the coordinate vector is
 
3
2
 
[p]B = − 12  .
3
29. We want to find constants c1 , c2 , and c3 such that
p(x) = 2 − x + 3x2 = c1 (1) + c2 (1 + x) + c3 (−1 + x2 ) = (c1 + c2 − c3 ) + c2 x + c3 x2 .
Equating coefficients shows that c3 = 3 and c2 = −1; substituting into the equation c1 + c2 − c3 = 2
gives c1 = 6, so that the coordinate vector is
 
6
[p]B = −1 .
3
30. To show B is a basis, we must show that it spans V and that the elements of B are linearly independent.
B spans V , since if v ∈ V , then it can be written as a linear
P combination of vectors in B by hypothesis.
To show they are linearly independent, note that 0 = P 0ui ; since the representation of 0 is unique
by assumption, it follows that the only solution to 0 = ci ui is the trivial solution, so that B = {ui }
is a linearly independent set. Thus B is a basis for V .
31. [c1 u1 + · · · + ck uk ]B = [c1 u1 ]B + · · · + [ck uk ]B = c1 [u1 ]B + · · · + ck [uk ]B .
32. Suppose c1 u1 + · · · + cn un = 0. Then clearly
0 = [c1 u1 + · · · + cn un ]B .
But by Exercise 31, the latter expression is equal to
c1 [u1 ]B + · · · + cn [un ]B ;
since [ui ]B are linearly independent in Rn , we get ci = 0 for all i, so that the only solution to the
original equation is the trivial solution. Thus the ui are linearly independent in V .
33. First suppose that span(u1 , . . . , um ) = V . Choose
 
c1
 c2 
 
x =  .  ∈ Rn .
 .. 
cn
6.2. LINEAR INDEPENDENCE, BASIS, AND DIMENSION
Then
P
471
ci ui ∈ V since V = span(ui ). Therefore by Exercise 31
x = [c1 u1 + · · · + cm um ]B = c1 [u1 ]B + · · · + cm [um ]B ,
which shows that x ∈ span(S).
For the converse, suppose that span(S) = Rn . Choose v ∈ V ; then [v]B ∈ Rn , and using Exercise 31
again,
hX
i
X
[v]B =
ci [ui ]B =
ci ui .
B
P
But since B is a basis, and the two elements
v
and
c
u
of
V
have
the same coordinate vector, they
i i
P
must be the same element. Thus v = ci ui , so that {ui } spans V .
34. Using the standard basis {x2 , x, 1} for P2 , if p(x) = ax2 + bx + c ∈ V , then p(0) = c = 0, so that c = 0
and therefore p(x) = ax2 + bx. Since x2 and x are linearly independent and span V , {x, x2 } is a basis,
so that dim V = 2.
35. Two polynomials in V are p(x) = 1 − x and q(x) = 1 − x2 , since p(1) = q(1) = 0. If we show these
polynomials are linearly independent and span V , we are done. They are linearly independent, since if
a(1 − x) + b(1 − x2 ) = (a + b) − ax − bx2 = 0,
then equating coefficients shows that a = b = 0. To see that they span, suppose r(x) = c+bx+ax2 ∈ V .
Then r(1) = a + b + c = 0, so that c = −b − a and therefore
r(x) = −(b + a) + bx + ax2 = −b(1 − x) − a(1 − x2 ),
so that r is a linear combination of p and q. Thus {p(x), q(x)} forms a basis for V , which has dimension
2.
36. Suppose p(x) = c + bx + ax2 ; then xp0 (x) = bx + 2ax2 . So p(x) = xp0 (x) means that c = 0 and a = 2a,
so that a = 0. Thus p(x) = bx. Then clearly V = span(x), so that {x} forms a basis, and dim V = 1.
37. Upper triangular matrices in M22 have the form
a b
= aE11 + bE12 + dE22 .
0 d
Since E11 , E12 , and E22 span V , and we know they are linearly independent, it follows that they form
a basis for V , which therefore has dimension 3.
38. Skew-symmetric matrices in M22 have the form
0 b
0
=b
−b 0
−1
1
= bA.
0
Since A spans V , and {A} is a linearly independent set, we see that {A} is a basis for V , which therefore
has dimension 1.
39. Suppose that
A=
Then
a
AB =
c
a+b
,
c+d
a
c
b
.
d
a+c
BA =
c
b+d
.
d
Then AB = BA means that a = a + c, so that c = 0, and also a + b = b + d, so that a = d. So the
matrix A has the form
a b
1 0
A=
=a
+ bE12 = aI + bE12 .
0 a
0 1
I and E12 are linearly independent since I = E11 + E22 , and {E11 , E22 , E12 } is a linearly independent
set. Since they span, we see that {I, E12 } forms a basis for V , which therefore has dimension 2.
472
CHAPTER 6. VECTOR SPACES
40. dim Mnn = n2 . For symmetric matrices, we have aij = aji . Therefore specifying the entries on or
above the diagonal determines the matrix, and these entries may take any value. Since there are
n + (n − 1) + · · · + 2 + 1 =
n(n + 1)
2
entries on or above the diagonal, this is the dimension of the space of symmetric matrices.
41. dim Mnn = n2 . For skew-symmetric matrices, we have aij = −aji . Therefore all the diagonal entries
are zero; further, specifying the entries above the diagonal determines the matrix, and these entries
may take any value. Since there are
(n − 1) + · · · + 2 + 1 =
n(n − 1)
2
entries above the diagonal, this is the dimension of the space of symmetric matrices.
42. Following the hint, Let B = {v1 , . . . , vk } be a basis for U ∩ W . By Theorem 6.10(e), this can be
extended to a basis C for U and to a basis D for W . Consider the set C ∪ D. Clearly it spans U + W ,
since an element of U + W is u + w for u ∈ U , w ∈ W ; since C and D are bases for U and W , the result
follows. Next we must show that this set is linearly independent. Suppose dim U = m and dim V = n;
then
C = {uk+1 , . . . , um } ∪ B,
D = {wk+1 , . . . , wn } ∪ B.
Suppose that a linear combination of C ∪ D is the zero vector. This means that
a1 v1 + · · · + ak vk + ck+1 uk+1 + · · · + cm um + dk+1 wk+1 + dn wn = A + C + D = 0.
This means that C = −A − D. Since A ∈ U ∩ W and D ∈ W , we see that −A − D ∈ W so that
C ∈ W . But also C ∈ U , since it is a combination of basis vectors for U . Thus C ∈ U ∩ W , which is
impossible unless all of the ci are zero, since the uk+1 are elements of U not in the subspace U ∩ W .
We are left with A + D = 0; this is a linear combination of basis elements of W , so that all the ai and
all the dj must be zero as well. Therefore C ∪ D is linearly independent.
It follows that C ∪ D forms a basis for U + W , so it remains to count the elements in this set. Clearly
C ∩ D = B, for any basis element in common between C and D must be in U ∩ W , so it must be in B.
Write #X for the number of elements in the set X. Then
dim(U + W ) = #(C ∪ D) = #C + #D − #B = n + m − k = dim U + dim W − dim(U ∩ W ).
43. Let dim U = m, with basis B = {u1 , . . . , um }, and dim V = n, with basis C = {v1 , . . . , vn }.
(a) Claim that a basis for U × V is D = {(ui , 0), (0, vj )}. This will also show that dim(U × V ) = mn,
by counting the basis vectors. First we want to show that D spans. Let x = (u, v) ∈ U × V .
Then


m
n
m
n
X
X
X
X
x = (u, v) = 
ci ui ,
dj vj  =
ci (ui , 0) +
dj (0, vj ),
i=1
j=1
i=1
j=1
proving that D spans U × V . To see that D is linearly independent, suppose that 0 is a linear
combination of the elements of D. This means that


m
n
m
n
X
X
X
X
0 = (0, 0) =
ci (ui , 0) +
dj (0, vj ) = 
ci ui ,
dj vj  .
i=1
j=1
i=1
j=1
Pm
But the ui are linearly independent, so that P
i=1 ci ui = 0 means that all of the ci are zero;
n
similarly the vj are linearly independent, so that j=1 dj vj = 0 means that all of the dj are zero.
Thus D is linearly independent.
So D is a basis for U × W , and therefore dim(U × W ) = mn.
6.2. LINEAR INDEPENDENCE, BASIS, AND DIMENSION
473
(b) Let dim W = k, with basis {w1 , . . . , wk }. Claim that D = {(wi , wi )} is a basis for ∆; proving
this claim also shows, by countingPthe basis vectors, that dim ∆ = dim W . To see that D spans
∆, choose (w, w) ∈ ∆; then w =
ci wi , so that
X
X
X
X
(w, w) =
ci wi ,
ci wi =
(ci wi , ci wi ) =
ci (wi , wi ).
P
Thus D spans ∆. To see that D is linearly independent, suppose that
ci (wi , wi ) = 0. Then
X
X
X
X
0=
ci (wi , wi ) =
(ci wi , ci wi ) =
ci wi ,
ci wi .
P
This means that
ci wi = 0; since the wi are linearly independent, we see that all the ci are
zero. So D is linearly independent, showing that it is a basis for ∆ and thus that dim ∆ = dim W .
44. Suppose that P is finite dimensional, and that {p1 (x), . . . , pn (x)} is a basis. Let m be the largest power
of x that appears in any of the basis elements, and consider q(x) = xm+1 . Since any linear combination
of the pi has powers of x at most equal to m, it follows that q ∈
/ span{pi }, so that P cannot have a
finite basis.
45. Start by adding all three standard basis vectors for P2 to the set, yielding {1 + x, 1 + x + x2 , 1, x, x2 }.
Since (1 + x) + x2 = 1 + x + x2 , we see that {1 + x, 1 + x + x2 , x2 } is linearly dependent, so discard x2 ,
leaving {1 + x, 1 + x + x2 , 1, x}. Next, since (1 + x) = 1 + x, the set {1 + x, 1, x} is linearly dependent,
so discard x, leaving B = {1 + x, 1 + x + x2 , 1}. This is a set of three vectors, so to prove that it is a
basis, it suffices to show that it spans. But by construction, the standard basis vectors not in B are
linear combinations of elements of B. Hence B is a basis for P2 .
46. Start by adding the standard basis vectors Eij to the set, giving
0 1
1 1
B= C=
,D=
, E11 , E12 , E21 , E22 .
0 1
0 1
Since E11 = D − C is the difference of the two given matrices, we discard it from B, leaving us with
five elements. Next, C = E12 + E22 , so discard E22 , leaving us with B = {C, D, E12 , E21 }. This has
four elements, and dim M22 = 4, so to prove that this is a basis, it suffices to prove that it spans. But
by construction, the standard basis vectors not in B are linear combinations of elements of B. So B
spans and therefore forms a basis for M22 .
47. Start by adding the standard basis vectors Eij to the set, giving
1 0
0 1
0 −1
B= I=
,C=
,D=
, E11 , E12 , E21 , E22 .
0 1
1 0
1
0
Since E21 = 12 (C + D), we discard it from B. Next, I = E11 + E22 , so discard E22 from B. Finally,
E12 = 21 (C − D), so we discard it from B. This leaves us with B = {I, C, D, E11 }. This has four
elements, and dim M22 = 4, so to prove that this is a basis, it suffices to prove that it spans. But by
construction, the standard basis vectors not in B are linear combinations of elements of B. So B spans
and therefore forms a basis for M22 .
48. Start by adding the standard basis vectors Eij to the set, giving
1 0
0 1
B= I=
,C=
, E11 , E12 , E21 , E22 .
0 1
1 0
Since I + C = E11 + E12 + E21 + E22 , these are a linearly dependent set; discard E22 from B. Next,
C = E12 + E21 , so discard E21 from B. This leaves us with B = {I, C, E11 , E12 }. This has four
elements, and dim M22 = 4, so to prove that this is a basis, it suffices to prove that it spans. But by
construction, the standard basis vectors not in B are linear combinations of elements of B. So B spans
and therefore forms a basis for M22 .
474
CHAPTER 6. VECTOR SPACES
49. The standard basis for P1 is {1, x}. If we show that each of these vectors is in the span of the given
vectors, then those vectors span P1 . But 1 ∈ span(1, 1 + x, 2x), and x = (1 + x) − 1 ∈ span(1, 1 + x, 2x).
Thus
span(1, 1 + x, 2x) ⊇ span(1, x) = P1 ,
so that a basis for the span is {1, x}.
50. Since (1 − 2x) + (2x − x2 ) = 1 − x2 , these three elements are linearly independent, so we can discard
1 − x2 from the set without changing the span, leaving {1 − 2x, 2x − x2 , 1 + x2 }. Claim these are
linearly independent. Suppose that
a(1 − 2x) + b(2x − x2 ) + c(1 + x2 ) = (a + c) + (−2a + 2b)x + (−b + c)x2 = 0.
Then a = −c and a = b from the constant term and the coefficient of x, so that b = −c. But then
b = −c together with −b + c = 0 shows that b = c = 0, so that a = 0. Hence these three polynomials
are linearly independent, so they form a basis for the span of the original set.
51. Since (1 − x) + (x − x2 ) = 1 − x2 and (1 − x) − (x − x2 ) = 1 − 2x + x2 , we can remove both 1 − x2 and
1 − 2x + x2 from the set without changing the span. This leaves us with {1 − x, x − x2 }. Claim these
are linearly independent. Suppose that
a(1 − x) + b(x − x2 ) = a + (−a + b)x − bx2 = 0.
Then clearly a = b = 0. Hence these two polynomials are linearly independent, so they form a basis
for the span of the original set.
52. Let
1
V1 =
0
0
,
1
0
V2 =
1
1
,
0
−1
V3 =
1
1
,
−1
1
V4 =
−1
−1
.
1
Then V3 + V4 = 0, so we can discard V4 from the set without changing the span. Next, V3 = V2 − V1 ,
so we can discard V3 from the set without changing the span. So we are left with {V1 , V2 }; these are
linearly independent since they are not scalar multiples of one another. So this set forms a basis for
the span of the given set.
53. Since cos 2x = cos2 x − sin2 x, we can discard cos 2x from the set without changing the span, leaving us
with {sin2 x, cos2 x}. Claim that this set is linearly independent. For suppose that a sin2 x+b cos2 x = 0
holds for all x. Setting x = 0 gives b = 0, while setting x = π2 gives a = 0. Thus a = b = 0 and these
functions are linearly independent. So {sin2 x, cos2 x} forms a basis for the span of the original set.
54. Suppose that av + c1 v1 + · · · + cn vn = 0, so that c1 v1 + · · · + cn vn = −av. If a 6= 0, then dividing
through by −a shows that v ∈ span(S), a contradiction. Therefore we must have a = 0, and we are
left with
c1 v1 + · · · + cn vn = 0,
which implies (since the vi are linearly independent) that all of the ci are zero. Therefore all coefficients
must be zero, so that S 0 is linearly independent.
55. If vn ∈ span(v1 , . . . , vn−1 ), then there are constants such that
vn = c1 v1 + · · · + cn−1 vn−1 .
Now choose x ∈ V = span(S). Then
x = a1 v1 + · · · + an−1 vn−1 + an vn
= a1 v1 + · · · + an−1 vn−1 + an (c1 v1 + · · · + cn−1 vn−1 )
= (a1 + an c1 )v1 + · · · + (an−1 + an cn−1 )vn−1 ,
showing that x ∈ span(S 0 ). Then V ⊆ span(S 0 ) ⊆ span(S) = V , so that span(S 0 ) = V .
6.2. LINEAR INDEPENDENCE, BASIS, AND DIMENSION
475
56. Use induction on n. If S = {v1 }, then S is linearly independent so is a basis. Next, suppose the
statement is true for all sets that span V and have no more than n − 1 elements. Let S = {v1 , . . . , vn }.
If S is linearly independent, then it is a basis and we are done. Otherwise it is a linearly dependent
set, so we can write some element of S (say, vn after a possible reordering) as a linear combination of
the other, so that vn ∈ span{v1 , · · · , vn−1 }. Then by Exercise 55, we can remove vn from S, giving
S 0 = {v1 , . . . , vn−1 } that also spans V . By the inductive assumption, S 0 can be made into a basis for
V by removing some (possibly no) elements, so S can.
57. Suppose x ∈ V . Since S = {v1 , . . . , vn } is a basis, there are scalars a1 , . . . , an such that
x = a1 v1 + a2 v2 + · · · + an vn .
Since the ci are nonzero, we also have
a2
an
a1
(cn vn ),
x = (c1 v1 ) + v2 + · · · +
c1
c2
cn
so that T = {c1 v1 , . . . , cn vn } spans V . To show T is linearly independent, the argument is much the
same. Suppose that
b1 (c1 v1 ) + b2 (c2 v2 ) + · · · + bn (cn vn ) = (b1 c1 )v1 + (b2 c2 )v2 + · · · + (bn cn )vn = v0 .
Since S is linearly independent, it follows that bi ci = 0 for all i. Since all of the ci are nonzero, all of
the bi must be zero, so that T is linearly independent. Therefore T is also a basis for V .
58. Let
S = {v1 , v2 , . . . , vn },
T = {v1 , v1 + v2 , v1 + v2 + v3 , . . . , v1 + v2 + · · · + vn }.
By Exercise 21 in Section 2.3, to show that span(T ) = span(S) = V it suffices to show that every
element of S is in span(T ) and vice versa. But
vi = (v1 + v2 + · · · + vi−1 + vi ) − (v1 + v2 + · · · + vi−1 ) ∈ span(T ) and
v1 + · · · + vi = (v1 ) + (v2 ) + · · · + (vi ) ∈ span(S).
Thus the two spans are equal. Since S is a basis, and T has the same number of elements as S, it must
also be a basis, by Theorem 6.10(d).
59. (a)
(x − a1 )(x − a2 )
(x − 2)(x − 3)
x2 − 5x + 6
1
5
=
=
= x2 − x + 3
(a0 − a1 )(a0 − a2 )
(1 − 2)(1 − 3)
2
2
2
(x − a0 )(x − a2 )
(x − 1)(x − 3)
x2 − 4x + 3
p1 (x) =
=
=
= −x2 + 4x − 3
(a1 − a0 )(a1 − a2 )
(2 − 1)(2 − 3)
−1
(x − a0 )(x − a1 )
(x − 1)(x − 2)
x2 − 3x + 2
1
3
p2 (x) =
=
=
= x2 − x + 1.
(a2 − a0 )(a2 − a1 )
(3 − 1)(3 − 2)
2
2
2
p0 (x) =
(b) If i = j, then substituting aj for x in the definition of pi gives a numerator equal to the denominator, so that pi (aj ) = 1 for i = j. If i 6= j, however, then substituting aj for x in the definition
of pi gives zero, since the term x − aj in the numerator of pi becomes zero. Thus
(
0 if i 6= j
pi (aj ) =
1 if i = j.
60. (a) Note first that the degree of each pi is n, since there are n factors in the numerator, so that
pi ∈ Pn for all i. Suppose that there are constants ci such that
c0 p0 (x) + c1 p1 (x) + · · · + cn pn (x) = 0.
Then this equation must hold for all x. Substitute ai for x. Then by Exercise 59(b), all pj (ai )
for j 6= i vanish, leaving only ci pi (ai ) = ci . Therefore ci = 0. Since this holds for every i, we see
that all of the ci are zero, so that the pi are linearly independent.
476
CHAPTER 6. VECTOR SPACES
(b) Since dim Pn = n + 1 and there are n + 1 polynomials p0 , p1 , . . . , pn that span, they must form
a basis, by Theorem 6.10(d).
61. (a) This is similar to part (a) of Exercise 60. Suppose that q(x) ∈ P, and that
q(x) = c0 p0 (x) + c1 p1 (x) + · · · + cn pn (x).
Then substituting ai for x gives q(ai ) = ci pi (ai ) = ci , since pj (ai ) = 0 for j 6= i by Exercise
59(b). Since the pi form a basis for P, any element of P has a unique representation as a linear
combination of the pi , so that unique representation must be the one we just found:
q(x) = q(a0 )p0 (x) + q(a1 )p1 (x) + · · · + q(an )pn (x).
(b) q(x) passes through the given n + 1 points if and only if q(ai ) = c1 . But this is exactly what we
proved in part (a). Since the representation is unique in part (a), this is the only such degree n
polynomial.
(c) (i) Here a0 = 1, a1 = 2, a2 = 3, so we use the Lagrange polynomials determined in Exercise
59(a) and therefore
q(x) = c0 p0 (x) + c1 p1 (x) + c2 p2 (x)
1 2 3
1 2 5
x − x + 3 − −x2 + 4x − 3 − 2
x − x+1
=6
2
2
2
2
= 3x2 − 16x + 19.
(ii) Here a0 = −1, a1 = 0, a2 = 3. We first compute the Lagrange polynomials:
(x − a1 )(x − a2 )
(x − 0)(x − 3)
x2 − 3x
1
3
=
=
= x2 − x
(a0 − a1 )(a0 − a2 )
(−1 − 0)(−1 − 3)
4
4
4
2
(x − (−1))(x − 3)
x − 2x − 3
1
2
(x − a0 )(x − a2 )
=
=
= − x2 + x + 1
p1 (x) =
(a1 − a0 )(a1 − a2 )
(0 − (−1))(0 − 3)
−3
3
3
(x − a0 )(x − a1 )
(x − (−1))(x − 0)
x2 + x
1 2
1
p2 (x) =
=
=
=
x + x.
(a2 − a0 )(a2 − a1 )
(3 − (−1))(3 − 0)
12
12
12
p0 (x) =
Using these polynomials, we get
q(x) = c0 p0 (x) + c1 p1 (x) + c2 p2 (x)
1 2
1 2 3
1 2 2
1
x − x +5 − x + x+1 +2
x + x
= 10
4
4
3
3
12
12
5 5 1
15
10
1
=
− +
+
x2 + − +
x+5
2 3 6
2
3
6
= x2 − 4x + 5.
62. Suppose that (a0 , 0), (a1 , 0), . . . , (an , 0) be the n + 1 distinct zeros. Then by Exercise 61(a), the only
polynomial passing through these n + 1 points is
q(x) = 0p0 (x) + 0p1 (x) + · · · + 0pn (x) = 0,
so it is the zero polynomial.
63. An n × n matrix is invertible if and only if it row-reduces to the identity matrix, so that its columns
are linearly independent. Since n linearly independent columns in Znp form a basis for Znp , finding the
number of invertible matrices in Mnn (Zp ) is the same as finding the number of ways to construct a
basis for Znp .
To construct a basis, we can start by choosing any nonzero vector v1 ∈ Znp . There are pn − 1 ways
to choose this vector (since Znp has pn elements, and we are excluding only one). For the second basis
Exploration: Magic Squares
477
element v2 , we can choose any vector not in span(v1 ) = {a1 v1 : a1 ∈ Zp }. There are therefore pn − p
choices for v2 . Likewise, suppose we have chosen v1 , . . . , vk−1 . Then we can choose for vk any vector
not in the span of the previous vectors, so any vector not in
span(v1 , . . . , vk−1 ) = {a1 v1 + a2 v2 + · · · + ak−1 vk−1 : a1 , a2 , . . . , ak−1 ∈ Zp }.
There are pn − pk−1 ways of choosing vk .
Thus the total number of bases is (pn − 1)(pn − p)(pn − p2 )(· · · )(pn − pn−1 ).
Exploration: Magic Squares
1. A classical magic square M contains each of the numbers 1, 2, . . . , n2 exactly once. Since wt(M ) is the
common sum of any row (or column), and there are n rows, we have, using the comments preceding
Exercise 51 in Section 2.4)
wt(M ) =
1 n2 (n2 + 1)
n(n2 + 1)
1
(1 + 2 + · · · + n2 ) = ·
=
.
n
n
2
2
2
2. A classical 3 × 3 magic square will have common sum 3(3 2+1) = 15. Therefore that no two of 7, 8,
and 9 can be in the same row, column, or diagonal, since the sum of any two of these numbers is at
least 15. The central square cannot be 1, since then 1 + 2 + k = 15 is impossible. Similarly, it cannot
be any of 2 through 4, since then some row, column, or diagonal contains the numbers 1 and 2, which
is impossible. The central square cannot be 7, 8, or 9, since then some line will contain two of those
numbers. Finally, it cannot be 6, since then some line will contain both 6 and 9. So the central square
must be 5. Also, 7 and 1 cannot be in the same line, since the third number would have to be 7.
Finally, 9 cannot be in a corner, since it would then participate in three lines adding to 15. But there
are only two possible sets of numbers, one of which is 9, adding to 15: 9, 5, 1 and 9, 4, and 2. Thus 9
must be on a side. Keeping these constraints in mind and trying a few configurations gives for example




2 7 6
8 1 6
9 5 1
3 5 7
4 3 8
4 9 2
These examples are essentially the same — the second one is a rotated and reflected version of the
first.
3. Since the magic squares in Exercise 2 have a weight of 15, subtracting 14
3 from each entry will subtract
14 from each row and column while maintaining common sums, so the result will have weight 1. For
example, using the second magic square above, we get


7
4
− 38
3
3
 13

1
.
− 11
 3
3
3 
10
− 23 − 53
3
In general, starting from any 3 × 3 classical magic square, to construct a magic square of weight w,
add w−15
to each entry in the magic square. This clearly preserves the common sum property, and
3
the new sum of each row is 15 + 3 w−15
= w.
3
4. (a) We must show that if A and B are magic squares, then so is A + B, and if c is any scalar, then
cA is also a magic square. Suppose wt(A) = a and wt(B) = b. Then the sum of the elements in
any row of A + B is the sum of the elements in that row of A plus the sum of the elements in that
row of B, which is therefore a + b. The same argument shows that the sum of the elements in any
column or either diagonal of A + B is also a + b, so that A + B ∈ Mag 3 with weight a + b. In the
same way, the sum of the elements in any row or column, or either diagonal, of cA is c times the
sum of the elements in that line of A, so it is equal to ca. Thus cA ∈ Mag 3 with weight ca.
478
CHAPTER 6. VECTOR SPACES
(b) If A, B ∈ Mag 03 , then wt(A) = wt(B) = 0. From part (a), wt(A + B) = 0 as well, and also
wt(cA) = c wt(A) = 0. Thus A + B, cA ∈ Mag 03 , so by Theorem 6.2, Mag 03 is a subspace of Mag 3 .
5. Let M be any 3 × 3 magic square, with wt(M ) = w. Using Exercise 3, we define M0 to be the magic
square of weight 0 derived from M by adding −w
3 to every element of M . This is the same as adding
the matrix


− w3 − w3 − w3
w

 w
− 3 − w3 − w3  = − J
3
− w3 − w3 − w3
to M . Thus
M0 = M −
w
w
J, so that M = M0 + J.
3
3
Thus k = w3 .
6. The linear equations express the fact that every row and column sum each diagonal sum is zero:
a+b+c
=0
d+e+f
=0
g+h+i=0
a
+d
b
+g
+e
c
a
=0
+h
+f
+i=0
+e
c
Row-reducing the associated matrix gives

1 1 1 0 0 0 0 0 0
0 0 0 1 1 1 0 0 0

0 0 0 0 0 0 1 1 1

1 0 0 1 0 0 1 0 0

0 1 0 0 1 0 0 1 0

0 0 1 0 0 1 0 0 1

1 0 0 0 1 0 0 0 1
0 0 1 0 1 0 1 0 0
=0
+i=0
+e
+g


0
1
0
0


0
0



0
0
→

0
0


0
0

0
0
0
0
0
1
0
0
0
0
0
0
=0
0
0
1
0
0
0
0
0
0
0
0
1
0
0
0
0
0
0
0
0
1
0
0
0
0
0
0
0
0
1
0
0
0
0
0
0
0
0
1
0
0
1
1
0
−1 −1
−1 −2
0
0
1
2
1
1
0
0

0
0

0

0
.
0

0

0
0
So there are two free variables in the general solution, which gives the magic square


−s
−t
s+t
 t + 2s 0 −t − 2s .
−s − t t
s
7. From Exercise 6, a basis for Mag 03 is given by



−1 0
1
0
A =  2 0 −2 , B =  1
−1 0
1
−1

−1
1
0 −1 ,
1
0
so that Mag 03 has dimension 2. Note that these matrices are linearly independent since they are not
scalar multiples of each other. (The matrix in Exercise 6 is not the same as that given in the hint. To
show that the two are just different forms of one another, make the substitutions u = −s, v = s + t;
then s = −v and t = v − s = u + v, so the matrix becomes


u
−u − v
v
−u + v
0
u − v ,
−v
u+v
−u
which is the matrix given in the hint.)
Exploration: Magic Squares
479
8. From Exercise 5, an arbitrary element M ∈ Mag 3 can be written
M = M0 + kJ,
M0 ∈ Mag 3 ,

1
J = 1
1

1
1 .
1
1
1
1
Since M0 has basis {A, B} (see Exercise 7), it follows that M3 is spanned by S = {A, B, J}. To see that
this is a basis, we must prove that these matrices are linearly independent. Now, we have seen that
A and B are linearly independent, so if S is linearly dependent, then span(S) = span(A, B) = Mag 03 .
But this is impossible, since S spans Mag 3 , which contains matrices not in Mag 03 . Thus S is linearly
independent, so it forms a basis for Mag 3 , which therefore has dimension 3.
9. Using the matrix M from exercise 5, adding both diagonals and the middle column gives
(a + e + i) + (c + e + g) + (b + e + h) = (a + b + c) + (g + h + i) + 3e,
which is the sum of the first and third rows, plus 3 times the central entry. Since each row, column,
and diagonal sum is w, we get w + w + w = w + w + 3e, so that e = w3 .
10. The sum of the squares in the entries of M are
s2 + (−s − t)2 + t2 + (−s + t)2 + 02 + (s − t)2 + (−t)2 + (s + t)2 + (−s)2
= s2 + (s2 + 2st + t2 ) + t2 + (s2 − 2st + t2 ) + (s2 − 2st + t2 ) + t2 + (s2 + 2st + t2 ) + s2
= 6s2 + 6t2 .
We have another way of computing this sum, however. Since M was obtained from a classical 3 × 3
magic square by subtracting 5 from each entry (see Exercise 5), the sum of the squares of the entries
of M is also
2
2
2
(1 − 5) + (2 − 5) + · · · + (9 − 5) = 60.
√
So the equation is 6s2 + 6t2 = 60, or s2 + t2 = 10, which is a circle of radius 10 centered at the origin.
The only way to write 10 as a sum of two integral squares is (±1)2 + (±3)2 , so this equation has the
eight integral solutions
(s, t) = (−3, −1), (−3, 1), (−1, −3), (−1, 3), (3, −1), (3, 1), (1, −3), (1, 3).
Substituting those values of s and t into

s+5
M + 5J = −s + t + 5
−t + 5
−s − t + 5
5
s+t+5

t+5
s − t + 5
−s + 5
gives the eight classical magic squares


2 9 4
7 5 3 ,
6 1 8


8 3 4
1 5 9 ,
6 7 2


2 7 6
9 5 1 ,
4 3 8


8 1 6
3 5 7 ,
4 9 2


4 9 2
3 5 7 ,
8 1 6


6 7 2
1 5 9 ,
8 3 4


4 3 8
9 5 1 ,
2 7 6


6 1 8
7 5 3 .
2 9 4
These matrices can all be obtained from one another by some combination of row and column interchanges and matrix transposition. A plot of the circle, with these eight points marked, is
480
CHAPTER 6. VECTOR SPACES
t
3
2
1
-3
-2
1
-1
2
s
3
-1
-2
-3
6.3
Change of Basis
1. (a)
2
1
0
b1 = 2
2
= b1
+ b2
⇒
⇒ [x]B =
3
0
1
b2 = 3
3
" #
5
5
c1 = 2
2
1
1
2
⇒
[x]
=
= c1
+ c2
⇒
C
3
1
−1
c2 = − 21
−1
2
(b) By Theorem 6.13, [ C | B ] → [ I | PC←B ]:
"
#
"
1
1 1 0
1 0
→
1 −1 0 1
0 1
"
(c) [x]C = PC←B [x]B =
1
2
1
2
1
2
1
−2
1
2
1
2
1
2
− 12
#
"
⇒
PC←B =
1
2
1
2
1
2
− 21
#
#" # " #
5
2
2 , which is the same as the answer from part (a).
=
3
− 12
(d) By Theorem 6.13, [ B | C ] → [ I | PB←C ]:
1 0 1
1
1
1
⇒ PB←C =
0 1 1 −1
1 −1
"
#" # " #
5
1
1
2 = 2 , which is the same as the answer from part (a).
(e) [x]B = PB←C [x]C =
1 −1 − 12
3
2. (a)
4
1
1
= b1
+ b2
−1
0
1
4
0
2
= c1
+ c2
−1
1
3
⇒
b1 = 5
b2 = −1
⇒
⇒
c1 = −7
c2 = 2
⇒
(b) By Theorem 6.13, [ C | B ] → [ I | PC←B ]:
#
"
"
1 0 − 32
0 2 1 1
→
1
1 3 0 1
0 1
2
− 21
1
2
5
−1
−7
[x]C =
2
[x]B =
#
"
⇒
PC←B =
− 32
− 12
1
2
1
2
#
6.3. CHANGE OF BASIS
(c) [x]C = PC←B [x]B =
481
"
− 23
− 12
1
2
1
2
#"
5
−1
#
=
" #
−7
2
, which is the same as the answer from part (a).
(d) By Theorem 6.13, [ B | C ] → [ I | PB←C ]:
1 1 0 2
1 0 −1 −1
−1 −1
→
⇒ PB←C =
0 1 1 3
0 1
1
3
1
3
−1 −1 −7
5
(e) [x]B = PB←C [x]C =
=
, which is the same as the answer from part (a).
1
3
2
−1
3. (a)
 
 

 
0
0
1
1
 
 
 
 
+
b
+
b
=
b
 0
3 0
2 1
1 0
1
0
0
−1
 
 
 
 
0
0
1
1
 
 
 
 
 0 = c1 1 + c2 1 + c3 0
1
1
1
−1

(b) By Theorem 6.13, [ C | B ] → [ I | PC←B ]:



1 0 0 1 0 0
1 0 0
1



1 1 0 0 1 0 → 0 1 0 −1
1 1 1 0 0 1
0 0 1
0
⇒
⇒
⇒
c1 = 1
c2 = −1
c3 = −1
⇒

0 0

1 0
−1 1

1
 
[x]B =  0
−1
 
1
 
[x]C = −1 .
−1

b1 = 1
b2 = 0
b3 = −1

⇒
1

PC←B = −1
0

0 0

1 0 .
−1 1
(c)

   
1
1
0
   
 0 = −1
1 0
   
−1
−1 1 −1
1
0

[x]C = PC←B [x]B = 
−1
0
which is the same as the answer from part (a).
(d) By Theorem 6.13, [ B | C ] → [ I | PB←C ]:


1 0 0 1 0 0
0 1 0 1 1 0 ⇒
0 0 1 1 1 1

1
PB←C = 1
1

0
0 .
1
0
1
1
(e)

1


[x]C = PB←C [x]C = 1
1


1

0
1
   
   
0
 −1 =  0 ,
1
−1
−1
1
1

0
which is the same as the answer from part (a).
4. (a)
 
 
 
 
3
0
0
1
 
 
 
 
1 = b1 1 + b2 0 + b3 0
 
 
 
 
5
0
1
0
 
 
 
 
3
1
0
1
 
 
 
 
1 = c1 1 + c2 1 + c3 0
 
 
 
 
5
0
1
1
 
1
 

b2 = 5 ⇒ [x]B = 
5
b3 = 3
3
 
c1 = − 12
−1
 2
3
⇒ [x]C = 
c2 = 32
 2 .
7
c3 = 72
2
b1 = 1
⇒
⇒
482
CHAPTER 6. VECTOR SPACES
(b) By Theorem 6.13, [ C | B ] → [ I | PC←B ]:



1
1 0 1 0 0 1
1 0 0
2



1
1 1 0 1 0 0 → 0 1 0



2
0 0 1 − 12
0 1 1 0 1 0

− 12

1
2
− 12 

1
2
1
2
1
2
⇒
1
 2
1
PC←B = 
 2
− 12
− 21
1
2
1
2

1
2
− 12 
.
1
2
(c)

1
 2
1
[x]C = PC←B [x]B = 
 2
− 12
− 12
 


1
2
1
2
1
−1
2  1
 2
   3 
− 21 
 5 =  2  ,
7
1
3
2
2
1
0
1
1
1
0

0
1
1

1


[x]C = PB←C [x]C = 0
1
1
   
−1
0
1
  2  
3 = 


1  2  5
7
1
3
2
which is the same as the answer from part (a).
(d) By Theorem 6.13, [ B | C ] → [ I | PB←C ]:



0 0 1 1 0 1
1 0 0
1 0 0 1 1 0 → 0 1 0
0 1 0 0 1 1
0 0 1
⇒

1
PB←C = 0
1
1
1
0

0
1
1
(e)
1
0
which is the same as the answer from part (a).
5. Note that in terms of the standard basis,
1
0
B=
,
,
0
1
C=
0
1
,
.
1
1
(a)
2 − x = b1 (1) + b2 (x)
b1 = 2
b2 = −1
⇒
2 − x = c1 (x) + c2 (1 + x)
⇒
⇒
1
0
PC←B =
0
1
⇒
−1
1
1
.
0
1
2
−3
=
, which agrees with the answer in part (a).
0 −1
2
(d) By Theorem 6.13, [ B | C ] → [ I | PB←C ]:
1 0 0 1
0 1 1 1
(e) [2 − x]B = PB←C [2 − x]C =
2
−1
−3
[2 − x]C =
.
2
[2 − x]B =
c1 = −3
c2 = 2
(b) By Theorem 6.13, [ C | B ] → [ I | PC←B ]:
0 1 1 0
1 0 −1
→
1 1 0 1
0 1
1
−1
(c) [2 − x]C = PC←B [2 − x]B =
1
⇒
⇒
0
PC←B =
1
1
1
1 −3
2
=
, which agrees with the answer in part (a).
1
2
−1
6.3. CHANGE OF BASIS
483
6. Note that in terms of the standard basis,
1
1
B=
,
,
1
−1
0
4
C=
,
.
2
0
(a)
1 + 3x = b1 (1 + x) + b2 (1 − x)
⇒
(c) [1 + 3x]C = PC←B [1 + 3x]B =
1
2
1
4
− 21
#"
1
4
2
(e) [1 + 3x]B = PB←C [1 + 3x]C =
#" #
2 32
1
−1
1
4
2
#
" #
=
3
2
1
4
1
−1
2
2
=
7. Note that in terms of the standard basis,
     
0
0 
 1
B = 1 , 1 , 0 ,


1
1
1
#
−1
2
−1
3
2
1
4
[1 + 3x]C =
"
⇒
1
4
"
[1 + 3x]B =
#
− 21
(d) By Theorem 6.13, [ B | C ] → [ I | PB←C ]:
1
1 0 4
1 0
→
1 −1 2 0
0 1
"
⇒
1
2
1
4
−1
⇒
2
" #
c2 = 14
(b) By Theorem 6.13, [ C | B ] → [ I | PC←B ]:
"
"
#
1 0
0 4 1
1
→
0 1
2 0 1 −1
"
b2 = −1
c1 = 23
⇒
1 + 3x = c1 (2x) + c2 (4)
"
b1 = 2
1
2
1
4
PC←B =
− 12
#
1
4
.
, which agrees with the answer in part (a).
⇒
PC←B =
2
2
1
−1
#
, which agrees with the answer in part (a).
     
0
0 
 1
C = 0 , 1 , 0 .


0
0
1
(a)
1 + x2 = b1 (1 + x + x2 ) + b2 (x + x2 ) + b3 (x2 )
1 + x2 = c1 (1) + c2 (x) + c3 (x2 )
⇒
c1 = 1
c2 = 0
c3 = 1
(b) By Theorem 6.13, [ C | B ] → [ I | PC←B ]:


1 0 0 1 0 0


0 1 0 1 1 0


0 0 1 1 1 1

1
(c) [1 + x2 ]C = PC←B [1 + x2 ]B = 1
1
0
1
1
⇒


1
[1 + x2 ]B = −1
1
⇒
b1 = 1
b2 = −1
b3 = 1
⇒
 
1
[1 + x2 ]C = 0 .
1

1

PC←B = 
1
1
0
1
1
⇒

0

0
.
1
   
0
1
1
0 −1 = 0, which agrees with the answer in part (a).
1
1
1
484
CHAPTER 6. VECTOR SPACES
(d) By Theorem 6.13, [ B | C ] → [ I | PB←C ]:






1 0 0 1 0 0
1 0 0
1
0 0
1
0 0
1 1 0 0 1 0 → 0 1 0 −1
1 0 ⇒ PC←B = −1
1 0
0 −1 1
1 1 1 0 0 1
0 0 1
0 −1 1

   
1
0 0 1
1
1 0 0 = −1, which agrees with the answer in part
(e) [1 + x2 ]B = PB←C [1 + x2 ]C = −1
0 −1 1 1
1
(a).
8. Note that in terms of the standard basis,
     
1
0 
 0
B = 1 , 0 , 1 ,


0
1
1
     
1
0 
 1
C = 0 , 1 , 0 .


0
0
1
(a)
b1 = 3
b2 = 4
b3 = −5
4 − 2x − x2 = b1 (x) + b2 (1 + x2 ) + b3 (x + x2 )
⇒
4 − 2x − x2 = c1 (1) + c2 (1 + x) + c3 (x2 )
c1 = 6
c2 = −2
c3 = −1
⇒
⇒


3
⇒ [4 − 2x − x2 ]B =  4
−5
 
6
[4 − 2x − x2 ]C = −2
−1
(b) By Theorem 6.13, [ C | B ] → [ I | PC←B ]:






1 0 0 −1 1 −1
−1 1 −1
1 1 0 0 1 0







0 1 0 1 0 1 → 0 1 0
1
1 0
1
.


 ⇒ PC←B =  1 0

0 1
1
0 0 1 0 1 1
0 0 1
0 1
1

 
 
−1 1 −1
3
6
1  4 = −2, which agrees with the
(c) [4 − 2x − x2 ]C = PC←B [4 − 2x − x2 ]B =  1 0
0 1
1 −5
−1
answer in part (a).
(d) By Theorem 6.13, [ B | C ] → [ I | PB←C ]:






0 1 0 1 1 0
1 0 0
1
2 −1
1
2 −1
1 0 1 0 1 0 → 0 1 0
1
1
0 ⇒ PC←B =  1
1
0
0 1 1 0 0 1
0 0 1 −1 −1
1
−1 −1
1

   
1
2 −1
6
3
1
0 −2 =  4, which agrees with the
(e) [4 − 2x − x2 ]B = PB←C [4 − 2x − x2 ]C =  1
−1 −1
1 −1
−5
answer in part (a).
9. (a)


 
 
 
 
 
4
1
0
0
0
b1 = 4
4
 2
0
1
0
0

b
=
2
2
2
  = b1   + b2   + b3   + b4   ⇒

⇒ [A]B = 
 0
0
0
1
0
 0
b3 = 0
−1
0
0
0
1
b4 = −1
−1
 5
 
 
 
 
 
b1 = 25
4
1
2
1
1
2
 0
 2
 
 
 
 

  = b1  2 + b2 1 + b3 1 + b4 0 ⇒ b2 = 0 ⇒ [A]C = 
 .
 0
 0
1
0
0
b3 = −3
−3
−1
−1
0
1
1
9
b4 = 92
2
6.3. CHANGE OF BASIS
485
(b) By Theorem 6.13, [ C | B ] → [ I | PC←B ]:

1
2
1
1
1
0
0

 2

 0

−1
1
1
0
0
1
0
1
0
0
0
0
1
0
1
1
0
0
0

1 0 0 0


0 1 0 0
0
→
0 0 1 0
0


1
0 0 0 1
0

1
2
0
−1
0
0
1
−1
1
1
3
2
−1
−2
− 21


0

1

− 21

⇒
1
2
0
−1

 0
PC←B = 
−1

0
1
1
1
−1
−2
3
2
(c)

1
2
 0

[A]C = PC←B [A]B = 
−1
3
2
    5
0 −1 − 21
4
2
 2  0
0
1
0
   
  =  ,
1
1
1  0 −3
9
−1
−1 −2 − 21
2
which agrees with the answer in part (a).
(d) By Theorem 6.13, [ B | C ] → [ I | PB←C ]:

1 0 0 0

0 1 0 0

0 0 1 0

0 0 0 1

1
2
1
1
2
1
1
0
1
0
−1
0
1

0

0

1

⇒

1
2
1
1

 2
PB←C = 
 0

−1
1
1
1
0
0
1

0
.
0

1
(e)

1
2
1

 2
[A]B = PB←C [A]C = 
 0

−1
1
1
1
0
0
1



4
   
   
0
  0 =  2 ,

  
0 
−3  0
9
1
−1
2
1
5
2

which agrees with the result in part (a).
10. (a)
 
 
 
 
 
1
1
0
1
1
 
 
 
 
 
1
 
 
 
 
  = b1 0 + b2 1 + b3 1 + b4 0
1
0
1
0
1
 
 
 
 
 
1
1
0
0
0
 
 
 
 
 
1
1
1
1
0
 
 
 
 
 
1
1
1
0
1
  = b1   + b2   + b3   + b4  
1
0
1
1
1
 
 
 
 
 
1
1
0
1
1
b1 = 1
⇒
b2 = 1
b3 = 0
⇒
b4 = 0
b1 = 13
⇒
b2 = 13
b3 = 13
b4 = 13
 
1
 
1

[A]B = 
0
 
0
 
1
⇒
 31 
3

[A]C = 
1 .
3
1
3
− 12


0

1

− 12
486
CHAPTER 6. VECTOR SPACES
(b) By Theorem 6.13, [ C | B ] → [ I | PC←B ]:

1 1 1 0

1 1 0 1

0 1 1 1

1 0 1 1
1
0
1
0
1
1
0
1
0
1
0
0


1 0 0 0
1



0
 → 0 1 0 0
0 0 1 0

1

0
0 0 0 1
2
3
− 13
2
3
− 13
− 13
2
3
2
3
− 13
− 13
2
3
− 13
2
3
− 31

2

3
2
3
− 31

2
 31
− 3
PC←B = 
 2
 3
− 13
⇒
− 13
− 13
2
3
− 13
2
3
2
3
2
3
− 31
− 31
1
0
1
1
 1
 2
PB←C = 
 1
 2
− 12
1
2
1
2
1
2
1
2
− 21
1
2

2

3
.
2
3
− 13
(c)

2
 13
− 3
[A]C = PC←B [A]B = 
 2
 3
− 13
− 13
   
1
1
3




2  
1


1
3  
3

=




2
1
3
3  0
1
0
− 13
3
2
3
2
3
− 31
− 31
− 13
1
0
1
1
2
1
2
− 12
1
2
1
2
1
2
1
2
− 21
1
2
2
3
− 13
2
3
which agrees with the answer in part (a).
(d) By Theorem 6.13, [ B | C ] → [ I | PB←C ]:

1 0 1 1

0 1 1 0

0 1 0 1

1 0 0 0
1
1
1
1
1
0
0
1
1
1
0
1


1 0 0 0
0



1
 → 0 1 0 0
0 0 1 0

1

1
0 0 0 1
1

3

2
1
−2
− 12

⇒
(e)

1
0
1
 1
 2
[A]B = PB←C [A]C = 
 1
 2
− 12
1
2
1
2
1
2
1
2
− 12
1
2
 
which agrees with the result in part (a).
11. Note that in terms of the standard basis {sin x, cos x},
B=
1
0
,
,
1
1
C=
 
1
1
3


 
3 1
 1
2 3
= 

1 1
−2 3 
0
1
− 21
0
3
1
1
1
,
.
1
−1

3

2
.
1
−2
− 12
6.3. CHANGE OF BASIS
487
(a)
b1 = 2
b2 = −5
" #
2
[2 sin x − 3 cos x]B =
−5
2 sin x − 3 cos x = b1 (sin x + cos x) + b2 (cos x)
⇒
⇒
2 sin x − 3 cos x = c1 (sin x + cos x) + c2 (sin x − cos x)
⇒
[2 sin x − 3 cos x]C =
(b) By Theorem 6.13, [ C | B ] → [ I | PC←B ]:
"
#
"
1 0 1
1
1 1 0
→
0 1 0
1 −1 1 1
1
2
− 21
#
"
1
2
1
−2
1
(c) [2 sin x − 3 cos x]C = PC←B [2 sin x − 3 cos x]B =
0
answer in part (a).
(d) By Theorem 6.13, [ B | C ] → [ I | PB←C ]:
1 0 1
1
1 0
→
1 1 1 −1
0 1
1
−2
1
0
" #
− 21
5
2
"
1
PC←B =
0
⇒
c1 = − 21
c2 = 52
⇒
.
1
2
− 12
#
.
#"
#
" #
2
− 12
=
, which agrees with the
5
−5
2
⇒
1
PC←B =
0
1
−2
#" #
" #
1 − 21
2
, which agrees with the
=
5
−2
−5
2
"
1
(e) [2 sin x − 3 cos x]B = PB←C [2 sin x − 3 cos x]C =
0
answer in part (a).
12. Note that in terms of the standard basis {sin x, cos x},
1
0
−1
1
B=
,
,
C=
,
.
1
1
1
1
(a)
⇒
sin x = b1 (sin x + cos x) + b2 (cos x)
"
b1 = 1
b2 = −1
sin x = c1 (cos x − sin x) + c2 (sin x + cos x)
⇒
⇒
c1 = − 21
c2 = 12
[sin x]B =
1
#
−1
⇒
[sin x]C =
"
0
1
2
1
2
" #
− 21
1
2
(b) By Theorem 6.13, [ C | B ] → [ I | PC←B ]:
"
−1
1
1
0
1
1
1
1
(c) [sin x]C = PC←B [sin x]B =
"
0
1
1
2
1
2
#
"
→
#"
#
1
−1
1
2
1
2
1
0
0
0
1
1
"
#
=
− 12
1
2
#
⇒
PC←B =
1
#
.
, which agrees with the answer in part (a).
488
CHAPTER 6. VECTOR SPACES
(d) By Theorem 6.13, [ B | C ] → [ I | PB←C ]:
1 0 −1 1
1 0
→
1 1
1 1
0 1
(e) [sin x]B = PB←C [sin x]C =
"
−1
2
#" #
1 − 12
0
1
2
−1
2
"
=
1
1
0
⇒
PC←B =
−1
2
1
0
#
−1
, which agrees with the answer in part (a).
13. From Section 3.6, Example 3.58, if B is the usual basis and C is the basis whose basis vectors are the
rotations of the usual basis vectors through 60◦ , the matrix of the rotation is
"
# "
√ #
3
1
cos 60◦ sin 60◦
2
2
√
PC←B =
=
.
1
− sin 60◦ cos 60◦
− 23
2
Then
(a)
"
[x]C = PC←B [x]B =
1
√2
− 23
√
3
2
1
2
#" #
3
"
1
√2
3
2
2
√ #
"
3
3
2 +
√
3
1− 2 3
=
.
(b) Since PC←B is an orthogonal matrix, we have
[x]B = (PC←B )
−1
T
[x]C = (PC←B ) [x]C =
√
− 23
" √
#
2 3+2
= √
.
−4
2 3−2
#"
1
2
4
#
14. From Section 3.6, Example 3.58, if B is the usual basis and C is the basis whose basis vectors are the
rotations of the usual basis vectors through 135◦ , the matrix of the rotation is
# " √
"
√ #
2
− 22
cos 135◦ sin 135◦
√
√2
=
.
PC←B =
− 22 − 22
− sin 135◦ cos 135◦
Then
(a)
"
[x]C = PC←B [x]B =
√
− 22
√
− 22
√
2
√2
− 22
#" #
3
2
√
"
=
#
− 22
√
.
#"
#
4
−522
(b) Since PC←B is an orthogonal matrix, we have
"
−1
[x]B = (PC←B )
T
[x]C = (PC←B ) [x]C =
√
√
− 22
− 22
2
2
− 22
√
√
"
#
0
= √ .
−4
4 2
15. By definition, the columns of PC←B are the coordinates of B in the basis C, so we have
1
1
2
−1
[u1 ]C =
, so that u1 = 1
−1
=
−1
2
3
−1
−1
1
2
3
[u2 ]C =
, so that u2 = −1
+2
=
.
2
2
3
4
Therefore
B = {u1 , u2 } =
−1
3
,
.
−1
4
6.3. CHANGE OF BASIS
489
16. Let C = {p1 (x), p2 (x), p3 (x)}. By definition, the columns of PC←B are the coordinates of B in the basis
C, so that
 
1
[x]C =  0 ⇒ x = p1 (x) − p3 (x),
−1
 
0
[1 + x]C = 2 ⇒ 1 + x = 2p2 (x) + p3 (x),
1
 
0
[1 − x + x2 ]C = 1 ⇒ 1 − x + x2 = p2 (x) + p3 (x).
1
Subtract the third equation from the second, giving p2 (x) = 2x − x2 ; substituting back into the third
equation gives p3 (x) = 1 − 3x + 2x2 ; then from the first equation, p1 (x) = 1 − 2x + 2x2 . So
C = {p1 (x), p2 (x), p3 (x)} = {1 − 2x + 2x2 , 2x − x2 , 1 − 3x + 2x2 }.
17. Since a = 1 in this case and we are considering a quadratic, we have for the Taylor basis of P2
B = {1, x − 1, (x − 1)2 } = {1, x − 1, x2 − 2x + 1}.
Let C = {1, x, x2 } be the standard basis for P2 . Then


1
[p(x)]C =  2 .
−5
To find the change-of-basis matrix PB←C , we reduce B C → I PB←C :



1 −1
1 1 0 0
1 0 0 1 1
1 −2 0 1 0 → 0 1 0 0 1
B C = 0
0
0
1 0 0 1
0 0 1 0 0
Therefore

1
[p(x)]C = PB←C [p(x)]B = 0
0
1
1
0

1
2 .
1
   
−2
1
1
2  2 = −8 ,
−5
−5
1
so that the Taylor polynomial centered at 1 is
−2 − 8(x − 1) − 5(x − 1)2 .
18. Since a = −2 in this case and we are considering a quadratic, we have for the Taylor basis of P2
B = {1, x + 2, (x + 2)2 } = {1, x + 2, x2 + 4x + 4}.
Let C = {1, x, x2 } be the standard basis for P2 . Then


1
[p(x)]C =  2 .
−5
To find the change-of-basis matrix PB←C , we reduce B C → I PB←C :




1 2 4 1 0 0
1 0 0 1 −2
4
B C = 0 1 4 0 1 0 → 0 1 0 0
1 −4 .
0 0 1 0 0 1
0 0 1 0
0
1
490
CHAPTER 6. VECTOR SPACES
Therefore

1
[p(x)]C = PB←C [p(x)]B = 0
0
  

−2
4
1
−23
1 −4  2 =  22 ,
0
1 −5
−5
so that the Taylor polynomial centered at −2 is
−23 + 22(x + 2) − 5(x + 2)2 .
19. Since a = −1 in this case and we are considering a cubic, we have for the Taylor basis of P3
B = {1, x + 1, (x + 1)2 , (x + 1)3 } = {1, x + 1, x2 + 2x + 1, x3 + 3x2 + 3x + 1}.
Let C = {1, x, x2 , x3 } be the standard basis for P2 . Then
 
0
0

[p(x)]C = 
0 .
1
To find the change-of-basis matrix PB←C , we reduce B
B
Therefore

1 1 1 1
0 1 2 3
C =
0 0 1 3
0 0 0 1
1
0
0
0
0
1
0
0
C → I
PB←C :


1 0 0 0
0
0 1 0 0
0
→
0 0 1 0
0
1
0 0 0 1
0
0
1
0

1
0
[p(x)]C = PB←C [p(x)]B = 
0
0

−1
1 −1
1 −2
3
.
0
1 −3
0
0
1
1
0
0
0
   
−1
0
−1
1 −1
0  3
1 −2
3
  =  ,
0
1 −3 0 −3
1
1
0
0
1
so that the Taylor polynomial centered at −1 is
−1 + 3(x + 1) − 3(x + 1)2 + (x + 1)3 .
20. Since a = 21 in this case and we are considering a cubic, we have for the Taylor basis of P3
(
B=
2 3 ) 1
1
1
1
1
3
3
1
1, x − , x −
, x−
= 1, x − , x2 − x + , x3 − x2 + x −
.
2
2
2
2
4
2
4
8
Let C = {1, x, x2 , x3 } be the standard basis for P2 . Then
 
0
0

[p(x)]C = 
0 .
1
To find the change-of-basis matrix PB←C , we reduce B
h
B

1
i 0

C =
0
0
− 12
1
0
0
1
4
− 18
−1
1
0
3
4
− 32
1
1
0
0
0
0
1
0
0
0
0
1
0
C → I


0
1
0
0


→
0
0
1
0
0
1
0
0
PB←C :
0
0
1
0
0
0
0
1
1
0
0
0
1
2
1
4
1
0
0
1
1
0
1
8
3
4 .
3

2

1
6.4. LINEAR TRANSFORMATIONS
491
Therefore

1
0

[p(x)]C = PB←C [p(x)]B = 
0
0
1
2
1
4
1
0
0
1
1
0
1
2
2
1
1
0
8
8

3  
3
4  0 =  4  ,
   
3
 0  32 
2
 
 
1
1
1
so that the Taylor polynomial centered at 12 is
1 3
+
8 4
1
2
x−
+
3
2
x−
3
1
+ x−
.
2
21. Since [x]D = PD←C [x]C and [x]C = PC←B [x]B , we get
[x]D = PD←C [x]C = PD←C (PC←B [x]B ) = (PD←C PC←B )[x]B .
Since the change-of-basis matrix is unique (its columns are determined by the two bases), it follows
that
PD←C PC←B = PD←B .
22. By Theorem 6.10 (c), to show that C is a basis it suffices to show that it is linearly independent, since
C has n elements and dim V = n. Since P is invertible, the columns of P are linearly independent. But
the given equation says that [ui ]B = pi , since the elements of that column are the coordinates of ui with
respect to the basis B. Since the columns of P are linearly independent, so is {[u1 ]B , [u2 ]B , . . . , [un ]B }.
But then by Theorem 6.7 in Section 6.2, it follows that {u1 , u2 , . . . , un } is linearly independent, so
that C is a basis for V .
To show that P = PB←C , recall that the columns of PB←C are the coordinate vectors of C with respect
to the basis B. But looking at the given equation, we see that this is exactly what the columns of P
are. Thus P = PB←C .
6.4
Linear Transformations
1. T is a linear transformation:
T
a
c
0
b
a
+ 0
d
c
b0
a + a0
=
T
0
d
c + c0


a + a0 + b + b0


0
b + b0


0 =

0
d+d
0
0
c+c +d+d

  0

a + b0
a+b
 0   0 
=T a
+
=
 0   0 
c
c+d
c0 + d0
0
b
a
+T 0
d
c
b0
d0
and
T
a
α
c
b
αa
=T
d
αc
αb
αd
=
αa + αb
0
a+b
=α
0
αc + αd
0
2. T is not a linear transformation:
0
w + w0
w x
w x0
=T
T
+ 0
0
y z
y + y0
y z
x + x0
z + z0
0
a
= αT
c+d
c
1
=
x + x0 − y − y 0
w + w0 − z − z 0
1
while
T
w
y
0
x
w
+T
z
y0
x0
z0
1
w−z
1
w0 − z 0
=
+ 0
x−y
1
x − y0
1
0
2
w + w − z − z0
=
x + x0 − y − y 0
2
Since the two are not equal, T is not a linear transformation.
b
.
d
492
CHAPTER 6. VECTOR SPACES
3. T is a linear transformation. If A, C ∈ Mnn and α is a scalar, then
T (A + C) = (A + C)B = AB + CB = T (A) + T (C)
T (αA) = (αA)B = α(AB) = αT (A).
4. T is a linear transformation. If A, C ∈ Mnn and α is a scalar, then
T (A + C) = (A + C)B − B(A + C) = AB + CB − BA − BC = (AB − BA) + (CB − BC)
= T (A) + T (C)
T (αA) = (αA)B − B(αA) = α(AB) − α(BA) = α(AB − BA) = αT (A).
5. T is a linear transformation, since by Exercise 44 in Section 3.2, tr(A + B) = tr(A) + tr(B) and
tr(αA) = α tr(A).
6. T is not a linear transformation. For example, in M22 , let A = I; then T (2A) = T (2I) = 2 · 2 = 4,
while 2T (A) = 2T (I) = 2(1 · 1) = 2.
7. T is not a linear transformation. For example, let A be any invertible matrix in Mnn ; then −A is
invertible as well, so that rank(A) = rank(−A) = n. Then
T (A) + T (−A) = rank(A) + rank(−A) = 2n while T (A + (−A)) = T (O) = rank(O) = 0.
8. Since T (0) = 1 + x + x2 6= 0, T is not a linear transformation by Theorem 6.14(a).
9. T is a linear transformation. Let p(x) = a + bx + cx2 and p0 (x) = a0 + b0 x + c0 x2 be elements of P2 ,
and let α be a scalar. Then
(T (p + p0 ))(d) = (T ((a + a0 ) + (b + b0 )x + (c + c0 )x2 ))(d)
= (a + a0 ) + (b + b0 )(d + 1) + (b + b0 )(d + 1)2
= a + b(d + 1) + b(d + 1)2 + a0 + b0 (d + 1) + b0 (d + 1)2
= (T (p))(d) + (T (p0 ))(d),
and
T (αp)(d) = (T (αa + αbx + αcx2 ))(d) = αa + αb(d + 1) + αb(d + 1)2
= α a + b(d + 1) + b(d + 1)2 = α(T (p))(d).
10. T is a linear transformation. Let f, g ∈ F and α be a scalar. Then
(T (f + g))(x) = (f + g)(x2 ) = f (x2 ) + g(x2 ) = (T (f ))(x) + (T (g))(x)
(T (αf ))(x) = (αf )(x2 ) = αf (x2 ) = α(T (f ))(x).
11. T is not a linear transformation. For example, let f be any function in F that takes on a value other
than 0 or 1. Then
2
2
(T (2f ))(x) = ((2f )(x)) = (2f (x)) = 4f (x)2 6= 2f (x)2 = 2(T (f ))(x),
where the inequality holds because of the conditions on f .
12. This is similar to Exercise 10; T is a linear transformation. Let f, g ∈ F and α be a scalar. Then
(T (f + g))(c) = (f + g)(c) = f (c) + g(c) = (T (f ))(c) + (T (g))(c)
(T (αf ))(c) = (αf )(c) = αf (c) = α(T (f ))(c)
6.4. LINEAR TRANSFORMATIONS
493
13. For T , we have (with α a scalar)
a
c
a+c
T
+
=T
= (a + c) + (a + c + b + d)x = (a + (a + b)x) + (c + (c + d)x)
b
d
b+d
a
c
=T
+T
b
d
a
αa
a
T α
=T
= αa + (αa + αb)x = α(a + (a + b)x) = αT
.
b
αb
b
For S, suppose that p(x), q(x) ∈ P1 and α is a scalar; then
T (p(x) + q(x)) = x(p(x) + q(x)) = xp(x) + xq(x) = T (p(x)) + T (q(x))
T (αp(x)) = x(αp(x)) = α(xp(x)) = αT (p(x)).
14. Using the comment following the definition of a linear transformation, we have
 
       
1
3
5
6
11
5
1
0
1
0
T
=T 5
+2
= 5T
+ 2T
= 5  2 + 2 0 =  10 + 0 = 10
2
0
1
0
1
−1
4
−5
8
3
In a similar way,
      

 
3
a
3b
a + 3b
1
a
1
0
1
0
T
=T a
+b
= aT
+ bT
= a  2 + b 0 =  2a  +  0 =  2a 
b
0
1
0
1
4
−a
4b
−a + 4b
−1
−7
1
3
15. We must first find the coordinates of
in the basis
,
. So we want to find c, d such
9
1
−1
that
−7
1
3
c + 3d = −7
c =
5
=c
+d
⇒
⇒
9
1
−1
c − d = 9
d = −4.
Then using the comment following the definition of a linear transformation, we have
−7
1
3
1
3
T
=T 5
−4
= 5T
− 4T
= 5(1 − 2x) − 4(x + 2x2 ) = −8x2 − 14x + 5.
9
1
−1
1
−1
The second part is much the same, except that we need to find c, d such that
c = a+3b
a
1
3
c + 3d = a
4
=c
+d
⇒
⇒
b
1
−1
c − d = b
d = a−b
4 .
Then using the comment following the definition of a linear transformation, we have
a−b
a + 3b 1
a
3
T
+
=T
b
1
−1
4
4
a + 3b
a−b
1
3
=
T
T
+
1
−1
4
4
a−b
a + 3b
=
(1 − 2x) +
(x + 2x2 )
4
4
a − b 2 a + 7b
a + 3b
=
x −
x+
.
2
4
4
16. Since T is linear,
T (6 + x − 4x2 ) = T (6) + T (x) + T (−4x2 ) = 6T (1) + T (x) − 4T (x2 )
= 6(3 − 2x) + (4x − x2 ) − 4(2 + 2x2 ) = 10 − 8x − 9x2 .
494
CHAPTER 6. VECTOR SPACES
Similarly,
T (a + bx + cx2 ) = T (a) + T (bx) + T (cx2 ) = aT (1) + bT (x) + cT (x2 )
= a(3 − 2x) + b(4x − x2 ) + c(2 + 2x2 ) = (3a + 2c) + (4b − 2a)x + (2c − b)x2 .
17. We do the second part first, then substitute
for a, b, and c to get the first part. We must find the
coordinates of a + bx + cx2 in the basis 1 + x, x + x2 , 1 + x2 . So we want to find r, s, t such that
a + bx + cx2 = r(1 + x) + s(x + x2 ) + t(1 + x2 ) = (r + t) + (r + s)x + (s + t)x2
r
+ t = a
r + s
= b
s + t = c
⇒
⇒
r = a+b−c
2
s = −a+b+c
2
t = a−b+c
.
2
Then using the comment following the definition of a linear transformation, we have
a+b−c
−a + b + c
a−b+c
2
2
2
T (a + bx + cx ) = T
(1 + x) +
(x + x ) +
(1 + x )
2
2
2
a+b−c
−a + b + c
a−b+c
=
T (1 + x) +
T (x + x2 ) +
T (1 + x2 )
2
2
2
a+b−c
−a + b + c
a−b+c
=
(1 + x2 ) +
(x − x2 ) +
(1 + x + x2 )
2
2
2
3a − b − c 2
= a + cx +
x
2
For the second part, substitute a = 4, b = −1, and c = 3 to get
T (4 − x + 3x2 ) = 4 + 3x +
12 + 1 − 3 2
x = 4 + 3x + 5x2 .
2
18. We do the second part first, then substitute for a, b, c, and d to get the first part. We must find the
coordinates of
a b
1 0
1 1
1 1
1 1
in the basis
,
,
,
.
c d
0 0
0 0
1 0
1 1
So we want to find r, s, t, u such that
a
c
b
1
=r
d
0
0
1
+s
0
0
1
1
+t
0
1
1
1
+u
0
1
r + s + t + u = a
r = a−b
1
s + t + u = b
s = b−c
⇒
⇒
1
t + u = c
t = c−d
u = d
u =
d.
Then using the comment following the definition of a linear transformation, we have
a b
1 0
1 1
1 1
1 1
T
= T (a − b)
+ (b − c)
+ (c − d)
+ +d
c d
0 0
0 0
1 0
1 1
1 0
1 1
1 1
1 1
= (a − b)T
+ (b − c)T
+ (c − d)T
+ +dT
0 0
0 0
1 0
1 1
= (a − b) · 1 + (b − c) · 2 + (c − d) · 3 + d · 4
= a + b + c + d.
For the second part, substitute a = 1, b = 3, c = 4, d = 2 to get
1 3
T
= 1 + 3 + 4 + 2 = 10.
4 2
6.4. LINEAR TRANSFORMATIONS
495
19. Let
T (E11 ) = a,
T (E12 ) = b,
T (E21 ) = c,
T (E22 ) = d.
Then
w x
T
= t (wE11 + xE12 + yE21 + zE22 ) = wT E11 + xT E12 + yT E21 + zT E22 = aw + bx + cy + dz.
y z
20. Writing


 
 
0
2
3
 6 = a 1 + b 0
−8
0
2
means that we must have b = −4, and then a = 6, so that if T were a linear transformation, then
 
 
 
0
2
3
T  6 = 6T 1 − 4 0 = 6(1 + x) − 4(2 − x + x2 ) = −2 + 10x − 4x2 6= −2 + 2x2 .
−8
0
2
This is a contradiction, so T cannot be linear.
21. Choose v ∈ V . Then −v = 0 − v by definition, so using part (a) of Theorem 6.14 and the definition
of a linear transformation,
T (−v) = T (0 − v) = T (0) − T (v) = 0 − T (v) = −T (v).
Pn
22. Choose v ∈ V . Then there are scalars c1 , . . . , cn such that v = i=1 ci vi since the vi are a basis.
Since T is linear, using the comment following the definition of a linear transformation we get
!
n
n
n
X
X
X
T (v) = T
ci v i =
ci T (vi ) =
ci vi = v.
i=1
i=1
i=1
Since T (v) = v for every vector v ∈ V , T is the identity transformation.
Pn
23. We must show that if p(x) ∈ Pn , then T p(x) = p0 (x). Suppose that p(x) = k=0 ak xk . Then since T
is linear, using the comment following the definition of a linear transformation we get
!
n
n
n
n
X
X
X
X
k
kak xk−1 = p0 (x).
ak T (xk ) =
ak T (xk ) = a0 T (1) +
T p(x) = T
ak x
=
k=0
k=1
k=0
k=1
24. (a) We prove the contrapositive: suppose that {v1 , . . . , vn } is linearly dependent in V ; then we have
c1 v1 + · · · + cn vn = 0,
where not all of the ci are zero. Then
T (c1 v1 + · · · + cn vn ) = c1 T (v1 ) + · · · + cn T (vn ) = T (0) = 0,
showing that {T (v1 ), . . . , T (vn )} is linearly dependent in W . Therefore if {T (v1 ), . . . , T (vn )} is
linearly independent in W , we must also have that {v1 , . . . , vn } is linearly independent in V .
(b) For example, the zero transformation from V to W , which takes every element of V to 0, shows
that the converse is false. There are many other examples as well, however. For instance, let
x
x
2
2
T :R →R :
7→
.
y
0
Then e1 and e2 are linearly independent in V , but
1
0
T (e1 ) =
and T (e2 ) =
0
0
are linearly dependent.
496
CHAPTER 6. VECTOR SPACES
25. We have
2
2
5
4 −1
=S T
=S
=
1
1
−1
0
6
x
x
2x + y
2x
−y
(S ◦ T )
=S T
=S
=
.
y
y
−y
0 2x + 2y
(S ◦ T )
Since the domain of T is R2 but the codomain of S is M22 , T ◦ S does not make sense and we cannot
compute it.
26. We have
(S ◦ T )(3 + 2x − x2 ) = S(T (3 + 2x − x2 )) = S(2 + 2 · (−1)x) = S(2 − 2x)
= 2 + (2 − 2)x + 2 · (−2)x2 = 2 − 4x2
(S ◦ T )(a + bx + cx2 ) = S(T (a + bx + cx2 )) = S(b + 2cx) = b + (b + 2c)x + 2(2c)x2
= b + (b + 2c)x + 4cx2 .
Since the domain of S is P1 , which equals the codomain of T , we can compute T ◦ S:
(T ◦ S)(a + bx) = T (S(a + bx)) = T (a + (a + b)x + 2bx2 ) = (a + b) + 2 · 2bx = (a + b) + 4bx.
27. The Chain Rule says that (f ◦ g)0 (x) = g 0 (x)f 0 (g(x)). Let g(x) = x + 1 and f (x) = p(x). Then
(S ◦ T )(p(x)) = S(T (p(x))) = S(p0 (x)) = p0 (x + 1)
(T ◦ S)(p(x)) = T (S(p(x))) = T (p(x + 1)) = T ((p ◦ g)(x)) = (p ◦ g)0 (x) = g 0 (x)p0 (g(x)) = p0 (x + 1)
since the derivative of g(x) is 1.
28. Use the Chain Rule from the previous exercise, and let g(x) = x + 1:
(S ◦ T )(p(x)) = S(T (p(x))) = S(xp0 (x)) = (x + 1)p0 (x + 1)
(T ◦ S)(p(x)) = T (S(p(x))) = T (p(x + 1)) = T ((p ◦ g)(x)) = x(p ◦ g)0 (x) = xg 0 (x)p0 (g(x)) = xp0 (x + 1)
since the derivative of g(x) is 1.
29. We must verify that S ◦ T = IR2 and T ◦ S = IR2 .
x
x
x−y
4(x − y) + (−3x + 4y)
x
(S ◦ T )
=S T
=S
=
=
y
y
−3x + 4y
3(x − y) + (−3x + 4y)
y
x
x
4x + y
(4x + y) − (3x + y)
x
(T ◦ S)
=T S
=T
=
=
.
y
y
3x + y
−3(4x + y) + 4(3x + y)
y
x
x
x
x
Thus (S ◦ T )
= (T ◦ S)
=
for any vector
, so both compositions are the identity and
y
y
y
y
we are done.
30. We must verify that S ◦ T = IP1 and T ◦ S = IP1 .
b
b
b
+ (a + 2b)x = −4 · + (a + 2b) + 2 · x = a + bx
(S ◦ T )(a + bx) = S(T (a + bx)) = S
2
2
2
2a
(T ◦ S)(a + bx) = T (S(a + bx)) = T ((−4a + b) + 2ax) =
+ ((−4a + b) + 2 · 2a)x = a + bx.
2
Thus (S ◦ T )(a + bx) = (T ◦ S)(a + bx) = a + bx for any a + bx ∈ P1 , so both compositions are the
identity and we are done.
6.4. LINEAR TRANSFORMATIONS
497
31. Suppose T 0 and T 00 are such that T 0 ◦ T = T 00 ◦ T = IV and T ◦ T 0 = T ◦ T 00 = IW . To show that
T 0 = T 00 , it suffices to show that T 0 (w) = T 00 (w) for any vector w ∈ W . But
T 0 (w) = IV ◦ T 0 (w) = T 00 ◦ T ◦ T 0 (w) = T 00 ◦ ((T ◦ T 0 )w) = T 00 ◦ IW (w) = T 00 (w).
32. (a) First, if T (v) = ±v, then clearly {v, T v} is linearly dependent. For the converse, if {v, T (v)} is
linearly dependent, then for some c, d not both zero cv + dT v = 0. Then
T (cv + dT v) = cT v + d(T ◦ T )v = cT v + dv = T 0 = 0
since T ◦ T = IV . Thus
cv + dT v = 0
cT v + dv = 0.
Subtracting these two equations gives c(T v − v) − d(T v − v) = (c − d)(T v − v) = 0. But this
means that either c = d 6= 0 or that T v = v. If c = d 6= 0, then divide through by c in the
equation cv + dT v = 0 to get v + T v = 0, so that T v = −v. Hence T v = ±v.
x
−x
(b) Let T
=
. Then
y
−y
(T ◦ T )
so that T ◦ T = IV , and
x
x
−x
−(−x)
x
=T T
=T
=
=
,
y
y
−y
−(−y)
y
x
−x
and
are linearly dependent.
y
−y
33. (a) First, if T (v) = v, then clearly {v, T v} = {v, v} is linearly dependent, and if T (v) = 0, then
{v, T v} = {v, 0} is linearly dependent. For the converse, suppose that {v, T v} is linearly dependent. Then there exist c, d, not both zero, such that
cv + dT v = 0.
As in Exercise 32, we get
T (cv + dT v) = cT v + d(T ◦ T )v = cT v + dT v = T 0 = 0
since T ◦ T = T . Thus
(c + d)T v = 0
cv +
dT v = 0.
Subtracting the two equations gives c(v − T v) = 0, so that either T v = v or c = 0. But if c = 0,
then cv + dT v = dT v = 0, so that since c and d were not both zero, d cannot be zero, so that
T v = 0. Thus either T v = v or T v = 0.
x
x
(b) For example, let T
=
. This is a projection operator, so we already know that T ◦ T = T ;
y
0
to show this via computation,
x
x
x
x
x
(T ◦ T )
=T T
=T
=
=T
.
y
y
0
0
y
Then for example T
1
1
0
=
while T
= 0.
0
0
1
498
CHAPTER 6. VECTOR SPACES
34. To show that S+T is a linear transformation, we must show it obeys the rules of a linear transformation.
Using the definition of S + T and of cT , as well as the fact that both S and T are linear, we have
(S + T )(u + v) = S(u + v) + T (u + v) = Su + Sv + T u + T v = (Su + T u) + (Sv + T v)
= (S + T )u + (S + T )v
(c(S + T ))v = (cS + cT )v = (cS)v + (cT )v = c(Sv) + c(T v) = c(Sv + T v) = c(S + T )v.
35. We show that L = L (V, W ) satisfies the ten axioms of a vector space. We use freely below the fact
that V and W are both vector spaces. Also, note that S = T in L (V, W ) simply means that Sv = T v
for all v ∈ V .
1. From Exercise 34, S, T ∈ L implies that S + T ∈ L .
2. (S + T )v = Sv + T v = T v + Sv = (T + S)v.
3. ((R + S) + T )v = (R + S)v + T v = Rv + Sv + T v = Rv + (S + T )v = (R + (S + T ))v.
4. Let Z : V → W be the map Z(v) = 0 for all v ∈ V. Then Z is the zero map, so it is linear and
thus in L , and
(S + Z)v = Sv + Zv = Sv + 0 = Sv.
5. If S ∈ L , let −S : V → W be the map (−S)v = −Sv. By Exercise 34, −S ∈ L , and
(S + (−S))v = Sv + (−S)v = Sv − Sv = 0.
6. By Exercise 34, if S ∈ L , then cS ∈ L .
7. (c(S + T ))v = c(S + T )v = c(Sv + T v) = cSv + cT v = (cS)v + (cT )v = ((cS) + (cT ))v.
8. ((c + d)S)v = (c + d)Sv = cSv + dSv = (cS)v + (cT )v = ((cS) + (cT ))v.
9. (c(dS))v = c(dS)v = c(dSv) = (cd)Sv = ((cd)S)v.
10. By Exercise 34, (1S)v = 1Sv = Sv.
36. To show that these linear transformations are equal, it suffices to show that they have the same value
on every element of V .
(a) (R ◦ (S + T ))v = R((S + T )v) = R(Sv + T v) = R(Sv) + R(T v) = (R ◦ S)v + (R ◦ T )v.
(b)
(c(R ◦ S))v = c(R ◦ S)v = cR(Sv) = (cR)(Sv) = ((cR) ◦ S)v
(c(R ◦ S))v = c(R ◦ S)v = cR(Sv) = (cR)(Sv) = R((cS)v) = (R ◦ (cS))v.
6.5
The Kernel and Range of a Linear Transformation
1. (a) Only (ii) is in ker(T ), since
1 2
1 0
(i) T
=
6= O,
−1 3
0 3
0 4
0 0
(ii) T
=
= O,
2 0
0 0
3
0
3
0
(iii) T
=
6= O.
0 −3
0 −3
(b) From the definition of T , a matrix is in range(T ) if and only if its two off-diagonal elements are
zero. Thus only (iii) is in range(T ).
2. (a) T is explicitly defined by
a
T
c
Thus
b
=a+d
d
6.5. THE KERNEL AND RANGE OF A LINEAR TRANSFORMATION
499
1 2
=1+3=4
−1 3
0 4
(ii) T
=0+0=0
2 0
1
3
(iii) T
= 1 + (−1) = 0,
0 −1
(i) T
and therefore (ii) and (iii) are in ker(T ) but (i) is not.
(b) All three of these are in range(T ). For example,
0 0
(i) T
= 0 + 0 = 0,
0 0
2 1
(ii) T
= 2 + 0 = 2,
4 0
√
√
√
2
π
= 22 + 0 = 22 .
(iii) T 2
12 0
(c) ker(T ) is the set of all matrices in M22 whose diagonal elements are negatives of each other.
range(T ) is all real numbers.
3. (a) We have
1−1
0
(i) T (1 + x) = T (1 + 1x + 0x ) =
=
6= 0,
1+0
1
0−1
−1
(ii) T (x − x2 ) = T (0 + 1x − 1x2 ) =
=
6= 0,
1 + (−1)
0
1−1
0
(iii) T (1 + x − x2 ) = T (1 + 1x − 1x2 ) =
=
= 0,
1 + (−1)
0
2
so only (iii) is in range(T ).
(b) All of them are. For example,
0
2
(i) T (1 + x − x ) =
, from part (a),
0
1−0
1
(ii) T (1) =
=
,
0+0
0
0−0
0
(iii) T (x2 ) =
=
.
0+1
1
(c) ker(T ) consists of all polynomials a + bx + cx2 such that a = b and b = −c; parametrizing via
a = t gives ker(T ) = {a + ax − ax2 }, so that ker(T ) consists of all multiples of 1 + x − x2 . range(T )
is all of R2 , since it is a subspace of R2 containing both e1 and e2 (see items (ii) and (iii) of part
(b)).
4. (a) Since xp0 (x) = 0 if and only if p0 (x) = 0, we see that only (i) is in ker(T ). For the other two, we
have T (x) = xx0 = x and T (x2 ) = x(x2 )0 = 2x2 .
(b) A polynomial in range(T ) must be divisible by x; that is, it must have no constant term. Thus
(i) is not
in range(T ). However, x = T (x) from part (a), and also from part (a), we see that
T 21 x2 = x2 . Thus (ii) and (iii) are in range(T ).
(c) ker(T ) consists of all polynomials whose derivative vanishes, so it consists of constant polynomials.
range(T ) consists of all polynomials with zero constant terms, sinceR given any
such polynomial,
we may write it as xg(x) for some polynomial g(x); then clearly T
g(x) dx = xg(x).
0 b
5. Since ker(T ) =
, a basis for ker(T ) is
c 0
0 1
0 0
,
.
0 0
1 0
500
CHAPTER 6. VECTOR SPACES
Similarly, since range(T ) =
a 0
0 d
, a basis for range(T ) is
1
0
0
0
,
0
0
0
.
1
Thus dim ker(T ) = 2 and dim range(T ) = 2. The Rank Theorem says that
4 = dim M22 = nullity(T ) + rank(T ) = dim ker(T ) + dim range(T ) = 2 + 2,
which is true.
6. Since
a
b
ker(T ) =
,
c −a
a basis for ker(T ) is
1
0
0
0
,
−1
0
1
0
,
0
1
0
0
.
From Exercise 2, range(T ) = R. Thus dim ker(T ) = 3 and dim range(T ) = 1. The Rank Theorem says
that
4 = dim M22 = nullity(T ) + rank(T ) = dim ker(T ) + dim range(T ) = 3 + 1,
which is true.
7. Since ker(T ) = a + ax − ax2 , a basis for ker(T ) is {1 + x − x2 }. From Exercise 3, range(T ) = R2 .
Thus dim ker(T ) = 1 and dim range(T ) = 2. The Rank Theorem says that
3 = dim P2 = nullity(T ) + rank(T ) = dim ker(T ) + dim range(T ) = 1 + 2,
which is true.
8. Since ker(T ) is the constant polynomials, it has a basis {1}. From Exercise 4, range(T ) is the set of
polynomials with nonzero constant terms, or {bx + cx2 }, so it has a basis {x, x2 }. Thus dim ker(T ) = 1
and dim range(T ) = 2. The Rank Theorem says that
3 = dim P2 = nullity(T ) + rank(T ) = dim ker(T ) + dim range(T ) = 1 + 2,
which is true.
9. Since
1
T
0
0
1−0
1
=
=
,
0
0−0
0
0
T
1
0
0−0
0
=
=
,
0
1−0
1
it follows that range(T ) = R2 , since it is a subspace of R2 containing both e1 and e2 . Thus rank(T ) =
dim range(T ) = 2, so that by the Rank Theorem,
nullity(T ) = dim M22 − rank(T ) = 4 − 2 = 2.
10. Since
1−0
1
T (1 − x) =
=
,
1−1
0
0
T (x) =
,
1
it follows that range(T ) = R2 , since it is a subspace of R2 containing both e1 and e2 . Thus rank(T ) =
dim range(T ) = 2, so that by the Rank Theorem,
nullity(T ) = dim P2 − rank(T ) = 3 − 2 = 1.
6.5. THE KERNEL AND RANGE OF A LINEAR TRANSFORMATION
11. If A =
a
c
501
b
, then
d
a
T (A) = AB =
c
b
1
d −1
−1
a − b −a + b
a − b −(a − b)
=
=
.
1
c − d −c + d
c − d −(c − d)
Thus T (A) = O if and only if a = b and c = d, so that
a a
ker(T ) =
,
c c
so that a basis for ker(T ) is
1
0
1
0
,
0
1
0
.
1
Hence nullity(T ) = dim ker(T ) = 2, so that by the Rank Theorem
rank(T ) = dim M22 − nullity(T ) = 4 − 2 = 2.
a
12. If A =
c
b
, then
d
T (A) = AB − BA =
a
c
b 1
d 0
1
1
−
1
0
1 a
1 c
b
a
=
d
c
a+b
a+c
−
c+d
c
So A ∈ ker(T ) if and only if c = 0 and a = d, so that
a b
1 0
0
A=
=a
+b
0 a
0 1
0
b+d
−c
=
d
0
a−d
.
c
1
.
0
These two matrices are linearly independent since they are not multiples of one another, so that
nullity(T ) = 2. Since dim M22 = 4, we see that by the Rank Theorem
rank(T ) = dim M22 − nullity(T ) = 4 − 2 = 2.
13. The derivative of p(x) = a + bx + cx2 is b + 2cx; then p0 (0) = b, so this is zero if b = 0. Thus
ker(T ) = {a + cx2 }, so that a basis for ker(T ) is {1, x2 }. Then nullity(T ) = dim ker(T ) = 2, so that by
the Rank Theorem,
rank(T ) = dim P2 − nullity(T ) = 3 − 2 = 1.
14. We have

0
T (A) = A − AT = a12 − a21
a31 − a13
a12 − a21
0
a32 − a23

a13 − a31
a23 − a32  ,
0
so that A ∈ ker(T ) if and only if a12 = a21 , a13 = a31 , and a23 = a32 ; that is, if and only if A is
symmetric. By Exercise 40 in Section 6.2, nullity(T ) = dim ker(T ) = 3·4
2 = 6. Then by the Rank
Theorem,
rank(T ) = dim M33 − nullity(T ) = 9 − 6 = 3.
15. (a) Yes, T is one-to-one. If
T
a
2a − b
0
=
=
,
b
a + 2b
0
then 2a − b = 0 and a + 2b = 0. The
only solution to this pair of equations is a = b = 0, so that
0
the only element of the kernel is
and thus T is one-to-one.
0
(b) Since T is one-to-one and the dimension of the domain and codomain are both 2, T must be onto
by Theorem 6.21.
502
CHAPTER 6. VECTOR SPACES
16. (a) Yes, T is one-to-one. If
T
a
= (a − 2b) + (3a + b)x + (a + b)x2 = 0,
b
then a + b = 0 from the third term. Since 3a + b= 0 as well, subtracting gives 2a = 0, so that
a
a = 0. The third term then forces b = 0. Thus
= 0 and T is one-to-one.
b
(b) T cannot be onto. R2 has dimension 2, so by the rank theorem 2 = dim R2 = rank(T ) +
nullity(T ) ≥ rank(T ). It follows that the dimension of the range of T is at most 2 (in fact, since
from part (a) nullity(T ) = 0, the dimension of the range is equal to 2). But dim P2 = 3, so T is
not onto.
17. (a) If T (a + bx + cx2 ) = 0, then
2a − b
=0
a + b − 3c = 0
−a
Row-reducing the augmented matrix gives

2 −1
0
 1
1 −3
−1
0
1
+ c = 0.


0
1
0 → 0
0
0
0
1
0
−1
−2
0

0
0 .
0
This has nontrivial solutions a = t, b = 2t, c = t, so that
ker(T ) = {t + 2tx + tx2 } = {t(1 + 2x + x2 )}.
So T is not one-to-one.
(b) Since dim P2 = dim R3 = 3, the fact that T is not one-to-one implies, by Theorem 6.21, that T
is not onto.
18. (a) By Exercise 10, nullity(T ) = 1, so that ker(T ) 6= {0}, so that T is not one-to-one.
(b) By Exercise 10, range(T ) = R2 , so that T is onto by definition.
 
a
19. (a) If T  b  = O, then a − b = a + b = 0 and b − c = b + c = 0; these equations give a = b = c = 0 as
c
the only solution. Thus T is one-to-one.
(b) By the Rank Theorem,
dim range(T ) = dim R3 − dim ker(T ) = 3 − 0 = 3.
However, dim M22 = 4 > 3 = dim range(T ), so that T cannot be onto.
 
a
20. (a) If T  b  = O, then a + b + c = 0, b − 2c = 0, and a − c = 0. The latter two equations give a = c
c
and b = 2c, so that from the first equation 4c = 0 and thus c = 0. It follows that a = b = 0. Thus
T is one-to-one.
3
(b) By Exercise 40 in Section 6.2, dim W = 2·3
2 = 3 = dim R . Since the dimensions of the domain
and codomain are equal and T is one-to-one, Theorem 6.21 implies that T is also onto.
6.5. THE KERNEL AND RANGE OF A LINEAR TRANSFORMATION
503
21. These vector spaces are isomorphic. Bases for the two vector spaces are
D3 : {E11 , E22 , E33 },
R3 : {e1 , e2 , e3 }.
Since both have dimension 3, we know that D3 ∼
= R3 . Define
T (aE11 + bE22 + cE33 ) = a e1 + b e2 + c e3 .
Then T is a linear transformation from the way we have defined it. Since the ei are linearly independent,
it follows that a e1 + b e2 + c e3 = 0 if and only if a = b = c = 0. Thus ker(T ) = {0} so that T is
one-to-one. Since dim D3 = dim R3 , we know that T is onto as well, so it is an isomorphism.
22. By Exercise 40 in Section 6.2, dim S3 = 3·4
2 = 6. Since an upper triangular matrix has three entries
that are forced to be zero (those below the diagonal), and the other six may take any value, we see
that dim U3 = 6 as well. Thus S3 ∼
= U3 . Define




a11 a12 a13
a11 a12 a13
T : S3 → D3 : a21 a22 a23  7→  0 a22 a23  .
a31 a32 a33
0
0 a33
Then any matrix in ker(T ) must have a11 = a12 = a13 = a22 = a23 = a33 = 0, so that it must
itself be the zero matrix. Thus ker(T ) = {0} so that T is one-to-one. Since T is one-to-one and
dim S3 = dim U3 = 6, it follows that T is onto and is therefore an isomorphism.
2·3
0
23. By Exercises 40 and 41 in Section 6.2, dim S3 = 3·4
2 = 6 while dim S3 = 2 = 3. Since their dimensions
0
∼
are unequal, S3 =
6 S3 .
24. Since a basis for P2 is {1, x, x2 }, we have dim P2 = 3. If p(x) ∈ W , say p(x) = a + bx + cx2 + dx3 ,
then p(0) = 0 implies that a = 0, so that p(x) = bx + cx2 + dx3 ; a basis for this subspace is {x, x2 , x3 }
so that dim W = 3 as well. Therefore P2 ∼
= W . Define
T : P2 → W : a + bx + cx2 7→ ax + bx2 + cx3 .
Then T (a + bx + cx2 ) = ax + bx2 + cx3 = 0 implies that a = b = c = 0. Thus T is one-to-one; since
the dimensions of domain and codomain are equal, T is also onto, so it is an isomorphism.
25. A basis for C as a vector space over R is {1, i}, so that dim C = 2. A basis for R2 is {e1 , e2 }, so that
dim R2 = 2. Thus C ∼
= R2 . Define
a
2
T : C → R : a + bi 7→
.
b
Then T (a + bi) = 0 means that a = b = 0. Thus T is one-to-one; since the dimensions of domain and
codomain are equal, T is also onto, so it is an isomorphism.
26. V consists of all 2 × 2 matrices whose upper-left and lower-right entries are negatives of each other, so
that
a
b
V =
.
c −a
Thus a basis for V is
1
0
0
0
,
−1
0
1
0
,
0
1
0
,
0
and therefore dim V = 3. Since dim W = dim R2 = 2, these vector spaces are not isomorphic.
27. Since the dimensions of the domain and codomain are the same, it suffices to show that T is one-to-one.
Let p(x) = a0 + a1 x + · · · + an xn . Then
p(x) + p0 (x) = (a0 + a1 x + · · · + an−1 xn−1 + an xn ) + (a1 + 2a2 x + · · · + nan xn−1 )
= (a0 + a1 ) + (a1 + 2a2 )x + · · · + (an−1 + nan )xn−1 + an xn .
504
CHAPTER 6. VECTOR SPACES
If p(x) + p0 (x) = 0, then the coefficient of xn is zero, so that an = 0. But then the coefficient of xn−1
is 0 = an−1 + nan = an−1 , so that an−1 = 0. We continue this process, determining that ai = 0 for all
i. Thus p(x) = 0, so that T is one-to-one and is thus an isomorphism.
28. Since the dimensions of the domain and codomain are the same, it suffices to show that T is one-to-one.
Now, T (xk ) = (x − 2)k . Suppose that
T (a0 + a1 x + · · · + an xn ) = a0 + a1 (x − 2) + · · · + an (x − 2)n = 0.
If we show that {1, x − 2, (x − 2)2 , . . . , (x − 2)n } is a basis for Pn , then it follows that all the ai are
zero, so that ker(T ) = {0} and T is one-to-one. To prove they form a basis, it suffices to show they
are linearly independent, since there are n + 1 elements in the set and dim Pn = n + 1. We do this
by induction. For n = 0, this is clear. For n = 1, suppose that a + b(x − 2) = (a − 2b) + bx = 0.
Then b = 0 and thus a = 0, so that linear independence for n = 1 is established. Now suppose that
{1, x − 2, . . . , (x − 2)k } is a linearly independent set, and suppose that
a0 + a1 (x − 2) + · · · + ak−1 (x − 2)k−1 + ak (x − 2)k = 0.
If this polynomial were expanded, the only term involving xk comes from the last term of the sum,
and it is ak xk . Therefor ak = 0, and then the other ai = 0 by induction. Thus this set forms a basis
for Pn , so that T is one-to-one and thus is an isomorphism.
29. Since the dimensions of the domain and
codomain are the same, it suffices to show that T is one-to-one.
That is, we want to show that xn p x1 = 0 implies that p(x) = 0. Suppose
p(x) = a0 + a1 x + · · · + an xn .
Then
1
1
1
n
x p
= x a0 + a1
+ · · · + an
= a0 xn + a1 xn−1 + · · · + an .
x
x
xn
Therefore if xn p x1 = 0, all of the ai are zero so that p(x) = 0. Thus T is one-to-one and is an
isomorphism.
n
30. (a) As the hint suggests, define a map T : C [0, 1] → C [2, 3] by (T (f ))(x) = f (x − 2) for f ∈ C [0, 1].
Thus for example (T (f ))(2.5) = f (2.5 − 2) = f (0.5). T is linear, since
(T (f + g))(x) = (f + g)(x − 2) = f (x − 2) + g(x − 2) = (T (f ))(x) + (T (g))(x)
(T (cf ))(x) = (cf )(x − 2) = cf (x − 2) = c(T (f ))(x).
We must show that T is one-to-one and onto. First suppose that T (f ) = 0; that is, suppose that
f ∈ C [0, 1] and that f (x − 2) = 0 for all x ∈ [2, 3]. Let y = x − 2; then x ∈ [2, 3] means y ∈ [0, 1],
and we get f (y) = 0 for all y ∈ [0, 1]. But this means that f = 0 in C [0, 1]. Thus T is one-to-one.
To see that it is onto, choose f ∈ C [2, 3], and let g(x) = f (x + 2). Then
(T (g))(x) = g(x − 2) = f (x + 2 − 2) = f (x),
so that T (g) = f and thus T is onto. Since T is one-to-one and onto, it is an isomorphism.
(b) This is similar to part (a). Define a map T : C [0, 1] → C [a, a + 1] by (T (f ))(x) = f (x − a) for
f ∈ C [0, 1]. T is proved linear in a similar fashion to part (a). We must show that T is one-to-one
and onto. First suppose that T (f ) = 0; that is, suppose that f ∈ C [0, 1] and that f (x − a) = 0
for all x ∈ [a, a + 1]. Let y = x − a; then x ∈ [a, a + 1] means y ∈ [0, 1], and we get f (y) = 0 for
all y ∈ [0, 1]. But this means that f = 0 in C [0, 1]. Thus T is one-to-one. To see that it is onto,
choose f ∈ C [a, a + 1], and let g(x) = f (x + a). Then
(T (g))(x) = g(x − a) = f (x + a − a) = f (x),
so that T (g) = f and thus T is onto. Since T is one-to-one and onto, it is an isomorphism.
6.5. THE KERNEL AND RANGE OF A LINEAR TRANSFORMATION
505
31. This is similar to Exercise 30. Define a map
T : C [0, 1] → C [0, 2] by (T (f ))(x) = f 21 x for f ∈ C [0, 1].
Thus for example (T (f ))(1) = f 12 · 1 = f 21 . T is proved linear in a similar fashion to part (a)
of Exercise 30. We must show that T is one-to-one and onto. First suppose that T (f ) = 0; that is,
suppose that f ∈ C [0, 1] and that f 12 x = 0 for all x ∈ [0, 2]. Let y = 12 x; then x ∈ [0, 2] means
y ∈ [0, 1], and we get f (y) = 0 for all y ∈ [0, 1]. But this means that f = 0 in C [0, 1]. Thus T is
one-to-one. To see that it is onto, choose f ∈ C [0, 2], and let g(x) = f (2x). Then
1
1
(T (g))(x) = g
x = f 2 · x = f (x),
2
2
so that T (g) = f and thus T is onto. Since T is one-to-one and onto, it is an isomorphism.
32. This is again similar to Exercises
30 and 31,
and is the most general form. Define a map T : C [a, b] →
C [c, d] by (T (f ))(x) = f
b−a
d−c (x − c) + a
for f ∈ C [a, b]. Then (T (f ))(c) = f (a) and (T (f ))(d) =
b − a + a = b. T is proved linear in a similar fashion to part (a) of Exercise 30. We must show that
T is one-to-one and
onto. First suppose that T (f ) = 0; that is, suppose that f ∈ C [a, b] and that
f
b−a
d−c (x − c) + a
= 0 for all x ∈ [c, d]. Let y = b−a
d−c (x − c) + a; then x ∈ [c, d] means y ∈ [a, b], and
we get f (y) = 0 for all y ∈ [a, b]. But this means that
f = 0 in C [a,
b]. Thus T is one-to-one. To see
that it is onto, choose f ∈ C [c, d], and let g(x) = f
(T (g))(x) = g
d−c
b−a (x − a) + c
. Then
d−c
b−a
b−a
(x − c) + a = f
(x − c) + a − a + c = f (x),
d−c
b−a
d−c
so that T (g) = f and thus T is onto. Since T is one-to-one and onto, it is an isomorphism.
33. (a) Suppose that (S ◦ T )(u) = 0. Then (S ◦ T )(u) = S(T u) = 0. Since S is one-to-one, we must have
T u = 0. But then since T is one-to-one, it follows that u = 0. Thus S ◦ T is one-to-one.
(b) Choose w ∈ W . Since S is onto, we can find some v ∈ V such that Sv = w. But T is onto as
well, so there is some u ∈ U such that T u = v. Then
(S ◦ T )(u) = S(T u) = Sv = w,
so that S ◦ T is onto.
34. (a) Suppose that T u = 0. Then since S0 = 0, we have
(S ◦ T )u = S(T u) = S0 = 0.
But S ◦ T is one-to-one, so we must have u = 0, so that T is one-to-one as well. Intuitively, if T
takes two elements of U to the same element of V , then certainly S ◦ T takes those two elements
of U to the same element of W .
(b) Choose w ∈ W . Since S ◦ T is onto, there is some u ∈ U such that (S ◦ T )u = w. Let v = T u.
Then
Sv = S(T u) = (S ◦ T )u = w,
so that S is onto. Intuitively, if S ◦ T “hits” every element of W , then surely S must.
35. (a) Suppose that T : V → W and that dim V < dim W . Then range(T ) is a subspace of W , and the
Rank Theorem asserts that
dim range(T ) = dim V − nullity(T ) ≤ dim V < dim W.
Theorem 6.11(b) in Section 6.2 states that if U is a subspace of W , then dim U = dim W if and
only if U = W . But here dim range(T ) 6= dim W , and range(T ) is a subspace of W , so that
range(T ) 6= W and T cannot be onto.
506
CHAPTER 6. VECTOR SPACES
(b) We want to show that ker(T ) 6= {0}. Since range(T ) is a subspace of W , then dim W ≥
dim range(T ) = rank(T ). Then the Rank Theorem asserts that
dim W + nullity(T ) ≥ rank(T ) + nullity(T ) = dim V,
so that dim ker(T ) = nullity(T ) ≥ dim V − dim W > 0. Thus ker(T ) 6= {0}, so that T is not
one-to-one.
36. Since dim Pn = dim Rn+1 = n + 1, we need only show that T is a one-to-one linear transformation. It
is linear since

 
 

 
q(a0 )
p(a0 )
p(a0 ) + q(a0 )
(p + q)(a0 )
 (p + q)(a1 )   p(a1 ) + q(a1 )   p(a1 )   q(a1 ) 

 
 

 
(T (p + q))(x) = 
 =  ..  +  ..  = (T (p))(x) + (T (q))(x)
=
..
..
  .   . 

 
.
.
q(an )
p(an )
p(an ) + q(an )
(p + q)(an )




 
p(a0 )
cp(a0 )
(cp)(a0 )
 p(a1 ) 
 (cp)(a1 )   cp(a1 ) 




 
(T (cp))(x) = 
 =  ..  = c  ..  = c(T (p))(x).
..





. 
.
.
p(an )
cp(an )
(cp)(an )
Now suppose T (p) = 0. That means that
p(a0 ) = p(a1 ) = · · · = p(an ) = 0,
so that p has n + 1 zeroes since the ai are distinct. But Exercise 62 in Section 6.2 implies that p(x)
must be the zero polynomial, so that ker(T ) = {0} and thus T is one-to-one.
37. By the Rank Theorem, since T : V → V and T 2 : V → V , we have
rank(T ) + nullity(T ) = dim V = rank(T 2 ) + nullity(T 2 ).
Since rank(T ) = rank(T 2 ), it follows that nullity(T ) = nullity(T 2 ), and thus that dim ker(T ) =
dim ker(T 2 ). So if we show ker(T ) ⊆ ker(T 2 ), we can conclude that ker(T ) = ker(T 2 ). Choose
v ∈ ker(T ); then T 2 v = T (T v) = T 0 = 0. Hence ker(T ) = ker(T 2 ).
Now choose v ∈ range(T ) ∩ ker(T ); we want to show v = 0. Since v ∈ range(T ), there is w ∈ V such
that v = T w. Then since v ∈ ker(T ),
0 = T v = T (T w) = T 2 w,
so that w ∈ ker(T 2 ) = ker(T ). So w is also in ker(T ). As a result, v = T w = 0, as we were to show.
38. (a) We have
T ((u1 , w1 ) + (u2 , w2 )) = T (u1 + u2 , w1 + w2 ) = u1 + u2 − w1 − w2
= (u1 − w1 ) + (u2 − w2 ) = T (u1 , w1 ) + T (u2 , w2 )
T (c(u, w)) = T (cu, cw) = cu − cw = c(u − w) = cT (u, w).
(b) By definition, U + W = {u + w : u ∈ U, w ∈ W }. So if u + w ∈ U + W , we have
T (u, −w) = u + w,
so that U + W ⊂ range(T ). But since T (u, w) = u − w ∈ U + W , we also have range(T ) ⊂ U + W .
Thus range(T ) = U + W .
6.6. THE MATRIX OF A LINEAR TRANSFORMATION
507
(c) Suppose v = (u, w) ∈ ker(T ). Then u − w = 0, so that u = w and thus u ∈ U ∩ W , and
v = (u, u). So
ker(T ) = {(u, u) : u ∈ U } ⊂ U × W
is a subspace of U × W . Then
S : U ∩ W → ker(T ) : u 7→ (u, u)
is clearly a linear transformation that is one-to-one and onto. Thus S is an isomorphism between
U ∩ W and ker T .
(d) From Exercise 43(b) in Section 6.2, dim(U × W ) = dim U + dim W . Then the Rank Theorem
gives
rank(T ) + nullity(T ) = dim(U × W ) = dim U + dim W.
But from parts (a) and (b), we have
rank(T ) = dim range(T ) = dim(U + W ),
nullity(T ) = dim ker(T ) = dim(U ∩ W ).
Thus
rank(T ) + nullity(T ) = dim(U + W ) + dim(U ∩ W ) = dim U + dim W.
Rearranging this equation gives dim(U + W ) = dim U + dim W − dim(U ∩ W ).
6.6
The Matrix of a Linear Transformation
1. Computing T (v) directly gives
T (v) = T (4 + 2x) = 2 − 4x.
To compute the matrix [T ]C←B , we compute the value of T on each basis element of B:
[T (1)]C = [0 − x]C =
0
,
−1
[T (x)]C = [1 − 0x]C =
1
0
⇒
[T ]C←B =
Then
0
[T ]C←B [4 + 2x]B =
−1
[T (1)]C
[T (x)]C =
0
−1
1
.
0
1 4
2
=
= [2 − 4x]C = [T (4 + 2x)]C .
0 2
−4
2. Computing T (v) directly gives (as in Exercise 1)
T (v) = T (4 + 2x) = 2 − 4x.
In terms of the basis B, we have 4 + 2x = 3(1 + x) + (1 − x), so that
3
[4 + 2x]B =
.
1
Now,
1
[T (1 + x)]C = [1 − x]C =
,
−1
Then
−1
[T (1 − x)]C = [−1 − x]C =
, ⇒
−1
1
[T ]C←B = [T (1 + x)]C [T (1 − x)]C =
−1
[T ]C←B [4 + 2x]B =
1
−1
−1 3
2
=
= [2 − 4x]C = [T (4 + 2x)]C .
−1 1
−4
−1
.
−1
508
CHAPTER 6. VECTOR SPACES
3. Computing T (v) directly gives
T (v) = T (a + bx + cx2 ) = a + b(x + 2) + c(x + 2)2 .
In terms of the standard basis B, we have
 
a
[a + bx + cx2 ]B =  b  .
c
Now,
 
1
[T (1)]C = [1]C = 0 ,
0
 
0
[T (x)]C = [x + 2]C = 1 ,
0
[T ]C←B =
 
0
[T (x2 )]C = [(x + 2)2 ]C = 0
1
[T (1)]C
[T (x)]C
[T (x2 )]C
⇒

1
= 0
0
0
1
0

0
0 .
1
Then

1
[T ]C←B [a + bx + cx2 ]B = 0
0
0
1
0
   
a
a
0
0  b  =  b  = [a + b(x + 2) + c(x + 2)2 ]C = [T (a + bx + cx2 )]C .
c
c
1
4. Computing T (v) directly gives
T (v) = T (a + bx + cx2 ) = a + b(x + 2) + c(x + 2)2 = (a + 2b + 4c) + (b + 4c)x + cx2 .
To compute [a + bx + cx2 ]B , we want to solve
a + bx + cx2 = r + s(x + 2) + t(x + 2)2 = (r + 2s + 4t) + (s + 4t)x + tx2
for r, s, and t. The x2 terms give t = c; then from the linear term b = s + 4c, so that s = b − 4c; finally,
a = r + 2(b − 4c) + 4c = r + 2b − 4c, so that r = a − 2b + 4c. Hence


a − 2b + 4c
[a + bx + cx2 ]B =  b − 4c  .
c
Now,
 
1
[T (1)]C = [1]C = 0 ,
0
 
4
[T (x + 2)]C = [x + 4]C = 1 ,
0
 
16
[T ((x + 2)2 )]C = [(x + 4)2 ]C = [x2 + 8x + 16]C =  8  ,
1
so that
[T ]C←B =
[T (1)]C
[T (x + 2)]C
[T ((x + 2)2 )]C

1
= 0
0
4
1
0

16
8 .
1
6.6. THE MATRIX OF A LINEAR TRANSFORMATION
509
Then



1 4 16 a − 2b + 4c
[T ]C←B [a + bx + cx2 ]B = 0 1 8   b − 4c 
0 0 1
c


(a − 2b + 4c) + 4(b − 4c) + 16c

(b − 4c) + 8c
=
c


a + 2b + 4c
=  b + 4c 
c
= [(a + 2b + 4c) + (b + 4c)x + cx2 ]C = [T (a + bx + cx2 )]C .
5. Directly,
T (a + bx + cx2 ) =
a
.
a+b+c
Using the standard basis B, we have
 
a
2

[a + bx + cx ]B = b  .
c
Now,
[T (1)]C =
1
1
=
,
1
1 C
[T (x)]C =
0
0
=
,
1
1 C
[T (x2 )]C =
so that
[T ]C←B =
[T (1)]C
[T (x)]C
2
[T (x )]C
1
=
1
0
1
0
0
=
,
1
1 C
0
.
1
Then
1
[T ]C←B [a + bx + cx2 ]B =
1
 
a
a
a
0 0  
b =
= [T (a + bx + cx2 )]C .
=
a+b+c
a+b+c C
1 1
c
6. Directly, as in Exercise 5,
T (a + bx + cx2 ) =
a
.
a+b+c
Using the basis B, we have
 
c
[a + bx + cx2 ]B =  b  .
a
Now,
[T (x2 )]C =
0
−1
=
,
1 C
1
[T (x)]C =
0
−1
=
,
1 C
1
so that
[T ]C←B =
[T (x2 )]C
[T (x)]C
[T (1)]C =
−1
[T (1)]C =
1
1
0
=
,
1 C
1
−1 0
.
1 1
Then
−1
2
[T ]C←B [a + bx + cx ]B =
1
 
c
−1 0  
−(b + c)
a
b =
=
= [T (a + bx + cx2 )]C .
a+b+c
a+b+c C
1 1
a
510
CHAPTER 6. VECTOR SPACES
7. Computing directly gives

  
−7 + 2 · 7
7
−7
Tv = T
=  −(−7)  = 7 .
7
7
7
−7
We first compute the coordinates of
in B. We want to solve
7
−7
1
3
=r
+s
7
2
−1
for r and s; that is, we want r and s such that −7 = r + 3s and 7 = 2r − s. Solving this pair of
equations gives r = 2 and s = −3. Thus
−7
2
=
.
7 B
−3
Now,
 
 
5
6
1
= −1 = −3 ,
T
2
C
2 C
2
so that
1
T
[T ]C←B =
2
C
Then

6
−7
[T ]C←B
= −3
7 B
2
 
 
1
4
3
= −3 = −2 ,
T
−1
C
−1 C
−1

6
3
T
= −3
−1
C
2

4
−2 .
−1

   
4 7
0
−7
2





−2
.
= 0 = 7 = T
7 C
−3
−1
7 C
7
8. Computing directly gives (as in the statement of Exercise 7)


a + 2b
a
Tv = T
=  −a  .
b
b
a
We first compute the coordinates of
in B. We want to solve
b
a
1
3
=r
+s
b
2
−1
for r and s; that is, we want r and s such that a = r + 3s and b = 2r − s. Solving this pair of equations
gives r = 71 (a + 3b) and s = 17 (2a − b). Thus
" #
"
#
1
a
(a + 3b)
7
= 1
.
b
7 (2a − b)
B
Now,
 
 
6
5
1
T
= −1 = −3 ,
2
C
2 C
2
so that
1
T
[T ]C←B =
2
C
 
 
1
4
3
T
= −3 = −2 ,
−1
C
−1 C
−1

6
3
T
= −3
−1
C
2

4
−2 .
−1
6.6. THE MATRIX OF A LINEAR TRANSFORMATION
511
Then


" #
#
6
4 "1
a
 7 (a + 3b)

[T ]C←B
= −3 −2 1
b
7 (2a − b)
B
2 −1


4
6
7 (a + 3b) + 7 (2a − b)


= − 37 (a + 3b) − 27 (2a − b)
1
2
7 (a + 3b) − 7 (2a − b)

 

" " ##
a + 2b
2a + 2b
a

 

.
=  −a − b  =  −a  = T
b
C
b
b
C
9. Computing directly gives
T v = T A = AT =
a
b
c
.
d
Since A = aE11 + bE12 + cE21 + dE22 , the coordinate vector of A in B is
 
a
b

[A]B = 
c .
d
Now,
 
 
1
1




0
T
 = 0 ,
[T E11 ]C = [E11
]C = [E11 ]C = 
0
0
0
0 C
 
0

0
0
0
T
=  ,
[T E12 ]C = [E12
]C = [E21 ]C =
1 0 C 1
0
 
0

1
0
1
T
=  ,
[T E21 ]C = [E21
]C = [E12 ]C =
0 0 C 0
0
 
0

0
0
0
T
.
[T E22 ]C = [E22
]C = [E22 ]C =
=
0 1 C 0
1
Therefore
[T ]C←B =
Then
[T E11 ]C

1
0
[T ]C←B [A]B = 
0
0
[T E12 ]C
0
0
1
0
0
1
0
0
[T E21 ]C

1
0
[T E22 ]C = 
0
0
   
0
a
a
b c
0
  =   = a
0  c   b 
b
1
d
d
0
0
1
0
0
1
0
0

0
0
.
0
1
c
= [T A]C .
d C
512
CHAPTER 6. VECTOR SPACES
10. Computing directly gives
a
Tv = TA = A =
b
T
c
.
d
Since A = aE11 + bE12 + cE21 + dE22 , the coordinate vector of A in B is
 
d
c

[A]B = 
b .
a
Now,
 
0
0
0
=  ,
1 C 1
0
 
1

0
1
0
T
=  ,
[T E21 ]C = [E21
]C = [E12 ]C =
0 0 C 0
0
 
0

0
0
1
T
[T E12 ]C = [E12
]C = [E21 ]C =
=  ,
1 0 C 0
0
 
 
0
1




0
T

0
[T E11 ]C = [E11
]C = [E11 ]C = 
0 = 0 .
1
0 C
0
T
[T E22 ]C = [E22 ]C = [E22 ]C =
0
Therefore
[T ]C←B =
[T E22 ]C
[T E21 ]C
[T E12 ]C

0
0
[T E11 ]C = 
1
0
1
0
0
0
0
1
0
0

0
0
.
0
1
Then

0
0
[T ]C←B [A]B = 
1
0
1
0
0
0
0
1
0
0
   
0
d
c
c b
0
  =   = a
0  b   d 
b
1
a
a
c
= [T A]C .
d C
11. Computing directly gives
Tv = TA =
a
c
b
1
d −1
−1
1
−
1
−1
−1 a b
1 c d
a−b
=
c−d
b−a
a−c b−d
c−b
−
=
d−c
c−a d−b
a−d
Since A = aE11 + bE12 + cE21 + dE22 , the coordinate vector of A in B is
 
a
b

[A]B = 
c .
d
d−a
.
b−c
6.6. THE MATRIX OF A LINEAR TRANSFORMATION
513
Now,
 
0
−1
−1

,
=
0 C  1
0
 
−1
 0
−1 0

,
[T E12 ]C = [E12 B − BE12 ]C =
=
0 1 C  0
1
 
1
 0
1
0
[T E21 ]C = [E21 B − BE21 ]C =
=  ,
0 −1 C  0
−1
 
0
 1
0 1
[T E22 ]C = [E22 B − BE22 ]C =
=  .
−1 0 C −1
0
0
[T E11 ]C = [E11 B − BE11 ]C =
1
Therefore

[T ]C←B =
[T E11 ]C
[T E12 ]C
[T E21 ]C
0
−1
[T E22 ]C = 
 1
0

−1
1
0
0
0
1
.
0
0 −1
1 −1
0
Then

0
−1

[T ]C←B [A]B = 
1
0

  
c−b
−1
1
0 a

  
0
0
1
  b  = d − a = c − b
a−d
0
0 −1  c  a − d
b−c
d
1 −1
0
d−a
= [T A]C .
b−c C
12. Computing directly gives
a
Tv = TA =
c
b
a
−
d
b
c
0
b−c
=
.
d
c−b
0
Using the given basis B for M22 , the coordinate vector of A is
 
a
b

[A]B = 
c .
d
Now,
 
0
0
T

[T E11 ]C = [E11 − E11 ]C = 
0 ,
0


0
 1
T

[T E12 ]C = [E12 − E12
]C = [E12 − E21 ]C = 
−1 ,
0
 
 
0
0
−1
0
T
T

 
[T E21 ]C = [E21 − E21 ]C = [E21 − E12 ]C = 
 1 , [T E22 ]C = [E22 − E22 ]C = 0 .
0
0
514
CHAPTER 6. VECTOR SPACES
Therefore
[T ]C←B =
Then
[T E22 ]C
[T E21 ]C

0
0
[T E11 ]C = 
0
0
[T E12 ]C

0
0 0
1 −1 0
.
−1
1 0
0
0 0

  
0
0
0 0 a

  
0
b−c
1 −1 0
  b  = b − c =
= [T A]C .
c−b
0 C
−1
1 0  c  c − b
0
d
0
0 0

0
0
[T ]C←B [A]B = 
0
0
13. (a) If a sin x + b cos x ∈ W , then
D(a sin x + b cos x) = D(a sin x) + D(b cos x) = aD sin x + bD cos x = a cos x − b sin x ∈ W.
(b) Since
[sin x]B =
1
,
0
[cos x]B =
0
,
1
then
[D sin x]B = [cos x]B =
0
,
1
[D cos x]B = [− sin x]B =
−1
,
0
⇒
[D]B =
0
1
−1
.
0
3
(c) [f (x)]B = [3 sin x − 5 cos x]B =
, so
−5
[D(f (x))]B = [D]B [f (x)]B =
0
1
−1
3
5
=
,
0 −5
3
so that D(f (x)) = f 0 (x) = 5 sin x + 3 cos x. Directly, since (sin x)0 = cos x and (cos x)0 = − sin x,
we also get f 0 (x) = 3 cos x − 5(− sin x) = 5 sin x + 3 cos x.
14. (a) If ae2x + be−2x ∈ W , then
D(ae2x + be−2x ) = D(ae2x ) + D(be−2x ) = aDe2x + bDe−2x = (2a)e2x + (−2b)e−2x ∈ W.
(b) Since
1
[e ]B =
,
0
2x
[e
−2x
0
]B =
,
1
then
[De2x ]B = [2e2x ]B =
2
,
0
(c) [f (x)]B = [e2x − 3e−2x ]B =
[De−2x ]B = [−2e−2x ]B =
0
,
−2
⇒
[D]B =
2
0
0
.
−2
1
, so
−3
2
[D(f (x))]B = [D]B [f (x)]B =
0
0
1
2
=
,
−2 −3
6
so that D(f (x)) = f 0 (x) = 2e2x + 6e−3x . Directly, f 0 (x) = e2x · 2 − 3e−2x · (−2) = 2e2x + 6e−3x .
15. (a) Since
 
1
[e2x ]B = 0 ,
0
 
0
[e2x cos x]B = 1 ,
0
 
0
[e2x sin x]B = 0 ,
1
6.6. THE MATRIX OF A LINEAR TRANSFORMATION
515
then
 
2
[De2x ]B = [2e2x ]B = 0 ,
0


0
[D(e2x cos x)]B = [2e2x cos x − e2x sin x]B =  2 ,
−1
 
0
[D(e2x sin x)]B = [2e2x sin x + e2x cos x]B = 1 .
2
Thus

2
[D]B = 0
0

0 0
2 1 .
−1 2

3
(b) [f (x)]B = [3e2x − e2x cos x + 2e2x sin x]B = −1, so
2

   
3
6
0 0
2 1 −1 = 0 ,
−1 2
2
5

2
[D(f (x))]B = [D]B [f (x)]B = 0
0
so that D(f (x)) = f 0 (x) = 6e2x + 5e2x sin x. Directly,
f 0 (x) = 3e2x · 2 − (2e2x cos x − e2x sin x) + (4e2x sin x + 2e2x cos x) = 6e2x + 5e2x sin x.
16. (a) Since
 
1
0

[cos x]B = 
0 ,
0
 
0
1

[sin x]B = 
0 ,
0
 
0
0

[x cos x]B = 
1 ,
0
 
0
0

[x sin x]B = 
0 ,
1
then


 
0
1
−1
0

 
[D cos x]B = [− sin x]B = 
 0 , [D sin x]B = [cos x]B = 0 ,
0
0
 
 
1
0
 0
1

 
[D(x cos x)]B = [cos x − x sin x]B = 
 0 , [D(x sin x)]B = [sin x + x cos x]B = 1 .
−1
0
Thus

0
−1
[D]B = 
 0
0
1
0
0
0

1 0
0 1
.
0 1
−1 0
516
CHAPTER 6. VECTOR SPACES
 
1
0

, so
(b) [f (x)]B = [cos x + 2x cos x]B =  
2
0

0
−1
[D(f (x))]B = [D]B [f (x)]B = 
 0
0
1
0
0
0
   
2
1 0 1
0 −1
0 1
  =  ,
0 1 2  0
−2
−1 0 0
so that D(f (x)) = f 0 (x) = 2 cos x − sin x − 2x sin x. Directly,
f 0 (x) = − sin x + 2 cos x − 2x sin x = 2 cos x − sin x − 2x sin x.
17. (a) Let p(x) = a + bx; then
(S ◦ T )p(x) = S(T p(x)) = S
Then
p(0)
a
a − 2(a + b)
−a − 2b
=S
=
=
.
p(1)
a+b
2a − (a + b)
a−b
−1
−1
=
,
[(S ◦ T )(1)]D =
1
1 D
so that
−2
−2
=
,
[(S ◦ T )(x)]D =
−1
−1 D
−1
[S ◦ T ]D←B =
1
−2
.
−1
(b) We have
[T (1)]C =
1
1
=
,
1 C
1
so that
[T ]C←B =
[T (x)]C =
1
1
0
0
=
,
1 C
1
0
.
1
Next,
1
1
1
S
=
=
,
2
0 D
2 D
so that
−2
0
−2
=
,
S
=
−1
1 D
−1 D
1
[S]D←C =
2
Then
[S ◦ T ]D←B = [S]D←C [T ]C←B =
1
2
−2
.
−1
−2 1
−1 1
0
−1
=
1
1
−2
,
−1
which matches the result from part (a).
18. (a) Let p(x) = a + bx; then
(S ◦ T )p(x) = S(T p(x)) = S(a + b(x + 1)) = S((a + b) + bx) = (a + b) + b(x + 1) = (a + 2b) + bx.
Then
so that
 
 
1
1
[(S ◦ T )(1)]D = 0 = 0 ,
0 D
0
 
 
2
2
[(S ◦ T )(x)]D = 1 = 1 ,
0 D
0

1
[S ◦ T ]D←B = 0
0

2
1 .
0
6.6. THE MATRIX OF A LINEAR TRANSFORMATION
517
(b) We have
 
 
1
1
[T (1)]C = [1]C = 0 = 0 ,
0 C
0
so that
 
 
1
1
[T (x)]C = [x + 1]C = 1 = 1 ,
0 C
0

1
[T ]C←B = 0
0

1
1 .
0
Next,
 
1
[S(1)]D = [1]D = 0 ,
0
 
1
[S(x)]D = [x + 1]D = 1 ,
0
 
1
[S(x2 )]D = [(x + 1)2 ]D = [x2 + 2x + 1]D = 2 ,
1
so that
Then

1
[S]D←C = 0
0

1
[S ◦ T ]D←B = [S]D←C [T ]C←B = 0
0

1
2 .
1
1
1
0
1
1
0

1
1
2 0
0
1
 
1
1
1 = 0
0
0

2
1 ,
0
which matches the result from part (a).
19. In Exercise 1, the basis for P1 in both the domain and codomain was already the standard basis E,
so we can use the matrix computed there:
0 1
[T ]E←E =
.
−1 0
Since this matrix has determinant 1, it is invertible, so that T is invertible as well. Then
−1
[T −1 (a + bx)]E←E = ([T ]E←E )
−1 1
a
0
=
0
b
1
a
0
=
b
−1
−1 a
−b
=
.
0 b
a
So T −1 (a + bx) = −b + ax.
20. From Exercise 5,
1
[T ]C←B =
1
0
1
0
.
1
Since this matrix is not square, it is not invertible, so that T is not invertible.
21. We first compute [T ]E←E , where E = {1, x, x2 } is the standard basis:
 
 
 
1
2
4
[T (1)]E = [1]E = 0 , [T (x)]E = [x + 2]E = 1 , [T (x2 )]E = [(x + 2)2 ]E = [x2 + 4x + 4]E = 4 ,
0
0
1
so that

1
[T ]E←E = 0
0
2
1
0

4
4 .
1
518
CHAPTER 6. VECTOR SPACES
Since this is an upper triangular matrix with nonzero diagonal entries, it is invertible, so that T is
invertible as well. To find T −1 , we have
  
a
1
−1
[T −1 (a + bx + cx2 )]E←E = ([T ]E←E )  b  = 0
c
0
−1  
4
a
4  b 
1
c

  

1 −2
4 a
a − 2b + 4c
1 −4  b  =  b − 4c  .
= 0
0
0
1
c
c
2
1
0
So T −1 (a + bx) = (a − 2b + 4c) + (b − 4c)x + cx2 = a + b(x − 2) + c(x − 2)2 = p(x − 2).
22. We first compute [T ]E←E , where E = {1, x, x2 } is the standard basis:
 
0
[T (1)]E = [0]E = 0 ,
0
 
1
[T (x)]E = [1]E = 0 ,
0
 
0
[T (x2 )]E = [2x]E = 2 ,
0
so that

0
[T ]E←E = 0
0
1
0
0

0
2 .
0
Since this is an upper triangular matrix with a zero entry on the diagonal, it has determinant zero, so
is not invertible; therefore, T is not invertible either.
23. We first compute [T ]E←E , where E = {1, x, x2 } is the standard basis:
 
1
[T (x)]E = [x + 1]E = 1 ,
0
 
1
[T (1)]E = [1 + 0]E = [1]E = 0 ,
0
 
0
[T (x2 )]E = [x2 + 2x]E = 2 ,
1
so that

1
[T ]E←E = 0
0
1
1
0

0
2 .
1
Since this is an upper triangular matrix with nonzero diagonal entries, it is invertible, so that T is
invertible as well. To find T −1 , we have
[T
−1
2
  
a
1
b  = 0
c
0
−1 
(a + bx + cx )]E←E = ([T ]E←E )
1
1
0
So T −1 (a + bx + cx2 ) = (a − b + 2c) + (b − 2c)x + cx2 .
−1  
0
a
2  b 
1
c

  

1 −1
2 a
a − b + 2c
1 −2  b  =  b − 2c  .
= 0
0
0
1
c
c
6.6. THE MATRIX OF A LINEAR TRANSFORMATION
519
24. We first compute [T ]E←E , where E = {E11 , E12 , E21 , E22 } is the standard basis:
 
3
2
1 0 3 2
3 2
[T E11 ]E =
=
=  ,
0 0 2 1 E
0 0 E 0
0
 
2
1
0 1 3 2
2 1
[T E12 ]E =
=
=  ,
0 0 2 1 E
0 0 E 0
0
 
0
0
0 0 3 2
0 0

[T E21 ]E =
=
= ,
1 0 2 1 E
3 2 E 3
2
 
0
0
0 0 3 2
0 0

[T E22 ]E =
=
= ,
0 1 2 1 E
2 1 E 2
1
so that

3
2
[T ]E←E = 
0
0
2
1
0
0
0
0
3
2

0
0
.
2
1
This matrix is block triangular, and each diagonal block is invertible, so it is invertible and therefore
T is invertible as well. To find T −1 , we have
−1 a
T
c
So T −1
a
c
  
3
a
 2
b
−1 
 
= ([T ]E←E ) 
 c  = 0
E←E
0
d
b
d
b
−a + 2b
=
d
−c + 2d
2
1
0
0
0
0
3
2
−1  
a
0
b
0
  
2  c 
d
1

  

−a + 2b
a
−1
2
0
0

  
 2 −3
0
0
  b  =  2a − 3b  .
=
 0
0 −1
2  c  −c + 2d
2c − 3d
d
0
0
2 −3
2a − 3b
.
2c − 3d
25. From Exercise 11, since the bases given there were both the standard basis,


0 −1
1
0
−1
0
0
1
.
[T ]E←E = 
 1
0
0 −1
0
1 −1
0
This matrix has determinant zero (for example, the second and third rows are multiples of one another),
so that T is not invertible.
26. From Exercise 12, since the bases given there were both the standard basis,


0
0
0 0
0
1 −1 0
.
[T ]E←E = 
0 −1
1 0
0
0
0 0
This matrix has determinant zero, so that T is not invertible.
520
CHAPTER 6. VECTOR SPACES
27. In Exercise 13, the basis is B = {sin x, cos x}. Using the method of Example 6.83,
Z
a
(a sin x + b cos x) dx = [D]−1
B←B b .
B
Using [D]B←B from Exercise 13, we get
Z
−1 0 −1
1
0 1
1
−3
(sin x − 3 cos x) dx =
=
=
.
1
0
−3
−1
0
−3
−1
B
R
So (sin x − 3 cos x) dx = −3 sin x − cos x + C. This is the same answer as integrating directly.
28. In Exercise 14, the basis is B = {e2x , e−2x }. Using the method of Example 6.83,
Z
a
ae2x + be−2x dx = [D]−1
B←B b .
B
Using [D]B←B from Exercise 14, we get
#" # " #
"
#−1 " # "
Z
1
0 0
0
2
0
0
−2x
2
=
.
5e
dx =
=
1
5
0 −2 5
−2
0 −2
5
B
R
So 5e−2x dx = − 52 e−2x + C. This is the same answer as integrating directly.
29. In Exercise 15, the basis is B = {e2x , e2x cos x, e2x sin x}. Using the method of Example 6.83,
Z
a
ae2x + be−2x dx = [D]−1
B←B b .
B
Using [D]B←B from Exercise 15, we get

−1   
   
1
2
0 0
0
0
0
0
0
2

   
   
e2x cos x − 2e2x sin x dx = 0
2 1  1 =  0 52 − 15   1 =  54  .
B
2
0 −1 2
0 15
− 35
−2
−2
5
R 2x
So
e cos x − 2e2x sin x dx = 45 e2x cos x − 53 e2x sin x + C. Checking via differentiation gives
0
3
3 2x
4
4 2x
e cos x − e sin x + C =
2e2x cos x − e2x sin x −
2e2x sin x + e2x cos x
5
5
5
5
Z
= e2x cos x − 2e2x sin x.
30. In Exercise 16, the basis is B = {cos x, sin x, x cos x, x sin x}. Using the method of Example 6.83,
 
a
Z
b
−1

(a cos x + b sin x + cx cos x + dx sin x) dx = [D]B←B 
c .
B
d
Using [D]B←B from Exercise 16, we get

−1   
   
0 1
1 0
0
0 −1 1
0 0
1
Z
−1 0
 0 1
 0  1
0
1
0
0
1
  =
  =  
(x cos x + x sin x) dx = 
 0 0
0 1 1 0
0 0 −1 1 −1
B
0 0 −1 0
1
0
0 1
0 1
1
R
So (x cos x + x sin x) dx = cos x + sin x − x cos x + x sin x + C. Checking via differentiation gives
0
(cos x + sin x − x cos x + x sin x + C) = − sin x + cos x − cos x + x sin x + sin x + x cos x
= x cos x + x sin x.
6.6. THE MATRIX OF A LINEAR TRANSFORMATION
31. With E = {e1 , e2 }, we have
−4 · 0
0
0
[T e1 ]E =
=
=
,
1+5·0 E
1 E
1
Therefore
[T ]E =
521
[T e2 ]E =
0
1
−4 · 1
−4
−4
=
=
.
0+5·1 E
5 E
5
−4
.
5
Using the method of Example 6.86(b), the eigenvectors of [T ]E will give us a diagonalizing basis C.
Using methods from Chapter 4, we find the eigenvalues and corresponding eigenvectors to be
−1
−4
λ1 = 4, v1 =
,
λ2 = 1, v2 =
.
1
1
Thus
C=
−1
−4
−1
,
, and [T ]C =
1
1
1
32. With E = {e1 , e2 }, we have
1
1
1−0
=
,
=
[T e1 ]E =
1
1 E
1+0 E
Therefore
[T ]E =
−1
−4
−1
[T ]E
1
1
[T e2 ]E =
1
1
−4
4
=
1
0
0
1
−1
−1
0−1
=
.
=
1
1 E
0+1 E
−1
.
1
Using the method of Example 6.86(b), the eigenvectors of [T ]E will give us a diagonalizing basis C.
Using methods from Chapter 4, we find the eigenvalues and corresponding eigenvectors to be
i
−i
λ1 = 1 + i, v1 =
,
λ2 = 1 − i, v2 =
.
1
1
Since the eigenvectors are not real, T is not diagonalizable.
33. With E = {1, x}, we have
4
4
=
,
[T (1)]E = [4 + x]E =
1
1 E
Therefore
2
2
=
.
[T (x)]E = [2 + 3x]E =
3
3 E
4
[T ]E =
1
2
.
3
Using the method of Example 6.86(b), the eigenvectors of [T ]E will give us a diagonalizing basis C.
Using methods from Chapter 4, we find the eigenvalues and corresponding eigenvectors to be
2
−1
λ1 = 5, v1 =
,
λ2 = 2, v2 =
.
1
1
Thus
C=
2
−1
2
,
, and [T ]C =
1
1
1
−1
−1
2
[T ]E
1
1
−1
5
=
1
0
0
.
2
34. With E = {1, x, x2 }, we have
 
 
1
1
[T (1)]E = [1]E = 0 = 0 ,
0 E
0
 
 
1
1
[T (x)]E = [x + 1]E = 1 = 1 ,
0 E
0
 
 
1
1
[T (x2 )]E = [(x + 1)2 ]E = [x2 + 2x + 1]E = 2 = 2 .
1 E
1
522
CHAPTER 6. VECTOR SPACES
Therefore

1
[T ]E = 0
0
1
1
0

1
2 .
1
Using the method of Example 6.86(b), the eigenvectors of [T ]E will give us a diagonalizing basis C. Since
this is an upper triangular matrix with 1s on the diagonal, the only eigenvalue
is 1; using methods from
 
1
Chapter 4 we find the eigenspace to be one-dimensional with basis 0. Thus T is not diagonalizable.
0
35. With E = {1, x}, we have
1
1
[T (1)]E = [1 + x · 0]E = [1]E =
=
,
0 E
0
0
0
[T (x)]E = [x + x · 1]E = [2x]E =
=
.
2 E
2
Therefore
[T ]E =
1
0
0
.
2
Thus T is already diagonal with respect to the standard basis E.
36. With E = {1, x, x2 }, we have
 
 
1
1
[T (1)]E = [1]E = 0 = 0 ,
0 E
0
 
 
2
2
[T (x)]E = [3x + 2]E = 3 = 3 ,
0 E
0

 
4
9
[T (x2 )]E = [(3x + 2)2 ]E = [9x2 + 12x + 4]E = 12 = 12 .
9 E
4

Therefore

1
[T ]E = 0
0
2
3
0

4
12 .
9
Using the method of Example 6.86(b), the eigenvectors of [T ]E will give us a diagonalizing basis C.
Since this is an upper triangular matrix with 1s on the diagonal, the eigenvalues are 1, 3, and 4; using
methods from Chapter 4 we find the corresponding eigenvectors to be
 
 
 
1
1
1
λ1 = 1, v1 = 0 ,
λ2 = 3, v2 = 1 ,
λ3 = 9, v3 = 2 .
0
0
1
Thus
     

1
1 
1
 1
C = 0 , 1 , 2 , and [T ]C = 0


0
0
1
0
1
1
0
−1

1
1
2 [T ]E 0
1
0
1
1
0
 
1
1
2 = 0
1
0
0
3
0

0
0 .
9
d1
−d2
, and let T be the reflection in `. Let d0 =
. As in Example
d2
d1
6.85, we may as well assume for simplicity that d is a unit vector; then d0 is as well. Continuing
37. Let ` have direction vector d =
6.6. THE MATRIX OF A LINEAR TRANSFORMATION
523
to follow Example 6.85, D = {d, d0 } is a basis for R2 . Since d0 is orthogonal to d, it follows that
T d0 = −d0 . Thus
1
0
[T (d)]D = [d]D =
,
[T (d0 )]D = [−d0 ]D =
.
0
−1
Thus
1
[T ]D =
0
0
.
−1
Now, [T ]E = PE←D [T ]D PD←E , so we must find PE←D . The columns of this matrix are the expression
of the basis vectors of D in the standard basis E, so we have
"
#
d1
d2
PE←D =
.
d2 −d1
Since we assumed d was a unit vector, this is an orthogonal matrix, so that
−1
T
[T ]E = PE←D [T ]D PD←E = PE←D [T ]D PE←D
= PE←D [T ]D PE←D
d
d2 1
0 d1
= 1
d2 −d1 0 −1 d2
2
d2
d − d22
= 1
−d1
2d1 d2
2d1 d2
.
d22 − d21
Note that this is the same as the answer from Exercise 26 in Section 3.6, with the assumption that d
is a unit vector.
38. Let T be the projection onto W . From Example 5.11, we have an orthogonal basis for R2 ,

 
 
 
1 
−1
1

u1 = 1 , u2 =  1 , u3 = −1 ,


2
1
0
where {u1 , u2 } is a basis for W and {u3 } is a basis for W ⊥ . Then
 
 
 
0
0
1
so that
[T (u1 )]D = 0 , [T (u2 )]D = 1 , [T (u3 )]D = 0 ,
0
0
0

1
[T ]D = 0
0
0
1
0

0
0 .
0
Further, since the columns of PE←D are the expression of the basis vectors of D in the standard basis
E,




1
1
0
1 −1
1
2
2



−1
1
1
.
PE←D = 1
= − 13
1 −1 ⇒ PE←D
3
3
1
1
1
−6 3
0
1
2
6
Thus
−1
[T ]E = PE←D [T ]D PD←E = PE←D [T ]D PE←D



1
1
1 −1
1 1 0 0
2
2


 1
1
= 1
1 −1 0 1 0 − 3
3
1
0
1
2 0 0 0
− 61
6
 
5
0
6
 1
1
= 6
3
1
− 13
3
1
6
5
6
1
3


3
If v = −1, then
2

5
6
 1
projW v = [T ]E [v]E =  6
− 13
1
6
5
6
1
3
  
5
3
3





1
=  13  ,
3  −1
1
− 23
2
3
− 31

which is the same as was found in Example 11 in Section 5.2.
− 31

1
.
3
1
3
524
CHAPTER 6. VECTOR SPACES
39. Let B = {v1 , . . . , vn }, and suppose that A is such that A[v]B = [T v]C for all v ∈ V . Then A[vi ]B =
[T vi ]C . But [vi ]B = ei since the vi are the basis vectors of B, so this equation says that Aei = [T vi ]C .
The left-hand side of this equation is the ith column of A, while the right-hand side is the ith column
of [T ]C←B . Thus those ith columns are equal; since this holds for all i, we get A = [T ]C←B .
40. Suppose that [v]B ∈ null([T ]C←B ). That means that
0 = [T ]C←B [v]B = [T v]C ,
so that [T v]C = 0 and thus v ∈ ker(T ). Similarly, if v ∈ ker(T ), then
0 = [T v]C = [T ]C←B [v]B ,
so that [v]B ∈ null([T ]C←B ). Then the map V → Rn : v 7→ [v]B is an isomorphism mapping ker(T ) to
null([T ]C←B ), so the two subspaces are isomorphic and thus have the same dimension. So nullity(T ) =
nullity([T ]C←B ).
41. Suppose B = {b1 , . . . , bn }. Then
[T ]C←B [bi ]B = [T (bi )]C
by definition of the matrix of T with respect to these bases, and denote by ai the columns of [T ]C←B .
Then rank([T ]C←B ) = dim col[T ]C←B , which is the number of linearly independent ai , and rank(T ) =
dim range(T ), which is the number of linearly independent T (vi ). Now, since
ai = [T ]C←B [bi ]B = [T (bi )]C
we see that {ai1 , ai2 , . . . , aim } are linearly independent if and only if {T (bi1 ), . . . , T (b)im } are linearly
independent, so col(A) = span(ai1 , ai2 , . . . , aim ) if and only if range(T ) = span(T (bi1 ), . . . , T (b)im )
and therefore rank(A) = rank(T ).
42. T is diagonalizable if and only if there is some basis C and matrix P such that [T ]C = P −1 AP is
diagonal. Thus if T is diagonalizable, so is A. For the reverse, if A is diagonalizable, then the basis C
whose elements are the columns of P is a basis in which [T ]C is diagonal.
43. Let dim V = n and dim W = m. Then [T ]C←B is an m × n matrix, and by Theorem 3.26 in Section
3.5, rank([T ]C←B ) + nullity([T ]C←B ) = n. But by Exercises 40 and 41, rank([T ]C←B ) = rank(T ) and
nullity([T ]C←B ) = nullity(T ); since dim V = n, we get
dim V = rank(T ) + nullity(T ).
44. Claim that [T ]C 0 ←B0 = PC 0 ←C [T ]C←B PB←B0 . To prove this, it is enough to show that for all x ∈ V ,
[x]C 0 = PC 0 ←C [T ]C←B PB←B0 [x]B0
since the matrix of T with respect to B 0 and C 0 is unique by Exercise 39. And indeed
PC 0 ←C [T ]C←B PB←B0 [x]B0 = PC 0 ←C [T ]C←B (PB←B0 [x]B0 ) = PC 0 ←C ([T ]C←B [x]B ) = PC 0 ←C [x]C = [x]C 0 .
45. For T ∈ L(V, W ), define ϕ(T ) = [T ]C←B ∈ Mmn . Claim that ϕ is an isomorphism. We must first show
it is linear:
ϕ(S + T ) = [(S + T )]C←B
= [(S + T )b1 ]C · · · [(S + T )bn ]C
= [Sb1 ]C + [T b1 ]C · · · [Sbn ]C + [T bn ]C
= [Sb1 ]C · · · [Sbn ]C + [T b1 ]C · · · [T bn ]C
= ϕ(S) + ϕ(T ).
Exploration: Tiles, Lattices, and the Crystallographic Restriction
525
Similarly
ϕ(cS) = [(cS)]C←B
= [(cS)b1 ]C · · · [(cS)bn ]C
= c[Sb1 ]C · · · c[Sbn ]C
= c [Sb1 ]C · · · [Sbn ]C
= cϕ(S).
Next we show that ker ϕ = 0, so that ϕ is one-to-one. Suppose that ϕ(T ) = 0. That means that
[T ]C←B = [0]mn , so that nullity([T ]C←B ) = n. By Exercise 40, this is equal to nullity(T ). Thus
nullity(T ) = n = dim V , so that T is the zero linear transformation, showing that ϕ is one-to-one.
Finally, since dim Mmn = mn, we will have shown that ϕ is an isomorphism if we prove that
dim L(V, W ) = mn. But if {v1 , . . . , vn } and {w1 , . . . , wm } are bases for V and W , then a linear
map from V to W sends each vi to some linear combination of the wj , so it is a sum of maps that send
vi to wk and the other vi to zero. There are mn of these maps, which are clearly linearly independent,
so that dim L(V, W ) = mn.
46. From Exercise 45 with W = R, we have dim W = dim R = 1, so that M1n ∼
= V . Thus
V ∗ = L(V, W ) = L(V, R) ∼
= M1n ∼
= V.
Exploration: Tiles, Lattices, and the Crystallographic Restriction
1. Let P be the pattern in Figure 6.15. We are given, from the figure, that P +u = P and that P +v = P .
Therefore if a is a positive integer, then P + au = (P + u) + (a − 1)u = P + (a − 1)u, and by induction
this is just P . Similarly if b is a positive integer then P + bv = (P + v) + (b − 1)v = P + (b − 1)v,
which again by induction is equal to P . If a is a negative integer, then since P + (−a)u = P , it follows
that P = P − (−a)u = P + au, and similarly for b < 0 and v.
2. The lattice points correspond to centers
of the trefoils in Figure 6.15.
1.5
1.
0.5
-3
-2
v
u
-1
1
2
3
-0.5
-1.
-1.5
θ
3. Let θ be the smallest positive angle through which the lattice exhibits rotational symmetry. Let RO
denote the operation of rotation through an angle θ > 0 around a center of rotation O, and let P be a
pattern or lattice. Now we have
nθ
θ n
RO
(P ) = RO
(P ) = P, n ∈ Z, n > 0,
since rotation through nθ is the same as applying a rotation through θ n times. If n < 0, then rotating
through nθ gives a figure which, if rotated back through (−n)θ, is P , so it must be P as well. This
shows that P is rotationally symmetric under any integer multiple of θ.
Next, let m be the smallest positive integer such that mθ ≥ 360◦ , and suppose that mθ = 360◦ + ϕ.
◦
360◦ +ϕ
ϕ
360◦
is an integer, so
Note that RO
(P ) = P , so that RO
(P ) = RO
(P ). We want to show that 360
θ
it suffices to show that ϕ = 0. Now, since m is minimal, we know that
mθ − ϕ = 360◦ > (m − 1)θ,
so that mθ − ϕ > mθ − θ and thus −ϕ > −θ, so that ϕ < θ. But
◦
360 +ϕ
ϕ
mθ
P = RO
(P ) = RO
(P ) = RO
(P ),
and we see that the lattice is rotationally symmetric under a rotation of ψ < θ0 . Since ϕ ≥ 0 and θ is
◦
the smallest positive such angle, it follows that ϕ = 0. Thus θ = 360
m .
526
CHAPTER 6. VECTOR SPACES
4. The smallest positive angle of rotational symmetry in the lattice of Exercise 2 is 60◦ ; this rotational
symmetry is also present in the original figure, in Figure 6.14 or 6.15.
5. Some examples are below:
-3
-2
n‡1
n‡2
n‡3
1
1
1
v
v
u 1
-1
v
2
3
-3
-2
u 1
-1
-1
n‡4
n‡6
1
1
v
u 1
v
-3
-2
u 1
-1
2
3
-3
-2
-1
-1
2
3
-3
-2
u 1
-1
2
3
-1
-1
2
3
-1
It is impossible to draw a lattice with 8-fold rotational symmetry, as the rest of this section will show.
6. (a) With θ = 60◦ , we have
u=
" #
1
0
"
,
v=
1
√2
3
2
#
"
#
,
and
[Rθ ]E =
"
cos θ
− sin θ
sin θ
cos θ
=
√
− 23
1
√2
3
2
1
2
#
.
(b) From the discussion in part (a),
"
Rθ (u) =
"
Rθ (v) =
With B = {u, v}, we have
0
[Rθ (u)]B = [v]B =
,
1
√
1
√2
3
2
− 23
1
√2
3
2
− 23
#" #
1
"
=
0
#
√ #"
1
2
1
2
1
√2
3
2
=
1
√2
3
2
#
= v,
" #
− 12
√
3
2
−1
[Rθ (v)]B = [−u + v]B =
1
= −u + v.
⇒
0
[Rθ ]B =
1
−1
.
1
7. Since the lattice is invariant under Rθ , u and v are mapped to lattice points under this rotation. But
each lattice point is by definition au + bv where a and b are integers. Thus
a
b
a b
[Rθ (u)]B = [au + cv]B =
, [Rθ (v)]B = [bu + dv]B =
⇒
[Rθ ]B =
,
c
d
c d
and the matrix entries are integers.
8. Let E = {e1 , e2 } and B = {u, v}. Then by Theorem 6.29 in Section 6.6, [Rθ ]E = P −1 [Rθ ]B P . But this
means that [Rθ ]E ∼ [Rθ ]B (that is, the two matrices are similar). By Exercise 35 in Section 4.4, this
implies that
2 cos θ = tr[Rθ ]E = tr[Rθ ]B = a + d,
which is an integer.
6.7. APPLICATIONS
527
9. If 2 cos θ = m is an integer, then m must be 0, ±1, or ±2, so the angles are angles whose cosines are
0, ± 21 , or ±1; these angles are
θ = 60◦ , 90◦ , 120◦ , 180◦ , 240◦ , 270◦ , 300◦ , 360◦ ,
corresponding to
n = 1, 2, 3, 4, or 6.
These are the only possible n-fold rotational symmetries.
10. Left to the reader.
6.7
Applications
1. Here a = −3, so by Theorem 6.32, {e3t } is a basis for solutions, which therefore have the form
y(t) = ce3t . Since y(1) = ce3 = 2, we have c = 2e−3 , so that y(t) = 2e−3 e3t = 2e3t−3 .
2. Here a = 1, so by Theorem 6.32, {e−t } is a basis for solutions, which therefore have the form x(t) = ce−t .
Since x(1) = ce−1 = 1, we have c = e, so that x(t) = ee−t = e1−t .
3. y 00 − 7y 0 + 12y = 0 corresponds to the characteristic equation λ2 − 7λ + 12 = (λ − 3)(λ − 4) = 0. Then
λ1 = 3 and λ2 = 4, so Theorem 6.33 tells us that the general solution has the form y(t) = c1 e3t + c2 e4t .
Then
1 − e4
c
=
1
y(0) = 1
c1 + c2 = 1
e3 − e4
⇒
⇒
3
4
e
c
+
e
c
=
1
y(1) = 1
1
2
e3 − 1
c2 = 3
e − e4
so that
y(t) =
1
e3 − e4
1 − e4 e3t + e3 − 1 e4t .
4. x00 +x0 −12x = 0 corresponds to the characteristic equation λ2 +λ−12 = (λ−3)(λ+4) = 0. Then λ1 = 3
and λ2 = −4, so Theorem 6.33 tells us that the general solution has the form x(t) = c1 e3t + c2 e−4t .
Then
1
c1 =
x(0) = 0
c1 + c2 = 0
7
⇒
⇒
3c1 − 4c2 = 1
1
x0 (0) = 1
c2 = −
7
so that
y(t) =
1 3t
e − e−4t .
7
√
5. f 00 − f 0 − f = 0 corresponds to the characteristic equation λ2 − λ − 1 = 0, so that λ1 = 1+2 5 and
√
√
√
λ2 = 1−2 5 . Therefore by Theorem 6.33, the general solution is f (t) = c1 e(1+ 5)t/2 + c2 e(1− 5)t/2 . So
√
f (0) = 0
f (1) = 1
so that
⇒
c +
c2 = 0
√
+ e(1− 5)/2 c2 = 1
1
√
e(1+ 5)/2 c1
√
c1 =
⇒
e( 5−1)/2
√
e 5−1
√
e( 5−1)/2
c2 = − √
e 5−1
√
e( 5−1)/2 (1+√5)t/2
f (t) = √
e
− e(1− 5)t/2 .
e 5−1
528
CHAPTER 6. VECTOR SPACES
6. g 00 − 2g = 0 corresponds to the characteristic equation λ2 − 2√= 0, so that
λ1 =
√
Therefore by Theorem 6.33, the general solution is g(t) = c1 et 2 + c2 e−t 2 . So
g(0) = 1
⇒
g(1) = 0
√
e
2
c1 +
c = 1
√ 2
− 2
c1 + e
c2 = 0
so that
g(t) =
c1 = −
1
√
e2 2 − 1
√
√
2 and λ2 = − 2.
1
√
e2 2 − 1
√
e2 2
√
c2 =
e2 2 − 1
⇒
√
√ √
−et 2 + e2 2−t 2 .
7. y 00 − 2y 0 + y = 0 corresponds to the characteristic equation λ2 − 2λ + 1 = (λ − 1)2 = 0, so that
λ1 = λ2 = 1. Therefore by Theorem 6.33, the general solution is y(t) = c1 et + c2 tet . So
y(0) = 1
y(1) = 1
c1
= 1
ec1 + ec2 = 1
⇒
so that
y(t) = et +
c1 = 1
1−e
c2 =
e
⇒
1−e t
te = et + (1 − e)tet−1 .
e
8. x00 + 4x0 + 4x = 0 corresponds to the characteristic equation λ2 + 4λ + 4 = (λ + 2)2 = 0, so that
λ1 = λ2 = −2. Therefore by Theorem 6.33, the general solution is x(t) = c1 e−2t + c2 te−2t . So
x(0) = 1
c1
= 1
−2c1 + c2 = 1
⇒
0
x (0) = 1
⇒
c1 = 1
c2 = 3
so that
y(t) = e−2t + 3te−2t .
9. y 00 − k 2 y = 0 corresponds to the characteristic equation λ2 − k 2 = 0, so that λ1 = k and λ2 = −k.
Therefore by Theorem 6.33, the general solution is y(t) = c1 ekt + c2 e−kt . So
y(0) = 1
y 0 (0) = 1
c1 + c2 = 1
kc1 − kc2 = 1
⇒
so that
y(t) =
k+1
2k
k−1
c2 =
2k
c1 =
⇒
1
(k + 1)ekt + (k − 1)e−kt .
2k
10. y 00 − 2ky 0 + k 2 y = 0 corresponds to the characteristic equation λ2 − 2kλ + k 2 = (λ − k)2 = 0, so that
λ1 = λ2 = k. Therefore by Theorem 6.33, the general solution is y(t) = c1 ekt + c2 tekt . So
y(0) = 1
y(1) = 0
c1
= 1
c1 + c2 = 0
⇒
c1 = 1
⇒
c2 = −1
so that
y(t) = ekt − te−kt .
11. f 00 − 2f 0 + 5f = 0 corresponds to the characteristic equation λ2 − 2λ + 5 = 0, so that λ1 = 1 + 2i and
λ2 = 1 − 2i. Therefore by Theorem 6.33, the general solution is f (t) = c1 et cos 2t + c2 et sin 2t. Then
f (0) = 1
π
f
=0
4
⇒
c1
= 1
eπ/4 c2 = 0
so that
y(t) = et cos 2t.
⇒
c1 = 1
c2 = 0
6.7. APPLICATIONS
529
12. h00 − 4h0 + 5h = 0 corresponds to the characteristic equation λ2 − 4λ + 5 = 0, so that λ1 = 2 + i and
λ2 = 2 − i. Therefore by Theorem 6.33, the general solution is h(t) = c1 e2t cos t + c2 e2t sin t. Then
h0 (t) = c1 (2e2t cos t − e2t sin t) + c2 (2e2t sin t + e2t cos t), and
h(0) = 0
0
h (0) = −1
⇒
c1
= 0
2c1 + c2 = −1
⇒
c1 = 0
c2 = −1
so that
y(t) = −e2t sin t.
13. (a) We start with p(t) = cekt ; then p(0) = ce0k = c = 100, so that c = 100. Next, p(3) = 100e3k =
1600, so that e3k = 16 and k = ln316 . Hence
p(t) = 100e(ln 16/3)t = 100 eln 16
t/3
= 100 · 16t/3 .
(b) The population doubles when p(t) = 100e(ln 16/3)t = 200. Simplifying gives
ln 16
3 ln 2
3 ln 2
3
t = ln 2, so that t =
=
= .
3
ln 16
4 ln 2
4
The population doubles in 43 of an hour, or 45 minutes.
(c) We want to find t such that p(t) = 100e(ln 16/3)t = 1000000. Simplifying gives
ln 16
3 ln 10000
12 ln 10
3 ln 10
t = ln 10000, so that t =
=
=
≈ 9.966 hours.
3
ln 16
4 ln 2
ln 2
14. (a) Start with p(t) = cekt ; then p(0) = ce0k = c = 76. Next, p(1) = 76e1k = 92, so that k = ln 92
76 ,
and therefore p(t) = 76et ln(92/76) . For the year 2000, we get
p(10) = 76e10 ln(92/76) ≈ 513.5.
This is almost twice the actual population in 2000.
(b) Start with p(t) = cekt ; then p(0) = ce0k = c = 203. Next, p(1) = 203e1k = 227, so that k = ln 227
203 ,
t ln(227/203)
and therefore p(t) = 203e
. For the year 2000, we get
p(3) = 203e3 ln(227/203) ≈ 283.8.
This is a pretty good approximation.
(c) Since the first estimate is very high but the second estimate is close, we can conclude that population growth was lower than exponential in the early 20th century, but has been close to exponential
since 1970. This could be attributed for example to three major wars during the period from 1900
to 1970.
15. (a) Let m(t) = ae−ct . Then m(0) = ae−c·0 = a = 50, so that m(t) = 50e−ct . Since the half-life is
1590 years, we know that 25 mg will remain after that time, so that
m(1590) = 50e−1590c = 25.
ln 2
Simplifying gives −1590c = ln 21 = − ln 2, so that c = 1590
. So finally, m(t) = 50e−t ln 2/1590 .
After 1000 years, the amount remaining is
m(1000) = 50e−1000 ln 2/1590 ≈ 32.33 mg.
(b) Only 10 mg will remain when m(t) = 10. Solving 50e−t ln 2/1590 = 10 gives
−t
ln 2
1
1590 ln 5
= ln = − ln 5, so that t =
≈ 3692 years.
1590
5
ln 2
530
CHAPTER 6. VECTOR SPACES
16. With m(t) = ae−ct , a half-life of 5730 years means that m(5730) = ae−5730c = a2 , so that e−5730c = 12
ln 2
and thus c = − ln(1/2)
5730 = 5730 . To find out how old the posts are, we want to find t such that
5730 ln 0.45
≈ 6600 years.
ln 2
√
√
17. Following Example 6.92, the general equation of the spring motion is x(t) = c1 cos K t + c2 sin K t.
Then x(0) = 10 = c1 cos 0 + c2 sin 0 = c1 , so that c1 = 10. Then
√
√
√
5 − 10 cos 10 K
√
x(10) = 5 = 10 cos 10 K + c2 sin 10 K , so that c2 =
.
sin 10 K
m(t) = ae−t ln 2/5730 = 0.45a, which gives t = −
Therefore the equation of the spring, giving its length at time t, is
√
√
√
5 − 10 cos 10 K
√
sin K t.
c2 = 10 cos K t +
sin 10 K
k
k
18. From the analysis in Example 6.92, since K = m
and m = 50, we have K = 50
, so that
r
r
√
√
k
k
t + c2 sin
t.
x(t) = c1 cos K t + c2 sin K t = c1 cos
50
50
√2π = 10,
Since the period of sin bt and cos bt are each 2π
b , the fact that the period is 10 means that
k/50
q
k
π
so that 50 = 5 and thus k = 2π 2 .
19. The differential equation for the pendulum motion is the same as the equation of spring motion, so we
apply the analysis from Example 6.92. Recall that the period of sin bt and cos bt are each 2π
b .
(a) Since θ00 + Lg θ = 0, we have K = Lg and
√
√
θ(t) = c1 cos
K t + c2 sin
r
K t = c1 cos
g
t + c2 sin
L
r
g
t.
L
√2π = 2π
Recall that the period of sin bt and cos bt are each 2π
b . Then the period is P =
gL
q
1
2π
this case, since L = 1, we get P = 2π g = √g .
q
L
g . In
(b) Since θ1 does not appear in the formula, the period does not depend on θ1 . It depends only on the
gravitational constant g and the length of the pendulum L. (Note that it also does not depend
on the weight of the pendulum).
20. We want to show that if y1 and y2 are two solutions, and c is a constant, then y1 + y2 and cy1 are also
solutions. But this follows directly from properties of the derivative:
(y1 + y2 )00 + a(y1 + y2 )0 + b(y1 + y2 ) = (y100 + ay10 + by1 ) + (y200 + ay20 + by2 ) = 0 + 0 = 0
(cy1 )00 + a(cy1 )0 + b(cy1 ) = cy100 + acy10 + bcy1 = c(y100 + ay10 + by1 ) = c · 0 = 0.
Thus the solution set is a subspace of F .
21. Since we know that dim S = 2, it suffices to show that eλt and teλt are solutions (are in S) and that
they are linearly independent. Note that λ is a solution of λ2 + aλ + b = 0. Then
00
0
eλt + a eλt + beλt = λ2 eλt + aλeλt + beλt = eλt (λ2 + aλ + b) = 0.
For teλt , note that
0
teλt = eλt + λteλt ,
00
0
teλt = eλt + λteλt = λeλt + λeλt + λ2 teλt = 2λeλt + λ2 teλt .
Chapter Review
531
Then
teλt
00
+ a teλt
0
+ bteλt = 2λeλt + λ2 teλt + a eλt + λteλt + bteλt
= (2λ + a)eλt + teλt (λ2 + aλ + b)
= (2λ + a)eλt .
√
2
But the solutions of λ2 + aλ + b are λ = −a± 2a −4b , so if the two roots are equal, we must have
a2 − 4b = 0, so that λ = − a2 , so that 2λ = −a and then 2λ + a = 0. So the remaining term vanishes,
and teλt is a solution.
To see that the two are linearly independent, suppose c1 eλt + c2 teλt is zero for all values of t. Setting
t = 0 gives c1 = 0, so that c2 teλt is zero. Then set t = 1 to get c2 = 0. Thus the two are linearly
independent, so they form a basis for the solution set S.
22. Suppose that c1 ept cos qt + c2 ept sin qt is identically zero. Setting t = 0 gives c1 = 0, so that c2 ept sin qt
is identically zero, which clearly forces c2 = 0, proving linear independence.
Chapter Review
1. (a) False. All we can say is that dim V ≤ n, so that any spanning set for V contains at most n vectors.
For example, let V = R, v1 = e1 and v2 = 2e1 . Then V = span(v1 , v2 ), but V has spanning sets
consisting of only one vector.
(b) True. See Exercise 17(a) in Section 6.2.
(c) True. For example,
1
I=
0
0
,
1
0
J=
1
1
,
0
1
L=
0
1
,
1
1
R=
1
1
.
0
Then R − I = E11 , L − I = E12 , J + I − L = E21 , J + I − R = E22 . Therefore this set of four
invertible matrices spans; since dim M22 = 4, it is a basis.
(d) False. By Exercise 2 in Section 6.5, the kernel of the map tr : M22 → R has dimension 3, so that
matrices with trace 0 span a subspace of M22 of dimension 3.
(e) False. In general, kx + yk =
6 kxk + kyk. For example, take x = e1 and y = e2 in R2 .
(f ) True. If T is one-to-one and {ui } is a basis for V , then {T ui } is a linearly independent set by
Theorem 6.22. It spans range(T ), so it is a basis for range(T ). Thus if T is onto, range(T ) = W ,
so that {T ui } is a basis for W and therefore dim V = dim W .
(g) False. This is only true if T is onto. For example, let T : R → R be T (x) = 0. Then ker T = R
but the codomain is not {0} (although the range is).
(h) True. If nullity(T ) = 4, then rank T = dim M33 − nullity(T ) = 9 − 4 = 5. So range(T ) is a
five-dimensional subspace of P4 , so it is the entire space.
(i) True. Since dim P3 = 4, it suffices to show that dim V = 4. But polynomials in V are of the
form a + bx + cx2 + dx3 + ex4 where a + b + c + d + e = 0, so they are of the form (−b − c − d −
e) + bx + cx2 + dx3 + ex4 , where b, c, d, and e are arbitrary. Thus a basis for V is {x, x2 , x3 , x4 },
so dim V = 4.
(j) False. See Example 6.80 in Section 6.6.
2. This is the subspace {0}, since the only vector
x
0
.
satisfying x2 + 3y 2 = 0 is
y
0
3. The conditions a + b = a + c and a + c = c + d imply that b = c and a = d, so that
a b
W =
.
b a
532
CHAPTER 6. VECTOR SPACES
(Note that matrices of this form satisfy all of the original equalities.) To show that this is a subspace,
we must show that it is closed under sums and scalar multiplication. Thus suppose
0
a b
a b0
,
∈ W,
c a scalar.
b a
b0 a0
Then
a
b
0
b
a b0
a + a0 b + b0
+ 0
=
∈W
a
b a0
b + b0 a + a0
a b
ca cb
c
=
∈ W.
b a
cb ca
4. To show that this is a subspace, we must show that it is closed under sums and scalar multiplication.
Suppose p(x), q(x) ∈ W . Then
1
1
1
1
1
3
3
3
3
x (p + q)
=x p
+q
=x p
+x q
= p(x) + q(x) = (p + q)(x),
x
x
x
x
x
so that p + q ∈ W . Next, if c is a scalar, then
1
1
= cx3 p
= cp(x) = (cp)(x),
x3 (cp)
x
x
so that cp ∈ W . Thus W is a subspace.
5. To show that this is a subspace, we must show that it is closed under sums and scalar multiplication.
Suppose f (x), g(x) ∈ W . Then
(f + g)(x + π) = f (x + π) + g(x + π) = f (x) + g(x) = (f + g)(x),
so that f + g ∈ W . Next, if c is a scalar, then
(cf )(x + π) = cf (x + π) = cf (x) = (cf )(x),
so that cf ∈ W . Thus W is a subspace.
6. They are linearly dependent. By the double angle formula, cos 2x = 1 − 2 sin2 x, so that
1 − cos 2x − 2 sin2 x = 1 − cos 2x −
2
3 sin2 x = 0.
3
7. Suppose that cA + dB = O. Taking the transpose of both sides gives cAT + dB T = O; since A is
symmetric and B is skew-symmetric, this is just cA − dB = O. Adding these two equations gives
2cA = O. Since A is nonzero, we must have c = 0, so that dB = O. Then since B is nonzero, we must
have d = 0. Thus A and B are linearly independent.
8. The condition a + d = b + c means that a = b + c − d, so that
a b
b+c−d b
W =
: a+d=b+c =
.
c d
c
d
Then define B by finding three matrices in W in each of which exactly one of b, c, and d is 1 while the
other are zero:
1 1
1 0
−1 0
B= B=
, C=
, D=
.
0 0
1 0
0 1
Consider the expression
bB + cC + dD =
b+c−d
c
b
.
d
If this is O, then clearly b = c = d = 0, so that B, C, and D are linearly independent. But also,
bB + cC + dD is a general matrix in W , so that B, C, and D span W . Thus {B, C, D} is a basis for
W.
Chapter Review
533
9. Suppose p(x) = a + bx + cx2 + dx3 + ex4 + f x5 ∈ P5 with p(−x) = p(x). Then
p(1) = p(−1) ⇒ a + b + c + d + e + f = a − b + c − d + e − f ⇒ b + d + f = 0
p(2) = p(−2) ⇒ a + 2b + 4c + 8d + 16e + 32f = a − 2b + 4c − 8d + 16e − 32f ⇒ b + 4d + 16f = 0
p(3) = p(−3) ⇒ a + 3b + 9c + 27d + 81e + 243f = a − 3b + 9c − 27d + 81e − 243f
⇒ b + 9d + 81f = 0.
These three equations in the unknowns b, d, f have only the trivial solution, so that b = d = f = 0,
and p(x) = a + cx2 + dx4 . Any polynomial of this form clearly has p(x) = p(−x), since (−x)2 = x2
and (−x)4 = x4 . So {1, x2 , x4 } is a basis for W .
10. Let B = {1, 1 + x, 1 + x + x2 } = {b1 , b2 , b3 } and C = {1 + x, x + x2 , 1 + x2 } = {c1 , c2 , c3 }. Then
 
 
 
0
−1
1
[c1 ]B = 1 ,
[c2 ]B =  0 ,
[c3 ]B = −1 ,
0
1
1
so that by Theorem 6.13

0
PB←C = 1
0
Then

−1
1
0 −1 .
1
1

1
2
 1
−1
PC←B = PB←C = − 2
1
2
1
0
0

1
2
1
.
2
1
2
11. To show that T is linear, we must show that T (x + z) = T x + T z and that T (cx) = cT x, where c is a
scalar.
T (x + z) = y(x + z)T y = y xT + zT y = yxT y + yzT y = T x + T z
T (cx) = y(cx)T y = cyxT y = cT x.
Thus T is linear.
12. T is not a linear transformation. Let A ∈ Mnn and c be a scalar other than 0 or 1. Then
T (cA) = (cA)T (cA) = c2 AT A = c2 T (A).
Since c 6= 0, 1, it follows that c2 T (A) 6= cT (A), so that T is not linear.
13. To show that T is linear, we must show that (T (p + q))(x) = T p(x) + T q(x) and that (T (cp))(x) =
cT p(x), where c is a scalar.
(T (p + q))(x) = (p + q)(2x − 1) = p(2x − 1) + q(2x − 1) = T p(x) + T q(x)
(T (cp))(x) = (cp)(2x − 1) = cp(2x − 1) = T p(x).
Thus T is linear.
14. We first write 5 − 3x + 2x2 in terms of 1, 1 + x, and 1 + x + x2 :
5 − 3x + 2x2 = a(1) + b(1 + x) + c(1 + x + x2 ) = (a + b + c) + (b + c)x + cx2 .
Then c = 2, so that (from the x term) b = −5, and then a = 8. Thus
T (5 − 3x + 2x2 ) = 8T (1) − 5T (1 + x) + 2T (1 + x + x2 )
1
=8
0
0
1
−5
1
0
1
0
+2
1
1
−1
3
=
0
2
−7
.
3
534
CHAPTER 6. VECTOR SPACES
15. Since T : Mnn → R is onto, we know from the rank theorem that
n2 = dim Mnn = rank(T ) + nullity(T ) = 1 + nullity(T ),
so that nullity(T ) = n2 − 1.
16. (a) We want a map T : M22 → M22 such that T (A) = O if and only if A is upper triangular. If
a b
A=
,
c d
an obvious map to try is
0
T (A) =
c
0
.
0
Clearly the matrices sent to O by T are exactly the upper triangular matrices (together with O),
which is the subspace W . To see that T is linear, note that we can rewrite T as T (A) = a21 E21 .
Then
T (A + B) = (A + B)21 E21 = (a21 + b21 )E21 = a21 E21 + b21 E21 = T (A) + T (B)
T (cA) = (cA)21 E21 = ca21 E21 = cT (A).
(b) For this map, we want the image of any matrix A to be upper triangular. With A as above, an
obvious map to try is the one that sets the entry below the main diagonal to zero:
a b
T (A) =
.
0 d
It is obvious that range T = W , since any upper triangular matrix is the image of itself under T ,
and every matrix of the form T (A) is upper triangular by construction. To see that T is linear,
write T (A) = A − a21 E21 ; then
T (A + B) = (A + B) − (A + B)21 E21 = (A + B) − (a21 + b21 )E21
= (A − a21 E21 ) + (B − b21 E21 ) = T (A) + T (B)
T (cA) = (cA) − (cA)21 E21 = cA − ca21 E21 = c(A − a21 E21 ) = cT (A).
17. We have
 
1
0

⇒ [T (1)]C = 
0
1
1 1
1 0
0 1
T (x) = T ((x + 1) − 1) = T (x + 1) − T (1) =
−
=
0 1
0 1
0 0
 
0
1

= 0 · E11 + 1 · E12 + 0 · E21 + 0 · E22 ⇒ [T (x)]C = 
0
0
0 −1
1
T (x2 ) = T ((1 + x + x2 ) − (1 + x)) = T (1 + x + x2 ) − T (1 + x) =
−
1
0
0
 
−1
−2

= −1 · E11 − 2 · E12 + 1 · E21 − 1 · E22 ⇒ [T (1)]C = 
 1 .
−1
1
T (1) =
0
0
= 1 · E11 + 0 · E12 + 0 · E21 + 1 · E22
1
1
−1
=
1
1
−2
−1
Chapter Review
535
Since these vectors are the columns of [T ]C←B , we have


1 0 −1
0 1 −2
.
[T ]C←B = 
0 0
1
1 0 −1
18. We must show that S spans V and that S is linearly independent. First, since every vector can be
written as a linear combination of elements of S in one way, S spans V . Next, suppose
c1 v1 + c2 v2 + · · · + cn vn = 0.
Since we know that we can write 0 as the linear combination of the elements of S with zero coefficients,
and since the representation of 0 is unique by hypothesis, it follows that ci = 0 for all i, so that S is
linearly independent. Thus S is a basis for V .
19. If u ∈ U , then T u ∈ ker(S), so that (S ◦ T )u = S(T u) = 0. Thus S ◦ T is the zero map.
20. Since {T (v1 ), T (v2 ), . . . , T (vn )} is a basis for V , it follows that range(T ) = V since any element of V
can be expressed as a linear combination of elements in range(T ). By Theorem 6.30 (p) and (t), this
means that T is invertible.
Chapter 7
Distance and Approximation
7.1
Inner Product Spaces
1. With hu, vi = 2u1 v1 + 3u2 v2 , we get
(a) hu, vi = 2 · 1 · 4 + 3 · (−2) · 3 = −10.
p
p
√
(b) kuk = hu, ui = 2 · 1 · 1 + 3 · (−2) · (−2) = 14.
p
√
−3
= 2 · (−3) · (−3) + 3 · (−5) · (−5) = 93.
(c) d(u, v) = ku − vk =
−5
6 2
2. With A =
, we get
2 3
6 2 4
T
(a) hu, vi = u Av = 1 −2
= −4.
2 3 3
s
p
√
6 2
1
1 −2
= 10.
(b) kuk = hu, ui =
2 3 −2
s
√
6 2 −3
−3
−3 −5
=
(c) d(u, v) = ku − vk =
= 189.
−5
2 3 −5
a
3. We want a vector w =
such that
b
hu, wi = 2 · 1 · a + 3 · (−2) · b = 2a − 6b = 0.
3t
3
Any vector of the form
is such a vector; for example,
.
t
1
a
4. We want a vector w =
such that
b
6 2 a
T
hu, wi = u aw = 1 −2
= 2a − 4b = 0.
2 3 b
2
There are many such vectors, for example w =
.
1
5.
(a)
(b)
(c)
hp(x), q(x)i = 3 − 2x, 1 + x + x2 = 3 · 1 + (−2) · 1 + 0 · 1 = 1.
p
p
√
kp(x)k = hp(x), p(x)i = 3 · 3 + (−2) · (−2) + 0 · 0 = 13.
p
√
d(p(x), q(x)) = kp(x) − q(x)k = 2 − 3x − x2 = 22 + (−3)2 + (−1)2 = 14.
537
538
CHAPTER 7. DISTANCE AND APPROXIMATION
6.
Z 1
(3 − 2x)(1 + x + x2 ) dx =
Z 1
(−2x3 + x2 + x + 3) dx
(a)
hp(x), q(x)i =
(b)
1
10
1 4 1 3 1 2
=
= − x + x + x + 3x
.
2
3
2
3
0
s
s
1 r
Z 1
p
13
4 3
2
2
(3 − 2x) dx =
=
kp(x)k = hp(x), p(x)i =
.
9x − 6x + x
3
3
0
0
s
Z 1
2
(2 − 3x − x2 )2 dx
d(p(x), q(x)) = kp(x) − q(x)k = 2 − 3x − x =
0
0
(c)
0
s
=
1 5 3 4 5 3
x + x + x − 6x2 + 4x
5
2
3
1
r
=
0
41
.
30
7. We want a polynomial r(x) = a + bx + cx2 such that
hp(x), r(x)i = 3 · a − 2 · b + 0 · c = 3a − 2b = 0.
There are many such polynomials; two examples are 2 − 3x and 4 − 6x + 7x2 .
8. We want a polynomial r(x) = a + bx + cx2 such that
Z 1
hp(x), r(x)i =
(3 − 2x)(a + bx + cx2 ) dx
0
Z 1
=
(3a + (3b − 2a)x + (3c − 2b)x2 − 2cx3 ) dx
0
3b − 2a 2 3c − 2b 2 c 4
x +
x − x
= 3ax +
2
3
2
5
1
= 2a + b + c = 0.
6
2
1
0
There are many such polynomials. Three examples are −1 + 4x2 , 5 − 12x, and 12x − 20x2 .
9. Recall the double-angle formulas sin2 x = 21 − 12 cos 2x and cos2 x = 12 + 12 cos 2x. Then
(a)
Z 2π
hf, gi =
Z 2π
f (x)g(x) dx =
0
0
Z 2π =
0
(b) kf k =
R 2π
0
sin2 x + sin x cos x dx
f (x)2 dx =
R 2π
0
1 1
− cos 2x + sin x cos x
2 2
sin2 x dx =
R 2π
0
dx =
2π
1
1
1
x − sin 2x + sin2 x
= π.
2
4
2
0
1
1
1
2 − 2 cos 2x
dx =
2π
1
2 x − 4 sin 2x 0 = π.
(c)
Z 2π
d(f, g) = kf − gk =
(− cos x)2 dx =
0
10. We want to find a function g such that
1
2π
2
2 sin x 0 = 0, so choose g(x) = cos x.
Z 2π
0
R 2π
0
cos2 x dx =
Z 2π 0
1 1
+ cos 2x dx
2 2
2π
1
1
=
= π.
x + sin 2x
2
4
0
sin x · g(x) dx = 0. Note from the previous exercise that
7.1. INNER PRODUCT SPACES
539
11. We must show it satisfies all four properties of an inner product. First,
hp(x), q(x)i = p(a)q(a) + p(b)q(b) + p(c)q(c) = q(a)p(a) + q(b)p(b) + q(c)p(c) = hq(x), p(x)i .
Next,
hp(x), q(x) + r(x)i = p(a)(q(a) + r(a)) + p(b)(q(b) + r(b)) + p(c)(q(c) + r(c))
= p(a)q(a) + p(a)r(a) + p(b)q(b) + p(b)r(b) + p(c)q(c) + p(c)r(c)
= p(a)q(a) + p(b)q(b) + p(c)q(c) + p(a)r(a) + p(b)r(b) + p(c)r(c)
= hp(x), q(x)i + hp(x), r(x)i .
Third,
hdp(x), q(x)i = dp(a)q(a) + dp(b)q(b) + dp(c)q(c) = d(p(a)q(a) + p(b)q(b) + p(c)q(c)) = d hp(x), q(x)i .
Finally,
hp(x), p(x)i = p(a)p(a) + p(b)q(b) + p(c)q(c) = p(a)2 + p(b)2 + p(c)2 ≥ 0,
and if equality holds, then p(a) = p(b) = p(c) = 0 since squares are always nonnegative. But p ∈ P2 ,
so if it is nonzero, it has at most two roots. Since a, b, and c are distinct, it follows that p(x) = 0.
12. We must show it satisfies all four properties of an inner product. First,
hp(x), q(x)i = p(0)q(0) + p(1)q(1) + p(2)q(2) = q(0)p(0) + q(1)p(1) + q(2)p(2) = hq(x), p(x)i .
Next,
hp(x), q(x) + r(x)i = p(0)(q(0) + r(0)) + p(1)(q(1) + r(1)) + p(2)(q(2) + r(2))
= p(0)q(0) + p(0)r(0) + p(1)q(1) + p(1)r(1) + p(2)q(2) + p(2)r(2)
= p(0)q(0) + p(1)q(1) + p(2)q(2) + p(0)r(0) + p(1)r(1) + p(2)r(2)
= hp(x), q(x)i + hp(x), r(x)i .
Third,
hcp(x), q(x)i = cp(0)q(0) + cp(1)q(1) + cp(2)q(2) = c(p(0)q(0) + p(1)q(1) + p(2)q(2)) = c hp(x), q(x)i .
Finally,
hp(x), p(x)i = p(0)p(0) + p(1)q(1) + p(2)q(2) = p(0)2 + p(1)2 + p(2)2 ≥ 0,
and if equality holds, then 0, 1, and 2 are all roots of p since squares are always nonnegative. But
p ∈ P2 , so if it is nonzero, it has at most two roots. It follows that p(x) = 0.
0
13. The fourth axiom does not hold. For example, let u =
. Then hu, ui = 0 but u 6= 0.
1
1
14. The fourth axiom does not hold. For example, let u =
. Then hu, ui = 1 · 1 − 1 · 1 = 0, but u 6= 0.
1
1
Alternatively, if u =
, then u · u = −3 < 0, so the fourth axiom again fails.
2
1
15. The fourth axiom does not hold. For example, let u =
. Then hu, ui = 0 · 1 + 1 · 0 = 0, but u 6= 0.
0
1
Alternatively, if u =
, then u · u = −4 < 0, so the fourth axiom again fails.
−2
16. The fourth axiom does not hold. For example, if p is any nonzero polynomial with zero constant term,
then hp(x), p(x)i = 0 but p(x) is not the zero polynomial.
540
CHAPTER 7. DISTANCE AND APPROXIMATION
17. The fourth axiom does not hold. For example, let p(x) = 1 − x; then p(1) = 0 so that hp(x), p(x)i = 0,
but p(x) is not the zero polynomial.
18. The fourth axiom does not hold.
For example E12 E22 = O, so that hE12 , E22 i = det O = 0, but
1 1
E11 6= O. Alternatively, let A =
; then A 6= O but hA, Ai = 0.
1 1
4
19. Examining Example 7.3, the matrix should be
1
u1
4
u2
1
1 v1
= 4u1 + u2
4 v2
1
; and indeed
4
v1
u1 + 4u2
= 4u1 v1 + u2 v1 + u1 v2 + 4u2 v2 ,
v2
as desired.
1
20. Examining Example 7.3, the matrix should be
2
u1
u2
1
2
2 v1
= u1 + 2u2
5 v2
2
; and indeed
5
2u1 + 5u2
v1
= u1 v1 + 2u2 v1 + 2u1 v2 + 5u2 v2 ,
v2
as desired.
21. The unit circle is
1 2
u1
2
u=
: u1 + u2 = 1 .
u2
4
u2
2
1
-1.0
0.5
-0.5
-1
-2
1.0
u1
7.1. INNER PRODUCT SPACES
22. The unit circle is
541
u1
2
2
u=
: 4u1 + 2u1 u2 + 4u2 = 1 .
u2
u2
0.6
0.4
0.2
-0.6
-0.4
0.2
-0.2
0.4
0.6
u1
-0.2
-0.4
-0.6
23. We have, using properties of the inner product,
hu, cvi = hcv, ui
(Property 1)
= c hv, ui
(Property 3)
= c hu, vi
(Property 1).
24. Let w be any vector. Then hu, 0i = hu, 0wi which, by Theorem 7.1(b), is equal to 0 hu, wi = 0.
Similarly, h0, vi = h0w, vi which, by property 3 of inner products, is equal to 0 hw, vi = 0.
25. Using properties of the inner product and Theorem 7.1 as needed,
hu + w, v − wi = hu, v − wi + hw, v − wi
= hu, vi + hu, −wi + hw, vi + hw, −wi
= hu, vi − hu, wi + hv, wi − hw, wi
2
= 1 − 5 + 0 − kwk
= −4 − 4 = −8.
26. Using properties of the inner product and Theorem 7.1 as needed,
h2v − w, 3u + 2wi = h2v, 3u + 2wi + h−w, 3u + 2wi
= h2v, 3ui + h2v, 2wi + h−w, 3ui + h−w, 2wi
= 6 hv, ui + 4 hv, wi − 3 hw, ui − 2 hw, wi
2
= 6 hu, vi + 4 hv, wi − 3 hu, wi − 2 kwk
= 6 · 1 + 4 · 0 − 3 · 5 − 2 · 22 = −17.
27. Using properties of the inner product and Theorem 7.1 as needed,
p
ku + vk = hu + v, u + vi
p
= hu, ui + hu, vi + hv, ui + hv, vi
q
2
2
= kuk + 2 hu, vi + kvk
√
√
= 1 + 2 · 1 + 3 = 6.
542
CHAPTER 7. DISTANCE AND APPROXIMATION
28. Using properties of the inner product and Theorem 7.1 as needed,
p
k2u − 3v + wk = h2u − 3v + w, 2u − 3v + wi
p
= 4 hu, ui − 12 hu, vi − 6 hv, wi + 4 hu, wi + 9 hv, vi + hw, wi
q
2
2
2
= 4 kuk − 12 hu, vi − 6 hv, wi + 4 hu, wi + 9 kvk + kwk
r
√ 2
= 4 · 12 − 12 · 1 − 6 · 0 + 4 · 5 + 9 ·
3 + 22
√
= 43.
29. u+v = w means that u+v−w = 0; this is true if and only if hu + v − w, u + v − wi = 0. Computing
this inner product gives
hu + v − w, u + v − wi = hu, ui + 2 hu, vi − 2 hu, wi − 2 hv, wi + hv, vi + h−w, −wi
2
2
2
= kuk + 2 · 1 − 2 · 5 − 2 · 0 + kvk + kwk
= 1 + 2 − 10 + 3 + 2 = 0.
2
30. Since kuk = 1, we also have 1 = kuk = hu, ui. Similarly, 1 = hv, vi. But then
0 ≤ hu + v, u + vi = hu, ui + 2 hu, vi + hv, vi = 2 + 2 hu, vi .
Solving this inequality gives 2 hu, vi ≥ −2, so that hu, vi ≥ −1.
31. We have
hu + v, u − vi = hu, ui + hu, vi − hv, ui − hv, vi .
Since inner products are commutative, the middle two terms cancel, leaving us with
2
2
hu, ui − hv, vi = kuk − kvk .
32. We have
2
ku + vk = hu + v, u + vi = hu, ui + hu, vi + hv, ui + hv, vi .
Since inner products are commutative, the middle two terms are the same, and we get
2
2
hu, ui + 2 hu, vi + hv, vi = kuk + 2 hu, vi + kvk .
33. From Exercise 32, we have
2
2
2
ku + vk = kuk + 2 hu, vi + kvk .
(1)
Substituting −v for v gives another identity,
2
2
2
2
2
ku − vk = kuk + 2 hu, −vi + k−vk = kuk − 2 hu, vi + kvk .
Adding these two gives
2
2
2
2
ku + vk + ku − vk = 2 kuk + 2 kvk ,
and dividing through by 2 gives the required identity.
34. Solve (1) and (2) in the previous exercise for hu, vi, giving
1
2
2
2
ku + vk − kuk − kvk
2
1
2
2
2
hu, vi =
− ku − vk + kuk + kvk .
2
hu, vi =
(2)
7.1. INNER PRODUCT SPACES
543
Add these two equations, giving
2 hu, vi =
1
2
2
ku + vk − ku − vk ;
2
dividing through by 2 gives
hu, vi =
1
1
1
2
2
2
2
ku + vk − ku − vk = ku + vk − ku − vk .
4
4
4
35. Use Exercise 34. First, if ku + vk = ku − vk, then Exercise 34 implies that hu, vi = 0. Second, suppose
that u and v are orthogonal. Then again from Exercise 34,
0=
1
1
1
2
2
ku + vk − ku − vk = (ku + vk + ku − vk) (ku + vk − ku − vk) ,
4
4
4
so that one of the two factors must be zero. If the first factor is zero, then both norms are zero, so
they are equal. If the second factor is zero, then we have ku + vk − ku − vk = 0 and again the norms
are equal.
36. From (2) in Exercise 33, we have
d(u, v) = ku − vk =
This is equal to
q
2
q
2
2
kuk − 2 hu, vi + kvk .
2
kuk + kvk if and only if hu, vi = 0, which is to say when u and v are orthogonal.
37. The inner product we are working with is hu, vi = 2u1 v1 + 3u2 v2 , so that
1
v1 =
0
2·1·1+3·1·0 1
1
1
1
0
v2 =
−
=
−
=
.
1
1
0
1
2·1·1+3·0·0 0
1
0
The orthogonal basis is
,
.
0
1
38. The inner product we are working with is
4
hu, vi = uT Av = uT
−2
−2
= 4u1 v1 − 2u1 v2 − 2u2 v1 + 7u2 v2 .
7
Then
1
0
" #
" # " #
" # " #
1
1
1
4·1·1−2·1·1−2·1·0+7·0·1 1
1 1
v2 =
−
=
−
= 2 .
4·1·1−2·1·0−2·1·0+7·0·0 0
2 0
1
1
1
v1 =
39. The inner product we are working with is a0 + a1 x + a2 x2 , b0 + b1 x + b2 x2 = a0 b0 + a1 b1 + a2 b2 .
Then
v1 = 1
1 · (1 + x)
1 = (1 + x) − 1 = x
1·1
1 · (1 + x + x2 )
x · (1 + x + x2 )
v3 = (1 + x + x2 ) −
1−
x = (1 + x + x2 ) − 1 − x = x2 .
1·1
x·x
v2 = (1 + x) −
The orthogonal basis is {1, x, x2 }.
544
CHAPTER 7. DISTANCE AND APPROXIMATION
40. The inner product we are working with is hp(x), q(x)i =
R1
0
p(x)q(x) dx. Then
v1 = 1
R1
(1 + x) dx
3
1
1 = (1 + x) − 1 = − + x
R1
2
2
1 dx
0
− 12 + x · (1 + x + x2 )
1 · (1 + x + x2 )
1
v3 = (1 + x + x2 ) −
− +x
1−
1
1
1·1
2
−2 + x · −2 + x
R1
R1
2
(1 + x + x ) dx
− 21 + x (1 + x + x2 ) dx
1
2
0
0
1− R1
= (1 + x + x ) −
− +x
R1
2
1 dx
− 12 + x · − 12 + x
0
0
1/6
1
11
− +x
= (1 + x + x2 ) − 1 −
6
1/12
2
11
1
= (1 + x + x2 ) − 1 − 2 − + x
6
2
1
= − x + x2 .
6
1 · (1 + x)
1 = (1 + x) −
v2 = (1 + x) −
1·1
41. The inner product here is hf, gi =
R1
−1
0
f (x)g(x) dx.
(a) To normalize the first three Legendre polynomials, we find the norm of each:
sZ
1
sZ
1
1 · 1 dx =
k1k =
1 dx =
−1
−1
sZ
1
sZ
1
√
2
s
r
1
1 3
2
kxk =
x · x dx =
x
=
3
3
−1
−1
−1
sZ sZ 1
1
1
1
1
2 2 1
2
2
2
4
x −
=
x −
x −
dx =
x − x +
dx
3
3
3
3
9
−1
−1
s
√
1
1 5 2 3 1
2 2
=
x − x + x
= √ .
5
9
9 −1
3 5
x2 dx =
So the normalized Legendre polynomials are
1
1
=√ ,
k1k
2
√
x
6
=
x,
kxk
2
x2 − 13
x2 − 31
√ √
3 5
1
5
= √
x2 −
= √ (3x2 − 1).
3
2 2
2 2
7.1. INNER PRODUCT SPACES
545
(b) For the computation, we will use the unnormalized Legendre polynomials given in Example 3.8,
1, x, and x2 − 31 , and normalize the fourth one at the end. With x4 = x3 , we have
x4 · v1
x4 · v2
x4 · v3
v4 = x4 −
v1 −
v2 −
v3
v1 · v1
v2 · v2
v3 · v3
R1 3
R 1 3 2 1
R1 3
x · x dx
x x − 3 dx
x · 1 dx
1
−1
−1
−1
3
2
1− R1
x− R1
= x − R1
x −
3
1 · 1 dx
x · x dx
x2 − 31 x2 − 13 dx
−1
−1
−1
R1 4
R1
R1 3
x dx
x5 − 31 x3 dx
x dx
1
−1
−1
1
−
x
−
x2 −
= x3 − R−1
R
R
1
1
1
1
2 2
2
4
3
1 dx
x dx
x − 3 x + 9 dx
−1
−1
−1
1 5 1
1 6
1 4 1
1
1 4
1
4 x −1
5 x −1
6 x − 12 x −1
1− x
−
x2 −
= x3 −
1
1
1
1
2
1
2
3
x3
x5 − x3 + x
3
−1
5
9
9
−1
2/5
= x3 − 0 −
x−0
2/3
3
= x3 − x.
5
The norm of this polynomial is
sZ sZ 1
1
3
9
3
6
x3 − x dx =
kv4 k =
x3 − x
x6 − x4 + x2 dx
5
5
5
25
−1
−1
s
√
1
1 7
6 5
3 3
2 14
x − x + x
.
=
=
7
25
25
35
−1
Thus the normalized Legendre polynomial is
√
3
14
35
3
√
x − x =
(5x3 − 3x).
5
4
2 14
42. (a) Starting with `0 (x) = 1, `1 (x) = x, `2 (x) = x2 − 31 , `3 (x) = x3 − 53 x, we have
`0 (1) = 1
⇒
L0 (x) = 1
`1 (1) = 1
⇒
L1 (x) = x
⇒
L2 (x) =
2
1
=
3
3
3 2
`3 (1) = 1 − −
5 5
`2 (1) = 1 −
⇒
3
1
3
1
x2 −
= x2 −
2
3
2
2
5
5
3
3
L3 (x) =
x3 − x = x3 − x.
2
5
2
2
(b) For L2 and L3 , the given recurrence says
2·2−1
2−1
3
1
xL1 (x) −
L0 (x) = x2 −
2
2
2 2
2·3−1
3−1
5
3 2 1
2
5
3
L3 (x) =
xL2 (x) −
L1 (x) = x
x −
− x = x3 − x,
3
3
3
2
2
3
2
2
L2 (x) =
so that the recurrence indeed works for n = 2 and n = 3. Then
2·4−1
4−1
7
5 3 3
3 3 2 1
L4 (x) =
xL3 (x) −
L2 (x) = x
x − x −
x −
4
4
4
2
2
4 2
2
35 4 15 2 3
=
x − x +
8
4
8
2·5−1
5−1
9
35 4 15 2 3
4 5 3 3
L5 (x) =
xL4 (x) −
L3 (x) = x
x − x +
−
x − x
5
5
5
8
4
8
5 2
2
63 5 35 3 15
=
x − x + x.
8
4
8
546
CHAPTER 7. DISTANCE AND APPROXIMATION
43. Let B = {ui } be an orthogonal basis for W . Then
hprojW (v), uj i =
X
hui , vi
ui , uj
hui , ui i
=
huj , vi
uj , uj
huj , uj i
=
huj , vi
huj , uj i = huj , vi .
huj , uj i
Then
hperpW (v), uj i = hv − projW (v), uj i = hv, uj i − hprojW (v), uj i = hv, uj i − huj , vi = 0.
Thus perpW (v) is orthogonal to every vector in B, so that perpW (v) is orthogonal to every w ∈ W .
44. (a) We have
2
2
htu + v, tu + vi = t2 hu, ui + 2t hu, vi + hv, vi = kuk t2 + 2 hu, vi t + kvk ≥ 0.
2
2
This is a quadratic at2 + bt + c in t, where a = kuk , b = 2 hu, vi, and c = kvk .
2
(b) Since a = kuk > 0, this is an upward-opening parabola, so its vertex is its lowest point. That
b
vertex is at − 2a
, so the lowest point, which must be nonnegative, is
2
b
ab2
b2 − 2b2 + 4ac
4ac − b2
b
b2
+b −
+c= 2 −
+c=
=
≥ 0.
a −
2a
2a
4a
2a
4a
4a
Since 4a > 0, we must have 4ac ≥ b2 , so that
√
ac ≥ 12 |b|.
(c) Substituting back in the values of a, b, and c from part (a) gives
√
q
ac =
2
2
kuk kvk = kuk kvk
≥
1
1
|b| = |2 hu, vi| = |hu, vi| .
2
2
Thus kuk kvk ≥ |hu, vi|, which is the Cauchy-Schwarz inequality.
Exploration: Vectors and Matrices with Complex Entries
2
1. Recall that if z ∈ C, then zz = |z| . Then for v as given, we have
kvk =
√
v·v =
√
v 1 v1 + v 2 v2 + · · · + v n vn =
q
2
2
|v1 | + · · · + |vn | .
2
Finally, we must show that |z| = z 2 . But
2
2
|z| = (zz) = zzz 2 = z 2 z 2 = z 2 ,
so the above is in fact equal to
p
|v12 | + · · · + |vn2 |.
2. (a) u · v = i · (2 − 3i) + 1 · (1 + 5i) = −i(2 − 3i) + (1 + 5i) = −2 + 3i.
p
p
√
(b) By Exercise 1, kuk = |i2 | + |12 | = |−1| + |1| = 2.
(c) By Exercise 1,
kvk =
p
|(2 − 3i)2 | + |(1 + 5i)2 | =
(d) d(u, v) = ku − vk =
√
3 5.
−2 + 4i
−5i
=
p
q
(2 + 3i)(2 − 3i) + (1 − 5i)(1 + 5i) =
2
2
|−2 + 4i| + |−5i| =
√
13 + 26 =
√
39.
p
(−2 − 4i)(−2 + 4i) + (5i)(−5i) =
Exploration: Vectors and Matrices with Complex Entries
(e) If w =
547
a
is orthogonal to u, then
b
u · w = i · a + 1 · b = −ai + b = 0,
so that b = ai. Thus for example setting a = 1, we get
1
.
i
a
(f ) If w =
is orthogonal to v, then
b
u · w = 2 − 3i · a + 1 + 5i · b = (2 + 3i)a + (1 − 5i)b = 0.
2+3i
So b = − 1−5i
a; setting for example a = 1 − 5i, we get b = −2 − 3i, and one possible vector is
1 − 5i
−2 − 3i
3. (a) v · u = v 1 u1 + · · · + v n un = v1 u1 + · · · + vn un = v1 u1 + · · · + vn un = u1 v1 + · · · + un vn = u · v.
(b)
u · (v + w) = u1 (v1 + w1 ) + u2 (v2 + w2 ) + · · · un (vn + wn )
= u1 v1 + u1 w1 + u2 v2 + u2 w2 + · · · + un vn + un wn
= (u1 v1 + u2 v2 + · · · + un vn ) + (u1 w1 + u2 w2 + · · · + un wn )
= u · v + u · w.
(c) First,
(cu) · v = cu1 v1 + · · · + cun vn = cu1 v1 + · · · + cun vn = c(u1 v1 + · · · + nn vn ) = cu · v.
For the second equality, we have, using part (a),
u · (cv) = (cu) · v = cu · v = cu · v = cv · u.
2
(d) By Exercise 1, u · u = kuk = u21 + · + u2n ≥ 0. Further, equality holds if and only if each
2
u2i = |ui | = 0, so all the ui must be zero and therefore u = 0.
4. Just apply the definition: conjugate A and take its transpose:
T
−i i
∗
(a) A = A =
.
−2i 3
T
2
5 − 2i
∗
(b) A = A =
.
5 + 2i
−1


2+i
4
T
0 .
(c) A∗ = A = 1 − 3i
−2
3 + 4i


−3i 1 + i 1 − i
T
4
0 .
(d) A∗ = A =  0
1 − i −i
i
548
CHAPTER 7. DISTANCE AND APPROXIMATION
5. These facts follow directly from standard properties of complex numbers together with the fact that
matrix conjugation is element by element:
(a) A = aij = [aij ] = A.
(b) A + B = aij + bij = aij + bij = [aij ] + bij = A + B.
(c) cA = [caij ] = [c · aij ] = c [aij ] = cA.
P
(d) Let C = AB; then cij = k aik bkj , so that
"
#
"
#
X
X
AB ij = C ij =
aik bkj
aik bik
=
= A · B ij ,
k
ij
k
ij
so that AB = A · B.
T
(e) The ij entry of A is aji . The ij entry of AT is aji , so its conjugate, which is the ij entry of
AT , is aji . So the two matrices have the same entries and thus are equal.
6. These facts follow directly from the facts in Exercise 5.
∗ T T T
T
∗ ∗
= A
= AT
= A.
(a) (A ) = AT = AT
(b) (A + B)∗ = (A + B)T = AT + B T = A∗ + B ∗ .
(c) (cA)∗ = (cA)T = cAT = cAT = cA∗ .
(d) (AB)∗ = (AB)T = B T AT = B T · AT = B ∗ A∗ .
 
v1


P
 v2 
7. We have u · v =
ui vi = u1 u2 · · · un  .  = uT v = u∗ v.
 .. 
vn
8. If A is Hermitian, then aij = aji for all i and j. In particular, when i = j, this gives aii = aii , so that
aii is equal to its own conjugate and therefore must be real.
9. (a) A is not Hermitian since it does not have real diagonal entries.
(b) A is not Hermitian since a12 = 2 − 3i and a21 = 2 − 3i = 2 + 3i; the two are not equal.
(c) A is not Hermitian since a12 = 1 − 5i and a21 = −1 − 5i; the two are not equal.
(d) A is Hermitian, since every entry is the conjugate of the same entry in its transpose.
(e) A is not Hermitian. Since it is a real matrix, it is equal to its conjugate, but AT 6= A since A is
not symmetric.
(f ) A is Hermitian. Since it is a real matrix, it is equal to its conjugate, and it is symmetric, so that
T
A = AT = A.
10. Suppose that λ is an eigenvalue of a Hermitian matrix A, with corresponding eigenvector v. Now,
v∗ A = v∗ A∗ = (Av)∗ = (λv)∗
since A is Hermitian. But by Exercise 6(c), (λv)∗ = λv∗ . Then
λ(v∗ v) = v∗ (λv) = v∗ (Av) = (v∗ A)v = λv∗ v = λ(v∗ v).
2
Then since v 6= 0, we have by Exercise 7 v∗ v = v · v = kvk 6= 0. So we can divide the above equation
through by v∗ v to get λ = λ, so that λ is real.
Exploration: Vectors and Matrices with Complex Entries
549
11. Following the hint, adapt the proof of Theorem 5.19. Suppose that λ1 6= λ2 are eigenvalues of a
Hermitian matrix A corresponding to eigenvectors v1 and v2 respectively. Then
λ1 (v1 · v2 ) = (λ1 v1 ) · v2 = (Av1 ) · v2 = (Av1 )∗ v2 = (v1∗ A∗ )v2 .
But A is Hermitian, so A∗ = A and we continue:
(v1∗ A∗ )v2 = (v1∗ A)v2 = v1∗ (Av2 ) = v1∗ (λ2 v2 ) = λ2 (v1∗ v2 ) = λ2 (v1 · v2 ).
Thus λ1 (v1 · v2 ) = λ2 (v1 · v2 ), so that (λ1 − λ2 )(v1 · v2 ) = 0. But λ1 6= λ2 by assumption, so we must
have v1 · v2 = 0.
"
#
− √i2 − √i2
∗
−1
∗
12. (a) Multiplying, we get AA = I, so that A is unitary and A = A =
.
√i
√i
−
2
2
4
0
(b) Multiplying, we get AA∗ =
6= I, so that A is not unitary.
0 4
#
"
3
− 4i
∗
−1
∗
5
5
.
(c) Multiplying, we get AA = I, so that A is unitary and A = A =
− 45 − 3i
5


1−i
√
√
0 −1+i
6
3


(d) Multiplying, we get AA∗ = I, so that A is unitary and A−1 = A∗ = 
1
0 
 0
.
2
1
√
√
0
6
3
13. First show (a) ⇔ (b). Let ui denote the ith column of a matrix U . Then the ith row of U ∗ is ui . Then
by the definition of matrix multiplication,
(U ∗ U )ij = ui · uj .
Then the columns of U form an orthonormal set if and only if
(
0 i 6= j
ui · uj =
,
1 i=j
which holds if and only if
(
∗
(U U )ij =
0
1
i 6= j
,
i=j
which is to say that U ∗ U = I. Thus the columns of U are orthogonal if and only if U is unitary.
Now apply this to the matrix U ∗ . We get that U ∗ is unitary if and only if the columns of U ∗ are
orthogonal. But U ∗ is unitary if and only if U is, since
(U ∗ )∗ U ∗ = U U ∗ ,
and the columns of U ∗ are just the conjugates of the rows of U , so that (letting Ui be the ith row of
U)
∗
Ui · Uj = Ui Uj = (Ui )T Uj = Uj · Ui .
So the columns of U ∗ are orthogonal if and only if the rows of U are orthogonal. This proves (a) ⇔
(c).
Next, we show (a) ⇒ (e) ⇒ (d) ⇒ (e) ⇒ (a). To prove (a) ⇒ (e), assume that U is unitary. Then
U ∗ U = I, so that
∗
U x · U y = (U x) U y = x∗ U ∗ U y = x∗ y = x · y.
550
CHAPTER 7. DISTANCE AND APPROXIMATION
Next, for (e) ⇒ (d), assume that (e) holds. Then Qx · Qx = x · x, so that kQxk =
√
x · x = kxk. Next, we show (d) ⇒ (e). Let ui denote the ith column of U . Then
√
Qx · Qx =
1
2
2
kx + yk − kx − yk
4
1
2
2
=
kU (x + y)k − kU (x − y)k
4
1
2
2
kU x + U yk − kU x − U yk
=
4
= U x · U y.
x·y =
Thus (d) ⇒ (e). Finally, to see that (e) ⇒ (a), let ei be the ith standard basis vector. Then ui = U ei ,
so that since (e) holds,
(
0 i 6= j
ui · uj = U ei · U ej = ei · ej =
1 i = j.
This shows that the columns of U form an orthonormal set, so that U is unitary.
14. (a) Since
"
# "
#
− √i2
i
i
i
i
−√ ·√ =0
·
= −√ · −√
i
√
2
2
2
2
2
" # " #
√i
√i
2 ·
2 = − √i · √i − √i · √i = 1
i
i
√
√
2
2
2
2
2
2
# "
#
"
− √i2
− √i2
i
i
i
i
√
√
·
=
·
−
− √ · √ = 1,
i
i
√
√
2
2
2
2
√i
2
√i
2
2
2
the columns of the matrix are orthonormal, so the matrix is unitary.
(b) The dot product of the first column with itself is
1 + i(1 + i) + 1 − i(1 − i) = (1 − i)(1 + i) + (1 + i)(1 − i) = 4 6= 1,
so that the matrix is not unitary.
(c) Since
" # "
3
5
4i
5
·
− 45
#
3i
5
3
4
4i 3i
= · −
−
·
=0
5
5
5 5
" # " #
3
5
4i
5
3
5
4i
5
3 3 4i 4i
· −
·
=1
5 5
5 5
" # " #
− 54
− 45
4
3i 3i
4
·
=
−
·
−
−
·
= 1,
3i
3i
5
5
5 5
5
5
·
=
the columns of the matrix are orthonormal, so the matrix is unitary.
Exploration: Vectors and Matrices with Complex Entries
551
(d) Clearly column 2 is a unit vector, and is orthogonal to each of the other two columns, so we need
only show the first and third columns are unit vectors and are orthogonal. But
 1+i   2 
√
√
6
6
−1 + i 1
2 − 2i −1 + i

   1−i 2
·√ =
+
=0
 0 · 0 = √ · √ +0+ √
6
3
6
6
3
3
−1−i
1
√
√
3
3
 1+i   1+i 
√
√
6
6
−1 + i −1 − i
2 2

 
 1−i 1+i
· √
= + =1
 0 · 0 = √ · √ +0+ √
6 3
6
6
3
3
−1−i
−1−i
√
√
3
3
 2   2 
√
√
6
6
2
1
1
4 1
2
   
 0  ·  0  = √ · √ + 0 + √ · √ = + = 1,
6 3
6
6
3
3
1
1
√
3
√
3
and we are done.
15. (a) With A as given, the characteristic polynomial is (λ − 2)2 − i · (−i) = λ2 − 4λ + 3 = (λ − 1)(λ − 3).
Thus the eigenvalues are λ1 = 1 and λ2 = 3. Corresponding eigenvectors are
1
1+i
λ=1:
,
λ=3:
.
i
1−i
Since these eigenvectors correspond to different eigenvalues, they are already orthogonal. Their
norms are
q
p
√
1
1+i
2
2
= 1 · 1 + (−i) · i = 2,
= |1 + i| + |1 − i| = 2.
i
1−i
So defining a matrix U whose columns are the normalized eigenvectors gives
#
"
U=
√1
2
√i
2
1+i
2
1−i
2
,
which is unitary. Then
U ∗ AU =
1
0
0
.
3
(b) With A as given, the characteristic polynomial is λ2 + 1. Thus the eigenvalues are λ1 = i and
λ2 = −i. Corresponding eigenvectors are
i
−i
λ=1:
,
λ=3:
.
1
1
Since these eigenvectors correspond to different eigenvalues, they are already orthogonal. Their
norms are
q
q
√
√
−i
i
2
2
2
2
= |−i| + |1| = 2.
= |i| + |1| = 2,
1
1
So defining a matrix U whose columns are the normalized eigenvectors gives
"
#
√i
− √i2
2
U= 1
,
1
√
√
2
2
which is unitary. Then
U ∗ AU =
i
0
0
.
−i
552
CHAPTER 7. DISTANCE AND APPROXIMATION
(c) With A as given, the characteristic polynomial is (λ+1)λ−(1+i)(1−i) = λ2 +λ−2 = (λ+2)(λ−1).
Thus the eigenvalues are λ1 = 1 and λ2 = −2. Corresponding eigenvectors are
1+i
1+i
λ=1:
,
λ=3:
.
2
−1
Since these eigenvectors correspond to different eigenvalues, they are already orthogonal. Their
norms are
q
q
√
√
1+i
1+i
2
2
2
2
= |1 + i| + |1| = 3.
= |1 + i| + |2| = 6,
−1
2
So defining a matrix U whose columns are the normalized eigenvectors gives
#
"
U=
1+i
√
6
√2
6
1+i
√
3
− √13
,
which is unitary. Then
1
U AU =
0
∗
0
.
−2
(d) With A as given, the characteristic polynomial is
(λ − 1) ((λ − 2)(λ − 3) − (1 − i)(1 + i)) = (λ − 1) λ2 − 5λ + 4 = (λ − 1)2 (λ − 4)
Thus the eigenvalues are λ1 = 1 and λ2 = 4. Corresponding eigenvectors are


  

0
1
0
λ = 4 : 1 − i .
λ = 1 : −1 + i , 0 ,
1
2
0
The eigenvectors corresponding to different eigenvalues are already orthogonal, and the two eigenvectors corresponding to λ = 1 are also orthogonal, so these vectors form an orthogonal set. Their
norms are


 
q
0
1
√
2
2
2
−1 + i = |0| + |−1 + i| + |1| = 3,
0 = 1,
1
0


q
0
√
1 − i = |0|2 + |1 − i|2 + |2|2 = 6.
2
So defining a matrix U whose columns are the normalized eigenvectors gives


0
1
0

 −1+i
√
√ ,
0 1−i
U =
 3
6
√1
0 √26
3
which is unitary. Then

1
U ∗ AU = 0
0
0
1
0

0
0 .
4
16. If A is Hermitian, then A∗ = A, so that A∗ A = AA = AA∗ . Next, if U is unitary, then U U ∗ = I. But if
U is unitary, then its rows are orthonormal. Since the columns of U ∗ are the conjugates of the rows of
U , they are orthonormal as well and therefore U ∗ is also unitary. Hence U U ∗ = U ∗ U = I and therefore
U is normal. Finally, if A is skew-Hermitian, then A∗ = −A; then A∗ A = −AA = A(−A) = AA∗ . So
all three of these types of matrices are normal.
17. By the result quoted just before Exercise 16, if A is unitarily diagonalizable, then A∗ A = AA∗ ; this is
the definition of A being normal.
Exploration: Geometric Inequalities and Optimization Problems
553
Exploration: Geometric Inequalities and Optimization Problems
1. We have
√ √
√
√ √
x· y+ y· x=2 x y
q
√ 2
√ 2 √
kuk = kvk =
x + ( y) = x + y,
|u · v| =
√
√ √
√
√
so that 2 x y ≤ x + y · x + y = x + y, and thus
√
xy ≤
x+y
.
2
√ √
√
√ 2
2. (a) Since
x − y is a square, it is nonnegative. Expanding then gives x − 2 x y + y ≥ 0, so
√
that 2 xy ≤ x + y and thus
x+y
√
xy ≤
.
2
(b) Let AD = a and BD = b. Then by the Pythagorean Theorem,
CD2 = a2 − x2 = b2 − y 2 ,
√
so that 2CD2 = (a2 + b2 ) − x2 − y 2 = (x + y)2 − (x2 + y 2 ) = 2xy, so that CD = xy. But CD
is clearly at most equal to a radius of the circle, since it extends from a diameter to the circle.
Since the radius is x+y
2 , we again get
√
xy ≤
x+y
.
2
√
3. Since the area is xy = 100, we have 10 = xy ≤ x+y
2 , so that 40 ≤ 2(x + y), which is the perimeter.
By the AMGM, equality holds when x = y; that is, when x = y = 10, so the square of side 10 is the
rectangle with smallest perimeter.
4. Let x1 play the role of y in the AMGM. Then
r
x·
√
x + x1
1
1
≤
, or 2 1 = 2 ≤ x + .
x
2
x
Equality holds if and only if x = x1 , i.e., when x = 1. Thus 1 + 11 = 2 is the minimum value for f (x)
for x > 0.
5. If the dimensions of the cut-out square are x × x, then the base of the box will be (10 − 2x) × (10 − 2x),
so its volume will be x(10 − 2x)2 . Then
√
3
V ≤
x + (10 − 2x) + (10 − 2x)
20 − 3x
=
,
3
3
10
with equality holding if x = 10 − 2x. Thus x = 10
3 , so that 10 − 2x = 3 as well. Then the largest
10
10
10
possible volume is 3 × 3 × 3 .
6. We have
p
(x + y) + (y + z) + (z + x)
2
3
f=
= (x + y + z),
3
3
with equality holding if x + y = y + z = z + x. But this implies that x = y = z; since xyz = 1, we
must have x = y = z = 1, so that f (x, y, z) = (1 + 1)(1 + 1)(1 + 1) = 8. This is the minimum value of
f subject to the constraint xyz = 1.
554
CHAPTER 7. DISTANCE AND APPROXIMATION
7. Use the substitution u = x − y. Then the original expression becomes
u+y+
8
.
uy
Using the AMGM for three variables, we get
8
r
3
u·y·
√
u + y + uy
8
8
3
≤
, or 3 8 = 6 ≤ u + y +
,
uy
3
uy
8
with equality holding if and only if u = y = uy
; this implies that u = y = 2, so that x = u + y = 4. So
8
the minimum values is 4 + 2·2 = 6, which occurs when x = 4 and y = 2.


x
√
8. Follow the method of Example 7.11. x2 + 2y 2 + z 2 is the square of the norm of v =  2 y , so that
z
 
1
√
x + 2y + 4z is the dot product of that vector with u =  2. Then the component-wise form of the
4
Cauchy-Schwarz Inequality gives
√ 2
√ √
2
2
2
2
(1·x+ 2· 2 y +4·z) = (x+2y +4z) ≤ 1 +
2 + 4 (x2 +2y 2 +z 2 ) = 19(x2 +2y 2 +z 2 ) = 19.
So x + 2y + 4z ≤
√
19, and the maximum value is
√
19.


x
9. Follow the method of Example 7.11. Here f (x, y, z) is the square of the norm of v =  y , so set
√z
2


1
u = √1 . Then the component-wise form of the Cauchy-Schwarz Inequality gives
2
  2
x
1
√ 2 1 2
2
2
 1  ·  y  = (x + y + z)2 ≤ 12 + 12 +
z
.
2
x
+
y
+
√
2
√z
2
2

Since x + y + z = 10, we get
1 2
1
100le4 x + y + z , or 25 ≤ x2 + y 2 + z 2 .
2
2
2
2
Thus the minimum value of the function is 25.
1
sin θ
2
10. Follow the method of Example 7.11 with u =
and v =
. (This works because kvk =
1
cos θ
sin2 θ + cos2 θ = 1.) Then the component-wise form of the Cauchy-Schwarz Inequality gives
(u · v)2 = (sin θ + cos θ)2 ≤ (12 + 12 )(sin2 θ + cos2 θ) = 2.
√
√
Therefore sin θ + cos θ ≤ 2, so the maximum value of the expression is 2.
x
1
11. We want to minimize x + y subject to the constraint x + 2y = 5. Thus set v =
and u =
.
y
2
Then the component-wise form of the Cauchy-Schwarz Inequality gives
2
2
(x + 2y)2 ≤ (12 + 22 )(x2 + y 2 ).
Exploration: Geometric Inequalities and Optimization Problems
555
Since x + 2y = 5, we get
25 ≤ 5(x2 + y 2 ), or 5 ≤ x2 + y 2 .
Thus the minimum value is 5, which occurs at equality. To find the point where equality occurs,
substitution 5 − 2y for x to get
5 = (5 − 2y)2 + y 2 = 25 − 20y + 5y 2 , so that y 2 − 4y + 4 = 0.
This has the solution y = 2; then x = 5 − 2y = 1. So the point on x + 2y = 5 closest to the origin is
(1, 2).
√
2
, apply the AMGM to x0 = x1 and y 0 = y1 , giving
12. To show that xy ≥ 1/x+1/y
p
x0 + y 0
x0 y 0 ≤
, or
2
r
2
1
2
√
≥ 0
, or xy ≥
.
x0 y 0
x + y0
1/x + 1/y
Since this inequality results from the AMGM, we have equality when x0 = y 0 , so when x = y. For the
first inequality, note first that (x − y)2 = x2 − 2xy + y 2 ≥ 0, so that x2 + y 2 ≥ 2xy. Then
r
2
x2 + y 2 + (x2 + y 2 )
x2 + 2xy + y 2
x+y
x2 + y 2
x+y
x2 + y 2
=
≥
=
≥
.
, so that
2
4
4
2
2
2
Equality occurs when x2 + y 2 = 2xy, which is the same as (x − y)2 = 0, so that x = y.
13. The area of the rectangle shown is 2xy, where x2 + y 2 = r2 . From Exercise 12, we have
r
p
√
√
p
p
x2 + y 2
√
≥ xy ⇒
x2 + y 2 ≥ 2xy ⇒ r2 = r ≥ 2xy = A ⇒ A ≤ r2 .
2
The maximum occurs at equality, and the maximum area is A = r2 . By Exercise 12, this happens
when x = y.
14. Following the hint, we have
(x + y)2
= (x + y)
xy
1
1
+
x y
.
So from Exercise 12, we have
2
x+y
≥
⇒ (x + y)
2
1/x + 1/y
1
1
+
x y
≥ 4.
Thus the minimum value is 4; this minimum occurs when x = y, so when
1
4x
1
+
=
= 4,
(x + x)
x x
x
or x = y = 1.
15. Squaring the inequalities in Exercise 12 gives
x2 + y 2
≥
2
x+y
2
2
≥
2
1/x + 1/y
2
.
Substituting x + x1 for x and y + y1 for y gives from the first inequality
x + x1
2
2
+ y + y1
2
2
x + x1 + y + y1
 .
≥
2

556
CHAPTER 7. DISTANCE AND APPROXIMATION
Since the constraint we are working with is x + y = 1, we get
2 
2
x + x1 + y + y1
(x + y) + x1 + y1

 =
 =
2
2

1
1
1
x + y
+
2
2
!2
≥
1
2
+
2 x+y
,
using the second and fourth terms in Exercise 12 for the final inequality. But now again since x+y = 1,
we get
2 2
2
5
25
1
+
=
=
,
2 x+y
2
4
25
so that f (x, y) = 2 · 25
4 = 2 is the minimum when x + y = 1. This equality holds when x = y, so that
1
x = y = 2.
7.2
Norms and Distance Functions
1. We have
kukE =
p
√
(−1)2 + 42 + (−5)2 = 42
kuks = |−1| + |4| + |−5| = 10
kukm = max (|−1| , |4| , |−5|) = 5.
2. We have
kvkE =
p
√
22 + (−2)2 + 02 = 2 2
kvks = |2| + |−2| + |0| = 4
kvkm = max (|2| , |−2| , |0|) = 2.
 
−3
3. u − v =  6, so
−5
dE (u, v) =
p
√
(−3)2 + 62 + (−5)2 = 70
ds (u, v) = |−3| + |6| + |−5| = 14
dm (u, v) = max (|−3| , |6| , |−5|) = 6.
4. (a) ds (u, v) measures the total difference in magnitude between corresponding components.
(b) dm (u, v) measure the greatest difference between any pair of corresponding components.
5. kukH = w(u), which is the number of 1s in u, or 4. kvkH = w(v), which is the number of 1s in v, or
5.
6. dH (u, v) = w(u − v) = w(u) + w(v) less twice the overlap, or 4 + 5 − 2 · 2 = 5; this is the number of
1s in u − v.
pP
P 2
2
7. (a) If kvkE =
vi2 = max{|vi |} = kvkm , then squaring both sides gives
vi = max{|vi | } =
2
2
max{vi }. Since vi ≥ 0 for all i, equality holds if and only if exactly one of the vi is nonzero.
P
(b) Suppose kvks =
|vi | = max{|vi |} = kvkm . Since |vi | ≥ 0 for all i, equality holds if and only if
exactly one of the vi is nonzero.
(c) By parts (a) and (b), kvkE = kvkm = kvks if and only if exactly one of the components of v is
nonzero.
7.2. NORMS AND DISTANCE FUNCTIONS
557
8. (a) Since
ku + vkE =
X
X
X
X
(ui + vi )2 =
u2i +
vi2 + 2
ui vi = kukE + kvkE + 2u · v,
we see that ku + vkE = kukE + kvkE if and only if u · v = 0; that is, if and only if u and v are
orthogonal.
(b) We have
ku + vks =
X
|ui + vi | ,
kuks + kvks =
X
(|ui | + |vi |).
For a given i, if the signs of ui and vi are the same, then |ui + vi | = |ui | + |vi |, while if they are
different, then |ui + vi | < |ui | + |vi |. Thus the two norms are equal exactly when ui and vi have
the same sign for each i, which is to say that ui vi ≥ 0 for all i.
(c) We have
ku + vks = max{|ui + vi |},
kuks + kvks = max{|ui |} + max{|vi |}.
Suppose that |ur | = max{|ui |} and |vt | = max{|vi |}. Then for any j, we know that
|uj + vj | ≤ |uj | + |vj | ≤ |ur | + |vt | ,
where the first inequality is the triangle inequality and the second comes from the condition on r
and t. Choose j such that |uj + vj | = ku + vks . If the two norms are equal, then
|uj + vj | = ku + vks = kuks + kvks = |ur | + |vt | .
Thus both of the inequalities above must be equalities. The first is an equality if uj and vj have
the same sign. For the second, since |uj | ≤ |ur | and |vj | ≤ |vt |, equality means that |uj | = |ur | and
|vj | = |vt |. Putting this together gives the following (somewhat complicated) condition for equality
to hold in the triangle inequality for the max norm: equality holds if there is some component j
such that uj and vj have the same sign and such that |uj | is a component of maximal magnitude
of u and |vj | is a component of maximal magnitude of v.
9. Let vk be a component of v with maximum magnitude. Then the result follows from the string of
equalities and inequalities
qX
q
kvkm = |vk | = vk2 ≤
vi2 = kvkE .
P 2
P
2
2
2
10. We have kvkE =
vi ≤ ( |vi |) ≤ kvks . Taking square roots gives the desired result since norms
are always nonnegative.
11. We have kvks =
P
12. We have kvkE =
pP
|vi | ≤
P
vi2 ≤
max{|vi |} = n max{|vi |} = n kvkm .
q
P
2
(max{vi }) =
q
√
√
2
n max{|vi | } = n max{|vi |} = n kvkm .
13. The unit circle in R2 relative to the sum norm is
{(x, y) : |x| + |y| = 1} = {(x, y) : x + y = 1 or x − y = 1 or − x + y = 1 or − x − y = 1}.
The unit circle in R2 relative to the max norm is
{(x, y) : max{|x| , |y|} = 1}.
558
CHAPTER 7. DISTANCE AND APPROXIMATION
These unit circles are shown below:
-1.0
1.0
1.0
0.5
0.5
1.0
0.5
-0.5
-1.0
0.5
-0.5
-0.5
-0.5
-1.0
-1.0
1.0
14. For example, in R2 , let u = h2, 1i and v = h1, 2i. Then kuk = kvk = 2 + 1 = 3. Also, ku + vk =
kh3, 3ik = 6 and ku − vk = kh1, −1ik = 2. But then
2
2
kuk + kvk = 32 + 32 = 18 6=
1
1
2
2
ku + vk + ku − vk = 18 + 2 = 20.
2
2
a
≥ 0, and since max{|2a| , |3b|} = 0 if and only if both a and b are zero, we see that the
15. Clearly
b
norm is zero only for the zero vector. Thus property (1) is satisfied. For property 2,
ca
a
= max{|2ac| , |3bc|} = |c| max{|2a| , |3b|}.
=
c
cb
b
Finally,
a
a+c
c
= max{|2(a + c)| , |3(b + d)|}.
+
=
b
b+d
d
Now use the triangle inequality:
max{|2(a + c)| , |3(b + d)|} ≤ max{|2a| + |2c| , |3b| + |3d|} ≤ max{|2a| , |3b|} + max{|2c| , |3d|}.
a
c
But the last expression is just
+
, and we are done.
b
d
16. Clearly max{|aij |} ≥ 0, and is zero only when all of the aij are zero, so when A = O. Thus property
i,j
(1) is satisfied. For property (2),
kcAk = max{|caij |} = max{|c| |aij |} = |c| max{|aij |} = |c| kAk .
i,j
i,j
i,j
For property (3), using the triangle inequality for absolute values gives
kA + Bk = max{|aij + bij |} ≤ max{|aij | + |bij |} ≤ max{|aij |} + max{|bij |} = kAk + kBk .
i,j
i,j
i,j
i,j
7.2. NORMS AND DISTANCE FUNCTIONS
559
17. Clearly kf k ≥ 0, since |f | ≥ 0. Since f is continuous, |f | is also continuous, so its integral is zero if
and only if it is identically zero. So property (1) holds. For property (2), use standard properties of
integrals:
Z
Z
Z
1
kcf k =
1
|cf (x)| dx =
0
1
|c| · |f (x)| dx = |c|
0
|f (x)| dx = |c| kf k .
0
Finally, property (3) relies on the triangle inequality for absolute value:
Z 1
Z 1
Z 1
Z 1
|g(x)| dx = kf k + kgk .
|f (x)| dx +
(|f (x)| + |g(x)|) dx =
|f (x) + g(x)| dx ≤
kf + gk =
0
0
0
0
18. Since the norm is the maximum of a set of nonnegative numbers, it is always nonnegative, and is zero
exactly when all of the numbers are zero, which means that f is the zero function on [0, 1]. So property
(1) holds. For property (2), we have
kcf k = max {|cf (x)|} = max {|c| |f (x)|} = |c| max {|f (x)|} = |c| kf k .
0≤x≤1
0≤x≤1
0≤x≤1
Finally, property (3) relies on the triangle inequality for absolute values:
kf + gk = max {|f (x) + g(x)|} ≤ max {|f (x)|+|g(x)|} ≤ max {|f (x)|}+ max {|g(x)|} = kf k+kgk .
0≤x≤1
0≤x≤1
0≤x≤1
0≤x≤1
19. By property (2) of the definition of a norm,
d(u, v) = ku − vk = |−1| ku − vk = k(−1)(u − v)k = kv − uk = d(v, u).
√
√
20. The Frobenius norm is kAkF = 22 + 32 + 42 + 12 = 30. From Theorem 7.7 and the comment
following, kAk1 is the largest column sum of |A|, which is 2 + 4 = 6, while kAk∞ is the largest row
sum of |A|, which is 2 + 3 = 4 + 1 = 5.
p
√
21. The Frobenius norm is kAkF = 02 + (−1)2 + (−3)2 + 32 = 19. From Theorem 7.7 and the comment following, kAk1 is the largest column sum of |A|, which is |−1| + 3 = 4, while kAk∞ is the largest
row sum, which is |−3| + 3 = 6.
p
√
22. The Frobenius norm is kAkF = 12 + 52 + (−2)2 + (−1)2 = 31. From Theorem 7.7 and the comment following, kAk1 is the largest column sum of |A|, which is 5 + |−1| = 6, while kAk∞ is the largest
row sum, which is 1 + 5 = 6.
√
√
23. The Frobenius norm is kAkF = 22 + 12 + 12 + 12 + 32 + 22 + 12 + 12 + 32 = 31. From Theorem
7.7 and the comment following, kAk1 is the largest column sum of |A|, which is 1 + 2 + 3 = 6, while
kAk∞ is the largest row sum, which is 1 + 3 + 2 = 6.
p
√
24. The Frobenius norm is kAkF =
02 + (−5)2 + 22 + 32 + 12 + (−3)2 + (−4)2 + (−4)2 + 32 = 89.
From Theorem 7.7 and the comment following, kAk1 is the largest column sum of |A|, which is |−5| +
1 + |−4| = 10, while kAk∞ is the largest row sum, which is |−4| + |−4| + 3 = 11.
p
√
2
2
2
2
2
2
2
2
2
25. The
√ Frobenius norm is kAkF = 4 + (−2) + (−1) + 0 + (−1) + 2 + 3 + (−3) + 0 = 44 =
2 11. From Theorem 7.7 and the comment following, kAk1 is the largest column sum of |A|, which is
4 + 0 + 3 = 7, while kAk∞ is the largest row sum, which is 4 + |−2| + |−1| = 7.
26. Since kAk1 = 6, which is the sum of the first column, such a vector x is e1 , since
2 3 1
2
2
=
, and
= 6.
4 1 0
4
4 s
1
Since kAk∞ = 5, which is the sum of (say) the first row, let y =
; then
1
5
2 3 1
5
= 5.
=
, and
4 1 1
5
5 m
560
CHAPTER 7. DISTANCE AND APPROXIMATION
27. Since kAk1 = 4, which is the sum of the magnitudes of the second column, such a vector x is e2 , since
0 −1 0
−1
−1
=
= |−1| + 3 = 4.
, and
−3
3 1
3
3 s
Since kAk∞ = 6, which is the sum of the magnitudes of the second row, let y =
column in that row is negative; then
0 −1 −1
−1
=
, and
−3
3
1
6
−1
since the first
1
−1
= 6.
6 m
28. Since kAk1 = 6, which is the sum of the magnitudes of the second column, such a vector x is e2 , since
1
5 0
5
5
=
= 5 + |−1| = 6.
, and
−2 −1 1
−1
−1 s
1
Since kAk∞ = 6, which is the sum of the first row, let y =
; then
1
1
5 1
6
6
= 6.
=
, and
−2 −1 1
−3
−3 m
29. Since kAk1 = 6, which is the sum of the magnitudes of the third column, such a vector x is e3 , since
 

   
1
2 1 1
0
1
1 3 2 0 = 2 , and 2 = 6.
3 s
1 1 3
1
3
 
1
Since kAk∞ = 6, which is the sum of the second row, let y = 1; then
1
 
   

4
4
1
2 1 1
1 3 2 1 = 6 , and 6
= 6.
5 m
5
1
1 1 3
30. Since kAk1 = 10, which is the sum of the magnitudes of the second column, such a vector x is e2 , since

   
 
0 −5
2 0
−5
−5
 3
1 −3 1 =  1 , and  1 = 10.
−4 −4
3 0
−4
−4 s
 
−1
Since kAk∞ = 11, which is the sum of the magnitudes of the third row, let y = −1 since the entries
1
in the first two columns are negative; then

   
 
0 −5
2 −1
7
7
 3
1 −3 −1 = −7 , and −7
= 11.
−4 −4
3
1
11
11 m
31. Since kAk1 = 7, which is the sum of the magnitudes of the first column, such a vector x is e1 , since

   
 
4 −2 −1 1
4
4
0 −1
2 0 = 0 , and 0 = 7.
3 −3
0 0
3
3 s
7.2. NORMS AND DISTANCE FUNCTIONS
561


1
Since kAk∞ = 7, which is the sum of the magnitudes of the first row, let y = −1 since the entries
−1
in the last two columns are negative; then

   
 
4 −2 −1
1
7
7
0 −1
2 −1 = −1 , and −1
= 7.
3 −3
0 −1
6 m
6
32. Let Ai be the ith row of A, let M =
max {kAj ks } be the maximum absolute row sum, and let x
j=1,2,...,n
be any vector with kxkm = 1 (which means that |xi | ≤ 1 for all i). Then
kAxkm = max{|Ai x|}
i
= max{|ai1 x1 + ai2 x2 + · · · + ain xn |}
i
≤ max{|ai1 | |x1 | + |ai2 | |x2 | + · · · + |ai1 | |xn |}
i
≤ max{|ai1 | + |ai2 | + · · · + |ain |}
i
= max{kAi ks } = M.
i
Thus kAxkm is at most equal to the maximum absolute row sum. Next, we find a specific unit vector
x so that it is actually equal to M . Let row k be the row of A that gives the maximum sum norm;
that is, such that kAk ks = M . Let
(
1
Akj ≥ 0
yj =
, j = 1, 2, . . . , n.
−1 Akj < 0
T
Then with y = y1 · · · yn , we have kykm = 1, and we have corrected any negative signs in row
k, so that akj yj ≥ 0 for all j and therefore the k th entry of Ay is
X
X
X
X
akj yj =
|akj yj | =
|akj | |yj | =
|akj | = kAk ks = M.
j
j
j
j
So we have shown that the ∞ norm of A is bounded by the maximum row sum, and then found a
vector x for which kAxk ≤ kAk∞ is equal to the maximum row sum. Thus kAk∞ is equal to the
maximum row sum, as desired.
33. (a) If k·k is an operator norm, then kIk = max kI x̂k = max kx̂k = 1.
kx̂k=1
kx̂k=1
pP
√
12 = n 6= 1. When n = 1,
(b) No, there is not, for n > 1, since in the Frobenius norm, kIkF =
the Frobenius norm is the same as the absolute value norm on the real number line.
34. Let λ be an eigenvalue, and v a corresponding unit vector. Normalize v to v̂. Then Av̂ = λv̂, so that
kAk = max kAx̂k ≥ kAv̂k = kλv̂k = |λ| kv̂k = |λ| .
kx̂k=1
35. Computing the inverse gives
"
A
−1
1
=
−2
− 12
3
2
#
.
Then
cond1 (A) = kAk1 A−1 1 = (3 + 4) (1 + |−2|) = 21
3
cond∞ (A) = kAk∞ A−1 ∞ = (4 + 2) |−2| +
= 21.
2
The matrix is well-conditioned.
562
CHAPTER 7. DISTANCE AND APPROXIMATION
36. Since det A = 0, by definition cond1 (A) = cond∞ (A) = ∞, so this matrix is ill-conditioned.
37. Computing the inverse gives
A−1 =
100 −99
.
−100 100
Then
cond1 (A) = kAk1 A−1 1 = (1 + 1)(100 + |−100|) = 400
cond∞ (A) = kAk∞ A−1 ∞ = (1 + 1)(100 + |−100|) = 400.
The matrix is ill-conditioned.
38. Computing the inverse gives
"
A−1 =
2001
50
3001
− 100
#
−2
3
2
.
Then
2001 3001
+
= 294266
50
100
2001
A−1 ∞ = (3001 + 4002)
+ 2 = 294266.
50
cond1 (A) = kAk1 A−1 1 = (200 + 4002)
cond∞ (A) = kAk∞
The matrix is ill-conditioned.
39. Computing the inverse gives

0
A−1 =  6
−5

0
1
−1 −1 .
1
0
Then
cond1 (A) = kAk1 A−1 1 = (1 + 5 + 1)(0 + 6 + 5) = 77
cond∞ (A) = kAk∞ A−1 ∞ = (5 + 5 + 6)(6 + 1 + 1) = 128.
The matrix is moderately ill-conditioned.
40. Computing the inverse gives


9
−36
30
192 −180 .
A−1 = −36
30 −180
180
Then
1 1
1+ +
(36 + 192 + 180) = 748
2 3
1 1
A−1 ∞ = 1 + +
(36 + 192 + 180) = 748.
2 3
cond1 (A) = kAk1 A−1 1 =
cond∞ (A) = kAk∞
The matrix is ill-conditioned.
41. (a) Computing the inverse gives
A−1 =
1
1
1 − k −1
−k
.
1
Then
cond∞ (A) = kAk∞ A−1 ∞ = max{|k| + 1, 2} max
k
1
2
+
,
k−1
k−1
k−1
.
7.2. NORMS AND DISTANCE FUNCTIONS
563
(b) As k → 1, since the second factor goes to ∞, cond∞ (A) → ∞. This makes sense since when
k = 1, A is not invertible.
42. Since the matrix norm is compatible, we have from the definition that kAxk ≤ kAk kxk for any x, so
that
1
kAk
kbk = AA−1 b ≤ kAk A−1 b , so that
≤
.
kA−1 bk
kbk
Then
A−1 ∆b
A−1 k∆bk
k∆bk
k∆xk
kAk
A−1 k∆bk = cond(A)
=
≤
≤
.
kxk
kA−1 bk
kA−1 bk
kbk
kbk
43. (a) Since
"
9
− 10
A−1 =
1
#
1
,
−1
we have cond∞ (A) = (10 + 10)(1 + 1) = 40.
(b) Here
10
∆A =
10
Therefore
10
10
−
11
10
10
0
=
9
0
0
2
⇒ k∆Ak = 2.
k∆Ak∞
k∆xkm
2
≤ cond∞ (A)
= 40 · ,
kx0 km
kAk∞
20
so that this can produce at most a change of a factor of 4, or a 400% relative change.
(c) Row-reducing gives
A
0
A
Thus
and
10
b =
10
10
b =
10
9
x=
,
1
10
9
100
99
10
11
100
99
→
→
1
0
1
0
0
1
9
1
0
1
11
.
−1
11
2
0
x =
, so that ∆x = x − x =
,
−1
−2
0
k∆xkm
2
kxkm = 9 , for approximately a 22% relative error.
(d) Using Exercise 42,
∆b =
100
100
0
−
=
,
101
99
2
so that k∆bkm = 2 and
k∆xkm
k∆bkm
2
≤ cond∞ (A)
= 40 ·
= 0.8,
kxkm
kbkm
100
so that at most an 80% change can result.
(e) Row-reducing A b gives the same result as above, while
10 10 100
1
A b0 =
→
10 9 101
0
Thus
and again
11
9
2
∆x =
−
=
,
−1
1
−2
k∆xkm
2
kxkm = 9 , for about a 22% actual relative error.
0
1
11
.
−1
564
CHAPTER 7. DISTANCE AND APPROXIMATION
44. (a) Since

−10

−1
A = 4
7

3
5

−1 −2 ,
−2 −3
we have cond1 (A) = (1 + 5 + |−1|)(|−10| + 4 + 7) = 147.
(b) Here

1
∆A = 2
1
Therefore
 
1
1 1
5 0 − 1
−1 2
1
 
1 1
0
5 0 = 1
−1 2
0
0
0
0

0
0 ⇒ k∆Ak1 = 1.
0
k∆xks
k∆Ak1
1
≤ cond1 (A)
= 147 · = 21,
kx0 ks
kAk1
7
so that this can produce at most a change of a factor of 21, or a 2100% relative change.
(c) Row-reducing gives
h
h
Thus
and
A
A
0


1
i

b = 2
1

1
i

b = 1
1

11
 
x = −4 ,
−6
1 1
5 0
−1 2
1 1
5 0
−1 2

1

2
3

1

2
3
→
→

1 0 0

0 1 0
0 0 0

1 0 0

0 1 0
0 0 1

11

−4
−6

− 11
2
3 
.
2 
5




33
− 11
−
2
2




x0 =  32  , so that ∆x = x0 − x =  11
,
2 
5
11
k∆xks
33
kxks = 21 ≈ 1.57, for approximately a 157% actual relative error.
(d) Using Exercise 42,
     
1
0
1
∆b = 1 − 2 = −1 ,
3
0
3
so that k∆bks = 1 and
k∆xks
k∆bks
1
≤ cond1 (A)
= 147 · = 24.5,
kxks
kbks
6
so that at most a 2450% relative change can result.
(e) Row-reducing A b gives the same result as above, while



1
1 1 1
1
A b0 = 1
5 0 1 → 0
1 −1 2 3
0
Thus
and
0
1
0
    

−4
11
−15
∆x =  1 − −4 =  5
4
−6
10
k∆xks
30
kxks = 21 ≈ 1.43, for about a 143% actual relative error.
0
0
1

−4
1 .
4
7.2. NORMS AND DISTANCE FUNCTIONS
565
45. From Exercise 33(a), 1 = kIk = A−1 A ≤ A−1 kAk = cond(A).
46. From Exercise 33(a),
cond(AB) = (AB)−1 kABk = B −1 A−1 kABk ≤ A−1 kAk B −1 kBk = cond(A) cond(B).
47. From Theorem 4.18 (b) in Chapter 4, if λn is an eigenvalue of A, then λ1n is an eigenvalue of A−1 . By
Exercise 34, kAk ≥ |λ1 | and A−1 ≥
1
λn
, so that
cond(A) = A−1 kAk ≥
48. With
7
A=
1
|λ1 |
.
|λn |
−1
,
−5
we have
M = −D
−1
"
7
(L + U ) = −
0
so that kM k∞ = 15 . Then
" #
6
⇒
b=
−4
#−1 "
0
0
−5
1
"
c=D
−1
b=
# "
−1
− 71
=
0
0
1
7
0
0
− 15
#"
0 0
1
1
5
# " #
6
6
= 74
−4
5
# "
−1
0
= 1
0
5
0
kckm =
6
.
7
#"
k
⇒
1
7
#
,
k
Note that x0 = 0, so that x1 = c and then kM k∞ kx1 − x0 km = kM k∞ kckm < 0.0005 means that
4+log (kck /5)
. In this case, with kM k∞ = 15 and kckm = 67 , we get
k > log 101/kMm
k∞ )
10 (
4 + log10 (6/35)
≈ 4.6, so that k ≥ 5.
log10 5
0.999
Computing x5 using xk+1 = M xk + c gives x5 =
, as compared with an actual solution of
0.999
1
x=
.
1
k>
49. With
4.5
A=
1
−0.5
,
−3.5
we have
#−1 "
4.5
0
0
(L + U ) = −
0 −3.5
1
"
M = −D
−1
so that kM k∞ = 27 . Then
" #
1
b=
⇒
−1
"
c=D
−1
b=
# "
−0.5
− 92
=
0
0
2
9
0
0
− 27
k
# " #
2
1
= 92
−1
7
#"
0 0
2
1
7
# "
−0.5
0
= 2
0
7
#"
⇒
k
kckm =
1
9
0
#
,
2
.
7
Note that x0 = 0, so that x1 = c and then kM k∞ kx1 − x0 km = kM k∞ kckm < 0.0005 means that
4+log (kck /5)
k > log 101/kMm
. In this case, with kM k∞ = 27 and kckm = 27 , we get
k∞ )
10 (
k>
4 + log10 (2/35)
≈ 5.06, so that k ≥ 6.
log10 (7/2)
566
CHAPTER 7. DISTANCE AND APPROXIMATION
0.2622
Computing x6 using xk+1 = M xk + c gives x6 =
, as compared with an actual solution of
0.3606
0.2623
x=
.
0.3606
50. With

20
A= 1
−1

1 −1
−10
1 ,
1 10

20
0

−1
M = −D (L + U ) = −  0 −10
0
0
−1 
0 1
0
 
0  1 0
−1 1
10
we have
2
so that kM k∞ = 10
= 51 . Then

 
0
17
 1
 
−1
b = 13 ⇒ c = D b = − 10
1
18
− 10
 
0
−1
 1
1 =  10
1
0
10
1
10
  

17
1
17
− 20
20


1  
− 10
 13 = − 13
20 
9
18
0
5
k
k
1
20
0

1
− 20
0
1
− 10
⇒
1
20
1 
10 
0
kckm =
9
.
5
Note that x0 = 0, so that x1 = c and then kM k∞ kx1 − x0 km = kM k∞ kckm < 0.0005 means that
4+log (kck /5)
. In this case, with kM k∞ = 15 and kckm = 95 , we get
k > log 101/kMm
k∞ )
10 (
4 + log10 (9/25)
≈ 5.08, so that k ≥ 6.
log10 5


0.999
Computing x6 using xk+1 = M xk + c gives x6 = −1.000, as compared with an actual solution of
1.999
 
1
x = −1.
2
k>
51. With

3
A = 1
0
we have

3

−1
M = −D (L + U ) = − 0
0
0
4
0
so that kM k∞ = 24 = 21 . Then
 

1
0
 
1
−1
b = 1 ⇒ c = D b =  4
1
0

0
1 ,
3
1
4
1
−1 
0
0
 
0 1
3
0
1
3
0
1
3
k
1
0
1
 
0
0
  1
1 = − 4
0
0
   
1
1
3
1
1  
= 4
4  1
1
0
1
3
− 31
0
− 31

0

− 41 
0
0
⇒
k
kckm =
1
.
3
Note that x0 = 0, so that x1 = c and then kM k∞ kx1 − x0 km = kM k∞ kckm < 0.0005 means that
4+log (kck /5)
k > log 101/kMm
. In this case, with kM k∞ = 12 and kckm = 13 , we get
k∞ )
10 (
k>
4 + log10 (1/15)
≈ 9.3, so that k ≥ 10.
log10 2
7.2. NORMS AND DISTANCE FUNCTIONS
567


0.299
Computing x10 using xk+1 = M xk + c gives x10 = 0.099, as compared with an actual solution of
0.299
 
0.3
x = 0.1.
0.3
52. The sum and max norms are operator norms induced from the sum and max norms on Rn , so they
are matrix norms.
(a) If C and D are any two matrices and k·k is either the sum or max norm, then kCDk ≤ kCk kDk.
Now assume A is such that kAk = δ < 1. Then
n
kAn k ≤ kAk = δ n .
Thus limn→∞ kAn k = limn→∞ δ n = 0. So by property (1) in the definition of matrix norms,
An → O.
(b) A computation shows that
(I − A)(I + A + · · · + An ) = I − An+1 ,
since all the other terms cancel in pairs. Thus
(I − A)(I + A + A2 + A3 + · · · ) = (I − A) lim (I + A + · · · + An )
n→∞
= lim ((I − A)(I + A + · · · + An ))
n→∞
= lim I − An+1
n→∞
= I.
It follows that I − A is invertible and that its inverse is I + A + A2 + A3 + · · · .
(c) Recall that a consumption matrix C is a matrix such that all entries are nonnegative and the sum
of the entries in each column does not exceed 1. A consumption matrix C is productive if I − C
is invertible and (I − C)−1 ≥ O; that is, if it has only nonnegative entries.
To prove Corollary 3.35, use k·k∞ , the max norm. Note that since C ≥ O, the sum of the absolute
values of the entries of a row is the same as the sum of the entries of the row. If the sum of each
row of C is less than 1, then kCk∞ < 1, so by part (b), I − C is invertible. Since C ≥ O, it is
clear that each entry of (I − C)−1 = I + C + C 2 + C 3 + · · · is nonnegative, so that (I − C)−1 ≥ O.
Thus C is productive.
The proof of Corollary 3.36 is similar, but it uses k·k1 , the sum norm. If the sum of each column
of C is less than 1, then kCk1 < 1. The rest of the argument is identical to that in the preceding
paragraph.
568
7.3
CHAPTER 7. DISTANCE AND APPROXIMATION
Least Squares Approximation
1. The corresponding data points predicted by the line y = −2 + 2x are (1, 0), (2, 2), and (3, 4), so the
least squares error is
p
√
(0 − 0)2 + (2 − 1)2 + (4 − 5)2 = 2.
A plot of the points and the line on the same set of axes is
6
4
2
0.5
1.
2.
1.5
3.
2.5
3.5
-2
2. The corresponding data points predicted by the line y = x are (1, 1), (2, 2), and (3, 3), so the least
squares error is
p
√
(1 − 0)2 + (2 − 1)2 + (3 − 5)2 = 6.
A plot of the points and the line on the same set of axes is
5
4
3
2
1
1
2
3
4
-1
3. The corresponding data points predicted by the line y = −3 + 25 x are 1, − 12 , (2, 2), and 3, 29 , so the
least squares error is
s
√
2
2
1
9
6
2
− − 0 + (2 − 1) +
−5 =
.
2
2
2
A plot of the points and the line on the same set of axes is
4
2
0.5
-2
1.
1.5
2.
2.5
3.
3.5
7.3. LEAST SQUARES APPROXIMATION
569
4. The corresponding data points predicted by the line y = 3 − 13 x are
14
1
1
,
0, 3 − · 0 = (0, 3),
−5, 3 − (−5) = −5,
3
3
3
4
1
1
1
5, 3 − · 5 = 5,
,
10, 3 − · 10 = 10, −
,
3
3
3
3
so the least squares error is
s
√
2
2 2 r
4
14
1
25
4 1
30
2
3−
+ (3 − 3) + 2 −
+ 0− −
=
+0+ + =
.
3
3
3
9
9 9
3
A plot of the points and the line on the same set of axes is
5
4
3
2
1
10
5
-5
5. The corresponding data points predicted by the line y = 3 − 13 x are
5
5
5
5
,
0,
,
5,
,
10,
,
−5,
2
2
2
2
so the least squares error is
s
2 2 2 2 r
5
1 1 1 25 √
5
5
5
3−
+ 3−
+ 2−
+ 0−
=
+ + +
= 7.
2
2
2
2
4 4 4
4
A plot of the points and the line on the same set of axes is
5
4
3
2
1
10
5
-5
6. The corresponding data points predicted by the line y = 2 − 15 x are
(−5, 3) ,
(0, 2) ,
(5, 1) ,
(10, 0) ,
so the least squares error is
q
√
2
2
2
(3 − 3) + (3 − 2) + (2 − 1) + (0 − 0) 62 = 2.
570
CHAPTER 7. DISTANCE AND APPROXIMATION
A plot of the points and the line on the same set of axes is
3.
2.5
2.
1.5
1.
0.5
10
5
-5
7. We are looking for a line y = a+bx that best fits the given points. Using the least squares approximation
theorem, we want a solution to Ax = b, where

 

y1
1 x1


 
1 1
0
 y2 
1 x2 
a
 
 

= 1 2 , x =
, b =  .  = 1 .
A = .

.
.. 
b
 .. 
 ..
1 3
5
yn
1 xn
Since
"
1
A A=
1
T
1
2

# 1
1 
1
3
1

"
1
3

2 =
6
3
#
"
7
6
T
−1
3
⇒ (A A) =
−1
14
−1
#
,
1
2
we get
"
x = (AT A)−1 AT b =
7
3
−1
#"
−1 1
1
1
2
1
2
 
# 0
" #
1  
−3
.
1 =
5
3
2
5
Thus the least squares approximation is y = −3 + 25 x. To find the least squares error, compute
  
0
1
  
e = b − Ax = 1 − 1
5
1
Then kek =
q
1 2
+ (−1)2 +
2
1 2
=
2

 
1
1 " #
2
−3

 
=
2
−1

.
5
1
2
3
2
√
6
2 .
8. We are looking for a line y = a+bx that best fits the given points. Using the least squares approximation
theorem, we want a solution to Ax = b, where


 
1 x1
y1


 
1 1
6
1 x2 
 y2 
a

 
   

1
2
3 .
A = .
=
,
x
=
,
b
=
=
 .. 
.. 
b
 ..
.
. 
1 3
1
1 xn
yn
Since
"
1
A A=
1
T
1
2

# 1
1 
1
3
1

"
1
3

2 =
6
3
#
"
7
6
T
−1
3
⇒ (A A) =
−1
14
−1
1
2
#
,
7.3. LEAST SQUARES APPROXIMATION
571
we get
"
x = (AT A)−1 AT b =
7
3
−1
#"
−1 1
1
1
2
1
2
 
" #
# 6
25
1  
3
.
3 =
3
− 25
1
5
Thus the least squares approximation is y = 25
3 − 2 x. To find the least squares error, compute

 
1
1 " 25 #
6

 
2 35 = − 13  .
−2
1
3
6
  
1
6
  
e = b − Ax = 3 − 1
1
1
Then kek =
q
1 2
+
6
− 13
2
1 2
=
6
+
√
6
6 .
9. We are looking for a line y = a+bx that best fits the given points. Using the least squares approximation
theorem, we want a solution to Ax = b, where
 


y1
1 x1
 


4
1 0
 y2 
1 x2 
a
   

 

1 .
1
1
=
,
x
=
,
b
=
A = .
=



.
..
b
 .. 
 ..
. 
0
1 2
yn
1 xn
Since
"
1
T
A A=
0
1
1

"
0
3

1 =
3
2

# 1
1 
1
2
1
#
"
5
3
T
−1
6
⇒ (A A) =
5
− 12
− 21
#
,
1
2
we get
"
x = (AT A)−1 AT b =
5
6
1
−2
− 12
1
2
#"
1
0
1
1
 
# 4
" #
11
1  
1 = 3 .
2
−2
0
Thus the least squares approximation is y = 11
3 − 2x. To find the least squares error, compute
  
1
4
  
e = b − Ax = 1 − 1
0
1
Then kek =
q
1 2
+
3
− 23
2
1 2
=
3
+

 
1
0 " 11 #
3
 3
 2
= − 3  .
1
−2
1
2
3
√
6
3 .
10. We are looking for a line y = a+bx that best fits the given points. Using the least squares approximation
theorem, we want a solution to Ax = b, where


 
1 x1
y1


 
1 0
3
1 x2 
 y2 
a

 




1
1
A = .
, x=
, b =  .  = 3 .
..  =
b
 ..
 .. 
. 
1 2
5
1 xn
yn
Since
"
1
T
A A=
0
1
1

# 1
1 
1
2
1

"
0
3

1 =
3
2
#
"
5
3
T
−1
6
⇒ (A A) =
− 12
5
− 21
1
2
#
,
572
CHAPTER 7. DISTANCE AND APPROXIMATION
we get
"
5
6
− 12
x = (AT A)−1 AT b =
− 21
1
2
#"
1
0
 
" #
# 3
8
1  
3 = 3 .
2
1
5
1
1
Thus the least squares approximation is y = 83 + x. To find the least squares error, compute

  
 
1
1 0 "8#
3
3
 3
  
 2
e = b − Ax = 3 − 1 1
= − 3  .
1
1
1 2
5
3
q √
2
2
1 2
+ − 23 + 13 = 36 .
Then kek =
3
11. We are looking for a line y = a+bx that best fits the given points. Using the least squares approximation
theorem, we want a solution to Ax = b, where

 
   

1 x1
y1
1 −5
−1
1 x2  
 y2   

1
0
a

 
   1

A = .
, x=
, b =  .  =  .
..  = 1
5
b
2
 ..
 .. 
. 
1
10
4
1 xn
yn
Since
"
1
A A=
−5
T
1
0
1
5

# 1
1 
1

10 1
1

−5
"
4
0

=
10
5
"
#"
#
"
3
10
T
−1
10
⇒ (A A) =
1
150
− 50
1
− 50
#
,
1
125
10
we get
3
10
1
− 50
x = (AT A)−1 AT b =
1
− 50
1
−5
1
125
1
0
1
5
 
# −1
" #

7
1 
 1
.
  = 10
8
10  2
25
4
7
8
Thus the least squares approximation is y = 10
+ 25
x. To find the least squares error, compute
 1
  

− 10
−1
1 −5 " #
 3
 1 1
 7
0

  
 10

e = b − Ax =   − 
.
 8 =  10
3 
 2 1
− 10

5 25
4
Then kek =
q
1
− 10
2
+
3 2
+
10
3
− 10
2
+
1
1 2
=
10
10
1
10
√
5
5 .
12. We are looking for a line y = a+bx that best fits the given points. Using the least squares approximation
theorem, we want a solution to Ax = b, where

 
   

1 x1
y1
1 −5
3
1 x2  
 y2   

0
a

 1
  3
A = .
, x=
, b =  .  =  .
..  = 1
5
b
2
 ..
 .. 
. 
1
10
0
1 xn
yn
Since
"
1
A A=
−5
T
1
0
1
5

# 1
1 
1

10 1
1

−5
"
0
4

=
5
10
10
#
"
3
10
T
−1
10
⇒ (A A) =
1
− 50
150
1
− 50
1
125
#
,
7.3. LEAST SQUARES APPROXIMATION
573
we get
"
x = (AT A)−1 AT b =
3
10
1
− 50
1
− 50
#"
1
125
1
−5
1
0
 
" #
# 3

5
1 
3
2 .
 =
− 15
10 2
1
5
0
Thus the least squares approximation is y = 52 − 15 x. To find the least squares error, compute
  
1
3
3 1
  
e = b − Ax =   − 
2 1
1
0
Then kek =
q
− 12
2
+
1 2
+
2
1 2
+
2

 1
−5 " #
−2


5
1
0
2 =  2 .

 1
1
 2
5 − 5
− 12
10
1 2
= 1.
2
13. We are looking for a line y = a+bx that best fits the given points. Using the least squares approximation
theorem, we want a solution to Ax = b, where


   
 
1 1
1
1 x1
y1

3
1
2
1 x2  



a

 y2   
 
1 3
4
A = .
, x=
, b= . =
..  = 


.
.
b
.
 ..  
.  1 4
5
1 xn
yn
1 5
7
Since
"
1
A A=
1
1
2
T
1
3
1
4

1
#
1
1 
1
5 

1
1

1

2 "

5
3
 = 15

4
5
#
"
11
15
T
−1
10
⇒ (A A) =
3
− 10
55
3
− 10
1
10
#
,
we get
"
T
x = (A A)
−1
T
A b=
11
10
3
− 10
#"
1
1
1
10
3
− 10
1
2
1
3
1
4
 
1

#
3 " 1 #

1 
4 = − 5 .

7
5  

5
5
7
Thus the least squares approximation is y = − 15 + 75 x. To find the least squares error, compute
  
 

−1
1 1
1
  
 " #  52 
3 1 2
 5
  
 
 −1
5





e = b − Ax = 4 − 1 3
=
 0 .
7
  


2
5
5 1 4
− 5 
1
7
1 5
5
q
√
2
2
2
2
Then kek =
− 15 + 25 + 02 + − 25 + 15 = 510 .
14. We are looking for a line y = a+bx that best fits the given points. Using the least squares approximation
theorem, we want a solution to Ax = b, where


 
   
1 1
10
1 x1
y1

8
1
2
1 x2  



a

 
 y2   
1 3
5
A = .
, x=
, b= . =
..  = 


.
.
b
.
 ..  
.  1 4
3
1 xn
yn
1 5
0
574
CHAPTER 7. DISTANCE AND APPROXIMATION
Since
"
1
A A=
1
T
1
2
1
3
1
4

1

#
"
2 "
11

5 15
T
−1
10

⇒ (A A) =
3 =
3
15 55
− 10

4
5

1
#
1
1 
1
5 

1
1
3
− 10
#
1
10
,
we get
"
11
10
3
− 10
x = (AT A)−1 AT b =
#"
1
1
1
10
3
− 10
1
2
1
3
1
4
 
10

#
 8  " 127 #

1 
 5  = 10 .

− 25
5  

3
0
5
Thus the least squares approximation is y = 127
10 − 2 x. To find the least squares error, compute
 
  

−1
10
1 1
  
 " #  35 
 10 
 8  1 2 127
 1
  

10





e = b − Ax =  5  − 1 3
=
− 5  .
5
 3
  
 −2
 10 
 3  1 4
0
1 5
− 15
Then kek =
q
− 15
2
+
3 2
+
10
− 15
2
+
3 2
+
10
− 15
2
√
30
10 .
=
15. Following the method of Example 7.28, we substitute the given values of x into the quadratic equation
a + bx + cx2 = y to get
a+ b+
c=
1
a + 2b + 4c = −2
which in matrix form is

1
1

1
1
a + 3b + 9c =
3
a + 4b + 16c =
4,
1
2
3
4
 

1
1  
a
−2
4
 b =   .
 3
9
c
4
16
Thus we want the least squares approximation to Ax = b, where A is the 4 × 3 matrix on the left and
b is the matrix on the right in the above equation. Since



 1 1 1




31
5
1 1 1 1 
4 10 30
− 27

4
4
4

 1 2 4  



129
AT A = 1 2 3 4  
− 54  ,
 = 10 30 100 ⇒ (AT A)−1 = − 27
4
20
1 3 9 
5
1
1 4 9 16
30 100 354
− 54
4
4
1 4 16
we get

31
4
 27
T
−1 T
x = (A A) A b = − 4
5
4
− 27
4
129
20
− 54

5
1
4

− 45  1
1
1
4
2
Thus the least squares approximation is y = 3 − 18
5 x+x .
1
2
4
1
3
9



1
1  
3
 −2 

.
4    = − 18
5 
 3
16
1
4


7.3. LEAST SQUARES APPROXIMATION
575
16. Following the method of Example 7.28, we substitute the given values of x into the quadratic equation
a + bx + cx2 = y to get
a+ b+
c=6
a + 2b + 4c = 0
a + 3b + 9c = 0
a + 4b + 16c = 2,
which in matrix form is

1
1

1
1
1
2
3
4

 
1  
6
a
0
4
 b =   .
0
9
c
16
2
Thus we want the least squares approximation to Ax = b, where A is the 4 × 3 matrix on the left and
b is the matrix on the right in the above equation. Since






 1 1 1

31
5
4 10 30
1 1 1 1 
− 27

4
4
4



 1 2 4  

129
AT A = 1 2 3 4  
− 54  ,
 = 10 30 100 ⇒ (AT A)−1 = − 27
4
20
1 3 9 
5
1
30 100 354
1 4 9 16
− 54
4
4
1 4 16
we get

31
4

x = (AT A)−1 AT b = − 27
4
5
4
− 27
4
129
20
− 54

5
1
4
5 
− 4  1
1
1
4
1
2
4
1
3
9
 

 6

15
1  
 0 

.
4    = − 56
5 
0
16
2
2
2
Thus the least squares approximation is y = 15 − 56
5 x + 2x .
17. Following the method of Example 7.28, we substitute the given values of x into the quadratic equation
a + bx + cx2 = y to get
a − 2b + 4c =
4
a− b+ c=
7
a
=
3
a+ b+ c=
0
a + 2b + 4c = −1,
which in matrix form is

1
1

1

1
1

 
−2 4  
4
 7
−1 1
a

 
   
0 0
 b =  3 .
 0
1 1 c
2 4
−1
Thus we want the least squares approximation to Ax = b, where A is the 5 × 3 matrix on the left and
b is the matrix on the right in the above equation. Since


1 −2 4





 
17
1
1 1 1 1 1 −1 1
5 0 10
0 − 71
35








T
−1
1
AT A = −2 −1 0 1 2 
0 ,
0 0
1
 =  0 10 0  ⇒ (A A) =  0 10


1
− 17
0 14
4
1 0 1 4 1
1 1
10 0 34
1
2 4
576
CHAPTER 7. DISTANCE AND APPROXIMATION
we get

4
  

18
1 1 1 1  7
5



  17 
−1 0 1 2 
 3 = − 10  .
 
1 0 1 4  0
− 12
−1


17
35

x = (AT A)−1 AT b =  0
− 17
0
1
10
0

− 71
1

0 −2
1
4
14
17
1 2
Thus the least squares approximation is y = 18
5 − 10 x − 2 x .
18. Following the method of Example 7.28, we substitute the given values of x into the quadratic equation
a + bx + cx2 = y to get
a − 2b + 4c =
0
a − b + c = −11
= −10
a
a + b + c = −9
a + 2b + 4c =
which in matrix form is
8,



0
−2 4  
−11
−1 1


 a




0 0 b = 
−10 .


−9
c
1 1
8
2 4

1
1

1

1
1
Thus we want the least squares approximation to Ax = b, where A is the 5 × 3 matrix on the left and
b is the matrix on the right in the above equation. Since


1 −2 4





 
17
5 0 10
1
1 1 1 1 1 −1 1
0 − 71
35
 





T
−1
1
AT A = −2 −1 0 1 2 
0 0
0 ,
 =  0 10 0  ⇒ (A A) =  0 10
1


1
1 1
10 0 34
4
1 0 1 4 1
− 17
0 14
1
2 4
we get


17
35

x = (AT A)−1 AT b =  0
− 17
0
1
10
0

− 17
1

0 −2
1
4
14

0

  62 
1 1 1 1 −11
−

  59 

=
.
−1 0 1 2 
−10


5


1 0 1 4  −9
4
8
9
2
Thus the least squares approximation is y = − 62
5 + 5 x + 4x .
19. The normal equation is AT Ax = AT b. We have
3
A A=
1
T
AT b =
1
1
1
−2


3 1
1 
11 6
1 1 =
2
6 6
1 2
 
1
3 2  
5
1 =
,
−2 1
4
1
so that the normal form of this system is
11
6
6
5
x=
.
6
4
7.3. LEAST SQUARES APPROXIMATION
Thus
577
#−1 " # "
1
11 6
5
5
x=
=
6 6
4
− 51
"
#" # " #
1
5
5
=
11
7
4
30
15
− 15
is the least squares solution.
20. The normal equation is AT Ax = AT b. We have


1 −2
3 2 
14
3 −2 =
−2 1
−6
2
1
 
1
3 2  
6
1 =
,
−2 1
−3
1
1
−2
AT A =
1
−2
AT b =
−6
9
so that the normal form of this system is
14 −6
6
x=
.
−6
9
−3
Thus
#−1 " # "
1
−6
6
= 10
1
9
−3
15
"
14
x=
−6
1
15
7
45
#"
# "
#
2
6
15
=
1
− 15
−3
is the least squares solution.
21. The normal equation is AT Ax = AT b. We have
AT A =
AT b =


1 −2

0 2 3 
0 −3 = 14 8
8 38
5
−3 5 0 2
3
0
 
4

12
0 2 3 
 1 =
,
−21
−3 5 0 −2
4
1
−2
1
−2
so that the normal form of this system is
14
8
Thus
"
14 8
x=
8 38
8
12
x=
.
38
−21
#−1 "
# "
19
12
234
=
2
−21
− 117
2
− 117
7
234
#"
# " #
4
12
3
=
−21
− 56
is the least squares solution.
22. The normal equation is AT Ax = AT b. We have
1
AT A =
0
AT b =
1
0


1
0

2 −1 0 
6
 2 −1 =
−1
1 2 −1
1
−3
0
2
 
1

2 −1 0 
 5 = 12 ,
−1
1 2 −1
−2
2
−3
6
578
CHAPTER 7. DISTANCE AND APPROXIMATION
so that the normal form of this system is
6
−3
Thus
"
6
x=
−3
−3
12
x=
.
6
−2
#−1 " # "
2
−3
12
= 91
6
−2
9
1
9
2
9
#"
# " #
22
12
= 98
−2
9
is the least squares solution.
23. Instead of trying to invert AT A, we instead use an alternative method, row-reducing AT A AT b .
We will see that AT A does not reduce to the identity matrix, so that the system does not have a unique
solution. We get

 


3
0
2
1
1 0 0
1 1
0
1 1

1 0 −1 −1 1
3 −2 −1
0 1 1

 = 0

AT A = 
0 1
3
2
1
1 0 −1 1 1 2 −2
1 −1
2
2
0 1
1
0 1 −1 1 0
   

2
1
1 1
0
1






−5
−3
1
0
−1
−1
  =  .
AT b = 
0 1
1
1  2  3
−1
4
0 1
1
0
Row-reducing, we have

3
0

2
1
0
2
1
3 −2 −1
−2
3
2
−1
2
2


2
1 0 0
0 1 0
−5
→
0 0 1
3
0 0 0
−1
−1
1
2
0

4
−5
.
−5
0
So AT A is not invertible, and therefore there is not a unique solution. The general solution is
 
  

1
4
4+t
−1
 −5 − t  −5
 
  

−5 − 2t = −5 + t −2 .
1
0
t
24. Instead of trying to invert AT A, we instead use an alternative method, row-reducing AT A AT b .
We will see that AT A does not reduce to the identity matrix, so that the system does not have a unique
solution. We get


 

0
1 1 1 0
1 1
0
3 0 3 0
1 −1 0 1 1 −1 1 −1 0 3 1 2

=

AT A = 
1
1 1 1 1
0 1
0 3 1 4 0
0 −1 0 1
1
1 1
1
0 2 0 2

   
0
1 1 1
5
3
1 −1 0 1  3  3
T
  =  .
A b=
1
1 1 1 −1  8
0 −1 0 1
1
−2
Row-reducing, we have

3 0 3 0
0 3 1 2

3 1 4 0
0 2 0 2


3
1 0 0
1
0 1 0
3
1
→
0 0 1 −1
8
−2
0 0 0
0

−5
−1
.
6
0
7.3. LEAST SQUARES APPROXIMATION
579
So AT A is not invertible, and therefore there is not a unique solution. The general solution is

  
 
−5 − t
−5
−1
−1 − t −1
 

 =   + t −1 .
 6 + t   6
 1
t
0
1
25. Solving these equations corresponds to solving Ax = b where
 


2
1
1 −1
6
 0 −1

2


b=
A=
11 .
 3
2 −1
0
−1
0
1
T
To solve, we row-reduce A A AT b :






1
1 −1
1
0
3 −1 
11
7 −5

0 −1
2 
2
0 
7
6 −5
AT A =  1 −1
=
 3
2 −1
−1
2 −1
1
−5 −5
7
−1
0
1
 
 
 2

35
1
0
3 −1  
6  
18 .
2
0 
=
AT b =  1 −1
11
−1
−1
2 −1
1
0
Row-reducing, we have

11
7 −5

6 −5
 7
−5 −5
7


1 0 0
35


18 → 0 1 0
−1
0 0 1

42
11
19 
.
11 
42
11
Thus the best approximation is
   
42
x
11
   19 
x = y  =  11  .
42
z
11
26. Solving these equations corresponds to solving Ax = b where

 

21
2 3
1

7
 1 1
1


A=
b=
14 .
−1 1 −1
0
0 2
1
T
To solve, we row-reduce A A AT b :





2 3
1
2 1 −1 0 
6

1 1
1 
1 2 
6
AT A = 3 1
=
−1 1 −1
1 1 −1 1
4
0 2
1
 

 21
 
2 1 −1 0  
35
7  
1 2 
84 .
AT b = 3 1
=
14
1 1 −1 1
14
0
Row-reducing, we have

6

6
4
6
15
5
4
5
4


35
1 0 0


84 → 0 1 0
14
0 0 1

469
66
224 
.
33 
133
− 11
6
15
5

4
5
4
580
CHAPTER 7. DISTANCE AND APPROXIMATION
Thus the best approximation is
  

469
x
66
  

x = y  =  224
.
33 
z
− 133
11
27. With Ax = QRx = b, we get
"
Rx = QT b,
or
#
"
2
1
x = 13
1
3
3
0
2
3
− 32
1
3
2
3
#

" #
2
3
 
.
 3 =
−2
−1

Back-substitution gives the solution
"
x=
5
3
#
.
−2
28. With Ax = QRx = b, we get
"√
Rx = QT b,
or
√
6
− 26
0
√1
2
#
"
x=
√1
6
√1
2
√2
6
0
 
" #
# 1
√2
− √16  
1 = √6 .
 
√1
2
2
1
Back-substitution gives the solution
" #
x=
4
3
2
.
29. We are looking for a line y = a+bx that best fits the given points. Using the least squares approximation
theorem, we want a solution to Ax = b, where




1 20
14.5

 

1 x1
y1
1 40 
 31 


1 x2  
 y2  


a
36 

  
 1 48 
.

A = .
,
x
=
,
b
=
=
=
 ..  
..  1 60 
b
 ..
 .  45.5
.  


1 80 
 59 
1 xn
yn
1 100
73.5
Since
"
1
A A=
20
1
40
T
1
48
1
60
1
80

1


# 1
1 
1
100 
1

1
1

20

40  "

48 
= 6
60 
348


80 
100
#
"
1519
348
T
−1
1545
⇒ (A A) =
29
− 2060
24304
29
− 2060
1
4120
we get
"
T
x = (A A)
−1
T
A b=
1519
1545
29
− 2060
29
− 2060
1
4120
#"
1
20
1
40
1
48
Thus the least squares approximation is y = 0.918 + 0.730x.
1
60
1
80


14.5


31  "
#
#



36
0.918
1 

=
.

100 
0.730
45.5


 59 
73.5
#
,
7.3. LEAST SQUARES APPROXIMATION
581
30. (a) We are looking for a line L = a + bF that best fits the given points. Using the least squares
approximation theorem, we want a solution to Ax = b, where
  

 


y1
1 x1
7.4
1 2
 y2  
1 x2  
  9.6 

 1 4
.
, x = a , b = 
=
A = .

=

.
.

.. 
1 6
b
 ..  11.5
 ..
13.6
1 8
yn
1 xn
Since
"
1
A A=
2
T
1
4

2
4


6

# 1
1 
1

8 1
1
1
6
"
4
=
20
#
"
3
20
T
−1
2
⇒ (A A) =
120
− 14
− 41
1
20
#
,
8
we get

7.4
"
#

5.4
1 
 9.6 
.
=

1.025
8 11.5
13.6

"
x = (AT A)−1 AT b =
3
2
− 41
#"
− 14 1
1
2
20
1
4
1
6
#
Thus the least squares approximation is L = 5.4 + 1.025F . a = 5.4 represents the length of the
spring when no force is applied (F = 0).
(b) When a weight of 5 ounces is attached, the spring length will be about 5.4 + 1.025 · 5 = 10.525
inches.
31. (a) We are looking for a line y = a + bx that best fits the given points. Letting x = 0 correspond to
1920 and using the least squares approximation theorem, we want a solution to Ax = b, where




1 0
54.1



  
 
1 10
59.7
1 x1
y1
1 20
62.9


1 x2  
 y2  
1 30
a
68.2

  
 



A = .
..  = 1 40 , x = b , b =  ..  = 69.7
.
 ..
. 
.  


1 50
70.8
1 xn
yn




1 60
73.7
1 70
75.4
Since
"
8
A A=
280
T
#
"
5
280
T
−1
12
⇒ (A A) =
1
14000
− 120
1
− 120
1
4200
#
,
we get
"
T
x = (A A)
−1
T
A b=
5
12
1
− 120
#"
1
1
0
4200
1
− 120
1
10
1
20
1
30
1
40
1
50
1
60


54.1


59.7




# 62.9 "
#

1 68.2
56.63

.

=

70 
69.7
0.291


70.8




73.7
75.4
Thus the least squares approximation is y = 56.63 + 0.291x. The model predicts that the life
expectancy of someone born in 2000 is y(80) = 56.63 + 0.291 · 80 = 79.91 years.
582
CHAPTER 7. DISTANCE AND APPROXIMATION
(b) To see how good the model is, compute the least squares error:



 

−2.53
1 0
54.1
 0.16 
59.7 1 10



 

 0.45 
62.9 1 20




 
 2.84 
68.2 1 30 56.63
 , so kek ≈ 4.43.





=
e = b − Ax = 


−
 1.43 
69.7 1 40 0.291
−0.38
70.8 1 50



 

−0.39
73.7 1 60
−1.6
1 70
75.4
The model is not bad, with an error of under 10% of the initial value.
32. (a) Following the method of Example 7.28, we substitute the given values of x into the equation
a + bx + cx2 = y to get
a + 0.5b + 0.25c = 11
a+
b+
c = 17
a + 1.5b + 2.25c = 21
which in matrix form is
a+
2b +
4c = 23
a+
3b +
9c = 18,
 


11
1 0.5 0.25  
17
 a
1 1
1


 
1 1.5 2.25  b  = 21 .
 


23
1 2
4  c
1 3
9
18
Thus we want the least squares approximation to Ax = b, where A is the 5 × 3 matrix on the left
and b is the matrix on the right in the above equation. Since

1
AT A =  0.5
0.25
1
1
1 1.5
1 2.25

 1
1 1 
1
2 3 
1
4 9 1
1
0.5
1
1.5
2
3

0.25

5
1 

 8
2.25
=

16.5
4 
9
8
16.5
39.5

16.5
39.5  ⇒
103.125

3.330 −4.082
1.031
5.735 −1.543 ,
(AT A)−1 = −4.082
1.031 −1.543
0.436

we get

3.330
x = (AT A)−1 AT b = −4.082
1.031

−4.082
1.031
1
5.735 −1.543  0.5
−1.543
0.436
0.25
 
 11



1
1
1 1 
1.918
17
 

1 1.5 2 3 
21 = 20.306 .
1 2.25 4 9 23
−4.972
18
Thus the least squares approximation is y = 1.918 + 20.306t − 4.972t2 .
(b) The height at which the object was released is y(0) ≈ 1.918 m. Its initial velocity was y 0 (0) ≈
2
20.306 m/sec, and its acceleration due to gravity is y 00 (t) ≈ −9.944 m/s .
(c) The object hits the ground when y(t) = 0. Solving the quadratic using the quadratic formula
gives y ≈ −0.092 and y ≈ 4.176 seconds. So the object hits the ground after about 4.176 seconds.
7.3. LEAST SQUARES APPROXIMATION
583
33. (a) If p(t) = cekt , then p(0) = 150 means that c = 150, so that p(t) = 150ekt . Taking logs gives
ln p(t) = ln 150 + kt ≈ 5.011 + kt. We wish to find the value of k which best fits the data. Let
t = 0 correspond to 1950 and t = 1 to 1960. Then kt = ln p(t) − ln 150 = ln p(t)
150 , so
 
1
 
2
 

A=
3 ,
 
4
5
h i
x= k ,


 
179
ln 150
0.177

 203  
ln 150  0.303

 227  

 
b=
ln 150  ≈ 0.414 .

 250  
ln 150  0.511
0.628
ln 281
150
Since
1
AT A = 55 and thus (AT A)−1 = 55
,
we get
x = (AT A)−1 AT b =
1 1
55
2
3
4


0.177


0.303

5 0.414
 = 0.131.
0.511
0.628
Thus the least squares approximation is p(t) = 150e0.131t .
(b) 2010 corresponds to t = 6, so the predicted population is p(6) = 150e0.131·6 ≈ 329.19 million.
34. Let t = 0 correspond to 1970.
(a) For a quadratic approximation, follow the method of Example 7.28, and substitute the given
values of x into the equation a + bx + cx2 = y to get
a
a + 5b +
=
29.3
25c =
44.7
a + 10b + 100c = 143.8
a + 15b + 225c = 371.6
a + 20b + 400c = 597.5
a + 25b + 625c = 1110.8
a + 30b + 900c = 1895.6
a + 35b + 1225c = 2476.6,
which in matrix form is

1
1

1

1

1

1

1
1
0
5
10
15
20
25
30
35



0
29.3
 44.7 
25 




100    
143.8 


a


225 
  b  =  371.6  .



400 
 597.5 
c



625 
1110.8
1895.6
900 
1225
2476.6
Thus we want the least squares approximation to Ax = b, where A is the 8 × 3 matrix on the left
and b is the matrix on the right in the above equation. Since




3
1
17
− 40
8
140
3500
24
600


 3
53
1 
AT A =  140
− 3000
3500
98000  ⇒ (AT A)−1 = − 40
,
4200
1
1
1
− 3000 105000
3500 98000 2922500
600
584
CHAPTER 7. DISTANCE AND APPROXIMATION
we get

17
24
3
− 40
1
600
53
4200
1
− 3000
 3
x = (AT A)−1 AT b = − 40

1

1

 1
1

1
600


1
− 3000

1
1

105000 
1

1
1
0
5
10
15
20
25
30
35

T 
29.3
0

 
25   44.7 

 
 


100 
  143.8 
57.063



225   371.6  

 = −20.335 .
 



400   597.5 
2.589


625 
 1110.8

 
900  1895.6
2476.6
1225
Thus the least squares approximation is y = 57.063 − 20.335t + 2.589t2 .
(b) With y(t) = cekt , since y(0) = 29.3 we have c = 29.3 so that p(t) = 29.3ekt . Taking logs gives
p(t)
ln p(t) = ln 29.3 + kt, or kt = ln 29.3
. Then


0
 
5
 
10
 
15
 
A =  ,
20
 
25
 
 
30
35

 
0.422
ln 44.7
29.3
 ln 143.8  1.591



29.3 

 
 371.6
 ln 29.3  2.540

 597.5  

 
b=
 ln 29.3  ≈ 3.015 .

 1110.8  
ln 29.3  3.635

 1895.6  
ln 29.3  4.170
4.437
ln 2476.6
29.3

Then AT A = 3500 and AT b = 487.696 , so that x = 487.696
≈ 0.1393 . Thus y(t) =
3500
29.3e0.1393t .
(c) The models produce the following predictions:
Year
1970
1975
1980
1985
1990
1995
2000
2005
Actual
29.3
44.7
143.8
371.6
597.5
1110.8
1895.6
2476.6
Quadratic
57.1
20.1
112.6
334.6
686.0
1166.8
1777.1
2516.9
Exponential
29.3
58.8
118.0
236.8
475.1
953.5
1913.3
3839.4
The quadratic appears to be a better fit overall, even though it does a poor job in the first 15
years or so. The exponential model simply grows too fast.
(d) Using the quadratic approximation, we get for 2010 and 2015
y(40) = 57.063 − 20.335 · 40 + 2.589 · 402 = 3386.1
y(45) = 57.063 − 20.335 · 45 + 2.589 · 452 = 4384.7.
35. Using Example 7.29, taking logs gives
   
t1
30
   
A = t2  = 60 ,
t3
90

  
kt1
ln 172
200

  
b = kt2  = ln 148
.
200 
128
kt2
ln 200
7.3. LEAST SQUARES APPROXIMATION
585
62.8 Then AT A = 12600 and AT b ≈ −62.8 , so that x ≈ − 12600
≈ −0.005. Thus k = −0.005 and
p(t) = ce−0.005t . To compute the half-life, we want to solve 12 c = ce−0.005t , so that −0.005t = ln 12 =
ln 2
≈ 139 days.
− ln 2; then t = 0.005
36. We want to find the best fit z = a + bx + cy to the given data; the data give the equations
− 4c =
0
=
0
a + 4b − c =
1
a + b − 3c =
1
a
a + 5b
a − b − 5c = −2,
which in matrix form is

1
1

1

1
1
0
5
4
1
−1
 

0
−4  
 0
0
 
 a
   
−1
 b =  1 .
 1

c
−3
−2
−5
Thus we want the least squares approximation to Ax = b, where A is the 5 × 3 matrix on the left and
b is the matrix on the right in the above equation. Since




2189
541
5
9 −13
− 433
15
15
15




86
AT A =  9 43 −2 ⇒ (AT A)−1 = − 433
,
− 107
15
15
15 
541
107
134
−13 −2
51
−
15
15
15
we get


2189
15

x = (AT A)−1 AT b = − 433
15
541
15
− 433
15
86
15
− 107
15

541
1
15
107  
− 15   0
134
−4
15
1
5
0

0
   
43
1
1
1  0
  38 


4
1 −1 
 1 = − 3  .
 
11
−1 −3 −5  1
3
−2
Thus the least squares approximation is y = 31 (43 − 8x + 11y).
37. We have
A=
1
1
⇒
AT = 1
1
⇒
AT A = 2 , (AT A)−1 = 21 .
So the standard matrix of the orthogonal projection onto W is
" #
"
i
1
1 h1i h
T
−1 T
2
P = A(A A) A =
=
1
1
1
1 2
2
Then
"
projW (v) = P v =
38. We have
1
A=
−2
⇒
AT = 1
−2
1
2
1
2
⇒
1
2
1
2
1
2
1
2
#
.
#" # " #
7
3
= 27 .
4
2
AT A = 5 , (AT A)−1 = 51 .
So the standard matrix of the orthogonal projection onto W is
" #
"
h ih
i
1
1
1
5
P = A(AT A)−1 AT =
=
1
−2
− 25
−2 5
− 25
4
5
#
.
586
CHAPTER 7. DISTANCE AND APPROXIMATION
Then
"
projW (v) = P v =
#" # " #
1
− 15
=
.
4
2
1
5
5
− 25
1
5
− 25
39. We have
 
1
A = 1
1
⇒
AT = 1
1
1
AT A = 3 , (AT A)−1 = 13 .
⇒
So the standard matrix of the orthogonal projection onto W is
 
1 h ih
  1
T
−1 T
P = A(A A) A = 1 3 1
1

1
i
3
1
1 = 3
1
1
3
1
3
1
3
1
3

1
3
1
.
3
1
3
Then
 

1
3
1
3
1
3
−1
⇒
1
3
1
projW (v) = P v =  3
1
3
 
1
1
2
3
 
1  
= 2 .
3  2
1
3
2
3
40. We have

2
A =  2
−1

⇒
AT = 2
2
AT A = 9 , (AT A)−1 = 19 .
So the standard matrix of the orthogonal projection onto W is

2 h ih
 
P = A(AT A)−1 AT =  2 91 2
−1


2 −1
i
4
9

=  49
− 29
4
9
4
9
− 92

− 92

− 92  .
1
9
Then
4
9
 4
projW (v) = P v =  9
− 29
4
9
4
9
− 29
   
4
− 92
1
9
 4
2  
− 9  0 =  9  .
1
0
− 29
9
1
1
⇒
AT A =

41. We have

1
A= 0
−1

1
1
1
⇒
AT =
0
1
−1
1
2
0
1
0
, (AT A)−1 = 2
3
0
So the standard matrix of the orthogonal projection onto W is

1

P = A(AT A)−1 AT =  0
−1

1 "1

1 2
0
1
#"
0 1
1
1
3
0
1

5
#
6
−1
 1
= 3
1
− 16
Then

5
6
 1
projW (v) = P v =  3
− 61
1
3
1
3
1
3
   
5
1
6





1
=  13  .
3  0
5
0
− 16
6
− 16
1
3
1
3
1
3
− 16

1
.
3
5
6
0
1
3
.
7.3. LEAST SQUARES APPROXIMATION
587
42. We have

1
A = −2
1

1
0
−1
1
A =
1
T
⇒
−2
1
0 −1
⇒
1
0
T
−1
, (A A) = 6
2
0
6
A A=
0
T
So the standard matrix of the orthogonal projection onto W is



2
#"
#
1
1 "1
3
1
−2
1
0



P = A(AT A)−1 AT = −2
= − 31
0 6 1
0 −1
0 2 1
1 −1
− 31
Then

2
3
 1
projW (v) = P v = − 3
− 31
43. With the new basis, we have


"
1 1
1


T
A = 1 3 ⇒ A =
1
0 1
1
3
#
0
1
2
3
− 13
.
1
2

− 31

− 31  .
2
3
   
−1
1
− 13
 
1  
− 3  2 =  0 .
2
1
3
3
Download