linear-algebra-a-modern-introduction-4e-ism-4nbsped-1285869605-9781285869605 compress

Complete Solutions Manual Linear Algebra A Modern Introduction FOURTH EDITION David Poole Trent University Prepared by Roger Lipsett Australia • Brazil • Japan • Korea • Mexico • Singapore • Spain • United Kingdom • United States ISBN-13: 978-128586960-5 ISBN-10: 1-28586960-5 © 2015 Cengage Learning ALL RIGHTS RESERVED. No part of this work covered by the copyright herein may be reproduced, transmitted, stored, or used in any form or by any means graphic, electronic, or mechanical, including but not limited to photocopying, recording, scanning, digitizing, taping, Web distribution, information networks, or information storage and retrieval systems, except as permitted under Section 107 or 108 of the 1976 United States Copyright Act, without the prior written permission of the publisher except as may be permitted by the license terms below. For product information and technology assistance, contact us at Cengage Learning Customer & Sales Support, 1-800-354-9706. For permission to use material from this text or product, submit all requests online at www.cengage.com/permissions Further permissions questions can be emailed to permissionrequest@cengage.com. Cengage Learning 200 First Stamford Place, 4th Floor Stamford, CT 06902 USA Cengage Learning is a leading provider of customized learning solutions with office locations around the globe, including Singapore, the United Kingdom, Australia, Mexico, Brazil, and Japan. Locate your local office at: www.cengage.com/global. Cengage Learning products are represented in Canada by Nelson Education, Ltd. To learn more about Cengage Learning Solutions, visit www.cengage.com. Purchase any of our products at your local college store or at our preferred online store www.cengagebrain.com. NOTE: UNDER NO CIRCUMSTANCES MAY THIS MATERIAL OR ANY PORTION THEREOF BE SOLD, LICENSED, AUCTIONED, OR OTHERWISE REDISTRIBUTED EXCEPT AS MAY BE PERMITTED BY THE LICENSE TERMS HEREIN. READ IMPORTANT LICENSE INFORMATION Dear Professor or Other Supplement Recipient: Cengage Learning has provided you with this product (the “Supplement”) for your review and, to the extent that you adopt the associated textbook for use in connection with your course (the “Course”), you and your students who purchase the textbook may use the Supplement as described below. Cengage Learning has established these use limitations in response to concerns raised by authors, professors, and other users regarding the pedagogical problems stemming from unlimited distribution of Supplements. Cengage Learning hereby grants you a nontransferable license to use the Supplement in connection with the Course, subject to the following conditions. The Supplement is for your personal, noncommercial use only and may not be reproduced, posted electronically or distributed, except that portions of the Supplement may be provided to your students IN PRINT FORM ONLY in connection with your instruction of the Course, so long as such students are advised that they Printed in the United States of America 1 2 3 4 5 6 7 17 16 15 14 13 may not copy or distribute any portion of the Supplement to any third party. You may not sell, license, auction, or otherwise redistribute the Supplement in any form. We ask that you take reasonable steps to protect the Supplement from unauthorized use, reproduction, or distribution. Your use of the Supplement indicates your acceptance of the conditions set forth in this Agreement. If you do not accept these conditions, you must return the Supplement unused within 30 days of receipt. All rights (including without limitation, copyrights, patents, and trade secrets) in the Supplement are and will remain the sole and exclusive property of Cengage Learning and/or its licensors. The Supplement is furnished by Cengage Learning on an “as is” basis without any warranties, express or implied. This Agreement will be governed by and construed pursuant to the laws of the State of New York, without regard to such State’s conflict of law rules. Thank you for your assistance in helping to safeguard the integrity of the content contained in this Supplement. We trust you find the Supplement a useful teaching tool. Contents 1 Vectors 1.1 The Geometry and Algebra of Vectors . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1.2 Length and Angle: The Dot Product . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Exploration: Vectors and Geometry . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1.3 Lines and Planes . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Exploration: The Cross Product . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1.4 Applications . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Chapter Review . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 3 3 10 25 27 41 44 48 2 Systems of Linear Equations 53 2.1 Introduction to Systems of Linear Equations . . . . . . . . . . . . . . . . . . . . . . . . . . . 53 2.2 Direct Methods for Solving Linear Systems . . . . . . . . . . . . . . . . . . . . . . . . . . . . 58 Exploration: Lies My Computer Told Me . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 75 Exploration: Partial Pivoting . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 75 Exploration: An Introduction to the Analysis of Algorithms . . . . . . . . . . . . . . . . . . . . . . 77 2.3 Spanning Sets and Linear Independence . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 79 2.4 Applications . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 93 2.5 Iterative Methods for Solving Linear Systems . . . . . . . . . . . . . . . . . . . . . . . . . . . 112 Chapter Review . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 123 3 Matrices 129 3.1 Matrix Operations . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 129 3.2 Matrix Algebra . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 138 3.3 The Inverse of a Matrix . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 150 3.4 The LU Factorization . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 164 3.5 Subspaces, Basis, Dimension, and Rank . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 176 3.6 Introduction to Linear Transformations . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 192 3.7 Applications . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 209 Chapter Review . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 230 4 Eigenvalues and Eigenvectors 235 4.1 Introduction to Eigenvalues and Eigenvectors . . . . . . . . . . . . . . . . . . . . . . . . . . . 235 4.2 Determinants . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 250 Exploration: Geometric Applications of Determinants . . . . . . . . . . . . . . . . . . . . . . . . . 263 4.3 Eigenvalues and Eigenvectors of n × n Matrices . . . . . . . . . . . . . . . . . . . . . . . . . . 270 4.4 Similarity and Diagonalization . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 291 4.5 Iterative Methods for Computing Eigenvalues . . . . . . . . . . . . . . . . . . . . . . . . . . . 308 4.6 Applications and the Perron-Frobenius Theorem . . . . . . . . . . . . . . . . . . . . . . . . . 326 Chapter Review . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 365 1 2 CONTENTS 5 Orthogonality 371 5.1 Orthogonality in Rn . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 371 5.2 Orthogonal Complements and Orthogonal Projections . . . . . . . . . . . . . . . . . . . . . . 379 5.3 The Gram-Schmidt Process and the QR Factorization . . . . . . . . . . . . . . . . . . . . . . 388 Exploration: The Modified QR Process . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 398 Exploration: Approximating Eigenvalues with the QR Algorithm . . . . . . . . . . . . . . . . . . . 402 5.4 Orthogonal Diagonalization of Symmetric Matrices . . . . . . . . . . . . . . . . . . . . . . . . 405 5.5 Applications . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 417 Chapter Review . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 442 6 Vector Spaces 451 6.1 Vector Spaces and Subspaces . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 451 6.2 Linear Independence, Basis, and Dimension . . . . . . . . . . . . . . . . . . . . . . . . . . . . 463 Exploration: Magic Squares . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 477 6.3 Change of Basis . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 480 6.4 Linear Transformations . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 491 6.5 The Kernel and Range of a Linear Transformation . . . . . . . . . . . . . . . . . . . . . . . . 498 6.6 The Matrix of a Linear Transformation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 507 Exploration: Tiles, Lattices, and the Crystallographic Restriction . . . . . . . . . . . . . . . . . . . 525 6.7 Applications . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 527 Chapter Review . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 531 7 Distance and Approximation 537 7.1 Inner Product Spaces . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 537 Exploration: Vectors and Matrices with Complex Entries . . . . . . . . . . . . . . . . . . . . . . . 546 Exploration: Geometric Inequalities and Optimization Problems . . . . . . . . . . . . . . . . . . . 553 7.2 Norms and Distance Functions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 556 7.3 Least Squares Approximation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 568 7.4 The Singular Value Decomposition . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 590 7.5 Applications . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 614 Chapter Review . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 625 8 Codes 633 8.1 Code Vectors . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 633 8.2 Error-Correcting Codes . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 637 8.3 Dual Codes . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 641 8.4 Linear Codes . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 647 8.5 The Minimum Distance of a Code . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 650 Chapter 1 Vectors 1.1 The Geometry and Algebra of Vectors H-2, 3L 1. H2, 3L 3 2 1 H3, 0L -2 1 -1 2 3 -1 H3, -2L -2 2. Since 2 3 5 + = , −3 0 −3 2 2 4 + = , −3 3 0 2 −2 0 + = , −3 3 0 plotting those vectors gives 1 2 3 4 -1 c b -2 a -3 d -4 -5 3 5 2 3 5 + = , −3 −2 −5 4 CHAPTER 1. VECTORS 3. 2 c z b 1 -2 0 -1 0 1 y2 1 0 3 2 -1 a x -1 d -2 4. Since the heads are all at (3, 2, 1), the tails are at       3 0 3 2 − 2 = 0 , 1 0 1       0 3 3 2 − 2 = 0 , 1 1 0       3 1 2 2 − −2 = 4 , 1 1 0 # » 5. The four vectors AB are 3 c 2 1 d a 1 2 3 4 -1 b -2 In standard position, the vectors are # » (a) AB = [4 − 1, 2 − (−1)] = [3, 3]. # » (b) AB = [2 − 0, −1 − (−2)] = [2, 1] # » (c) AB = 12 − 2, 3 − 32 = − 32 , 32 # » (d) AB = 16 − 13 , 12 − 13 = − 16 , 16 . 3 2 a 1 c b d -1 1 2 3       3 −1 4 2 − −1 = 3 . 1 −2 3 1.1. THE GEOMETRY AND ALGEBRA OF VECTORS 5 6. Recall the notation that [a, b] denotes a move of a units horizontally and b units vertically. Then during the first part of the walk, the hiker walks 4 km north, so a = [0, 4]. During the second part of the walk, the hiker walks a distance of 5 km northeast. From the components, we get " √ √ # 5 2 5 2 ◦ ◦ , . b = [5 cos 45 , 5 sin 45 ] = 2 2 Thus the net displacement vector is " √ √ # 5 2 5 2 c=a+b= , 4+ . 2 2 7. a + b = 3 2 3+2 5 + = = . 0 3 0+3 3 3 2 a+b b 1 a 1 2 −2 2 − (−2) 4 8. b−c = − = = . 3 3 3−3 0 2 3 4 5 3 2 b -c 1 b-c 1 3 −2 5 9. d − c = − = . −2 3 −5 2 3 4 1 2 3 4 5 d -1 -2 d-c -3 -c -4 -5 3 3 3+3 10. a + d = + = = 0 −2 0 + (−2) 6 . −2 a 1 2 3 4 5 6 d -1 a+d -2 11. 2a + 3c = 2[0, 2, 0] + 3[1, −2, 1] = [2 · 0, 2 · 2, 2 · 0] + [3 · 1, 3 · (−2), 3 · 1] = [3, −2, 3]. 12. 3b − 2c + d = 3[3, 2, 1] − 2[1, −2, 1] + [−1, −1, −2] = [3 · 3, 3 · 2, 3 · 1] + [−2 · 1, −2 · (−2), −2 · 1] + [−1, −1, −2] = [6, 9, −1]. 6 CHAPTER 1. VECTORS 13. u = [cos 60◦ , sin 60◦ ] = h 1 2, √ 3 2 i h √ i , and v = [cos 210◦ , sin 210◦ ] = − 23 , − 21 , so that # √ √ 1 3 3 1 u+v = − , − , 2 2 2 2 # √ √ 1 3 3 1 u−v = + , + . 2 2 2 2 " " # » 14. (a) AB = b − a. # » # » # » # » (b) Since OC = AB, we have BC = OC − b = (b − a) − b = −a. # » (c) AD = −2a. # » # » # » (d) CF = −2OC = −2AB = −2(b − a) = 2(a − b). # » # » # » (e) AC = AB + BC = (b − a) + (−a) = b − 2a. # » # » # » # » (f ) Note that F A and OB are equal, and that DE = −AB. Then # » # » # » # » # » BC + DE + F A = −a − AB + OB = −a − (b − a) + b = 0. 15. 2(a − 3b) + 3(2b + a) property e. distributivity = (2a − 6b) + (6b + 3a) property b. associativity = (2a + 3a) + (−6b + 6b) = 5a. 16. property e. distributivity −3(a − c) + 2(a + 2b) + 3(c − b) = property b. associativity = (−3a + 3c) + (2a + 4b) + (3c − 3b) (−3a + 2a) + (4b − 3b) + (3c + 3c) = −a + b + 6c. 17. x − a = 2(x − 2a) = 2x − 4a ⇒ x − 2x = a − 4a ⇒ −x = −3a ⇒ x = 3a. 18. x + 2a − b = 3(x + a) − 2(2a − b) = 3x + 3a − 4a + 2b x − 3x = −a − 2a + 2b + b ⇒ ⇒ −2x = −3a + 3b ⇒ 3 3 x = a − b. 2 2 19. We have 2u + 3v = 2[1, −1] + 3[1, 1] = [2 · 1 + 3 · 1, 2 · (−1) + 3 · 1] = [5, 1]. Plots of all three vectors are 2 w 1 v v 1 -1 2 3 4 v u -1 v u -2 5 6 1.1. THE GEOMETRY AND ALGEBRA OF VECTORS 7 20. We have −u − 2v = −[−2, 1] − 2[2, −2] = [−(−2) − 2 · 2, −1 − 2 · (−2)] = [−2, 3]. Plots of all three vectors are -u -v 3 -u 2 w -v 1 u -2 1 -1 2 v -1 -2 21. From the diagram, we see that w = −2u + 4v. 6 -u 5 -u w 4 v 3 v 2 v 1 v 1 -1 2 3 4 6 5 u -1 22. From the diagram, we see that w = 2u + 3v. 9 8 u 7 6 w 5 u 4 3 v 2 u v 1 v -2 -1 1 2 3 4 5 6 7 23. Property (d) states that u + (−u) = 0. The first diagram below shows u along with −u. Then, as the diagonal of the parallelogram, the resultant vector is 0. Property (e) states that c(u + v) = cu + cv. The second figure illustrates this. 8 CHAPTER 1. VECTORS cv cHu + vL cu u v cu u u+v u cv v -u 24. Let u = [u1 , u2 , . . . , un ] and v = [v1 , v2 , . . . , vn ], and let c and d be scalars in R. Property (d): u + (−u) = [u1 , u2 , . . . , un ] + (−1[u1 , u2 , . . . , un ]) = [u1 , u2 , . . . , un ] + [−u1 , −u2 , . . . , −un ] = [u1 − u1 , u2 − u2 , . . . , un − un ] = [0, 0, . . . , 0] = 0. Property (e): c(u + v) = c ([u1 , u2 , . . . , un ] + [v1 , v2 , . . . , vn ]) = c ([u1 + v1 , u2 + v2 , . . . , un + vn ]) = [c(u1 + v1 ), c(u2 + v2 ), . . . , c(un + vn )] = [cu1 + cv1 , cu2 + cv2 , . . . , cun + cvn ] = [cu1 , cu2 , . . . , cun ] + [cv1 , cv2 , . . . , cvn ] = c[u1 , u2 , . . . , un ] + c[v1 , v2 , . . . , vn ] = cu + cv. Property (f): (c + d)u = (c + d)[u1 , u2 , . . . , un ] = [(c + d)u1 , (c + d)u2 , . . . , (c + d)un ] = [cu1 + du1 , cu2 + du2 , . . . , cun + dun ] = [cu1 , cu2 , . . . , cun ] + [du1 , du2 , . . . , dun ] = c[u1 , u2 , . . . , un ] + d[u1 , u2 , . . . , un ] = cu + du. Property (g): c(du) = c(d[u1 , u2 , . . . , un ]) = c[du1 , du2 , . . . , dun ] = [cdu1 , cdu2 , . . . , cdun ] = [(cd)u1 , (cd)u2 , . . . , (cd)un ] = (cd)[u1 , u2 , . . . , un ] = (cd)u. 25. u + v = [0, 1] + [1, 1] = [1, 0]. 26. u + v = [1, 1, 0] + [1, 1, 1] = [0, 0, 1]. 1.1. THE GEOMETRY AND ALGEBRA OF VECTORS 9 27. u + v = [1, 0, 1, 1] + [1, 1, 1, 1] = [0, 1, 0, 0]. 28. u + v = [1, 1, 0, 1, 0] + [0, 1, 1, 1, 0] = [1, 0, 1, 0, 0]. 29. + 0 1 2 3 0 0 1 2 3 1 1 2 3 0 2 2 3 0 1 3 3 0 1 2 · 0 1 2 3 0 0 0 0 0 1 0 1 2 3 2 0 2 0 2 3 0 3 2 1 0 0 1 2 3 4 1 1 2 3 4 0 2 2 3 4 0 1 3 3 4 0 1 2 4 4 0 1 2 3 · 0 1 2 3 4 0 0 0 0 0 0 1 0 1 2 3 4 2 0 2 4 1 3 3 0 3 1 4 2 30. + 0 1 2 3 4 4 0 4 3 2 1 31. 2 + 2 + 2 = 6 = 0 in Z3 . 32. 2 · 2 · 2 = 3 · 2 = 0 in Z3 . 33. 2(2 + 1 + 2) = 2 · 2 = 3 · 1 + 1 = 1 in Z3 . 34. 3 + 1 + 2 + 3 = 4 · 2 + 1 = 1 in Z4 . 35. 2 · 3 · 2 = 4 · 3 + 0 = 0 in Z4 . 36. 3(3 + 3 + 2) = 4 · 6 + 0 = 0 in Z4 . 37. 2 + 1 + 2 + 2 + 1 = 2 in Z3 , 2 + 1 + 2 + 2 + 1 = 0 in Z4 , 2 + 1 + 2 + 2 + 1 = 3 in Z5 . 38. (3 + 4)(3 + 2 + 4 + 2) = 2 · 1 = 2 in Z5 . 39. 8(6 + 4 + 3) = 8 · 4 = 5 in Z9 . 10 40. 2100 = 210 = (1024)10 = 110 = 1 in Z11 . 41. [2, 1, 2] + [2, 0, 1] = [1, 1, 0] in Z33 . 42. 2[2, 2, 1] = [2 · 2, 2 · 2, 2 · 1] = [1, 1, 2] in Z33 . 43. 2([3, 1, 1, 2] + [3, 3, 2, 1]) = 2[2, 0, 3, 3] = [2 · 2, 2 · 0, 2 · 3, 2 · 3] = [0, 0, 2, 2] in Z44 . 2([3, 1, 1, 2] + [3, 3, 2, 1]) = 2[1, 4, 3, 3] = [2 · 1, 2 · 4, 2 · 3, 2 · 3] = [2, 3, 1, 1] in Z45 . 44. x = 2 + (−3) = 2 + 2 = 4 in Z5 . 45. x = 1 + (−5) = 1 + 1 = 2 in Z6 46. x = 2−1 = 2 in Z3 . 47. No solution. 2 times anything is always even, so cannot leave a remainder of 1 when divided by 4. 48. x = 2−1 = 3 in Z5 . 49. x = 3−1 4 = 2 · 4 = 3 in Z5 . 50. No solution. 3 times anything is always a multiple of 3, so it cannot leave a remainder of 4 when divided by 6 (which is also a multiple of 3). 51. No solution. 6 times anything is always even, so it cannot leave an odd number as a remainder when divided by 8. 10 CHAPTER 1. VECTORS 52. x = 8−1 9 = 7 · 9 = 8 in Z11 53. x = 2−1 (2 + (−3)) = 3(2 + 2) = 2 in Z5 . 54. No solution. This equation is the same as 4x = 2 − 5 = −3 = 3 in Z6 . But 4 times anything is even, so it cannot leave a remainder of 3 when divided by 6 (which is also even). 55. Add 5 to both sides to get 6x = 6, so that x = 1 or x = 5 (since 6 · 1 = 6 and 6 · 5 = 30 = 6 in Z8 ). 56. (a) All values. (b) All values. (c) All values. 57. (a) All a 6= 0 in Z5 have a solution because 5 is a prime number. (b) a = 1 and a = 5 because they have no common factors with 6 other than 1. (c) a and m can have no common factors other than 1; that is, the greatest common divisor, gcd, of a and m is 1. 1.2 Length and Angle: The Dot Product −1 3 · = (−1) · 3 + 2 · 1 = −3 + 2 = −1. 2 1 3 4 2. Following Example 1.15, u · v = · = 3 · 4 + (−2) · 6 = 12 − 12 = 0. −2 6     2 1 3. u · v = 2 · 3 = 1 · 2 + 2 · 3 + 3 · 1 = 2 + 6 + 3 = 11. 1 3 1. Following Example 1.15, u · v = 4. u · v = 3.2 · 1.5 + (−0.6) · 4.1 + (−1.4) · (−0.2) = 4.8 − 2.46 + 0.28 = 2.62.     4 √1 √  2 − 2 √ √ √  √   5. u · v =   3 ·  0  = 1 · 4 + 2 · (− 2) + 3 · 0 + 0 · (−5) = 4 − 2 = 2. −5 0     −2.29 1.12 −3.25  1.72    6. u · v =   2.07 ·  4.33 = −1.12 · 2.29 − 3.25 · 1.72 + 2.07 · 4.33 − 1.83 · (−1.54) = 3.6265. −1.54 −1.83 7. Finding a unit vector v in the same direction as a given vector u is called normalizing the vector u. Proceed as in Example 1.19: p √ kuk = (−1)2 + 22 = 5, so a unit vector v in the same direction as u is " # " # − √15 1 1 −1 v= u= √ = . kuk √2 5 2 5 8. Proceed as in Example 1.19: kuk = p √ √ 32 + (−2)2 = 9 + 4 = 13, so a unit vector v in the direction of u is " # " # √3 3 1 1 13 v= u= √ = . kuk 13 −2 − √213 1.2. LENGTH AND ANGLE: THE DOT PRODUCT 11 9. Proceed as in Example 1.19: kuk = p 12 + 22 + 32 = √ 14, so a unit vector v in the direction of u is     √1 1 14  2  1 1    = v= u= √  2  √14  .   kuk 14 3 √ 3 14 10. Proceed as in Example 1.19: p √ √ kuk = 3.22 + (−0.6)2 + (−1.4)2 = 10.24 + 0.36 + 1.96 = 12.56 ≈ 3.544, so a unit vector v in the direction of u is     1.5 0.903 1 1  0.4 ≈ −0.169 . v= u= kuk 3.544 −2.1 −0.395 11. Proceed as in Example 1.19: r kuk = 12 + √ 2 √ 2 √ 2 + 3 + 02 = 6, so a unit vector v in the direction of u is   1   1  √  6 √ √ 1 6 6 6 √ √ √    2  1   3 √  √   3  1  2 1 √6  =  3  =  √  = u= √  v= √    1   2  3 kuk 6  3   √6   √2   2  0 0 0 0  12. Proceed as in Example 1.19: p √ kuk = 1.122 + (−3.25)2 + 2.072 + (−1.83)2 = 1.2544 + 10.5625 + 4.2849 + 3.3489 √ = 19.4507 ≈ 4.410, so a unit vector v in the direction of u is 1 1 1.12 −3.25 2.07 −1.83 ≈ 0.254 −0.737 v u= kuk 4.410 −1 3 −4 13. Following Example 1.20, we compute: u − v = − = , so 2 1 1 q √ 2 d(u, v) = ku − vk = (−4) + 12 = 17. 3 4 −1 14. Following Example 1.20, we compute: u − v = − = , so −2 6 −8 q √ 2 2 d(u, v) = ku − vk = (−1) + (−8) = 65.       2 −1 1 15. Following Example 1.20, we compute: u − v = 2 − 3 = −1, so 3 1 2 q √ 2 2 d(u, v) = ku − vk = (−1) + (−1) + 22 = 6. 0.469 −0.415 . 12 CHAPTER 1. VECTORS       3.2 1.5 1.7 16. Following Example 1.20, we compute: u − v = −0.6 −  4.1 = −4.7, so −1.4 −0.2 −1.2 d(u, v) = ku − vk = p 1.72 + (−4.7)2 + (−1.2)2 = √ 26.42 ≈ 5.14. 17. (a) u · v is a real number, so ku · vk is the norm of a number, which is not defined. (b) u · v is a scalar, while w is a vector. Thus u · v + w adds a scalar to a vector, which is not a defined operation. (c) u is a vector, while v · w is a scalar. Thus u · (v · w) is the dot product of a vector and a scalar, which is not defined. (d) c · (u + v) is the dot product of a scalar and a vector, which is not defined. 18. Let θ be the angle between u and v. Then √ 3 u·v 3 · (−1) + 0 · 1 2 p cos θ = =√ =− √ =− . 2 2 2 2 kuk kvk 2 3 2 3 + 0 (−1) + 1 Thus cos θ < 0 (in fact, θ = 3π 4 ), so the angle between u and v is obtuse. 19. Let θ be the angle between u and v. Then cos θ = 1 u·v 2 · 1 + (−1) · (−2) + 1 · (−1) p = . =p 2 2 2 2 2 2 kuk kvk 2 2 + (−1) + 1 1 + (−2) + 1 Thus cos θ > 0 (in fact, θ = π3 ), so the angle between u and v is acute. 20. Let θ be the angle between u and v. Then cos θ = 0 4 · 1 + 3 · (−1) + (−1) · 1 u·v p = √ √ = 0. =p 2 2 2 2 2 2 kuk kvk 26 3 4 + 3 + (−1) 1 + (−1) + 1 Thus the angle between u and v is a right angle. 21. Let θ be the angle between u and v. Note that we can determine whether θ is acute, right, or obtuse u·v by examining the sign of kukkvk , which is determined by the sign of u · v. Since u · v = 0.9 · (−4.5) + 2.1 · 2.6 + 1.2 · (−0.8) = 0.45 > 0, we have cos θ > 0 so that θ is acute. 22. Let θ be the angle between u and v. Note that we can determine whether θ is acute, right, or obtuse u·v by examining the sign of kukkvk , which is determined by the sign of u · v. Since u · v = 1 · (−3) + 2 · 1 + 3 · 2 + 4 · (−2) = −3, we have cos θ < 0 so that θ is obtuse. 23. Since the components of both u and v are positive, it is clear that u · v > 0, so the angle between them is acute since it has a positive cosine. √ √ ◦ 24. From Exercise 18, cos θ = − 22 , so that θ = cos−1 − 22 = 3π 4 = 135 . 25. From Exercise 19, cos θ = 12 , so that θ = cos−1 21 = π3 = 60◦ . 26. From Exercise 20, cos θ = 0, so that θ = π2 = 90◦ is a right angle. 1.2. LENGTH AND ANGLE: THE DOT PRODUCT 13 27. As in Example 1.21, we begin by calculating u · v and the norms of the two vectors: u · v = 0.9 · (−4.5) + 2.1 · 2.6 + 1.2 · (−0.8) = 0.45, p √ kuk = 0.92 + 2.12 + 1.22 = 6.66, p √ kvk = (−4.5)2 + 2.62 + (−0.8)2 = 27.65. So if θ is the angle between u and v, then cos θ = 0.45 u·v 0.45 √ ≈√ , =√ kuk kvk 6.66 27.65 182.817 so that −1 θ = cos 0.45 √ 182.817 ≈ 1.5375 ≈ 88.09◦ . Note that it is important to maintain as much precision as possible until the last step, or roundoff errors may build up. 28. As in Example 1.21, we begin by calculating u · v and the norms of the two vectors: u · v = 1 · (−3) + 2 · 1 + 3 · 2 + 4 · (−2) = −3, p √ kuk = 12 + 22 + 32 + 42 = 30, p √ kvk = (−3)2 + 12 + 22 + (−2)2 = 18. So if θ is the angle between u and v, then cos θ = 3 u·v 1 = −√ √ = − √ kuk kvk 30 18 2 15 so that 1 θ = cos−1 − √ ≈ 1.7 ≈ 97.42◦ . 2 15 29. As in Example 1.21, we begin by calculating u · v and the norms of the two vectors: u · v = 1 · 5 + 2 · 6 + 3 · 7 + 4 · 8 = 70, p √ kuk = 12 + 22 + 32 + 42 = 30, p √ kvk = 52 + 62 + 72 + 82 = 174. So if θ is the angle between u and v, then u·v 70 35 cos θ = =√ √ = √ kuk kvk 30 174 3 145 so that −1 θ = cos 35 √ 3 145 ≈ 0.2502 ≈ 14.34◦ . 30. To show that 4ABC is right, we need only show that one pair of its sides meets at a right angle. So # » # » # » let u = AB, v = BC, and w = AC. Then we must show that one of u · v, u · w or v · w is zero in order to show that one of these pairs is orthogonal. Then # » # » u = AB = [1 − (−3), 0 − 2] = [4, −2] , v = BC = [4 − 1, 6 − 0] = [3, 6] , # » w = AC = [4 − (−3), 6 − 2] = [7, 4] , and u · v = 4 · 3 + (−2) · 6 = 12 − 12 = 0. # » # » Since this dot product is zero, these two vectors are orthogonal, so that AB ⊥ BC and thus 4ABC is a right triangle. It is unnecessary to test the remaining pairs of sides. 14 CHAPTER 1. VECTORS 31. To show that 4ABC is right, we need only show that one pair of its sides meets at a right angle. So # » # » # » let u = AB, v = BC, and w = AC. Then we must show that one of u · v, u · w or v · w is zero in order to show that one of these pairs is orthogonal. Then # » u = AB = [−3 − 1, 2 − 1, (−2) − (−1)] = [−4, 1, −1] , # » v = BC = [2 − (−3), 2 − 2, −4 − (−2)] = [5, 0, −2], # » w = AC = [2 − 1, 2 − 1, −4 − (−1)] = [1, 1, −3], and u · v = −4 · 5 + 1 · 0 − 1 · (−2) = −18 u · w = −4 · 1 + 1 · 1 − 1 · (−3) = 0. # » # » Since this dot product is zero, these two vectors are orthogonal, so that AB ⊥ AC and thus 4ABC is a right triangle. It is unnecessary to test the remaining pair of sides. 32. As in Example 1.22, the dimensions of the cube do not matter, so we work with a cube with side length 1. Since the cube is symmetric, we need only consider one diagonal and adjacent edge. Orient the cube as shown in Figure 1.34; take the diagonal to be [1, 1, 1] and the adjacent edge to be [1, 0, 0]. Then the angle θ between these two vectors satisfies 1 1·1+1·0+1·0 1 √ √ cos θ = =√ , so θ = cos−1 √ ≈ 54.74◦ . 3 1 3 3 Thus the diagonal and an adjacent edge meet at an angle of 54.74◦ . 33. As in Example 1.22, the dimensions of the cube do not matter, so we work with a cube with side length 1. Since the cube is symmetric, we need only consider one pair of diagonals. Orient the cube as shown in Figure 1.34; take the diagonals to be u = [1, 1, 1] and v = [1, 1, 0] − [0, 0, 1] = [1, 1, −1]. Then the dot product is u · v = 1 · 1 + 1 · 1 + 1 · (−1) = 1 + 1 − 1 = 1 6= 0. Since the dot product is nonzero, the diagonals are not orthogonal. 34. To show a parallelogram is a rhombus, it suffices to show that its diagonals are perpendicular (Euclid). But     1 2 d1 · d2 = 2 · −1 = 2 · 1 + 2 · (−1) + 0 · 3 = 0. 3 0 To determine its side length, note that since the diagonals are perpendicular, one half of each diagonal are the legs of a right triangle whose hypotenuse is one side of the rhombus. So we can use the Pythagorean Theorem. Since 2 2 kd1 k = 22 + 22 + 02 = 8, kd2 k = 12 + (−1)2 + 32 = 11, we have for the side length s2 = kd1 k 2 2 + kd2 k 2 2 = 8 11 19 + = , 4 4 4 √ so that s = 19 2 ≈ 2.18. 35. Since ABCD is a rectangle, opposite sides BA and CD are parallel and congruent. So we can use # » the method of Example 1.1 in Section 1.1 to find the coordinates of vertex D: we compute BA = # » # » [1 − 3, 2 − 6, 3 − (−2)] = [−2, −4, 5]. If BA is then translated to CD, where C = (0, 5, −4), then D = (0 + (−2), 5 + (−4), −4 + 5) = (−2, 1, 1). 1.2. LENGTH AND ANGLE: THE DOT PRODUCT 15 36. The resultant velocity of the airplane is the sum of the velocity of the airplane and the velocity of the wind: 200 0 200 r=p+w = + = . 0 −40 −40 37. Let the x direction be east, in the direction of the current, and the y direction be north, across the river. The speed of the boat is 4 mph north, and the current is 3 mph east, so the velocity of the boat is 0 3 3 v= + = . 4 0 4 38. Let the x direction be the direction across the river, and the y direction be downstream. Since vt = d, use the given information to find v, then solve for t and compute d. Since the speed of the boat is 20 20 km/h and the speed of the current is 5 km/h, we have v = . The width of the river is 2 km, and 5 2 the distance downstream is unknown; call it y. Then d = . Thus y 20 2 vt = t= . 5 y Thus 20t = 2 so that t = 0.1, and then y = 5 · 0.1 = 0.5. Therefore (a) Ann lands 0.5 km, or half a kilometer, downstream; (b) It takes Ann 0.1 hours, or six minutes, to cross the river. Note that the river flow does not increase the time required to cross the river, since its velocity is perpendicular to the direction of travel. 39. We want to find the angle between Bert’s resultant vector, r, and his velocity vector upstream, v. Let the first coordinate of the vector be the direction across the river, and the second be the direction x upstream. Bert’s velocity vector directly across the river is unknown, say u = . His velocity vector 0 0 upstream compensates for the downstream flow, so v = . So the resultant vector is r = u + v = 1 x 0 x + = . Since Bert’s speed is 2 mph, we have krk = 2. Thus 0 1 1 2 x2 + 1 = krk = 4, so that x= √ 3. If θ is the angle between r and v, then √ r·v 3 cos θ = = , krk kvk 2 so that −1 θ = cos √ ! 3 = 60◦ . 2 40. We have u·v (−1) · (−2) + 1 · 4 −1 −1 −3 proju v = u= =3 = . 1 1 3 u·u (−1) · (−1) + 1 · 1 A graph of the situation is (with proju v in gray, and the perpendicular from v to the projection also drawn) 16 CHAPTER 1. VECTORS 4 v 3 proju HvL 2 1 u -3 41. We have -2 -1 " 3# 3 4 · 1 + − · 2 u·v 5 = − 1 u = −u. 5 u = 3 53 proju v = 4 4 u·u 1 · + − · − − 45 5 5 5 5 A graph of the situation is (with proju v in gray, and the perpendicular from v to the projection also drawn) 2 v 1 proju HvL 1 u 42. We have  proju v = u·v u= u·u      1 1 4 1 1 1 2 2 3 · 2 − · 2 − · (−2) 8      2 4 2 1 1 2 = = . − − −      1 1 1 1 1 1 4 4 3 3 −4 + −2 −2 2 · 2 + −4 − 12 − 12 − 43 43. We have     3 1 1 2     3 u·v 1 · 2 + (−1) · (−3) + 1 · (−1) + (−1) · (−2) −1 6 −1  − 2  3 proju v = u=   =   =   = u. u·u 1 · 1 + (−1) · (−1) + 1 · 1 + (−1) · (−1)  1 4  1  32  2 − 32 −1 −1  44. We have proju v = u·v 0.5 · 2.1 + 1.5 · 1.2 0.5 2.85 0.5 0.57 u= = = = 1.14u. 1.71 u·u 0.5 · 0.5 + 1.5 · 1.5 1.5 2.5 1.5 1.2. LENGTH AND ANGLE: THE DOT PRODUCT 17 45. We have   3.01 u·v 3.01 · 1.34 − 0.33 · 4.25 + 2.52 · (−1.66)  −0.33 proju v = u= u·u 3.01 · 3.01 − 0.33 · (−0.33) + 2.52 · 2.52 2.52     3.01 −0.301 1.5523  1 −0.33 ≈  0.033 ≈ − u. =− 15.5194 10 2.52 −0.252 # » 46. Let u = AB = # » 2−1 1 4−1 3 = and v = AC = = . 2 − (−1) 3 0 − (−1) 1 (a) To compute the projection, we need 1 3 u·v = · = 6, 3 1 Thus u·u= 1 1 · = 10. 3 3 " # " # 3 3 1 u·v u= = 59 , proju v = u·u 5 3 5 so that " # " # " # 12 3 3 5 . v − proju v = − 59 = 4 1 − 5 5 Then kuk = p 12 + 32 = √ s kv − proju vk = 10, 12 5 2 √ 2 4 4 10 + = , 5 5 so that finally √ 1 1√ 4 10 A = kuk kv − proju vk = = 4. 10 · 2 2 5 √ √ √ (b) We already know u · v = 6 and kuk = 10 from part (a). Also, kvk = 32 + 12 = 10. So cos θ = 3 u·v 6 =√ √ = , kuk kvk 5 10 10 so that s sin θ = p 1 − cos2 θ = 1− 2 4 3 = . 5 5 Thus 1 1√ √ 4 A = kuk kvk sin θ = 10 10 · = 4. 2 2 5         4−3 1 5−3 2 # » # » 47. Let u = AB = −2 − (−1) = −1 and v = AC = 0 − (−1) =  1. 6−4 2 2−4 −2 (a) To compute the projection, we need     1 2 u · v = −1 ·  1 = −3, 2 −2 Thus     1 1 u · u = −1 · −1 = 6. 2 2     1 −1 3    21  u·v u = − −1 =  2  , proju v = u·u 6 2 −1 18 CHAPTER 1. VECTORS so that      5 − 21 2 2    1   1 v − proju v =  1 −  2  =  2  −2 −1 −1  Then kuk = p 12 + (−1)2 + 22 = √ 6, s √ 2 2 5 1 30 2 + + (−1) = kv − proju vk = , 2 2 2 so that finally √ √ 1√ 30 3 5 1 kuk kv − proju vk = = . 6· 2 2 2 2 p √ (b) We already know u · v = −3 and kuk = 6 from part (a). Also, kvk = 22 + 12 + (−2)2 = 3. So √ u·v 6 −3 cos θ = = √ =− , kuk kvk 6 3 6 so that v u √ !2 √ u p 6 30 t 2 = . sin θ = 1 − cos θ = 1 − 6 6 A= Thus √ √ 1 1√ 3 5 30 kuk kvk sin θ = = . 6·3· 2 2 6 2 48. Two vectors u and v are orthogonal if and only if their dot product u · v = 0. So we set u · v = 0 and solve for k: 1 2 k+1 u·v = · = 0 ⇒ 2(k + 1) + 3(k − 1) = 0 ⇒ 5k − 1 = 0 ⇒ k = . 3 k−1 5 A= Substituting into the formula for v gives " v= # 1 5 +1 1 5 −1 " = 6 5 − 45 # . As a check, we compute " # " # 6 2 5 = 12 − 12 = 0, · u·v = 5 5 − 45 3 and the vectors are indeed orthogonal. 49. Two vectors u and v are orthogonal if and only if their dot product u · v = 0. So we set u · v = 0 and solve for k:    2 1 k u · v = −1 ·  k  = 0 ⇒ k 2 − k − 6 = 0 ⇒ (k + 2)(k − 3) = 0 ⇒ k = 2, −3. 2 −3 Substituting into the formula for v gives     (−2)2 4 k = 2 : v1 =  −2  = −2 , −3 −3  2   3 9 k = −3 : v2 =  3  =  3 . −3 −3 As a check, we compute         1 4 1 9 u · v1 = −1 · −2 = 1 · 4 − 1 · (−2) + 2 · (−3) = 0, u · v2 = −1 ·  3 = 1 · 9 − 1 · 3 + 2 · (−3) = 0 2 −3 2 −3 and the vectors are indeed orthogonal. 1.2. LENGTH AND ANGLE: THE DOT PRODUCT 19 50. Two vectors u and v are orthogonal if and only if their dot product u · v = 0. So we set u · v = 0 and solve for y in terms of x: 3 x u·v = · = 0 ⇒ 3x + y = 0 ⇒ y = −3x. 1 y Substituting y = −3x back into the formula for v gives x 1 v= =x . −3x −3 Thus any vector orthogonal to u·v = 3 1 is a multiple of . As a check, 1 −3 3 x · = 3x − 3x = 0 for any value of x, 1 −3x so that the vectors are indeed orthogonal. 51. As noted in the remarks just prior to Example 1.16, the zero vector 0 is orthogonal to all vectors in a x a 2 R . So if = 0, any vector will do. Now assume that 6= 0; that is, that either a or b is b y b nonzero. Two vectors u and v are orthogonal if and only if their dot product u · v = 0. So we set u · v = 0 and solve for y in terms of x: a x u·v = · = 0 ⇒ ax + by = 0. b y First assume b 6= 0. Then y = − ab x, so substituting back into the expression for v we get # " # " # " 1 b x x =x = . v= b −a − ab − ab x Next, if b = 0, then a 6= 0, so that x = − ab y, and substituting back into the expression for v gives " # " # " # − ab y − ab b y . v= =y =− a −a y 1 So in either case, a vector orthogonal to a b , if it is not the zero vector, is a multiple of . As a b −a check, note that a rb · = rab − rab = 0 for all values of r. b −ra 52. (a) The geometry of the vectors in Figure 1.26 suggests that if ku + vk = kuk + kvk, then u and v point in the same direction. This means that the angle between them must be 0. So we first prove Lemma 1. For all vectors u and v in R2 or R3 , u · v = kuk kvk if and only if the vectors point in the same direction. Proof. Let θ be the angle between u and v. Then cos θ = u·v , kuk kvk so that cos θ = 1 if and only if u · v = kuk kvk. But cos θ = 1 if and only if θ = 0, which means that u and v point in the same direction. 20 CHAPTER 1. VECTORS We can now show Theorem 2. For all vectors u and v in R2 or R3 , ku + vk = kuk + kvk if and only if u and v point in the same direction. Proof. First assume that u and v point in the same direction. Then u · v = kuk kvk, and thus 2 ku + vk =u · u + 2u · v + v · v 2 By Example 1.9 2 2 = kuk + 2u · v + kvk 2 Since w · w = kwk for any vector w 2 = kuk + 2 kuk kvk + kvk By the lemma 2 = (kuk + kvk) . Since ku + vk and kuk+kvk are both nonnegative, taking square roots gives ku + vk = kuk+kvk. For the other direction, if ku + vk = kuk + kvk, then their squares are equal, so that 2 2 2 (kuk + kvk) = kuk + 2 kuk kvk + kvk and 2 ku + vk = u · u + 2u · v + v · v 2 are equal. But kuk = u · u and similarly for v, so that canceling those terms gives 2u · v = 2 kuk kvk and thus u · v = kuk kvk. Using the lemma again shows that u and v point in the same direction. (b) The geometry of the vectors in Figure 1.26 suggests that if ku + vk = kuk − kvk, then u and v point in opposite directions. In addition, since ku + vk ≥ 0, we must also have kuk ≥ kvk. If they point in opposite directions, the angle between them must be π. This entire proof is exactly analogous to the proof in part (a). We first prove Lemma 3. For all vectors u and v in R2 or R3 , u · v = − kuk kvk if and only if the vectors point in opposite directions. Proof. Let θ be the angle between u and v. Then cos θ = u·v , kuk kvk so that cos θ = −1 if and only if u · v = − kuk kvk. But cos θ = −1 if and only if θ = π, which means that u and v point in opposite directions. We can now show Theorem 4. For all vectors u and v in R2 or R3 , ku + vk = kuk − kvk if and only if u and v point in opposite directions and kuk ≥ kvk. Proof. First assume that u and v point in opposite directions and kuk ≥ kvk. Then u · v = − kuk kvk, and thus 2 ku + vk =u · u + 2u · v + v · v 2 By Example 1.9 2 2 = kuk + 2u · v + kvk 2 Since w · w = kwk for any vector w 2 = kuk − 2 kuk kvk + kvk By the lemma 2 = (kuk − kvk) . Now, since kuk ≥ kvk by assumption, we see that both ku + vk and kuk − kvk are nonnegative, so that taking square roots gives ku + vk = kuk − kvk. For the other direction, if ku + vk = 1.2. LENGTH AND ANGLE: THE DOT PRODUCT 21 kuk − kvk, then first of all, since the left-hand side is nonnegative, the right-hand side must be as well, so that kuk ≥ kvk. Next, we can square both sides of the equality, so that 2 2 2 (kuk − kvk) = kuk − 2 kuk kvk + kvk and 2 ku + vk = u · u + 2u · v + v · v 2 are equal. But kuk = u · u and similarly for v, so that canceling those terms gives 2u · v = −2 kuk kvk and thus u · v = − kuk kvk. Using the lemma again shows that u and v point in opposite directions. 53. Prove Theorem 1.2(b) by applying the definition of the dot product: u · v = u1 (v1 + w1 ) + u2 (v2 + w2 ) + · · · + un (vn + wn ) = u1 v1 + u1 w1 + u2 v2 + u2 w2 + · · · + un vn + un wn = (u1 v1 + u2 v2 + · · · + un vn ) + (u1 w1 + u2 w2 + · · · + un wn ) = u · v + u · w. 54. Prove the three parts of Theorem 1.2(d) by applying the definition of the dot product and various properties of real numbers: Part 1: For any vector u, we must show u · u ≥ 0. But u · u = u1 u1 + u2 u2 + · · · + un un = u21 + u22 + · · · + u2n . Since for any real number x we know that x2 ≥ 0, it follows that this sum is also nonnegative, so that u · u ≥ 0. Part 2: We must show that if u = 0 then u · u = 0. But u = 0 means that ui = 0 for all i, so that u · u = 0 · 0 + 0 · 0 + · · · + 0 · 0 = 0. Part 3: We must show that if u · u = 0, then u = 0. From part 1, we know that u · u = u21 + u22 + · · · + u2n , and that u2i ≥ 0 for all i. So if the dot product is to be zero, each u2i must be zero, which means that ui = 0 for all i and thus u = 0. 55. We must show d(u, v) = ku − vk = kv − uk = d(v, u). By definition, d(u, v) = ku − vk. Then by Theorem 1.3(b) with c = −1, we have k−wk = kwk for any vector w; applying this to the vector u − v gives ku − vk = k−(u − v)k = kv − uk , which is by definition equal to d(v, u). 56. We must show that for any vectors u, v and w that d(u, w) ≤ d(u, v) + d(v, w). This is equivalent to showing that ku − wk ≤ ku − vk + kv − wk. Now substitute u − v for x and v − w for y in Theorem 1.5, giving ku − wk = k(u − v) + (v − w)k ≤ ku − vk + kv − wk . 57. We must show that d(u, v) = ku − vk = 0 if and only if u = v. This follows immediately from Theorem 1.3(a), kwk = 0 if and only if w = 0, upon setting w = u − v. 58. Apply the definitions: u · cv = [u1 , u2 , . . . , un ] · [cv1 , cv2 , . . . , cvn ] = u1 cv1 + u2 cv2 + · · · + un cvn = cu1 v1 + cu2 v2 + · · · + cun vn = c(u1 v1 + u2 v2 + · · · + un vn ) = c(u · v). 22 CHAPTER 1. VECTORS 59. We want to show that ku − vk ≥ kuk − kvk. This is equivalent to showing that kuk ≤ ku − vk + kvk. This follows immediately upon setting x = u − v and y = v in Theorem 1.5. 60. If u · v = u · w, it does not follow that v = w. For example, since 0 · v = 0 for every vector v in Rn , the zero vector is orthogonal to every vector v. So if u = 0 in the above equality, we know nothing about v and w. (as an example, 0 · [1, 2] = 0 · [−17, 12]). Note, however, that u · v = u · w implies that u · v − u · w = u(v − w) = 0, so that u is orthogonal to v − w. 2 2 61. We must show that (u + v)(u − v) = kuk − kvk for all vectors in Rn . Recall that for any w in Rn 2 that w · w = kwk , and also that u · v = v · u. Then 2 2 2 2 (u + v)(u − v) = u · u + v · u − u · v − v · v = kuk + u · v − u · v − kvk = kuk − kvk . 62. (a) Let u, v ∈ Rn . Then 2 2 ku + vk + ku − vk = (u + v) · (u + v) + (u − v) · (u − v) = (u · u + v · v + 2u · v) + (u · u + v · v − 2u · v) 2 2 2 2 = kuk + kvk + 2u · v + kuk + kvk − 2u · v 2 2 = 2 kuk + 2 kvk . (b) Part (a) tells us that the sum of the squares of the lengths of the diagonals of a parallelogram is equal to the sum of the squares of the lengths of its four sides. u+v u-v u v 63. Let u, v ∈ Rn . Then 1 1 1 2 2 ku + vk − ku − vk = [(u + v) · (u + v) − ((u − v) · (u − v)] 4 4 4 1 = [(u · u + v · v + 2u · v) − (u · u + v · v − 2u · v)] 4 i 1 h 2 2 2 2 = kuk − kuk + kvk − kvk + 4u · v 4 = u · v. 64. (a) Let u, v ∈ Rn . Then using the previous exercise, 2 2 2 2 ku + vk = ku − vk ⇔ ku + vk = ku − vk ⇔ ku + vk − ku − vk = 0 1 1 2 2 ⇔ ku + vk − ku − vk = 0 4 4 ⇔u·v =0 ⇔ u and v are orthogonal. 1.2. LENGTH AND ANGLE: THE DOT PRODUCT 23 (b) Part (a) tells us that a parallelogram is a rectangle if and only if the lengths of its diagonals are equal. u+v u-v u v 2 2 2 2 65. (a) By Exercise 55, (u+v)·(u−v) = kuk −kvk . Thus (u+v)·(u−v) = 0 if and only if kuk = kvk . It follows immediately that u + v and u − v are orthogonal if and only if kuk = kvk. (b) Part (a) tells us that the diagonals of a parallelogram are perpendicular if and u+v only if the lengths of its sides are equal, i.e., if and only if it is a rhombus. u-v u v 2 2 2 2 66. From Example 1.9 and the fact that w · w = kwk q, we have ku + vk = kuk + 2u · v + kvk . Taking 2 2 the square root of both sides yields ku + vk = kuk + 2u · v + kvk . Now substitute in the given √ values kuk = 2, kvk = 3, and u · v = 1, giving r √ 2 √ √ ku + vk = 22 + 2 · 1 + 3 = 4 + 2 + 3 = 9 = 3. 67. From Theorem 1.4 (the Cauchy-Schwarcz inequality), we have |u · v| ≤ kuk kvk. If kuk = 1 and kvk = 2, then |u · v| ≤ 2, so we cannot have u · v = 3. 68. (a) If u is orthogonal to both v and w, then u · v = u · w = 0. Then u · (v + w) = u · v + u · w = 0 + 0 = 0, so that u is orthogonal to v + w. (b) If u is orthogonal to both v and w, then u · v = u · w = 0. Then u · (sv + tw) = u · (sv) + u · (tw) = s(u · v) + t(u · w) = s · 0 + t · 0 = 0, so that u is orthogonal to sv + tw. 69. We have u · v u · (v − proju v) = u · v − u u ·u u ·v =u·v−u· u u · u · v u =u·v− (u · u) u·u =u·v−u·v = 0. 24 CHAPTER 1. VECTORS 70. (a) proju (proju v) = proju u·v u·u u u·v u·v = u·u proju u = u·u u = proju v. (b) Using part (a), u · v u · v u · v u− proj u proju (v − proju v) = proju v − u = u·u u·u u·u u u · v u·v = u− u = 0. u·u u·u (c) From the diagram, we see that proju vku, so that proju (proju v) = proju v. Also, (v − proju v) ⊥ u, so that proju (v − proju v) = 0. u proju HvL v v - proju HvL -proju HvL 71. (a) We have (u21 + u22 )(v12 + v22 ) − (u1 v1 + u2 v2 )2 = u21 v12 + u21 v22 + u22 v12 + u22 v22 − u21 v12 − 2u1 v1 u2 v2 − u22 v22 = u21 v22 + u22 v12 − 2u1 u2 v1 v2 = (u1 v2 − u2 v1 )2 . But the final expression is nonnegative since it is a square. Thus the original expression is as well, showing that (u21 + u22 )(v12 + v22 ) − (u1 v1 + u2 v2 )2 ≥ 0. (b) We have (u21 + u22 + u23 )(v12 + v22 + v32 ) − (u1 v1 + u2 v2 + u3 v3 )2 = u21 v12 + u21 v22 + u21 v32 + u22 v12 + u22 v22 + u22 v32 + u23 v12 + u23 v22 + u23 v32 − u21 v12 − 2u1 v1 u2 v2 − u22 v22 − 2u1 v1 u3 v3 − u23 v32 − 2u2 v2 u3 v3 = u21 v22 + u21 v32 + u22 v12 + u22 v32 + u23 v12 + u23 v22 − 2u1 u2 v1 v2 − 2u1 v1 u3 v3 − 2u2 v2 u3 v3 = (u1 v2 − u2 v1 )2 + (u1 v3 − u3 v1 )2 + (u3 v2 − u2 v3 )2 . But the final expression is nonnegative since it is the sum of three squares. Thus the original expression is as well, showing that (u21 + u22 + u23 )(v12 + v22 + v32 ) − (u1 v1 + u2 v2 + u3 v3 )2 ≥ 0. u·v u, we have 72. (a) Since proju v = u·u u·v u·v u· v− u u·u u·u u·v u·v = u·v− u·u u·u u·u u·v = (u · v − u · v) u·u = 0, proju v · (v − proju v) = Exploration: Vectors and Geometry 25 so that proju v is orthogonal to v − proju v. Since their vector sum is v, those three vectors form a right triangle with hypotenuse v, so by Pythagoras’ Theorem, 2 2 2 2 kproju vk ≤ kproju vk + kv − proju vk = kvk . Since norms are always nonnegative, taking square roots gives kproju vk ≤ kvk. (b) kproju vk ≤ kvk ⇐⇒ ⇐⇒ ⇐⇒ u · v u ≤ kvk u·u ! u·v u ≤ kvk 2 kuk u·v kuk 2 kuk ≤ kvk |u · v| ≤ kvk kuk ⇐⇒ |u · v| ≤ kuk kvk , ⇐⇒ which is the Cauchy-Schwarcz inequality. u·v 73. Suppose proju v = cu. From the figure, we see that cos θ = ckuk kvk . But also cos θ = kukkvk . Thus these two expressions are equal, i.e., c kuk u·v u·v u·v u·v = ⇒ c kuk = ⇒c= = . kvk kuk kvk kuk kuk kuk u·u 74. The basis for induction is the cases n = 1 and n = 2. The n = 1 case is the assertion that kv1 k ≤ kv2 k, which is obviously true. The n = 2 case is the Triangle Inequality, which is also true. Now assume the statement holds for n = k ≥ 2; that is, for any v1 , v2 , . . . , vk , kv1 + v2 + · · · + vk k ≤ kv1 k + kv2 k + · · · + kvk k . Let v1 , v2 , . . . , vk , vk+1 be any vectors. Then kv1 + v2 + · · · + vk + vk+1 k = kv1 + v2 + · · · + (vk + vk+1 )k ≤ kv1 k + kv2 k + · · · + kvk + vk+1 k using the inductive hypothesis. But then using the Triangle Inequality (or the case n = 2 in this theorem), kvk + vk+1 k ≤ kvk k + kvk+1 k. Substituting into the above gives kv1 + v2 + · · · + vk + vk+1 k ≤ kv1 k + kv2 k + · · · + kvk + vk+1 k ≤ kv1 k + kv2 k + · · · + kvk k + kvk+1 k , which is what we were trying to prove. Exploration: Vectors and Geometry # » # » # » 1. As in Example 1.25, let p = OP . Then p − a = AP = 31 AB = 13 (b − a), so that p = a + 13 (b − a) = # » # » 1 1 3 (2a + b). More generally, if P is the point n of the way from A to B along AB, then p − a = AP = 1# » 1 1 1 n AB = n (b − a), so that p = a + n (b − a) = n ((n − 1)a + b). # » 2. Use the notation that the vector OX is written x. Then from exercise 1, we have p = 21 (a + c) and q = 12 (b + c), so that 1 1 1 1# » # » P Q = q − p = (b + c) − (a + c) = (b − a) = AB. 2 2 2 2 26 CHAPTER 1. VECTORS # » # » # » # » # » 3. Draw AC. Then from exercise 2, we have P Q = 12 AB = SR. Also draw BD. Again from exercise 2, # » # » # » we have P S = 12 BD = QR. Thus opposite sides of the quadrilateral P QRS are equal. They are also parallel: indeed, 4BP Q and 4BAC are similar, since they share an angle and BP : BA = BQ : BC. Thus ∠BP Q = ∠BAC; since these angles are equal, P QkAC. Similarly, SRkAC so that P QkSR. In a like manner, we see that P SkRQ. Thus P QRS is a parallelogram. 4. Following the hint, we find m, the point that is two-thirds of the distance from A to P . From exercise 1, we have 1 1 1 1 1 p = (b + c), so that m = (2p + a) = 2 · (b + c) + a = (a + b + c). 2 3 3 2 3 Next we find m0 , the point that is two-thirds of the distance from B to Q. Again from exercise 1, we have 1 1 1 1 1 0 q = (a + c), so that m = (2q + b) = 2 · (a + c) + b = (a + b + c). 2 3 3 2 3 Finally we find m00 , the point that is two-thirds of the distance from C to R. Again from exercise 1, we have 1 1 1 1 1 00 2 · (a + b) + c = (a + b + c). r = (a + b), so that m = (2r + c) = 2 3 3 2 3 Since m = m0 = m00 , all three medians intersect at the centroid, G. # » # » # » # » 5. With notation as in the figure, we know that AH is orthogonal to BC; that is, AH · BC = 0. Also # » # » # » # » # » # » BH is orthogonal to AC; that is, BH · AC = 0. We must show that CH · AB = 0. But # » # » AH · BC = 0 ⇒ (h − a) · (b − c) = 0 ⇒ h · b − h · c − a · b + a · c = 0 # » # » BH · AC = 0 ⇒ (h − b) · (c − a) = 0 ⇒ h · c − h · a − b · c + b · a = 0. Adding these two equations together and canceling like terms gives # » # » 0 = h · b − h · a − c · b + a · c = (h − c) · (b − a) = CH · AB, so that these two are orthogonal. Thus all the altitudes intersect at the orthocenter H. # » # » # » # » 6. We are given that QK is orthogonal to AC and that P K is orthogonal to CB, and must show that # » # » RK is orthogonal to AB. By exercise 1, we have q = 21 (a + c), p = 12 (b + c), and r = 12 (a + b). Thus 1 # » # » QK · AC = 0 ⇒ (k − q) · (c − a) = 0 ⇒ k − (a + c) · (c − a) = 0 2 1 # » # » P K · CB = 0 ⇒ (k − p) · (b − c) = 0 ⇒ k − (b + c) · (b − c) = 0. 2 Expanding the two dot products gives 1 1 1 1 k·c−k·a− a·c+ a·a− c·c+ a·c=0 2 2 2 2 1 1 1 1 k · b − k · c − b · b + b · c − c · b + c · c = 0. 2 2 2 2 Add these two together and cancel like terms to get 1 1 1 # » # » 0 = k · b − k · a − b · b + a · a = k − (b + a) · (b − a) = (k − r) · (b − a) = RK · AB. 2 2 2 # » # » Thus RK and AB are indeed orthogonal, so all the perpendicular bisectors intersect at the circumcenter. 1.3. LINES AND PLANES 27 2 2 7. Let O, the center of the circle, be the origin. Then b = −a and kak = kck = r2 where r is the radius # » # » of the circle. We want to show that AC is orthogonal to BC. But # » # » AC · BC = (c − a) · (c − b) = (c − a) · (c + a) 2 2 = kck + c · a − kak − a · c = (a · c − c · a) + (r2 − r2 ) = 0. Thus the two are orthogonal, so that ∠ACB is a right angle. 8. As in exercise 5, we first find m, the point that is halfway from P to R. We have p = 21 (a + b) and r = 12 (c + d), so that 1 1 1 1 1 (a + b) + (c + d) = (a + b + c + d). m = (p + r) = 2 2 2 2 4 Similarly, we find m0 , the point that is halfway from Q to S. We have q = 21 (b + c) and s = 12 (a + d), so that 1 1 1 1 1 0 m = (q + s) = (b + c) + (a + d) = (a + b + c + d). 2 2 2 2 4 # » # » Thus m = m0 , so that P R and QS intersect at their mutual midpoints; thus, they bisect each other. 1.3 Lines and Planes 3 0 1. (a) The normal form is n · (x − p) = 0, or · x− = 0. 2 0 x (b) Letting x = , we get y 3 x 0 3 x · − = · = 3x + 2y = 0. 2 y 0 2 y The general form is 3x + 2y = 0. 3 1 2. (a) The normal form is n · (x − p) = 0, or · x− = 0. −4 2 x (b) Letting x = , we get y 3 x 1 3 x−1 · − = · = 3(x − 1) − 4(y − 2) = 0. −4 y 2 −4 y−2 Expanding and simplifying gives the general form 3x − 4y = −5. 1 −1 3. (a) In vector form, the equation of the line is x = p + td, or x = +t . 0 3 x x 1−t (b) Letting x = and expanding the vector form from part (a) gives = , which yields y y 3t the parametric form x = 1 − t, y = 3t. −4 1 4. (a) In vector form, the equation of the line is x = p + td, or x = +t . 4 1 x (b) Letting x = and expanding the vector form from part (a) gives the parametric form x = −4+t, y y = 4 + t. 28 CHAPTER 1. VECTORS     0 1 5. (a) In vector form, the equation of the line is x = p + td, or x = 0 + t −1. 0 4   x (b) Letting x = y  and expanding the vector form from part (a) gives the parametric form x = t, z y = −t, z = 4t.     3 2 6. (a) In vector form, the equation of the line is x = p + td, or x =  0 + t 5. −2 0       x x 3 + 2t (b) Letting x = y  and expanding the vector form from part (a) gives y  =  5t , which z z −2 yields the parametric form x = 3 + 2t, y = 5t, z = −2.      3 0 7. (a) The normal form is n · (x − p) = 0, or 2 · x − 1 = 0. 1 0   x (b) Letting x = y , we get z           3 x 0 3 x 2 · y  − 1 = 2 · y − 1 = 3x + 2(y − 1) + z = 0. 1 z 0 1 z Expanding and simplifying gives the general form 3x + 2y + z = 2.      3 2 8. (a) The normal form is n · (x − p) = 0, or 5 · x −  0 = 0. −2 0   x (b) Letting x = y , we get z           x 3 2 x−3 2 5 · y  −  0 = 5 ·  y  = 2(x − 3) + 5y = 0. 0 z −2 0 z+2 Expanding and simplifying gives the general form 2x + 5y = 6. 9. (a) In vector form, the equation of the line is x = p + su + tv, or       0 2 −3 x = 0 + s 1 + t  2 . 0 2 1   x (b) Letting x = y  and expanding the vector form from part (a) gives z     x 2s − 3t y  =  s + 2t  z 2s + t which yields the parametric form the parametric form x = 2s − 3t, y = s + 2t, z = 2s + t. 1.3. LINES AND PLANES 29 10. (a) In vector form, the equation of the line is x = p + su + tv, or       6 0 −1 x = −4 + s 1 + t  1 . −3 1 1   x (b) Letting x = y  and expanding the vector form from part (a) gives the parametric form x = 6−t, z y = −4 + s + t, z = −3 + s + t. 11. Any pair of points on ` determine a direction vector, so we use P and Q. We choose P to represent # » the point on the line. Then a direction vector forthe line is d = P Q = (3, 0) − (1, −2) = (2, 2). The 1 2 vector equation for the line is x = p + td, or x = +t . −2 2 12. Any pair of points on ` determine a direction vector, so we use P and Q. We choose P to represent the # » point on the line. Then a direction vector for the line is d=  P Q= (−2,  1, 3) − (0, 1, −1) = (−2, 0, 4). 0 −2 The vector equation for the line is x = p + td, or x =  1 + t  0. −1 4 13. We must find two direction vectors, u and v. Since P , Q, and R lie in a plane, we compute We get two direction vectors # » u = P Q = q − p = (4, 0, 2) − (1, 1, 1) = (3, −1, 1) # » v = P R = r − p = (0, 1, −1) − (1, 1, 1) = (−1, 0, −2). Since u and v are not scalar multiples of each other, they will serve as direction vectors (if they were parallel to each other, wewould have   not aplane  but a line). So the vector equation for the plane is 3 −1 1 x = p + su + tv, or x = 1 + s −1 + t  0. 1 −2 1 14. We must find two direction vectors, u and v. Since P , Q, and R lie in a plane, we compute We get two direction vectors # » u = P Q = q − p = (1, 0, 1) − (1, 1, 0) = (0, −1, 1) # » v = P R = r − p = (0, 1, 1) − (1, 1, 0) = (−1, 0, 1). Since u and v are not scalar multiples of each other, they will serve as direction vectors (if they were parallel to each other, wewould have   not aplane  but a line). So the vector equation for the plane is 1 0 −1 x = p + su + tv, or x = 1 + s −1 + t  0. 0 1 1 15. The parametric and associated vector forms x = p + td found below are not unique. (a) As in the remarks prior to Example 1.20, we start by letting x = t. Substituting x = t into y = 3x − 1 gives Sowe get parametric equations x = t, y = 3t − 1, and corresponding y= 3t − 1. x 0 1 vector form = +t . y −1 3 (b) In this case since the coefficient of y is 2, we start by letting x = 2t. Substituting x = 2t into 3x + 2y = 5 gives 3 · 2t + 2y = 5, which gives y = −3t + 25 . So we get parametric equations x = 2t, y = 52 − 3t, with corresponding vector equation " # " # " # x 0 2 = 5 +t . y −3 2 30 CHAPTER 1. VECTORS Note that theequation was of the form ax + by = c with a = 3, b = 2, and that a direction vector b was given by . This is true in general. −a 16. Note that x = p + t(q − p) is the line that passes through p (when t = 0) and q (when t = 1). We write d = q − p; this is a direction vector for the line through p and q. (a) As noted above, the line p + td passes through P at t = 0 and through Q at t = 1. So as t varies from 0 to 1, the line describes the line segment P Q. (b) As shown in Exploration: Vectors and Geometry, to find the midpoint of P Q, we start at P # » and travel half the length of P Q in the direction of the vector P Q = q − p. That is, the midpoint of P Q is the head of the vector p + 12 (q − p). Since x = p + t(q − p), we see that this line passes through the midpoint at t = 21 , and that the midpoint is in fact p + 21 (q − p) = 21 (p + q). (c) From part (b), the midpoint is 12 ([2, −3] + [0, 1]) = 12 [2, −2] = [1, −1]. (d) From part (b), the midpoint is 21 ([1, 0, 1] + [4, 1, −2]) = 12 [5, 1, −1] = 52 , 12 , − 12 . (e) Again from Exploration: Vectors and Geometry, the vector whose head is 31 of the way from P to Q along P Q is x1 = 13 (2p + q). Similarly, the vector whose head is 32 of the way from P to Q along P Q is also the vector one third of the way from Q to P along QP ; applying the same formula gives for this point x2 = 31 (2q + p). When p = [2, −3] and q = [0, 1], we get 1 4 5 1 ,− x1 = (2[2, −3] + [0, 1]) = [4, −5] = 3 3 3 3 1 1 2 1 x2 = (2[0, 1] + [2, −3]) = [2, −1] = ,− . 3 3 3 3 (f ) Using the formulas from part (e) with p = [1, 0, −1] and q = [4, 1, −2] gives 1 1 4 1 (2[1, 0, −1] + [4, 1, −2]) = [6, 1, −4] = 2, , − 3 3 3 3 1 1 2 5 x2 = (2[4, 1, −2] + [1, 0, −1]) = [9, 2, −5] = 3, , − . 3 3 3 3 x1 = 17. A line `1 with slope m1 has equation y = m1 x + b1 , or −m1 x + y = b1 . Similarly, a line `2 withslope −m1 m2 has equation y = m2 x + b2 , or −m2 x + y = b2 . Thus the normal vector for `1 is n1 = , and 1 −m2 the normal vector for `2 is n2 = . Now, `1 and `2 are perpendicular if and only if their normal 1 vectors are perpendicular, i.e., if and only if n1 · n2 = 0. But −m1 −m2 n1 · n2 = · = m1 m2 + 1, 1 1 so that the normal vectors are perpendicular if and only if m1 m2 +1 = 0, i.e., if and only if m1 m2 = −1. 18. Suppose the line ` has direction vector d, and the plane P has normal vector n. Then if d · n = 0 (d and n are orthogonal), then the line ` is parallel to the plane P. If on the other hand d and n are parallel, so that d = n, then ` is perpendicular to P.   2 (a) Since the general form of P is 2x + 3y − z = 1, its normal vector is n =  3. Since d = 1n, we −1 see that ` is perpendicular to P. 1.3. LINES AND PLANES 31   4 (b) Since the general form of P is 4x − y + 5z = 0, its normal vector is n = −1. Since 5     2 4 d · n =  3 · −1 = 2 · 4 + 3 · (−1) − 1 · 5 = 0, −1 5 ` is parallel to P.   1 (c) Since the general form of P is x − y − z = 3, its normal vector is n = −1. Since −1     2 1 d · n =  3 · −1 = 2 · 1 + 3 · (−1) − 1 · (−1) = 0, −1 −1 ` is parallel to P.   4 (d) Since the general form of P is 4x + 6y − 2z = 0, its normal vector is n =  6. Since −2     4 2 1 1 d =  3 =  6 = n, 2 2 −1 −2 ` is perpendicular to P. 19. Suppose the plane P1 has normal vector n1 , and the plane P has normal vector n. Then if n1 · n = 0 (n1 and n are orthogonal), then P1 is perpendicular to P. If on the other hand n1 and n are parallel, so that n1 = cn, thenP1is parallel to P. Note that in this exercise, P1 has the equation 4 4x − y + 5z = 2, so that n1 = −1. 5   2 (a) Since the general form of P is 2x + 3y − z = 1, its normal vector is n =  3. Since −1     4 2 n1 · n = −1 ·  3 = 4 · 2 − 1 · 3 + 5 · (−1) = 0, −1 5 the normal vectors are perpendicular, and thus P1 is perpendicular to P.   4 (b) Since the general form of P is 4x − y + 5z = 0, its normal vector is n = −1. Since n1 = n, 5 P1 is parallel to P.   1 (c) Since the general form of P is x − y − z = 3, its normal vector is n = −1. Since −1     4 1 n1 · n = −1 · −1 = 4 · 1 − 1 · (−1) + 5 · (−1) = 0, 5 −1 the normal vectors are perpendicular, and thus P1 is perpendicular to P. 32 CHAPTER 1. VECTORS   4 (d) Since the general form of P is 4x + 6y − 2z = 0, its normal vector is n =  6. Since −2     4 4 n1 · n = −1 ·  6 = 4 · 4 − 1 · 6 + 5 · (−2) = 0, 5 −2 the normal vectors are perpendicular, and thus P1 is perpendicular to P. 20. Since the vector form is x = p + td, we use the given information to determine p and d. The general 2 equation of the given line is 2x − 3y = 1, so its normal vector is n = . Our line is perpendicular −3 2 to that line, so it has direction vector d = n = . Furthermore, since our line passes through the −3 2 point P = (2, −1), we have p = . Thus the vector form of the line perpendicular to 2x − 3y = 1 −1 through the point P = (2, −1) is x 2 2 = +t . y −1 −3 21. Since the vector form is x = p + td, we use the given information todetermine p and d. The general 2 equation of the given line is 2x − 3y = 1, so its normal vector is n = . Our line is parallel to that −3 3 (note that d · n = 0). Since our line passes through the point line, so it has direction vector d = 2 2 P = (2, −1), we have p = , so that the vector equation of the line parallel to 2x − 3y = 1 through −1 the point P = (2, −1) is x 2 3 = +t . y −1 2 22. Since the vector form is x = p + td, we use the given information to determine p and d. A line is perpendicular to a plane if its direction vector d is the normal vector n of  the  plane. The general 1 equation of the given plane is x − 3y + 2z = 5, so its normal vector is n = −3. Thus the direction 2   1 vector of our line is d = −3. Furthermore, since our line passes through the point P = (−1, 0, 3), we 2   −1 have p =  0. So the vector form of the line perpendicular to x − 3y + 2z = 5 through P = (−1, 0, 3) 3 is       x −1 1 y  =  0 + t −3 . z 3 2 23. Since the vector form is x = p + td, we use the given information to determine p and d. Since the given line has parametric equations       x 1 −1 x = 1 − t, y = 2 + 3t, z = −2 − t, it has vector form y  =  2 + t  3 . z −2 −1 1.3. LINES AND PLANES 33   −1 So its direction vector is  3, and this must be the direction vector d of the line we want, which is −1   −1 parallel to the given line. Since our line passes through the point P = (−1, 0, 3), we have p =  0. 3 So the vector form of the line parallel to the given line through P = (−1, 0, 3) is       −1 −1 x y  =  0 + t  3 . z 3 −1 24. Since the normal form is n · x = n · p, we use the given information to determine n and p. Note that a plane is parallel to a given plane if their normal vectors  are  equal. Since the general form of the 6 given plane is 6x − y + 2z = 3, its normal vector is n = −1, so this must be a normal vector of the 2 desired plane as well. Furthermore, since our plane passes through the point P = (0, −2, 5), we have   0 p = −2. So the normal form of the plane parallel to 6x − y + 2z = 3 through (0, −2, 5) is 5             x 6 0 6 x 6 −1 · y  = −1 · −2 or −1 · y  = 12. z 2 5 2 z 2 25. Using Figure 1.34 in Section 1.2 for reference, we will find a normal vector n and a point vector p for each of the sides, then substitute into n · x = n · p to get an equation for each plane. (a) Start with P 1 determined by the face of the cube in the xy-plane. Clearly a normal vector for 1 P1 is n = 0, or any vector parallel to the x-axis. Also, the plane passes through P = (0, 0, 0), 0  0 so we set p = 0. Then substituting gives 0         1 x 1 0 0 · y  = 0 · 0 or x = 0. 0 z 0 0 So the general equation for P1 is x = 0. Applying the same argument above to the plane P2 determined by the face in the xz-plane gives a general equation of y = 0, and similarly the plane P3 determined by the face in the xy-plane gives a general equation of z = 0. Now consider P4 , the plane containing the face parallel to the face  in  the yz-plane but passing 1 through (1, 1, 1). Since P4 is parallel to P1 , its normal vector is also 0; since it passes through 0   1 (1, 1, 1), we set p = 1. Then substituting gives 1         1 x 1 1 0 · y  = 0 · 1 or x = 1. 0 z 0 1 So the general equation for P4 is x = 1. Similarly, the general equations for P5 and P6 are y = 1 and z = 1. 34 CHAPTER 1. VECTORS   x (b) Let n = y  be a normal vector for the desired plane P. Since P is perpendicular to the z xy-plane, their normal vectors must be orthogonal. Thus     x 0 y  · 0 = x · 0 + y · 0 + z · 1 = z = 0. z 1   x Thus z = 0, so the normal vector is of the form n = y . But the normal vector is also 0 perpendicular to the plane in question, by definition. Since that plane contains both the origin and (1, 1, 1), the normal vector is orthogonal to (1, 1, 1) − (0, 0, 0):     x 1 y  · 1 = x · 1 + y · 1 + z · 0 = x + y = 0. 0 1   x Thus x + y = 0, so that y = −x. So finally, a normal vector to P is given by n = −x for 0   1 any nonzero x. We may as well choose x = 1, giving n = −1. Since the plane passes through 0 (0, 0, 0), we let p = 0. Then substituting in n · x = n · p gives         0 1 x 1 −1 · y  = −1 · 0 , or x − y = 0. 0 0 z 0 Thus the general equation for the plane perpendicular to the xy-plane and containing the diagonal from the origin to (1, 1, 1) is x − y = 0. (c) As in Example 1.22 (Figure 1.34)   in Section 1.2, use u = [0, 1, 1] and v = [1, 0, 1] as two vectors x in the required plane. If n = y  is a normal vector to the plane, then n · u = 0 = n · v: z         x 0 x 1 n · u = y  · 1 = y + z = 0 ⇒ y = −z, n · v = y  · 0 = x + z = 0 ⇒ x = −z. z 1 z 1     −z 1 Thus the normal vector is of the form n = −z  for any z. Taking z = −1 gives n =  1. z −1 Now, the side diagonals pass through (0, 0, 0), so set p = 0. Then n · x = n · p yields         1 x 1 0  1 · y  =  1 · 0 , or x + y − z = 0. −1 z −1 0 The general equation for the plane containing the side diagonals is x + y − z = 0. 26. Finding the distance between points A and B is equivalent to finding d(a, b), where a is the vector from the origin to A, and similarly for b. Given x = [x, y, z], p = [1, 0, −2], and q = [5, 2, 4], we want to solve d(x, p) = d(x, q); that is, p p d(x, p) = (x − 1)2 + (y − 0)2 + (z + 2)2 = (x − 5)2 + (y − 2)2 + (z − 4)2 = d(x, q). 1.3. LINES AND PLANES 35 Squaring both sides gives (x − 1)2 + (y − 0)2 + (z + 2)2 = (x − 5)2 + (y − 2)2 + (z − 4)2 2 2 2 2 2 ⇒ 2 x − 2x + 1 + y + z + 4z + 4 = x − 10x + 25 + y − 4y + 4 + z − 8z + 16 8x + 4y + 12z = 40 ⇒ ⇒ 2x + y + 3z = 10. Thus all such points (x, y, z) lie on the plane 2x + y + 3z = 10. 0 −c| 27. To calculate d(Q, `) = |ax√0a+by , we first put ` into general form. With d = 2 +b2 1 1 , we get n = −1 1 since then n · d = 0. Then we have n·x=n·p ⇒ 1 x 1 −1 · = = 1. 1 y 1 2 Thus x + y = 1 and thus a = b = c = 1. Since Q = (2, 2) = (x0 , y0 ), we have √ 3 3 2 |1 · 2 + 1 · 2 − 1| √ . =√ = d(Q, `) = 2 2 12 + 12   −2 28. Comparing the given equation to x = p + td, we get P = (1, 1, 1) and d =  0. As suggested by 3 # » Figure 1.63, we need to calculate the length of RQ, where R is the point on the line at the foot of the # » perpendicular from Q. So if v = P Q, then # » # » P R = projd v, RQ = v − projd v.       1 −1 0 # » Now, v = P Q = q − p = 1 − 1 =  0, so that 1 −1 0 projd v = d·v d·d d=     2 −2 13 −2 · (−1) + 3 · (−1)      0 =  0 . −2 · (−2) + 3 · 3 3 3 − 13 Thus       2 15 −1 − 13 13       v − projd v =  0 −  0 =  0 . 3 −1 − 13 − 10 13 Then the distance d(Q, `) from Q to ` is   √ 3 5   5p 2 5 13 0 = . kv − projd vk = 3 + 22 = 13 13 13 2 0 +by0 +cz0 −d| 29. To calculate d(Q, P) = |ax√ , we first note that the plane has equation x + y − z = 0, so a2 +b2 +c2 that a = b = 1, c = −1, and d = 0. Also, Q = (2, 2, 2), so that x0 = y0 = z0 = 2. Hence √ |1 · 2 + 1 · 2 − 1 · 2 − 0| 2 2 3 d(Q, P) = p =√ = . 3 3 12 + 12 + (−1)2 36 CHAPTER 1. VECTORS 0 +by0 +cz0 −d| 30. To calculate d(Q, P) = |ax√ , we first note that the plane has equation x − 2y + 2z = 1, so a2 +b2 +c2 that a = 1, b = −2, c = 2, and d = 1. Also, Q = (0, 0, 0), so that x0 = y0 = z0 = 0. Hence d(Q, P) = 1 |1 · 0 − 2 · 0 + 2 · 0 − 1| p = . 3 12 + (−2)2 + 22 # » # » 31. Figure 1.66 suggests that we let v = P Q; then w = P R = projd v. Comparing the given line ` to # » 1 2 −1 3 x = p + td, we get P = (−1, 2) and d = . Then v = P Q = q − p = − = . Next, −1 2 2 0 w = projd v = d·v d·d d= 1 · 3 + (−1) · 0 1 · 1 + (−1) · (−1) 3 1 1 = . −1 2 −1 So " # " # " # 1 3 −1 # » 2 = 2 . r = p + P R = p + projd v = p + w = + 1 − 32 2 2 1 1 So the point R on ` that is closest to Q is 2 , 2 . # » # » 32. Figure 1.66 suggests that we let = P Q; then P R = projd v. Comparing givenline ` to x = p+td,    the   v 0 1 −1 −2 # » we get P = (1, 1, 1) and d =  0. Then v = P Q = q − p = 1 − 1 =  0. Next, 0 1 −1 3 projd v = d·v d·d d=     2 −2 13 −2 · (−1) + 3 · (−1)      0 =  0 . (−2)2 + 32 3 3 − 13 So       15 2 1 13 13 # »       r = p + P R = p + projd v = 1 +  0 =  1  . 3 10 − 13 1 13 15 10 So the point R on ` that is closest to Q is 13 , 1, 13 . # » # » 33. Figure 1.67 suggests we let v = P Q, where Pis some point on the plane; then QR = projn v. The  1 equation of the plane is x + y − z = 0, so n =  1. Setting y = 0 shows that P = (1, 0, 1) is a point −1 on the plane. Then       2 1 1 # » v = P Q = q − p = 2 − 0 = 2 , 2 1 1 so that  projn v = n · v n·n n= 1·1+1·1−1·1 12 + 12 + (−1)2    2 1 3    2  1 =  3  . −1 − 23 Finally,         2 4 1 1 3 3 # » # »         r = p + P Q + P R = p + v − projn v = 0 + 2 −  23  =  43  . 8 1 1 − 23 3 Therefore, the point R in P that is closest to Q is 34 , 43 , 83 . 1.3. LINES AND PLANES 37 # » # » 34. Figure 1.67 suggests we let v = P Q, where P is some  point on the plane; then QR = projn v. The 1 equation of the plane is x − 2y + 2z = 1, so n = −2. Setting y = z = 0 shows that P = (1, 0, 0) is a 2 point on the plane. Then       0 1 −1 # » v = P Q = q − p = 0 − 0 =  0 , 0 0 0 so that    − 91 1    2 −2 =  9  . 2 − 29  projn v = n · v n·n n= 1 · (−1) 12 + (−2)2 + 22 Finally,         − 19 −1 1 − 19 # » # »      2  2 r = p + P Q + P R = p + v − projn v = 0 +  0 −  9  =  9  . 0 0 − 29 − 29 Therefore, the point R in P that is closest to Q is − 19 , 29 , − 29 . 35. Since the given lines `1 and `2 are parallel, choose arbitrary points Q on `1 and P on `2 , say Q = (1, 1) and P = (5, 4). The direction vector of `2 is d = [−2, 3]. Then # » 1 5 −4 v = PQ = q − p = − = , 1 4 −3 so that −2 · (−4) + 3 · (−3) d·v 1 −2 −2 =− d= . projd v = 3 3 d·d (−2)2 + 32 13 Then the distance between the lines is given by 1 −2 1 −54 18 3 18 √ −4 13. kv − projd vk = + = = = −3 3 13 13 −36 13 2 13 36. Since the given lines `1 and `2 are parallel, choose arbitrary points Q on `1 and P on `2 , say Q = (1, 0, −1) and P = (0, 1, 1). The direction vector of `2 is d = [1, 1, 1]. Then       1 0 1 # » v = P Q = q − p =  0 − 1 = −1 , −2 −1 1 so that projd v = d·v d·d d= 1 · 1 + 1 · (−1) + 1 · (−2) 12 + 12 + 12     1 1 2 1 = − 1 . 3 1 1 Then the distance between the lines is given by       5 √ 1 1 3 42 1p 2   2   1 kv − projd vk = −1 + 1 = − 3  = 5 + (−1)2 + (−4)2 = . 3 3 3 4 −2 1 −3 37. Since P1 and P2 are parallel, we choose an arbitrary point on P1 , say Q = (0, 0, 0), and compute d(Q, P2 ). Since the equation of P2 is 2x + y − 2z = 5, we have a = 2, b = 1, c = −2, and d = 5; since Q = (0, 0, 0), we have x0 = y0 = z0 = 0. Thus the distance is d(P1 , P2 ) = d(Q, P2 ) = |ax0 + by0 + cz0 − d| |2 · 0 + 1 · 0 − 2 · 0 − 5| 5 √ √ = = . 2 2 2 2 2 2 3 a +b +c 2 +1 +2 38 CHAPTER 1. VECTORS 38. Since P1 and P2 are parallel, we choose an arbitrary point on P1 , say Q = (1, 0, 0), and compute d(Q, P2 ). Since the equation of P2 is x + y + z = 3, we have a = b = c = 1 and d = 3; since Q = (1, 0, 0), we have x0 = 1, y0 = 0, and z0 = 0. Thus the distance is √ |ax0 + by0 + cz0 − d| |1 · 1 + 1 · 0 + 1 · 0 − 3| 2 2 3 √ √ d(P1 , P2 ) = d(Q, P2 ) = = =√ = . 3 3 a2 + b2 + c2 12 + 1 2 + 1 2 # » a 0 −c| , where n = , n · a = c, and B = (x0 , y0 ). If v = AB = 39. We wish to show that d(B, `) = |ax√0a+by 2 +b2 b b − a, then a x n · v = n · (b − a) = n · b − n · a = · 0 − c = ax0 + by0 − c. b y0 Then from Figure 1.65, we see that |n · v| |ax0 + by0 − c| √ = . n·n knk a2 + b2   a 0 +by0 +cz0 −d|  b , n · a = d, and B = (x0 , y0 , z0 ). If 40. We wish to show that d(B, `) = |ax√ , where n = a2 +b2 +c2 c # » v = AB = b − a, then     x0 a n · v = n · (b − a) = n · b − n · a =  b  ·  y0  − d = ax0 + by0 + cz0 − d. c z0 d(B, `) = kprojn vk = n · v n = Then from Figure 1.65, we see that d(B, `) = kprojn vk = n · v n·n n = |ax0 + by0 + cz0 − d| |n · v| √ = . knk a2 + b2 + c2 41. Choose B = (x0 , y0 ) on `1 ; since `1 and `2 are parallel, the distance between them is d(B, `2 ). Then a x since B lies on `1 , we have n · b = · 0 = ax0 + by0 = c1 . Choose A on `2 , so that n · a = c2 . Set b y0 v = b − a. Then using the formula in Exercise 39, the distance is d(`1 , `2 ) = d(B, `2 ) = |n · v| |n · (b − a)| |n · b − n · a| |c1 − c2 | = = = . knk knk knk knk 42. Choose B = (x0 , y0 , z0 ) on P1 ; since P1 and parallel, the distance between them is d(B, P2 ). P  2 are  a x0 Then since B lies on P1 , we have n · b =  b  ·  y0  = ax0 + by0 + cz0 = d1 . Choose A on P2 , so c z0 that n · a = d2 . Set v = b − a. Then using the formula in Exercise 40, the distance is |n · v| |n · (b − a)| |n · b − n · a| |d1 − d2 | = = = . knk knk knk knk     1 2 43. Since P1 has normal vector n1 = 1 and P2 has normal vector n2 =  1, the angle θ between 1 −2 the normal vectors satisfies d(P1 , P2 ) = d(B, P2 ) = cos θ = n1 · n2 1 · 2 + 1 · 1 + 1 · (−2) 1 p =√ = √ . 2 2 2 2 2 2 kn1 k kn2 k 3 3 1 + 1 + 1 2 + 1 + (−2) Thus θ = cos−1 1 √ 3 3 ≈ 78.9◦ . 1.3. LINES AND PLANES 39     3 1 44. Since P1 has normal vector n1 = −1 and P2 has normal vector n2 =  4, the angle θ between 2 −1 the normal vectors satisfies cos θ = 3 · 1 − 1 · 4 + 2 · (−1) 3 1 n1 · n2 p =p = −√ √ = −√ . 2 2 2 2 2 2 kn1 k kn2 k 14 18 28 3 + (−1) + 2 1 + 4 + (−1) This is an obtuse angle, so the acute angle is 1 π − θ = π − cos−1 − √ ≈ 79.1◦ . 28 45. First, to see that P and ` intersect, substitute the parametric equations for ` into the equation for P, giving x + y + 2z = (2 + t) + (1 − 2t) + 2(3 + t) = 9 + t = 0, so that t = −9 represents the point of intersection, which is thus (2 + (−9), 1 − 2(−9),    3+ (−9)) = 1 1 (−7, 19, −6). Now, the normal to P is n = 1, and a direction vector for ` is d = −2. So if θ is 2 1 the angle between n and d, then θ satisfies cos θ = 1 · 1 + 1 · (−2) + 2 · 1 1 n·d p =√ = , 2 2 2 2 2 2 knk kdk 6 1 + 1 + 1 1 + (−2) + 1 so that −1 θ = cos 1 ≈ 80.4◦ . 6 Thus the angle between the line and the plane is 90◦ − 80.4◦ ≈ 9.6◦ . 46. First, to see that P and ` intersect, substitute the parametric equations for ` into the equation for P, giving 4x − y − z = 4 · t − (1 + 2t) − (2 + 3t) = −t − 3 = 6, so that t = −9 represents the point of intersection, · (−9)) =   which is thus (−9, 1 + 2 · (−9), 2 +3  1 4 (−9, −17, −25). Now, the normal to P is n = −1, and a direction vector for ` is d = 2. So if θ 3 −1 is the angle between n and d, then θ satisfies cos θ = n·d 4·1−1·2−1·3 1 √ =√ = −√ √ . knk kdk 18 14 42 + 12 + 12 12 + 22 + 32 This corresponds to an obtuse angle, so the acute angle between the two is 1 −1 θ = π − cos −√ √ ≈ 86.4◦ . 18 14 Thus the angle between the line and the plane is 90◦ − 86.4◦ ≈ 3.6◦ . 47. We have p = v − c n, so that c n = v − p. Take the dot product of both sides with n, giving (c n) · n = (v − p) · n ⇒ c(n · n) = v · n − p · n ⇒ c(n · n) = v · n (since p and n are orthogonal) n·v c= . n·n ⇒ 40 CHAPTER 1. VECTORS Note that another interpretation of the figure is that c n = projn v = n·v c = n·n . n·v n·n n, which also implies that Now substitute this value of c into the original equation, giving n · v n. p = v − cn = v − n·n   1 48. (a) A normal vector to the plane x + y + z = 0 is n = 1. Then 1     1 1 n · v = 1 ·  0 = 1 · 1 + 1 · 0 + 1 · (−2) = −1 1 −2     1 1 n · n = 1 · 1 = 1 · 1 + 1 · 1 + 1 · 1 = 3, 1 1 so that c = − 31 . Then      4 1 1 3   1    1 p=v− n =  0 + 1 =  3  . n·n 3 1 −2 − 53  n · v  3 (b) A normal vector to the plane 3x − y + z = 0 is n = −1. Then 1     1 3 n · v = −1 ·  0 = 3 · 1 − 1 · 0 + 1 · (−2) = 1 −2 1     3 3 n · n = −1 · −1 = 3 · 3 − 1 · (−1) + 1 · 1 = 11, 1 1  1 so that c = 11 . Then       8 1 3 11 1    1   p=v− n =  0 − −1 =  11  . n·n 11 −2 1 − 23 11 n · v   1 (c) A normal vector to the plane x − 2z = 0 is n =  0. Then −2     1 1 n · v =  0 ·  0 = 1 · 1 + 0 · 0 − 2 · (−2) = 5 −2 −2     1 1 n · n =  0 ·  0 = 1 · 1 + 0 · 0 − 2 · (−2) = 5, −2 −2 Exploration: The Cross Product 41 so that c = 1. Then    1 1     n =  0 −  0 = 0. p=v− n·n −2 −2  n · v Note that the projection is 0 because the vector is normal to the plane, so its projection onto the plane is a single point.   2 (d) A normal vector to the plane 2x − 3y + z = 0 is n = −3. Then 1     2 1 n · v = −3 ·  0 = 2 · 1 − 3 · 0 + 1 · (−2) = 0 1 −2     2 2 n · n = −3 · −3 = 2 · 2 − 3 · (−3) + 1 · 1 = 14, 1 1  1 so that c = 0. Thus p = v =  0. Note that the projection is the vector itself because the −2 vector is parallel to the plane, so it is orthogonal to the normal vector.  Exploration: The Cross Product       3 u2 v3 − u3 v2 1 · 3 − 0 · (−1) 1. (a) u × v = u3 v1 − u1 v3  =  1 · 3 − 0 · 2  =  3. −3 u1 v2 − u2 v1 0 · (−1) − 1 · 3       u2 v3 − u3 v2 −1 · 1 − 2 · 1 −3 (b) u × v = u3 v1 − u1 v3  =  2 · 0 − 3 · 1  = −3. u1 v2 − u2 v1 3 · 1 − (−1) · 0 3     u2 v3 − u3 v2 2 · (−6) − 3 · (−4) (c) u × v = u3 v1 − u1 v3  = 3 · 2 − (−1) · (−6) = 0. u1 v2 − u2 v1 −1 · (−4) − 2 · 2       u2 v3 − u3 v2 1·3−1·2 1 (d) u × v = u3 v1 − u1 v3  = 1 · 1 − 1 · 3 = −2. 1·2−1·1 1 u1 v2 − u2 v1 2. We have         1 0 0·0−0·1 0 e1 × e2 = 0 × 1 = 0 · 0 − 1 · 0 = 0 = e3 0 0 1·1−0·0 1         0 0 1·1−0·0 1 e2 × e3 = 1 × 0 = 0 · 0 − 0 · 1 = 0 = e1 0 1 0·0−1·0 0         0 1 0·0−1·0 0 e3 × e1 = 0 × 0 = 1 · 1 − 0 · 0 = 1 = e2 . 1 0 0·0−0·1 0 42 CHAPTER 1. VECTORS 3. Two vectors are orthogonal if their dot product equals zero. But     u2 v3 − u3 v2 u1 (u × v) · u = u3 v1 − u1 v3  · u2  u1 v2 − u2 v1 u3 = (u2 v3 − u3 v2 )u1 + (u3 v1 − u1 v3 )u2 + (u1 v2 − u2 v1 )u3 = (u2 v3 u1 − u1 v3 u2 ) + (u3 v1 u2 − u2 v1 u3 ) + (u1 v2 u3 − u3 v2 u1 ) = 0     u2 v3 − u3 v2 v1 (u × v) · v = u3 v1 − u1 v3  · v2  u1 v2 − u2 v1 v3 = (u2 v3 − u3 v2 )v1 + (u3 v1 − u1 v3 )v2 + (u1 v2 − u2 v1 )v3 = (u2 v3 v1 − u2 v1 v3 ) + (u3 v1 v2 − u3 v2 v1 ) + (u1 v2 v3 − u1 v3 v2 ) = 0. 4. (a) By Exercise 1, a vector normal to the plane is         0 3 1 · 2 − 1 · (−1) 3 n = u × v = 1 × −1 =  1 · 3 − 0 · 2  =  3 . 1 2 0 · (−1) − 1 · 3 −3 So the normal form for the equation of this plane is n · x = n · p, or         x 3 1 3  3 · y  =  3 ·  0 = 9. z −3 −2 −3 This simplifies to 3x + 3y − 3z = 9, or x + y − z = 3.     1 2 # » # » (b) Two vectors in the plane are u = P Q = 1 and v = P R =  3. So by Exercise 1, a vector −2 1 normal to the plane is         −5 1 · (−2) − 1 · 3 1 2 n = u × v = 1 ×  3 = 1 · 1 − 2 · (−2) =  5 . 5 2·3−1·1 −2 1 So the normal form for the equation of this plane is n · x = n · p, or         −5 0 x −5  5 · y  =  5 · −1 = 0. 1 z 5 5 This simplifies to −5x + 5y + 5z = 0, or x − y − z = 0.     v2 u3 − v3 u2 u2 v3 − u3 v2 5. (a) v × u = v3 u1 − u3 v1  = − u3 v1 − u1 v3  = −(u × v). v1 u2 − v2 u1 u1 v2 − u2 v1         u1 0 u2 · 0 − u3 · 0 0 (b) u × 0 = u2  × 0 = u3 · 0 − u1 · 0 = 0 = 0. u3 0 u1 · 0 − u2 · 0 0     u2 u3 − u3 u2 0 (c) u × u = u3 u1 − u1 u3  = 0 = 0. u1 u2 − u2 u1 0     u2 kv3 − u3 kv2 u2 v3 − u3 v2 (d) u × kv = u3 kv1 − u1 kv3  = k u3 v1 − u1 v3  = k(u × v). u1 kv2 − u2 kv1 u1 v2 − u2 v1 Exploration: The Cross Product 43 (e) u × ku = k(u × u) = k(0) = 0 by parts (d) and (c). (f ) Compute the cross-product:   u2 (v3 + w3 ) − u3 (v2 + w2 ) u × (v + w) = u3 (v1 + w1 ) − u1 (v3 + w3 ) u1 (v2 + w2 ) − u2 (v1 + w1 )   (u2 v3 − u3 v2 ) + (u2 w3 − u3 w2 ) = (u3 v1 − u1 v3 ) + (u3 w1 − u1 w3 ) (u1 v2 − u2 v1 ) + (u1 w2 − u2 w1 )     u2 v3 − u3 v2 u2 w3 − u3 w2 = u3 v1 − u1 v3  + u3 w1 − u1 w3  u1 v2 − u2 v1 u1 w2 − u2 w1 = u × v + u × w. 6. In each case, simply compute: (a)     v2 w3 − v3 w2 u1 u · (v × w) = u2  · v3 w1 − v1 w3  v1 w2 − v2 w1 u3 = u1 v2 w3 − u1 v3 w2 + u2 v3 w1 − u2 v1 w3 + u3 v1 w2 − u3 v2 w1 = (u2 v3 − u3 v2 )w1 + (u3 v1 − u1 v3 )w2 + (u1 v2 − u2 v1 )w3 = (u × v) · w. (b)     u1 v2 w3 − v3 w2 u × (v × w) = u2  × v3 w1 − v1 w3  u3 v1 w2 − v2 w1   u2 (v1 w2 − v2 w1 ) − u3 (v3 w1 − v1 w3 ) = u3 (v2 w3 − v3 w2 ) − u1 (v1 w2 − v2 w1 ) u1 (v3 w1 − v1 w3 ) − u2 (v2 w3 − v3 w2 )   (u1 w1 + u2 w2 + u3 w3 )v1 − (u1 v1 + u2 v2 + u3 v3 )w1 = (u1 w1 + u2 w2 + u3 w3 )v2 − (u1 v1 + u2 v2 + u3 v3 )w2  (u1 w1 + u2 w2 + u3 w3 )v3 − (u1 v1 + u2 v2 + u3 v3 )w3     v1 w1 = (u1 w1 + u2 w2 + u3 w3 ) v2  − (u1 v1 + u2 v2 + u3 v3 ) w2  v3 w3 = (u · w)v − (u · v)w. (c)   2 u2 v3 − u3 v2 2 ku × vk = u3 v1 − u1 v3  u1 v2 − u2 v1 = (u2 v3 − u3 v2 )2 + (u3 v1 − u1 v3 )2 + (u1 v2 − u2 v1 )2 = (u21 + u22 + u23 )2 (v12 + v22 + v32 )2 − (u1 v1 + u2 v2 + u3 v3 )2 2 2 = kuk kvk − (u · v)2 . 44 CHAPTER 1. VECTORS 7. For problem 2, use the computation in the solution above to show that e1 × e2 = e3 . Then e2 × e3 = e2 × (e1 × e2 ) = (e2 · e2 )e1 − (e2 · e1 )e2 6(b) = 1 · e1 − 0 · e2 ei · ej = 0 if i 6= j, ei · ei = 1 = e1 . Similarly, e3 × e1 = e3 × (e2 × e3 ) = (e3 · e3 )e2 − (e3 · e2 )e3 6(b) = 1 · e2 − 0 · e3 ei · ej = 0 if i 6= j, ei · ei = 1 = e2 . For problem 3, we have u · (u × v) = (u × u) · v by 6(a), and then since u × u = 0 by Exercise 5(c), this reduces to 0 · v = 0. Thus u is orthogonal to u × v. Similarly, v · (u × v) = v · (−v × u) = −v · (v × u) by Exercise 5(a), and then as before, −v · (v × u) = −(v × v) · u = −0 · u = 0, so that v is also orthogonal to u × v. 8. (a) We have 2 2 2 2 2 2 2 ku × vk = kuk kvk − (u · v)2 2 2 = kuk kvk − kuk kvk cos2 θ 2 2 = kuk kvk 1 − cos2 θ = kuk kvk sin2 θ. Since the angle between u and v is always between 0 and π, it always has a nonnegative sine, so taking square roots gives ku × vk = kuk kvk sin θ. (b) Recall that the area of a triangle is A = 21 (base)(height). If the angle between u and v is θ, then the length of the perpendicular from the head of v to the line determined by u is an altitude of the triangle; the corresponding base is u. Thus the area is 1 1 1 kuk · (kvk sin θ) = kuk kvk sin θ = ku × vk . 2 2 2     1 4 # »   # »   (c) Let u = AB = −1 and v = AC = −3 . Then from part (b), we see that the area is −1 2       1 4 −5 1     1   1√ −1 × −3 −6 = A= = 62. 2 2 2 −1 2 1 A= 1.4 Applications 1. Use the method of example 1.34. The magnitude of the resultant force r is q p 2 2 krk = kf1 k + kf2 k = 122 + 52 = 13 N, while the angle between r and east (the direction of f2 ) is θ = tan−1 12 ≈ 67.4◦ . 5 Note that the resultant is closer to north than east; the larger force to the north pulls the object more strongly to the north. 1.4. APPLICATIONS 45 2. Use the method of example 1.34. The magnitude of the resultant force r is q p 2 2 krk = kf1 k + kf2 k = 152 + 202 = 25 N, while the angle between r and west (the direction of f1 ) is θ = tan−1 20 ≈ 53.1◦ . 15 Note that the resultant is closer to south than west; the larger force to the south pulls the object more strongly to the south. 4 8 8 cos 60◦ √ 3. Use the method of Example 1.34. If we let f1 = . So the resultant , then f2 = = 0 8 sin 60◦ 4 3 force is 12 4 8 r = f1 + f2 = + √ = √ . 0 4 3 4 3 q √ 2 √ √ The magnitude of r is krk = 122 + 4 3 = 192 = 8 3 N, and the angle formed by r and f1 is √ −1 4 3 1 = tan−1 √ = 30◦ . 12 3 θ = tan Note that the resultant also forms a 30◦ degree angle with f2 ; since the magnitudes of the two forces are the same, the resultant points equally between them. √ 4 6 cos 135◦ −3√ 2 4. Use the method of Example 1.34. If we let f1 = . So the , then f2 = = 0 6 sin 135◦ 3 2 resultant force is √ √ 4 −3√ 2 4 −√3 2 r = f1 + f2 = + = . 0 3 2 3 2 The magnitude of r is q q √ √ √ 2 2 krk = (4 − 3 2) + (3 2) = 52 − 24 2 ≈ 4.24 N, and the angle formed by r and f1 is −1 θ = tan √ 3 2 √ ≈ 93.3◦ 4−3 2 2 2 −6 4 cos 60◦ √ . 5. Use the method of Example 1.34. If we let f1 = , then f2 = , and f3 = = 0 0 4 sin 60◦ 2 3 So the resultant force is 2 −2 2 −6 r = f1 + f2 + f3 = + + √ = √ . 0 0 2 3 2 3 The magnitude of r is krk = q √ √ (−2)2 + (2 3)2 = 16 = 4 N, and the angle formed by r and f1 is √ √ 3 = tan−1 (− 3) = 120◦ . −2 √ (Note that many CAS’s will return −60◦ for tan−1 (− 3); by convention we require an angle between 0 and 180◦ , so we add 180◦ to that answer, since tan θ = tan(180◦ + θ)). −1 2 θ = tan 46 CHAPTER 1. VECTORS 6. Use the method of Example 1.34. If we let f1 = 10 0 −5 0 , then f2 = , f3 = , and f4 = . So 0 13 0 −8 the resultant force is 10 0 −5 0 5 r = f1 + f2 + f3 + f4 = + + + = . 0 13 0 −8 5 The magnitude of r is krk = p √ √ 52 + 52 = 50 = 5 2 N, and the angle formed by r and f1 is θ = tan−1 5 = tan−1 1 = 45◦ . 5 7. Following Example 1.35, we have the following diagram: f fy 60° fx Here f makes a 60◦ angle with fy , so that kfy k = kf k cos 60◦ , kfx k = kf k sin 60◦ . √ √ Thus kfx k = 10 · 23 = 5 3 ≈ 8.66 N and kfy k = 10 · 12 = 5 N. Finally, this gives fx = √ 5 3 , 0 fy = 0 . 5 8. The force that must be applied parallel to the ramp is the force needed to counteract the component of the force due to gravity that acts parallel to the ramp. Let f be the force due to gravity; this acts downwards. We can decompose it into components fp , acting parallel to the ramp, and fr , acting orthogonally to the ramp. Since the ramp makes an angle of 30◦ with the horizontal, it makes an angle of 60◦ with f . Thus 1 kfp k = kf k cos 60◦ = kf k = 5 N. 2 9. The vertical force is the vertical component of the force vector f . Since this acts at an angle of 45◦ to the horizontal, the magnitude of the component of this force in the vertical direction is √ √ 2 ◦ kfy k = kf k cos 45 = 1500 · = 750 2 ≈ 1060.66 N. 2 10. The vertical force is the vertical component of the force vector f , which has magnitude 100 N and acts at an angle of 45◦ to the horizontal. Since it acts at an angle of 45◦ to the horizontal, the magnitude of the component of this force in the vertical direction is √ √ 2 kfy k = kf k cos 45◦ = 100 · = 50 2 ≈ 70.7 N. 2 Note that the mass of the lawnmower itself is irrelevant; we are not considering the gravitational force in this exercise, only the force imparted by the person mowing the lawn. 1.4. APPLICATIONS 47 11. Use the method of Example 1.36. Let t be the force vector on the cable; then the tension on the cable is ktk. The force imparted by the hanging sign is the y component of t; call it ty . Since t and ty form an angle of 60◦ , we have 1 kty k = ktk cos 60◦ = ktk . 2 The gravitational force on the sign is its mass times the acceleration due to gravity, which is 50·9.8 = 490 N. Thus ktk = 2 kty k = 2 · 490 = 980 N. 12. Use the method of Example 1.36. Let w be the force vector created by the sign. Then kwk = 1·9.8 = 9.8 N, since the sign weighs 1 kg. By symmetry, each string carries half the weight of the sign, since the angles each string forms with the vertical are the same. Let s be the tension in the left-hand string. Since the angle between s and w is 45◦ , we have √ 2 1 kwk = ksk cos 45◦ = ksk . 2 2 √ 9.8 = 4.9 2 ≈ 6.9 N. Thus ksk = √12 kwk = √ 2 13. A diagram of the situation is 25 15 20 15 kg The triangle is a right triangle, since 152 + 202 = 625 = 252 . Thus if θ1 is the angle that the left-hand 20 wire makes with the ceiling, then sin θ1 = 25 = 45 ; likewise, if θ2 is the angle that the right-hand wire 15 3 makes with the ceiling, then sin θ2 = 25 = 5 . Let f1 be the force on the left-hand wire and f2 the force on the right-hand wire. Let r be the force due to gravity acting on the painting. Then following Example 1.36, we have, using the Law of Sines, kf1 k 15 · 9.8 kf2 k krk = = 147 N. = = sin θ1 sin θ2 sin 90◦ 1 Then kf1 k = 147 sin θ1 = 147 · 4 = 117.6 N, 5 kf2 k = 147 sin θ2 = 147 · 3 = 88.2 N. 5 14. Let r be the force due to gravity acting on the painting, f1 be the tension on the wire opposite the 30◦ angle, and f2 be the tension on the wire opposite the 45◦ angle (don’t these people know to hang paintings straight?). Then krk = 20 · 9.8 = 196 N. Note that the remaining angle in the triangle is 105◦ . Then using the method of Example 1.36, we have, using the Law of Sines, kf1 k kf2 k krk = = . sin 30◦ sin 45◦ sin 105◦ Thus 196 · 12 krk · sin 30◦ ≈ ≈ 101.46 N sin 105◦ 0.9659√ 196 · 22 krk · sin 45◦ kf2 k = ≈ ≈ 143.48 N. sin 105◦ 0.9659 kf1 k = 48 CHAPTER 1. VECTORS Chapter Review 1. (a) True. This follows from the properties of Rn listed in Theorem 1.1 in Section 1.1: u=u+0 Zero Property, Property (c) = u + (w + (−w)) Additive Inverse Property, Property (d) = (u + w) + (−w) Distributive Property, Property (b) = (v + w) + (−w) By the given condition u + w = v + w = v + (w + (−w)) Distributive Property, Property (b) =v+0 Additive Inverse Property, Property (d) =v Zero Property, Property (c). (b) False. See Exercise 60 in Section 1.2. For one counterexample, let w = 0 and u and v be arbitrary vectors. Then u · w = v · w = 0 since the dot product of any vector with the zero vector is zero. But certainly u and v need not be equal. As a second counterexample, suppose that both u and v are orthogonal to w; clearly they need not be equal, but u · w = v · w = 0. (c) False. For example, let u be any nonzero vector, and v any vector orthogonal to u. Let w = u. Then u is orthogonal to v and v is orthogonal to w = u, but certainly u and w = u are not orthogonal. (d) False. When a line is parallel to a plane, then d · n = 0; that is, d is orthogonal to the normal vector of the plane. (e) True. Since a normal vector n for P and the line ` are both perpendicular to P, they must be parallel. See Figure 1.62. (f ) True. See the remarks following Example 1.31 in Section 1.3. (g) False. They can be skew lines, which  arenonintersecting lines with nonparallel  direction   vectors. 0 0 1 For example, let `1 be the line x = t 0 (the x-axis), and `2 be the line x = 0 + t 1 (the 0 1 0 line through (0, 0, 1) that is parallel to the y-axis). These two lines do not intersect, yet they are not parallel.     1 1 (h) False. For example, 0 · 0 = 1 + 0 + 1 = 0 in Z2 . 1 1 (i) True. If ab = 0 in Z/5, then ab must be a multiple of 5. But 5 is prime, so either a or b must be divisible by 5, so that either a = 0 or b = 0 in Z5 . (j) False. For example, 2 · 3 = 6 = 0 in Z6 , but neither 2 nor 3 is zero in Z6 . 10 2. Let w = . Then the head of the resulting vector is −10 −1 3 10 9 4u + v + w = 4 + + = . 5 2 −10 12 So the coordinates of the point at the head of 4u + v are (9, 12). 3. Since 2x + u = 3(x − v) = 3x − 3v, simplifying gives u + 3v = x. Thus −1 3 8 x = u + 3v = +3 = . 5 2 11 # » # » 4. Since ABCD is a square, OC = −OA, so that # » # » # » # » # » BC = OC − OB = −OA − OB = −a − b. Chapter Review 49 5. We have     −1 2 u · v =  1 ·  1 = −1 · 2 + 1 · 1 + 2 · (−1) = −3 2 −1 p √ kuk = (−1)2 + 12 + 22 = 6 p √ kvk = 22 + 12 + (−1)2 = 6. Then if θ is the angle between u and v, it satisfies cos θ = 1 u·v 3 = −√ √ = − . kuk kvk 2 6 6 Thus −1 θ = cos 1 − = 120◦ . 2 6. We have      1 1 1 1 · 1 − 2 · 1 + 2 · 1   1    92  u= proju v = −2 = −2 = − 9  . u·u 1 · 1 − 2 · (−2) + 2 · 2 9 2 2 2 9  u · v 7. We are looking for a vector   in the xy-plane; any such vector has a z-coordinate of 0. So the vector we a are looking for is u =  b  for some a, b. Then we want 0       1 a 1 u · 2 =  b  · 2 = a + 2b = 0, 3 0 3 so that a = −2b. So for example choose b = 1; then a = −2, and the vector   −2 u =  1 0   1 is orthogonal to 2. Finally, to get a unit vector, we must normalize u. We have 3 kuk = p √ (−2)2 + 12 + 02 = 5 to get   − √25   1 √1  w= u=   5 kuk 0 that is orthogonal to the given vector. We could have chosen any value for b, but we would have gotten either w or −w for the normalized vector. 8. The vector form of the given line is     2 −1 x =  3 + t  2 , −1 1 50 CHAPTER 1. VECTORS   −1 so the line has a direction vector d =  2. Since the plane is perpendicular to this line, d is a normal 1 vector n to the plane. Then the plane passes through P = (1, 1, 1), so equation n · x = n · p becomes         −1 x −1 1  2 · y  =  2 · 1 = 2. 1 z 1 1 Expanding gives the general equation −x + 2y + z = 2.   2 9. Parallel planes have parallel normals, so the vector n =  3, which is a normal to the given plane, −1 is also a normal to the desired plane. The plane we want must pass through P = (3, 2, 5), so equation n · x = n · p becomes         2 x 2 3  3 · y  =  3 · 2 = 7. −1 z −1 5 Expanding gives the general equation 2x + 3y − z = 7. 10. The three points give us two vectors that lie in the plane:       0 1 1 # »       d1 = AB = 0 − 1 = −1 , 1 0 1       −1 1 0 # » d2 = BC = 1 − 0 =  1 . 1 1 2   a A normal vector n =  b  must be orthogonal to both of these vectors, so c     0 a n · d1 =  b  · −1 = −b + c = 0 ⇒ b = c 1 c     a −1 n · d2 =  b  ·  1 = −a + b + c = 0 ⇒ a = b + c. c 1   2c Since b = c and a = b + c, we get a = 2c, so n =  c  for any value of c. Choosing c = 1 gives the c   2 vector n = 1. Let P be the point A = (1, 1, 0) (we could equally well choose B or C), and compute 1 n · x = n · p:         2 x 2 1 1 · y  = 1 · 1 = 3. 1 z 1 0 Expanding gives the general equation 2x + y + z = 3. Chapter Review 51 11. Let     1−1 0 # » u = AB = 0 − 1 = −1 1−0 1     0−1 −1 # » v = AC = 1 − 1 =  0 . 2−0 2 Using the first method from Figure 1.39 (Exercises 46 and 47) in Section 1.2, we have       0 0 0 u · v 0 · (−1) − 1 · 0 + 1 · 2   2     −1 −1 −1 u= = = proju v = u·u 02 + (−1)2 + 12 2 1 1 1       −1 0 −1 v − proju v =  0 − −1 =  1 . 2 1 1 Then the area of the triangle is p 1p 2 1√ √ 1√ 1 kuk kv − proju vk = 0 + (−1)2 + 12 (−1)2 + 12 + 12 = 2 3= 6. 2 2 2 2 12. From the first example in Exploration: Vectors and Geometry, we have a formula for the midpoint:         5 3 8 4 1     1     1 1 + −7 −6 = −3 . = m = (a + b) = 2 2 2 −2 0 −2 −1 13. Suppose that kuk = 2 and kvk = 3. Then from the Cauchy-Schwarcz inequality, we have |u · v| ≤ kuk kvk = 2 · 3 = 6. Thus −6 ≤ u · v ≤ 6, so the dot product cannot equal −7. 14. We will apply the formula |ax0 + by0 + cz0 − d| √ a2 + b2 + c2 where A = (x0 , y0 , z0 ) is a point, and a general equation for the plane is ax + by + cz = d. Here, we have a = 2, b = 3, c = −1, and d = 0; since the point is (3, 2, 5), we have x0 = 3, y0 = 2, and z0 = 5. So the distance from the point to the plane is √ 7 |2 · 3 + 3 · 2 − 1 · 5 − 0| 14 d(A, P) = p =√ = . 2 14 22 + 32 + (−1)2 d(A, P) = 15. As in example 1.32 in Section 1.3, we have B = (3, 2, 5), and the line ` has vector form     0 1 x = 1 + t 1 , 2 1   1 so that A = (0, 1, 2) lies on the line, and a direction vector for the line is d = 1. Then 1       3 0 3 # » v = AB = 2 − 1 = 1 , 5 2 3 52 CHAPTER 1. VECTORS and then projd v = d·v d·d d=     7 1 1 · 3 + 1 · 1 + 1 · 3    37  1 =  3  . 12 + 12 + 12 7 1 3 Then the vector from B that is perpendicular to the line is the vector       7 2 3 3 3   7  4 v − projd v = 1 −  3  = − 3  . 7 2 3 3 3 So the distance from B to ` is s 2 2 2 2 4 2 1√ 2√ kv − projd vk = + − + = 4 + 16 + 4 = 6. 3 3 3 3 3 16. We have 3 − (2 + 4)3 (4 + 3)2 = 3 − 13 · 22 = 3 − 1 · 4 = −1 = 4 in Z5 . Note that 2 + 4 = 6 = 1 in Z5 , 4 + 3 = 7 = 2 in Z5 , and −1 = 4 in Z5 . 17. 3(x + 2) = 5 implies that 5 · 3(x + 2) = 5 · 5 = 25 = 4. But 5 · 3 = 15 = 1 in Z7 , so this is the same as x + 2 = 4, so that x = 2. To check the answer, we have 3(2 + 2) = 3 · 4 = 12 = 5 in Z7 . 18. This has no solutions. For any value of x, the left-hand side is a multiple of 3, so it cannot leave a remainder of 5 when divided by 9 (which is also a multiple of 3). 19. Compute the dot product in Z45 : [2, 1, 3, 3] · [3, 4, 4, 2] = 2 · 3 + 1 · 4 + 3 · 4 + 3 · 2 = 6 + 4 + 12 + 6 = 1 + 4 + 2 + 1 = 8 = 3. 20. Suppose that [1, 1, 1, 0] · [d1 , d2 , d3 , d4 ] = d1 + d2 + d3 = 0 in Z2 . Then an even number (either zero or two) of d1 , d2 , and d3 must be 1 and the others must be zero. d4 is arbitrary (either 0 or 1). So the eight possible vectors are [0, 0, 0, 0], [0, 0, 0, 1], [1, 1, 0, 0], [1, 1, 0, 1], [1, 0, 1, 0], [1, 0, 1, 1], [0, 1, 1, 0], [0, 1, 1, 1]. Chapter 2 Systems of Linear Equations 2.1 Introduction to Systems of Linear Equations 1. This is equation is linear since each of x, y, and z appear to the first power with constant coefficients. √ 1, π, and 3 5 are constants. 2. This is not linear since each of x, y, and z appears to a power other than 1. 3. This is not linear since x appears to a power other than 1. 4. This is not linear since the equation contains the term xy, which is a product of two of the variables. 5. This is not linear since cos x is a function of x other than a constant times a multiple of x. 6. This is linear, since here cos 3 is a constant, and x, y, and √ z all appear to the first power with constant coefficients cos 3, −4, and 1, and the remaining term, 3, is also a constant. 7. Put the equation into the general form ax + by + c by adding y to both sides to get 2x + 4y = 7. There are no restrictions on x and y since there were none in the original equation. 8. In the given equation, the denominator must not be zero, so we must have x 6= y. With this restriction, we have x2 − y 2 (x + y)(x − y) = 1 ⇐⇒ = 1 ⇐⇒ x + y = 1, x 6= y. x−y x−y The corresponding linear equation is x + y = 1, for x 6= y. 9. Since x and y each appear in the denominator, neither one can be zero. With this restriction, we have 1 4 y x 4 1 + = ⇐⇒ + = ⇐⇒ y + x = 4 ⇐⇒ x + y = 4. x y xy xy xy xy The corresponding linear equation is x + y = 4, for x 6= 0 and y 6= 0. 10. Since log10 x and log10 y are defined only for x > 0 and y > 0, these are the restrictions on the variables. With these restrictions, we have log10 x − log10 y = 2 ⇐⇒ log10 x x = 2 ⇐⇒ = 102 = 100 ⇐⇒ x = 100y ⇐⇒ x − 100y = 0. y y The corresponding linear equation is x − 100y = 0 for x > 0 and y > 0. 11. As in Example 2.2(a), we set x = t and solve for y. Setting x = t in 3x − 6y =0 gives 3t − 6y = 0, so that 6y = 3t and thus y = 12 t. So the set of solutions has the parametric form t, 21 t . Note that if we had set x = 2t to start with, we would have gotten the simpler (but equivalent) parametric form [2t, t]. 53 54 CHAPTER 2. SYSTEMS OF LINEAR EQUATIONS 12. As in Example 2.2(a), we set x1 = t and solve for x2 . Setting x1 = t in 2x1 +3x2 = 5 gives 2t+3x 2 = 5, so that 3x2 = 5 − 2t and thus x2 = 35 − 32 t. So the set of solutions has the parametric form t, 53 − 32 t . Note that if we had set x = 3t to start with, we would have gotten the simpler (but equivalent) parametric form [3t, 5 − 2t]. 13. As in Example 2.2(b), we set y = s and z = t and solve for x. (We choose y and z since the coefficient of the remaining variable, x, is one, so we will not have to do any division). Then x + 2s + 3t = 4, so that x = 4 − 2s − 3t. Thus a complete set of solutions written in parametric form is [4 − 2s − 3t, s, t]. 14. As in Example 2.2(b), we set x1 = s and x2 = t and solve for x3 . This substitution yields 4s+3t+2x3 = 3 1 1, so that 2x31 = 1−4s−3t and x3 = 2 −2s− 2 t. Thus a complete set of solutions written in parametric 3 form is s, t, 2 − 2s − 2 t . 3 15. Subtract the first equation from the second to get x = 3; substituting x = 3 back into the first equation gives y = −3. Thus the unique intersection point is (3, −3). 2x+y3 2 1 1 2 3 4 5 -1 -2 H3, -3L -3 x+y0 -4 -5 -6 -7 16. Add twice the second equation to the first equation to get 7x = 21 and thus x = 3. Substitute x = 3 back into the first equation to get y = −2. So the unique intersection point is (3, −2). 6 3x+y7 4 2 1 2 3 4 5 -2 H3, -3L x-2y7 -4 -6 -8 3 17. Add three times the second equation to the first equation. This gives 0 = 6, which is impossible. So this system has no solutions — the two lines are parallel. 3x-6y3 2 1 -x + 2 y 1 -2 1 -1 2 3 4 5 4 5 -1 18. Add 53 times the first equation to the second, giving 0 = 0. Thus the system has an infinite number of solutions — the two lines are the same line (they are coincident). 6 4 0.1 x - 0.05 y 0.2 2 -2 1 -1 -2 2 3 -0.06 x + 0.03 y -0.12 -4 -6 -8 19. Starting from the second equation, we find y = 3. Substitute y = 3 into x − 2y = 1 to get x − 2 · 3 = 1, so that x = 7. The unique solution is [x, y] = [7, 3]. 20. Starting from the second equation, we find 2v = 6 so that v = 3. Substitute v = 3 into 2u − 3v = 5 to get 2u − 3 · 3 = 5, or 2u = 14, so that u = 7. The unique solution is [u, v] = [7, 3]. 21. Starting from the third equation, we have 3z = −1, so that z = − 13 . Now substitute this value into the second equation: 1 1 2 1 2y − z = 1 ⇒ 2y − − = 1 ⇒ 2y + = 1 ⇒ 2y = ⇒ y = . 3 3 3 3 2.1. INTRODUCTION TO SYSTEMS OF LINEAR EQUATIONS 55 Finally, substitute these values for y and z into the first equation: 1 1 2 2 x−y+z =0⇒x− + − =0⇒x− =0⇒x= . 3 3 3 3 So the unique solution is [x, y, z] = 32 , 13 , − 13 . 22. The third equation gives x3 = 0; substituting into the second equation gives −5x2 = 0, so that x2 = 0. Substituting these values into the first equation gives x1 = 0. So the unique solution is [x1 , x2 , x3 ] = [0, 0, 0]. 23. Substitute x4 = 1 from the fourth equation into the third equation, giving x3 − 1 = 0, so that x3 = 1. Substitute these values into the second equation, giving x2 + 1 + 1 = 0 and thus x2 = −2. Finally, substitute into the first equation: x1 + x2 − x3 − x4 = 1 ⇒ x1 + (−2) − 1 − 1 = 1 ⇒ x1 − 4 = 1 ⇒ x1 = 5. The unique solution is [x1 , x2 , x3 , x4 ] = [5, −2, 1, 1]. 24. Let z = t, so that y = 2t − 1. Substitute for y and z in the first equation, giving x − 3y + z = 5 ⇒ x − 3(2t − 1) + t = 5 ⇒ x − 5t + 3 = 5 ⇒ x = 5t + 2. This system has an infinite number of solutions: [x, y, z] = [5t + 2, 2t − 1, t]. 25. Working forward, we start with x = 2. Substitute that into the second equation to get 2 · 2 + y = −3, or 4 + y = −3, so that y = −7. Finally, substitute those two values into the third equation: −3x − 4y + z = −10 ⇒ −3 · 2 − 4 · (−7) + z = −10 ⇒ 22 + z = −10 ⇒ z = −32. The unique solution is [x, y, z] = [2, −7, −32]. 26. Working forward, we start with x1 = −1. Substituting into the second equation gives − 21 ·(−1)+x2 = 5, or 21 + x2 = 5. Thus x2 = 92 . Finally, substitute those two values into the third equation: 3 3 9 15 1 x1 + 2x2 + x3 = 7 ⇒ · (−1) + 2 · + x3 = 7 ⇒ + x3 = 7 ⇒ x3 = − . 2 2 2 2 2 The unique solution is [x1 , x2 , x3 ] = −1, 29 , − 12 . 27. As in Example 2.6, we create the augmented matrix from the coefficients: 1 −1 0 2 1 3 28. As in Example 2.6, we create the augmented matrix from the coefficients:   2 3 −1 1  1 0 1 0 −1 2 −2 0 29. As in Example 2.6, we create the augmented matrix from the coefficients:   1 5 −1 −1 1 −5 2 4 4 30. As in Example 2.6, we create the augmented matrix from the coefficients: 1 −2 0 1 2 −1 1 −1 −3 1 56 CHAPTER 2. SYSTEMS OF LINEAR EQUATIONS 31. There are three variables, since there are three columns not counting the last one. Use x, y, and z. Then the system becomes y+z=1 x−y =1 2x − y + z = 1 . 32. There are five variables, since there are five columns not counting the last one. Use x1 , x2 , x3 , x4 , and x5 . Then the system becomes x1 − x2 + 3x4 + x5 = 2 x1 + x2 + 2x3 + x4 − x5 = 4 x2 + 2x4 − 3x5 = 0 . 33. Add the first equation to the second, giving 3x = 3, so that x = 1. Substituting into the first equation gives 1 − y = 0, so that y = 1. The unique solution is [x, y] = [1, 1]. 34. As in Example 2.6, we perform row operations on the augmented matrix of this system (from Exercise 28) to get it into a form where we can read off the solution.  2 3  1 0 −1 2 −1 1 −2      1 0 1 0 1 0 1 0 1 1 R −2R1 R1 ↔R2  0 3 −3 1 −− 3 R2 . . . 0 − 2 3 −1 1 −−2−−−→ −−−−→ −→ R3 +R1 0 −1 2 −2 0 − −−−−→ 0 2 −1 0     R1 −R3  0 −−− 1 0 1 −−→ 1 0 0 1 0 1 0    1  R2 +R3  . . . 0 1 −1 13  0 1 −1  −−−−→ 0 1 0 3 − R3 −2R2 2 0 2 −1 0 −−−−−→ 0 0 1 −3 0 0 1 From this we get the unique solution [x1 , x2 , x3 ] = 2 1 2 3, −3, −3  2 3  − 31  . − 32 . 35. Add the first two equations to give 6y = −6, so that y = −1. Substituting back into the second equation gives −x − 1 = −5, so that x = 4. Further, 2 · 4 + 4 · (−1) = 4, so this solution satisfies the third equation as well. Thus the unique solution is [x, y] = [4, −1]. 36. As in Example 2.6, we perform row operations on the augmented matrix of this system (from Exercise 30) to get it into a simpler form. 1 −1 −2 0 1 1 −1 −3 2 1 R2 +R1 1 − 0 −−−−→ −2 −1 0 1 2 −R2 . . . −1 −2 3 − −−→ R1 +2R2 1 −2 0 1 2 − −−−−→ 1 ... 0 1 1 2 −3 0 0 1 2 1 5 2 −4 . −3 This augmented matrix corresponds to the linear system a + 2c + 5d = −4 b + c + 2d = −3 so that, letting c = s and d = t, we get a = −4 − 2s − 5t and b = −3 − s − 2t. So a complete set of solutions is parametrized by [a, b, c, d] = [−4 − 2s − 5t, −3 − s − 2t, s, t]. 37. Starting with the linear system given in Exercise 31, add the first equation to the second and to the third, giving x+ z=2 2x + 2z = 2 . Subtracting twice the first of these from the second gives 0 = −2, which is impossible. So this system is inconsistent and has no solution. 2.1. INTRODUCTION TO SYSTEMS OF LINEAR EQUATIONS 57 38. As in Example 2.6, we perform row operations on the augmented matrix of this system (from Exercise 32) to get it into a simpler form.  1 1 0    1 −1 0 3 1 2 −1 0 3 1 2 R2 −R1  R2 ↔R3 1 2 1 −1 4 − 2 2 −2 −2 2 − −−−−→ 0 −−−−→ . . . 1 0 2 3 0 0 1 0 2 3 0     1 −1 0 3 1 2 1 −1 0 3 1 2 0 1 0 2 3 0 1 0 2 3 0 1 . . . 0 ... R3 −2R2 2 R3 0 2 2 −2 −2 2 − 0 0 2 −6 −8 2 −−−−→ −−−→  R1 +R2   5 4 1 −1 0 3 1 2 −−−−−→ 1 0 0 0 1 0 1 0 2 3 0 2 3 . . . 0 0 0 1 −3 −4 1 0 0 1 −3 −4  2 0 1 Then setting x4 = s and x5 = t, we have x1 + 5s + 4t = 2 x2 + 2s + 3t = 0 x3 − 3s − 4t = 1 , x1 = 2 − 5s − 4t so that x2 = −2s − 3t x3 = 1 + 3s + 4t. So the general solution to this system has parametric form [x1 , x2 , x3 , x4 , x5 ] = [2 − 5s − 4t, −2s − 3t, 1 + 3s + 4t, s, t] . 39. (a) Since t = x and y = 3 − 2t, we can substitute x for t in the equation for y to get y = 3 − 2x, or 2x + y = 3. (b) If y = s, then substituting s for y in the general equation above gives 2x + s = 3, or 2x = 3 − s, so that x = 23 − 12 s. So the parametric solution is [x, y] = 32 − 21 s, s . 40. (a) Since x1 = t, substitute x1 for t in the second and third equations, giving x2 = 1 + x1 and x3 = 2 − x1 . This gives the linear system −x1 + x2 =1 x1 + x3 = 2 . (b) Substitute s for x3 into the second equation, giving x1 + s = 2, so that x1 = 2 − s. Substitute that value for x1 into the first equation, giving −(2 − s) + x2 = 1, or x2 = 3 − s. So the parametric form given in this way is [x1 , x2 , x3 ] = [2 − s, 3 − s, s]. 41. Let u = x1 and v = y1 . Then the system of equations becomes 2u + 3v = 0 and 3u + 4v = 1. Solving the second equation for v gives v = 41 − 43 u. Substitute that value into the first equation, giving 1 3 1 3 2u + 3 − u = 0, or − u = − , so u = 3. 4 4 4 4 Then v = 41 − 34 · 3 = −2, so the solution is [u, v] = [3, −2], or [x, y] = u1 , v1 = 13 , − 21 . 42. Let u = x2 and v = y 2 . Then the system of equations becomes u + 2v = 6 and u − v = 3. Subtracting the second of these from the first gives 3v = 3, so that v = 1. Substituting that into the second equation gives u − 1 = 3, so that u = 4. The solution in terms of u and v is thus [u, v] = [x2 , y 2 ] = [4, 1]. Since 1 and 4 have both a positive and a negative square root, the original system has the four solutions [x, y] = [2, 1], [−2, 1], [2, −1], or [−2, −1]. 58 CHAPTER 2. SYSTEMS OF LINEAR EQUATIONS 43. Let u = tan x, v = sin y, and w = cos z. Then the system becomes u − 2v = 2 u− v+w= 2 v − w = −1 . Subtract the first equation from the second, giving v + w = 0. Add that to the third equation, so1 that 1 these into the second equation gives u− − 2v = −1 and v = − 12 . Thus w = 12 . Substituting 2 + 2 = 2, so that u = 1. The solution is [u, v, w] = 1, − 12 , 12 , so that π x = tan−1 u = tan−1 1 = + kπ, 4 1 7π π y = sin−1 v = sin−1 − = + 2`π or − + 2`π, 2 6 6 1 π z = cos−1 w = cos−1 = ± + 2nπ, 2 3 where k, `, and n are integers. 44. Let r = 2a and s = 3b . Then the system becomes −r + 2s = 1 and 3r − 4s = 1. Adding three times the first equation to the second gives 2s = 4, so that s = 2. Substituting that back into the first equation gives −r + 4 = 1, so that r = 3. So the solution is [r, s] = [2a , 3b ] = [3, 2]. In terms of a and b, this gives [a, b] = [log2 3, log3 2]. 2.2 Direct Methods for Solving Linear Systems 1. This matrix is not in row echelon form, since the leading entry in row 2 appears to the right of the leading entry in row 3, violating condition 2 of the definition. 2. This matrix is in row echelon form, since the row of zeros is at the bottom and the leading entry in each nonzero row is to the left of the ones in all succeeding rows. It is not in reduced row echelon form since the leading entry in the first row is not 1. 3. This matrix is in reduced row echelon form: the leading entry in each nonzero row is 1 and is to the left of the ones in all succeeding rows, and any column containing a leading 1 contains zeros everywhere else. 4. This matrix is in reduced row echelon form: all zero rows are at the bottom, and all other conditions are vacuously satisfied (that is, they are true since no rows or columns satisfy the conditions). 5. This matrix is not in row echelon form since it has a zero row, but that row is not at the bottom. 6. This matrix is not in row echelon form since the leading entries in rows 1 and 2 appear to the right of leading entries in succeeding rows. 7. This matrix is not in row echelon form since the leading entry in row 1 does not appear to the left of the leading entry in row 2. It appears in the same column, but that is not sufficient for the definition. 8. This matrix is in row echelon form, since the zero row is at the bottom and the leading entry in each row appears in a column to the left of the leading entries in succeeding rows. However, it is not in reduced row echelon form since the leading entries in rows 1 and 3 are not 1. (Also, it is not in reduced row echelon form since the third column contains a leading 1 but also contains other nonzero entries). 9. (a)  0 0 1 0 1 1   1 1 R1 ↔R2  1 − −−−−→ 0 1 0 1 1 0  1 1 · · · 1 2.2. DIRECT METHODS FOR SOLVING LINEAR SYSTEMS 59 (b) R1 −R3  −−− −−→ 1 R2 −R3 0 ···− −−−−→ 0 1 1 0  R1 −R2  0 − −−−−→ 1 0 0 1 0  0 0 1 0 1 0 10. (a) " 4 2 " # 3 R2 ↔R1 2 −−−−−→ 4 1 # 1 " 2 R1 1 −− −→ 1 3 4 1 2 " # 1 R −4R1 3 −−2−−−→ 0 1 2 1 # ··· (b) " R1 − 12 R2 1 − − − − − − → ··· 0 0 1 # 11. (a)  3  5 2  1 R1  2 4 −− −→ 1   −2 5 5 3   2 5  R ↔R3  −2 −−1−−−→ 5 3 4   1 2  R −5R1  −2 −−2−−−→ 0 R3 −3R1 5 − −−−−→ 0  2  −12 · · · −1 (b)  R −2R2  1 2 −−1−−−→   1 0 R3 +R2 −1 − −−−−→ 0  1 1 − 12 R2  ···− −−−→ 0 0  0  1 0 12. (a) 2 3 1 R1 2 −4 −2 6 −− −→ 1 1 6 6 3 −2 −1 3 1 R2 −3R1 1 6 6 − −−−−→ 0 −2 −1 3 ··· 7 9 −3 (b) " 1 · · · 71 R2 −−−→ 0 " # R +2R2 1 0 3 −−1−−−→ − 73 0 1 −2 −1 9 1 7 11 7 9 7 15 7 − 73 # 13. (a)  3 2 4 −2 −1 −3     3 −1 3 −2 −1 3R2 R2 −2R1  6 −3 −3 − −1 −−→ −−−−→ 0 −3R3 R3 +4R1 −1 − 9 3 − −−→ −12 −−−−→ 0   −2 −1 3 0 1 −1 R3 −R2 1 −1 − −−−−→ 0  −2 −1 1 −1 · · · 0 0 (b)  1 R1  3 −3 −− −→ 1  0 −1 0 0 0 1 0  −1 −1 0   2 −3 1 R +3R1 0 −6 10 −−2−−−→ R3 +2R1 −4 7 − −−−−→ 0 2 0 0   −3 1 0 1 R3 −R2 1 − −−−−→ 0  R +2R2 3 −−1−−−→ 0 ··· 0 0 1 0 14. (a)  −2 −3 1   −4 7 1 R1 ↔R3  −6 10 − −3 −−−−→ 2 −3 −2 (b)  R −3R2 1 −−1−−−→ 0 ··· 0 2 0 0  0 1 0 2 0 0  3 1 · · · 0 60 CHAPTER 2. SYSTEMS OF LINEAR EQUATIONS 15.  1 0  0 0      1 2 −4 −4 5 1 2 −4 −4 5 2 −4 −4 5  0 −1 10 9 −5 9 −5 −1 10 9 −5    8R 0 −1 10  3  0 0 0 8 8 −8 0 1 1 −1 − 0 1 1 −1 −→ R4 +29R3 0 0 29 29 −5 0 29 29 −5 0 0 0 24 − −−−−−→ 0     1 2 −4 −4 5 1 2 −4 −4 5  0 −1 10 0 8 8 −8 9 −5 R3 ↔R2 0   − − − − − → 0 −1 10 0 9 −5 0 8 8 −8 R −3R2 0 3 −1 2 10 0 3 −1 2 10 −−4−−−→   1 2 −4 −4 5 R2 +2R1 −−−−−→  2 4 0 0 2   R3 +2R1  2 3 2 1 5 −−−−−→ R4 −R1 3 6 5 −−− −−→ −1 1 16. Ri ↔ Rj undoes itself; k1 Ri undoes kR1 ; and Ri − kRj undoes Ri + kRJ . − 1 R1 1 R1 + 7 R2 2 2 3 −1 1 2 1 2 − − 2 −1 − − − − → − − − − − → = B. So A and B are row equivalent, 17. A = R2 −2R1 1 0 3 4 − 1 0 −−−−→ 1 0 and we can convert A to B by the sequence R2 − 2R1 , − 12 R1 , and R1 + 72 R2 . 18.  2 A= 1 −1 0 1 1  R1 +R2  3 −1 − −−−−→  1 0 −1 1    3 1 −1 −1 R +2R 3  0 −−2−−−→ −1 3 2 1 −1 1 1     3 3 1 −1 3 1 −1 1 R + R R2 +R1  2 −1 3 2 3 3 2 − 1 −−− −−−−→ 2 4 − − − → R3 +R1 2 2 2 0 2 2 0 −−− −−→ 1 1 1  1 5 2  −1 1 = B. 0 Therefore the matrices A and B are row equivalent. 19. Performing R2 + R1 gives a new second row R20 = R2 + R1 . Then performing R1 + R2 gives a new first row with value R1 + R20 = 2R1 + R2 . Only one row operation can be performed at a time (although in a sequence of row operations, we may perform more than one in one step as long as any rows affected do not appear in other transformations in that step, just for efficiency). R1 −R2 −R1 x1 x1 −x2 −x2 − − − − − − → −−→ x2 . The net effect is to interchange the 20. R2 +R1 R2 +R1 x2 − x + x x + x x x1 2 1 2 − 1 −−−−→ 1 −−−−→ first and second rows. So this sequence of elementary row operations is the same as R1 ↔ R2 . 21. Note that 3R2 − 2R1 is not an elementary row operation since it is not of the form Ri ↔ Rj , kRi , or Ri + kRj (it is not of the third form since the coefficient of R2 is 3, not 1). However, 3 1 3 1 3 1 , R2 − 32 R1 3R2 10 2 4 − −−−−−→ 0 3 −−→ 0 10 so this is in fact a composition of elementary row operations. 22. We can create a 1 in the top left corner as follows: 1 R1 3 3 2 R1 ↔R2 1 4 3 2 −− −→ 1 , 1 4 −−−−−→ 3 2 1 4 1 2 3 4 , 3 1 R1 −2R2 2 − −−−−→ 1 1 4 −6 . 4 The first method, simply interchanging the two rows, requires the least computation and seems simplest. Our next choice would probably be the third method, since at least it produces integer results. 2.2. DIRECT METHODS FOR SOLVING LINEAR SYSTEMS 61 23. The rank of a matrix is the number of nonzero rows in its row echelon form. (1) Since A is not in row echelon form, we first put it in row echelon form:     1 0 1 1 0 1 R2 ↔R3  0 0 3 − −−−−→ 0 1 0 . 0 1 0 0 0 3 Since the row echelon form has three nonzero rows, rank A = 3. (2) A is already in row echelon form, and it has two nonzero rows, so rank A = 2. (3) A is already in row echelon form, and it has two nonzero rows, so rank A = 2. (4) A is already in row echelon form, and it has no nonzero rows, so rank A = 0. (5) Since A is not in row echelon form, we first put it in row echelon form:     1 0 3 −4 0 1 0 3 −4 0 R2 ↔R3  0 0 0 0 0 − 0 1 . −−−−→ 0 1 5 0 1 5 0 1 0 0 0 0 0 Since the row echelon form has two nonzero rows, rank A = 2. (6) Since A is not in row echelon form, we first put it in row echelon form:     0 0 1 1 0 0 R1 ↔R3  0 1 0 − −−−−→ 0 1 0 1 0 0 0 0 1 Since the row echelon form has three nonzero rows, rank A = 3. (7) Since A is not in row echelon form, we first put it in row echelon form:       1 2 3 1 2 3 1 2 3 0 −2 −3 1 0 0       R2 −R1 0 −2 −3    −−−−−→   R3 + 21 R2  0 1 1 0 0 − 12  1 1 −−−−−−→ 0 0 0 1 0 0 1 0 0  1 0   0 R +2R3 1 −−4−−−→ 0  2 3 −2 −3  . 0 − 12  0 0 Since the row echelon form has three nonzero rows, rank A = 3. (8) A is already in row echelon form, and it has three nonzero rows, so rank A = 3. 24. Such a matrix has 0, 1, 2, or 3 nonzero rows.   0 0 0 • If it has no nonzero rows, then all rows are zero, so the only such matrix is 0 0 0. 0 0 0 • If it has one nonzero row, that must be the first row, and any entries in that row following the 1 are arbitrary, so the possibilities are       1 ∗ ∗ 0 1 ∗ 0 0 1 0 0 0 , 0 0 0 , 0 0 0 , 0 0 0 0 0 0 0 0 0 where ∗ denotes that any entry is allowed. • If it has two nonzero rows, then the third row is zero. There are several possibilities for the arrangements of 1s in the first two rows: they could be in columns 1 and 2, 1 and 3, or 2 and 3. In each case, the entries following each 1 are arbitrary, except for the column in the first row above the 1 in the second row. This gives the possibilities       1 0 ∗ 1 ∗ 0 0 1 0 0 1 ∗ , 0 0 1 , 0 0 1 . 0 0 0 0 0 0 0 0 0 62 CHAPTER 2. SYSTEMS OF LINEAR EQUATIONS  1 • If the matrix has all nonzero rows, then the only possibility is 0 0 entries must be zero since each column has a one in it. 0 1 0  0 0. Note that all other 1 25. First row-reduce the augmented matrix of the system:  1  2 4 2 −3 −1 1 −1 1   1 9  R −2R1  0−−2−−−→ 0 R3 −4R1 4 − −−−−→ 0  1  0 R3 +9R2 −−−−−→ 0 R +3R3  −−1−−−→ 1 R2 + 57 R3  −−−−−−→ 0 0   9 1 2 −3  − 51 R2  −18 −−− 0 1 − 75  −→ 0 −9 13 −32    2 −3 1 2 −3 9 9    1 − 75 18 0 1 − 57 18 5  5 5  R 2 2 2 3 0 1 1 − −→ 0 0 5 5 −  R −2R   2 2 0 12 −−1−−−→ 1 0 0 2    1 0 5 0 1 0 5 0 1 0 0 0 1 1 2 −3 −5 7 −9 13  9 18  5  −32     x1 2 Thus the solution is x2  = 5. 1 x3 26. First row-reduce the augmented matrix of the system:  1  −1 3 −1 1 3 1 1 7   0 1  R2 +R1  5 −−−−−→ 0 R3 −3R1 2 − −−−−→ 0 −1 1 2 2 4 4   0 1  12 R2  5 −−−→ 0 2 0 −1 1 1 1 4 4   0 1  5 0 2 R3 −4R2 2 −−−−−→ 0 −1 1 1 1 0 0  0 5 2 −8 This is an inconsistent system since the bottom row of the matrix is zero but the constant is nonzero. Thus the system has no solution. 27. First row-reduce the augmented matrix of the system:  1 −1 2 −3 −2 2 1 4 6    0 1 −3 −2 0 R2 +R1 −R2 0 −−−−−→ 0 −1 −1 0 − −−→ R −2R 3 1 0 −−−−−→ 0 10 10 0    1 −3 −2 0 1 0 0 1 1 0 R3 −10R2 0 10 10 0 − −−−−−→ 0 −3 −2 1 1 0 0  R1 +3R2  0 − −−−−→ 1 0 0 0 0 0 1 0 1 1 0  0 0 0 Since there is a row of zeros, and no leading 1 in the third column, we know that x3 is a free variable, say x3 = t. Back substituting gives x2 + t = 0 and x1 + t = 0, so that x1 = x2 = −t. So the solution is       x1 −t −1 x2  = −t = t −1 . x3 t 1 2.2. DIRECT METHODS FOR SOLVING LINEAR SYSTEMS 63 28. First row-reduce the augmented matrix of the system:   1 R1     3 3 1 2 2 3 −1 4 1 −− 1 − 12 2 12 − 21 2 −→ 1 2 2 2     R −3R1   3 0 1 1 −−2−−−→ 0 1 1 −5 − 12  3 −1 0 − 11 3 −1 2 2 R3 −3R1 17 5 1 3 −4 1 −1 2 3 −4 1 −1 2 − −7 −−−−→ 0 − 2 2 2    3 1 1 − 12 2 1 32 − 21 2 2 2 2  − 11 R2  3 1  3 10 10 0 1 − 0 1 −    −−−−→ 11 11 11 11 11 R3 + 17 5 2 8 1 2 R2 −7 0 − 17 0 0 − − − − − − → 2 2 2 11 11  1 2 1  11  14 11 This matrix is in row echelon form, but we can continue, putting it in reduced row echelon form: 1  R1 − 3 R2   R1 + 11   R3  1 7 1 4 2 1 23 − 12 2 −−−− −→ 1 0 − 11 11 11 −−−−−−→ 1 0 0 1 1 2 −    3 3 3 10 1  10 1  R2 + 11 R3  ··· 0 1 − 11 0 1 − 11 11 11  11 11  −−−−−−→ 0 1 0 2 2 11 R 3 2 1 4 7 0 0 1 4 7 0 0 1 4 7 −− −→ 0 0 Using z = t as the parameter gives w = 1 − t, x = 2 − 2t, y = 7 − 4t, z = t for the solution, or         −1 1 1−t w −2  x  2 − 2t 2       =  y  7 − 4t = 7 + t −4 . 1 0 t z 29. First row-reduce the augmented matrix of the system:     1 R1    1 1 3 3 2 2 1 1 1 3 −− −→ 1 2 2 2 2     R2 −4R1   −R2  − − − − − → 4 1 7 4 1 7 0 −1 1       −−−→ 0 R3 −2R1 2 5 −1 2 5 −1 −−−−−→ 0 4 −4 0 x 2 The solution is 1 = . x2 −1 1 2 1 4  R1 − 1 R2  2 −−−−− −→ 1   −1 0 R −4R 2 −4 −−3−−−→ 0 0 1 0  2  −1 0 −3 0 0 0 1 −2 0 0 0  −2 1 0 3 2 30. First row-reduce the augmented matrix of the system:  −1  2 1 3 −2 4 −6 1 −2 −3 4 −8  −R1  −−−→ 1 0 R2 +2R1  −3 − −−−−→ 0 R3 +R1 2 − −−−−→ 0  1  0 −−−−→ 0 − 13 R2 −3 2 −4 0 1 −2 0 2 −4 −3 2 −4 0 −3 6 0 2 −4  0 −3 2   −2 1 0 1 R3 −2R2 2 − −−−−→ 0 −3 2 −4 0 1 −2 0 0 0  −2 1 0  R −2R2 1 −−1−−−→ 0 0 The system has been reduced to x1 − 3x2 = −2 x3 − 2x4 = 1. x2 and x4 are free variables; setting x2 = s and x4 = t and back-substituting gives for the solution           x1 3s − 2 −2 3 0 x2   s   0 1 0  =        x3   2t + 1  =  1 + s 0 + t 2 . x4 t 0 0 1 64 CHAPTER 2. SYSTEMS OF LINEAR EQUATIONS 31. First row-reduce the augmented matrix of the system:  1 2 1 6 1 3 1 1 2 0 −1 0 −2 −6 0 −3 1 0 −4  2R1  2 −−→ 1  6R2  −1− −→ 1 3R3 8 − −→ 1 2 3 0 −2 −12 0 0 −18 6 −6 0 −12  4 −6 24   1 2 −2 −12 0 4 −−−−−→ 0 1 2 −6 6 −10 R3 −R1 20 12 −12 −−−−−→ 0 −2 −4   1 2 −2 −12 0 4 0 1 2 −6 6 −10 R +2R2 0 0 0 0 0 0 −−3−−−→   R1 −2R2 0 −12 24 −−−−−→ 1 0 −6 0 1 2 −6 6 −10 0 0 0 0 0 0 R2 −R1 Since the matrix has rank 2 but there are 5 variables, there are three free variables, say x3 = r, x4 = s, and x5 = t. Back-substituting gives x1 − 6r − 12t = 24 and x2 + 2r − 6s + 6t = −10, so that x1 = 24 + 6r + 12t and x2 = −10 − 2r + 6s − 6t. Thus the solution is           12 0 x1 6 24 −6 6 x2  −10 −2           x3  =  0 + r  1 + s 0 + t  0 .            0 1 x4   0  0 1 0 0 0 x5 32. First row-reduce the augmented matrix of the system: √ 2   0 0 1 √ 2 −1 2 −3 √ 2  √12 R1  1 −−−−→ 1 √  1  − 2 √2 R2 0 −−−−→ 1 0 √ √ 2 2 1 −1  1 0  0 1  √ 2 R3 −−−−→ 0 0 2 √ −322 √ 2  R1 − √2 R2  −−−−−2−−→ 1   −1 0 R +R 3 2 1 −−−−−→ 0 √ 2 2 0 1 0 √ 3 2+ √ 2 −√3 2 2 2 2 √ √ 3 2+ √ 2 3 2 − 2 1 √  R1 −( 2+ 23 )R3  2 −−−−−−−−−−→ 1 0 0  R2 + 3√2 R3  0 1 0 −1 − −−−−2−−→  0 0 1 0 √  2  −1 0 √  2  −1 0 33. First row-reduce the augmented matrix of the system:  1 1  0 1 1 2 1 −1 −1 1 1 1 0 1 0 1     1 1 2 1 1 1 1 R −R 2 1  R ↔R 0  0 2 3  −−−−−→ 0 −2 −3 0 −1 − 0 −1 1 1 0 −1 −−−−→ 0 R4 −R1 2 − 0 −2 0 1 0 −−−−→ 0    1 1 2 1 1 1 0 1  0 1 0 −1    R +2R2  0 0 0 −1 0 −3 −−3−−−→ R4 −2R3 0 0 −2 0 1 −−−−−→ 0 1 2 1 1 1 0 −2 −3 0 0 −2 0 1 1 0 0 2 1 1 0 −1 0 0 0  1 −1  −1 1  1 −1  −3 7 We need not proceed any further — since the last row says 0 = 7, we know the system is inconsistent and has no solution. 2.2. DIRECT METHODS FOR SOLVING LINEAR SYSTEMS 65 34. First row-reduce the augmented matrix of the system:  1 1 1 1 1 2 3 4  1 3 6 10 1 4 10 20   1 1 1 1 4 R −R 2 1 0 1 2 3 −−− −−→ 10  R −R  3 1  20 − 0 2 5 9 −− −−→ R4 −R1 35 − 0 3 9 19 −−−−→  1 1 1 1 0 1 2 3  0 0 1 3 R4 −3R3 −−−−−→ 0 0 0 1   1 1 1 4 0 1 2 6  R −2R2  0 0 1 16 −−3−−−→ R4 −3R2 31 − −−−−→ 0 0 3  R1 −R4  4 −−−−−→ 1 R2 −3R4  6  −−−−−→ 0 R3 −3R4 0 4 − −−−−→ 1 0 1 1 0 0 1 2 1 0 0 0 0 1 1 3 3 10  4 6  4 13  R1 −R3   −−→ 1 1 0 0 2 3 −−− R2 −2R3   3  −−−−−→ 0 1 0 0 1 0 0 1 0 1 1 1 0 0 0 1 1  R1 −R2 −−− −−→ 1 0 0 0 1 0  0 0 1 0 0 0 0 0 0 1  1 1  1 1     1 a  b  1    so that   c  = 1 is the unique solution. 1 d 35. If we reverse the first and third rows, we get a system in row echelon form with a leading 1 in every row, so rank A = 3. Thus this is a system of three equations in three variables with 3 − 3 = 0 free variables, so it has a unique solution. 36. Note that R3 −2R2 results in the row 0 0 0 0 | 2, so that the system is inconsistent and has no solutions. 37. This system is a homogeneous system with four variables and only three equations, so the rank of the matrix is at most 3 and thus there is at least one free variable. So this system has infinitely many solutions. 38. Note that R3 ← R3 − R1 − R2 gives a zero row, and then R2 ← R2 − 6R1 gives a matrix in row echelon form. So there are three free variables, and this system has infinitely many solutions. 39. It suffices to show that the row-reduced matrix has no zero rows, since it then has rank 2, so there are no free variables and no possibility of an inconsistent system. First, if a = 0, then since ad − bc = −bc 6= 0, we see that both b and c are nonzero. Then row-reduce the augmented matrix: 0 b r R1 ↔R2 c d s . c d s −−−−−→ 0 b r Since b and c are both nonzero, this is a consistent system with a unique solution. Second, if c = 0, then since ad − bc = ad 6= 0, we see that both a and d are nonzero. Then the augmented matrix is a b r . 0 d s This is row-reduced since both a and d are nonzero, and it clearly has rank 2, so this is a consistent system with a unique solution. Finally, if a 6= 0 and c 6= 0, then row-reducing gives a c b d cR1 r −−→ ac bc aR2 s − ac ad −→ rc ac bc R2 −R1 as − 0 ad − bc −−−−→ rc . as Since ac 6= 0 and also ad − bc 6= 0, this is a rank 2 matrix, so it has a unique solution. 66 CHAPTER 2. SYSTEMS OF LINEAR EQUATIONS 40. First reduce the augmented matrix of the given system to row echelon form: k 2 2 −4 3 R1 ↔R2 2 −6 −−−−−→ k −4 2 −6 2 1 R2 − 2 kR1 3 − −−−−−→ 0 −4 2 + 2k −6 3 + 3k (a) The system has no solution when this matrix has a zero row with a corresponding nonzero constant. Now, 2 + 2k = 0 for k = −1, in which case 3 + 3k = 0 as well. Thus there is no value of k for which the system has no solutions. (b) The system has a unique solution when rank A = 2, which happens when the bottom row is nonzero, i.e., for k 6= −1. (c) The system has an infinite number of solutions when A has a zero row with a zero constant. From part (a), this happens when k = −1. 41. First row-reduce the augmented matrix of the system: 1 k k 1 1 1 R2 −kR1 1 − −−−−→ 0 k 1 − k2 1 1−k (a) If 1 − k 2 = 0 but 1 − k 6= 0, this system will be inconsistent and have no solution. 1 − k 2 = 0 for k = 1 and k = −1; of these, only k = −1 produces 1 − k 6= 0. Thus the system has no solution for k = −1. (b) The system has a unique solution when 1 − k 2 6= 0, so for k 6= ±1. (c) The system has infinitely many solutions when the last row is zero, so when 1 − k 2 = 1 − k = 0. This happens for k = 1. 42. First reduce the augmented matrix of the given system to row echelon form:  1 1 2 −2 3 1 1 −1 4   2 1 R2 −R1 −−→ 0 k  −−− R3 −2R1 k2 − −−−−→ 0 −2 3 3 −2 3 −2   1 2 0 k − 2 R3 −R2 k2 − 4 − −−−−→ 0 −2 3 3 −2 0 0  2 k − 2 k2 − k − 2 (a) The system has no solution when the matrix has a zero row with a corresponding nonzero constant. This occurs for k 2 − k − 2 6= 0, i.e. for k 6= 2, −1. (b) Since the augmented matrix has a zero row, the system does not have a unique solution for any value of k; z is a free variable. (c) The system has an infinite number of solutions when the augmented matrix has a zero row with a zero constant. This occurs for k 2 − k − 2 = 0, i.e. for k = 2, −1. 43. First reduce the augmented matrix of the given system to row echelon form:  1 1 k 1 k 1 k 1 1   1 1 1 R2 −R1 −−→ 0 k − 1 1  −−− R3 −kR1 −2 − −−−−→ 0 1 − k k 1−k 1 − k2   1 1 0 0  R3 +R2 −2 − k − −−−−→ 0 1 k k−1 1−k 0 2 − k − k2  1 0  −2 − k (a) The system has no solution when the matrix has a zero row with a corresponding nonzero constant. This occurs when 2 − k − k 2 = (1 − k)(2 + k) = 0 but −2 − k 6= 0. For k = 1, −2 − k = −3 6= 0, but for k = −2, −2 − k = 0. Thus the system has no solution only when k = 1. (b) The system has a unique solution when 2 − k − k 2 6= 0; that is, when k 6= 1 and k 6= −2. (c) The system has an infinite number of solutions when the augmented matrix has a zero row with a zero constant. This occurs when 2 − k − k 2 = 0 and −2 − k = 0. From part (a), we see that this is exactly when k = −2. 2.2. DIRECT METHODS FOR SOLVING LINEAR SYSTEMS 67 44. There are many examples. For instance, (a) The following system of n equations in n variables has infinitely many solutions (for example, choose k and set x1 = k and x2 = −k, with the other xi = 0): x1 + x2 + · · · + xn = 0 2x1 + 2x2 + · · · + 2xn = 0 .. . nx1 + nx2 + · · · + nxn = 0 . Let m be any integer greater than n. Then the following system of m equations in n variables has infinitely many solutions (for example, choose k and set x1 = k and x2 = −k, with the other xi = 0): x1 + x2 + · · · + xn = 0 2x1 + 2x2 + · · · + 2xn = 0 .. . mx1 + mx2 + · · · + mxn = 0 . (b) The following system of n equations in n variables has a unique solution: x1 = 0 x2 = 0 .. . xn = 0 . Let m be any integer greater than n. Then a system whose first n equations are as above, and which continues with equations x1 + x2 = 0, x1 + 2x2 = 0, and so forth until there are m equations clearly also has a unique solution. 45. Observe that the normal vectors to the planes, which are [3, 2, 1] and [2, −1, 4], are not parallel, so the two planes will indeed intersect in a line. The line of intersection is the points in the solution set of the system 3x + 2y + z = −1 2x − y + 4z = 5 Reduce the corresponding augmented matrix to row echelon form: " 3 2 2 1 −1 4 # 1 " 3 R1 −1 −− −→ 1 5 2 2 3 1 3 −1 4 # " 2 − 13 1 3 R −2R1 5 −−2−−−→ 0 − 73 " 1 1 23 3 3 − 7 R2 10 0 1 − −−−−→ 7 1 3 10 3 − 13 # 17 3 # " R − 23 R2 − 13 −−1−−− −→ 1 0 − 17 0 1 7 Thus z is a free variable; we set z = 7t to eliminate some fractions. Then 9 x = −9t + , 7 y = 10t − 17 , 7 so that the line of intersection is       9 x −9 7    17    y  = − 7  + t  10 . z 0 7 9 7 − 10 7 9 7 − 17 7 # 68 CHAPTER 2. SYSTEMS OF LINEAR EQUATIONS 46. Observe that the normal vectors to the planes, which are [4, 1, 1] and [2, −1, 3], are not parallel, so the two planes will indeed intersect in a line. The line of intersection is the points in the solution set of the system 4x + y + z = 0 2x − y + 3z = 2 Reduce the corresponding augmented matrix to row echelon form: " 4 2 1 1 −1 3 # 1 " 4 R1 0 −− −→ 1 2 2 1 4 1 4 −1 3 " # 1 0 R −2R1 2 −−2−−−→ 0 1 4 − 32 " 1 − 32 R2 −−−−→ 0 1 4 5 2 1 4 1 # 0 2 1 4 − 35 # " R − 14 R2 0 −−1−−− −→ 1 − 34 0 0 1 2 3 − 53 1 3 − 43 # Letting the parameter be z = t, we get the parametric form of the solution: x= 1 2 − t, 3 3 4 5 y = − + t, 3 3 z = t. 47. We are looking for examples, so we should keep in mind simple planes that we know, such as x = 0, y = 0, and z = 0. (a) Starting with x = 0 and y = 0, these planes intersect when x = y = 0, so they intersect on the z-axis. So we must find another plane that contains the z-axis. Since the line y = x in R2 passes through the origin, the plane y = x in R3 will have z as a free variable, and it contains (0, 0, 0), so it contains the z-axis. Thus the three planes x = 0, y = 0, and x = y intersect in a line, the z-axis. A sketch of these three planes is: (b) Again start with x = 0 and y = 0; we are looking for a plane that intersects both of these, but not on the z-axis. Looking down from the top, it seems that any plane whose projection onto the xy plane is a line in the first quadrant will suffice. Since x + y = 1 is such a line, the plane x + y = 0 should intersect each of the other planes, but there will be no common points of intersection. To find the line of intersection of x = 0 and x + y = 1, solve that pair of equations. Using 2.2. DIRECT METHODS FOR SOLVING LINEAR SYSTEMS 69 forward substitution we get y = 1, so the line of intersection is the line x = 0, y = 1, which is the line [0, 1, t]. To find the line of intersection of y = 0 and x + y = 1, again solve using forward substitution, giving the line x = 1, y = 0, which is the line [1, 0, t]. Clearly these two lines have no common points. A sketch of these three planes is: (c) Clearly x = 0 and x = 1 are parallel, while y = 0 intersects each of them in a line. A sketch of these three planes is: 70 CHAPTER 2. SYSTEMS OF LINEAR EQUATIONS (d) The three planes x = 0, y = 0, and z = 0 intersect only at the origin: 48. The two lines have parametric forms p + su and q + tv, where s and t are the parameters. These two are equal when p + su = q + tv, or su − tv = q − p. Substitute the given values of the four vectors to get           1 −1 2 −1 3 s+t= 3 s  2 − t  1 = 2 −  2 =  0 ⇒ 2s − t = 0 −1 0 0 1 −1 −s = −1 . From the third equation, s = 1; substituting into the first equation gives t = 2. Since s = 1, t = 2 satisfies the second equation, this is the solution. Thus the point of intersection is     0 x y  = p + 1u = 4 . z 0 49. The two lines have parametric forms p + su and q + tv, where s and t are the parameters. These two are equal when p + su = q + tv, or su − tv = q − p. Substitute the given values of the four vectors to get           1 2 −1 3 −4 s − 2t = −4 s 0 − t 3 =  1 − 1 =  0 ⇒ − 3t = 0 1 1 −1 0 −1 s − t = −1 . From the second equation, t = 0; setting t = 0 in the other two equations gives s = −4 and s = −1. This is impossible, so the system has no solution and the lines do not intersect. 50. Let Q = (a, b, c); then we want to find values of Q such that the line q + sv intersects the line p + tu. The two lines intersect when p + su = q + tv, or su − tv = q − p. 2.2. DIRECT METHODS FOR SOLVING LINEAR SYSTEMS 71 Substitute the values of the vectors to get           1 a−1 s − 2t = a − 1 1 2 a s  1 − t 1 =  b  −  2 =  b − 2  ⇒ s − t = b − 2 −1 0 c −3 c+3 −s = c+3. Solving those equations for a, b, and c gives Q = (a, b, c) = (s − 2t + 1, s − t + 2, −s − 3). Thus any pair of numbers s and t determine a point Q at which the two given lines intersect. 51. If x = [x1 , x2 , x3 ] satisfies u · x = v · x = 0, then the components of x satisfy the system u1 x1 + u2 x2 + u3 x3 = 0 v1 x1 + v2 x2 + v3 x3 = 0. There are several cases. First, if either vector is zero, then the cross product is zero, and the statement does not hold: any vector that is orthogonal to the other vector is orthogonal to both, but is not a multiple of the zero vector. Otherwise, both vectors are nonzero. Assume that u is the vector with a lowest-numbered nonzero component (that is, if u1 = 0 but u2 6= 0, while v1 6= 0, then reverse u and v). The augmented matrix of the system is u1 u2 u3 0 . v1 v2 v3 0 If u1 = u2 = 0 but u3 6= 0, then v1 = v2 = 0 and v3 6= 0 as well, so that u and v are parallel and thus u × v = 0. However, any vector of the form   s t 0 is orthogonal to both u and v, so the result does not hold in this case either. Next, if u1 = 0 but u2 6= 0, then v1 = 0 as well, so the augmented matrix has the form 0 u2 u3 0 0 0 0 u2 u3 0 u2 u3 . u2 R2 R −v2 R1 0 v2 v3 0 − −−→ 0 u2 v2 u2 v3 0 −−2−−− −→ 0 0 u2 v3 − u3 v2 0     0 0 Then x1 is a free variable. If u2 v3 − u3 v2 = 0, then u = u2  and v = v2  are again parallel, so u3 v3 their dot product is zero and again the result does not hold. But if u2 v3 − u3 v2 6= 0, then x3 = 0, which implies that x2 = 0 as well. So any vector that is orthogonal to both u and v in this case is of the form     x1 s x2  = 0 , x3 0 and         0 0 u2 v3 − u3 v2 u2 v3 − u3 v2 u2  × v2  = u3 · 0 − 0 · v3  =  , 0 u3 v3 0 · v2 − 0 · u 2 0 and since u2 v3 − u3 v2 6= 0, the orthogonal vectors are multiples of the cross product. Finally, assume that u1 6= 0. Then the augmented matrix reduces as follows: u1 u2 u3 0 u1 u2 u1 u2 u3 0 u1 R2 R2 −v1 R1 v1 v2 v3 0 − u v u v u v 0 0 u v 1 2 1 3 1 2 − u2 v1 −−→ 1 1 −−−−−−→ u3 u1 v3 − u3 v1 0 . 0 72 CHAPTER 2. SYSTEMS OF LINEAR EQUATIONS Now consider the last row of this matrix. If it consists of all zeros, then u1 v2 = u2 v1 and u1 v3 = u3 v1 . If v1 6= 0, then uu12 = vv21 and also uu13 = vv13 , so that again u and v are parallel and the result does not hold. Otherwise, if v1 = 0, then u1 v2 = u1 v3 = 0, and u1 was assumed nonzero, so that v2 = v3 = 0 and thus v is the zero vector, which it was assumed not to be. So at least one of u1 v2 − u2 v1 and u1 v3 − u3 v1 is nonzero. If they are both nonzero, then x3 = t is free, and (u1 v2 − u2 v1 )x2 = (u3 v1 − u1 v3 )t, so thatx2 = u3 v1 − u1 v3 t. u1 v2 − u2 v1 But then u1 x1 + u2 x2 + u3 x3 = 0, so that u1 (u3 v2 − u2 v3 ) u3 v1 − u1 v3 t + u3 t = u1 x1 + t = 0, u1 x1 + u2 u1 v2 − u2 v1 u1 v2 − u2 v1 and thus x1 = u2 v3 − u3 v2 t. u1 v2 − u2 v1 Thus the orthogonal vectors are     u2 v3 −u3 v2 x1    u1 v2 −u2 v1  x2  = t  u3 v1 −u1 v3  ,    u1 v2 −u2 v1  1 x3 or, clearing fractions,     x1 u2 v3 − u3 v2     x2  = t u3 v1 − u1 v3  ,     x3 u1 v2 − u2 v1 so that orthogonal vectors are multiples of the cross product. If u1 v2 − u2 v1 6= 0 but u1 v3 − u3 v1 = 0, then again v1 = 0 forces v = 0, which is impossible. So v1 6= 0 and thus uu13 = vv31 . Now, x3 = t is free and x2 = 0, so that u1 x1 + u3 t = 0 and then x1 = − uu31 t. Let t = (u1 v2 − u2 v1 )s; then x1 = − u3 (u1 v2 − u2 v1 ) u3 u1 v2 u3 v3 s=− s + (u2 v1 )s = −u3 v2 s + (u2 v1 )s = (u2 v3 − u3 v2 )s. u1 u1 u1 v1 Thus orthogonal vectors are of the form     x1 u2 v3 − u3 v2 x2  = s  , 0 x3 u1 v2 − u2 v1 and since u1 v3 − u3 v1 = 0, this column vector is u × v, and again the result holds. The final case is that u1 v3 − u3 v1 6= 0 but u1 v2 − u2 v1 = 0. Again v1 = 0 forces v = 0, which is impossible, so that v1 6= 0 and thus uu12 = vv21 . Now, x2 = t is free and x3 = 0, so that u1 x1 + u2 t = 0 and then x1 = − uu21 t. Let t = (u1 v3 − u3 v1 )s; then x1 = − u2 u1 v3 u2 v2 u2 (u1 v3 − u3 v1 ) s=− s + u3 v1 s = −u2 v3 s + u3 v1 s = (u3 v2 − u2 v3 )s. u1 u1 u1 v1 Thus orthogonal vectors are of the form     x1 u2 v3 − u3 v2 x2  = s u1 v3 − u3 v1  , x3 0 and since u1 v2 − u2 v1 = 0, this column vector is u × v, and again the result holds. Whew. 2.2. DIRECT METHODS FOR SOLVING LINEAR SYSTEMS 73 52. To show that the lines are skew, note first that they have nonparallel direction vectors, so that they are not parallel. Suppose there is some x such that x = p + su = q + tv. Then su − tv = q − p. Substituting the given values into this equation gives           1 −1 2s = −1 2 0 0 s −3 − t  6 =  1 − 1 =  0 ⇒ −3s − 6t = 0 1 −1 −1 0 −1 s + t = −1 . But the first equation gives s = − 21 , so that from the second equation t = 14 . But this pair of values does not satisfy the third equation. So the system has no solution and thus the lines do not intersect. Since they are not parallel, they are skew. Next, we wish to find a pair of parallel planes, one containing each line. Since u and v are direction vectors for the lines, those direction vectors will lie in both planes since the planes are required to be parallel. So a normal vector is given by     −3 · (−1) − 1 · 6 −3 u × v =  1 · 0 − 2 · (−1)  =  2 . 2 · 6 − (−3) · 0 12 So the planes we are looking for will have equations of the form −3x + 2y + 12z = d. The plane that passes through P = (1, 1, 0) must satisfy −3 · 1 + 2 · 1 + 12 · 0 = d, so that d = −1. The plane that passes through Q = (0, 1, −1) must satisfy −3 · 0 + 2 · 1 + 12 · (−1) = d, so that d = −10. Hence the two planes are −3x + 2y + 12z = −1 containing p + su, −3x + 2y + 12z = −10 containing q + tv. 53. Over Z3 , R2 +2R1 1 2 1 − −−−−→ 1 2 1 1 2 0 2 x 0 Thus the solution is = . y 2 1 1 2R2 1 − −→ 0 R1 +R2 1 − −−−−→ 1 0 2 0 1 2 1 0 . 2 54. Over Z2 ,  1 1 0 0 1 1 1 0 1   1 1 0 0 R3 +R1 1 − −−−−→ 0 1 1 1 0 1 1   1 1 0 0 R3 +R2 0 − −−−−→ 0 1 1 0 0 1 0  R1 +R2  1 − −−−−→ 1 0 1 0 1 1 0 0 0 0 0  1 0 0 Thus z = t is a free variable, and we have x + t = 1, so that x = 1 − t = 1 + t and also y + t = 0, so that y − t = t. Thus the solution is         x 1+t 0   1 y  =  t  = 0 , 1 ,   z t 0 1 since the only possible values for t in Z2 are 0 and 1. 55. Over Z3 ,  1 1 0 0 1 1 1 0 1   1 1 1 0 0 1 1 0 R3 +2R1 1 − −−−−→ 0 2 1  R +2R2  1 −−1−−−→ 1 0 0 R3 +R2 0 − −−−−→ 0 0 1 0  1 0 ··· 2R3 0 − −→   R1 +R3  1 0 2 1 −−− −−→ 1 R2 +2R3 0 · · · 0 1 1 0 − −−−−→ 0 0 1 0 0 2 1 2 0 1 0 0 0 1  1 0 0 74 CHAPTER 2. SYSTEMS OF LINEAR EQUATIONS Thus the unique solution is     x 1 y  = 0 . z 0 56. Over Z5 , 3 2 1 4 2R1 1 − −→ 1 1 1 4 4 2 1 R2 +4R1 1 − −−−−→ 0 4 0 1 . 4 Since the bottom row says 0 = 4, this system is inconsistent and has no solutions. 57. Over Z7 , 5R1 2 1 − −→ 1 4 1 1 x 3 . Thus the solution is = y 3 3 1 5 R2 +6R1 1 3 1 −−−−−→ 0 1 3 4 R1 +4R2 5 − −−−−→ 1 0 3 0 1 3 . 3 58. Over Z5 ,  1 1  2 1 0 2 2 0 0 4 0 3 4 0 1 0   1 0 0 4 1 R +4R 2 1 −−−−−→ 0 2 4 1 3  R +3R  1  1 − 0 2 0 3 −3−−−→ R +4R 1 2 −−4−−−→ 0 0 3 1  1 0 ··· 0 0 0 2 0 0 0 4 1 3 4 1 2 1  1 2  R +4R · · · 2 4 − −3−−−→ 1   1 0 0 4 1 3R 2 0 1 2 3 − − → 2   0 0 1 2 2 R +2R 4 3 1 −−−−−→ 0 0 0 0   1 0 0 4 1 R2 +3R3  1 − − − − − →  0 1 0 4 0 0 1 2 2 0 0 0 0 0  1 2  2 0 Thus x4 = t is a free variable, and       1+t 1 − 4t x1 x2  2 − 4t  2 + t      = x3  2 − 2t = 2 + 3t . x4 t t Since t = 0, 1, 2, 3, or 4, there are five solutions:       1 2 3 2 3 4  ,  ,  , 2 0 3 0 1 2   4 0  , 1 3   0 1  and  4 . 4 59. The Rank Theorem, which holds over Zp as well as over Rn , says that if A is the matrix of a consistent system, then the number of free variables is n − rank(A), where n is the number of variables. If there is one free variable, then it can take on any of the p values in Zp ; since 1 = n − rank(A), the system has exactly p = p1 = pn−rank(A) solutions. More generally, suppose there are k free variables. Then k = n − rank(A). Each of the free variables can take any of p possible values, so altogether there are pk choices for the values of the free variables. Each of these yields a different solution, so the total number of solutions is pk = pn−rank(A) . 60. The first step in row-reducing the augmented matrix of this system would be to multiply by the appropriate number to get a 1 in the first column. However, neither 2 nor 3 has a multiplicative inverse in Z6 , so this is impossible. So the best we can do in terms of row-reduction is 2 3 4 2 3 4 R2 +R1 4 3 2 − −−−−→ 0 0 0 Exploration: Lies My Computer Told Me 75 Thus y is a free variable, and 2x + 3y = 4. To solve this equation, choose each possible value for y in turn: • When y = 0, y = 2, or y = 4 we get 2x = 4, so that x = 2 or x = 5. • When y = 1 or y = 3, we get 2x + 3 = 4, or 2x = 1, which has no solution. So altogether there are six solutions: 2 5 , , 0 0 2 , 2 5 , 2 2 , 4 5 . 4 Exploration: Lies My Computer Told Me 1. x+ y=0 801 y=0 x + 800 ⇒ −800x − 800y = 0 800x + 801y = 800 ⇒ x = −800 y = 800 . x+ y=0 x + 1.0012y = 0 ⇒ −x − y=0 x + 1.0012y = 1 ⇒ x = −833.33 y = 833.33 . ⇒ −x − y=0 x + 1.001y = 1 ⇒ x = −1000 y = 1000 , 2. 3. At four significant digits, we get x+ y=0 x + 1.001y = 0 while at three significant digits, we get x+ y=0 x + 1.00y = 0 ⇒ −x − y=0 x + 1.00y = 1 ⇒ Inconsistent system; no solution. 4. For the equations we just looked at, the slopes of the two lines were very close (the angle between the lines was very small). That means that changing one of the slopes even a little bit will result in a large change in their intersection point. That is the cause of the wild changes in the computed output values above — rounding to five significant digits is indeed the same as a small change in the slope of the second line, and it caused a large change in the computed solution. For the second example, note that the two slopes are 7.083 ≈ 1.55602, 4.552 2.693 ≈ 1.55575, 1.731 so that again the slopes are close, so we would expect the same kind of behavior. Exploration: Partial Pivoting 1 ≈ 4761.9. 1. (a) Solving 0.00021x = 1 to five significant digits gives x = 0.00021 1 (b) If only four significant digits are kept, we have 0.0002x = 1, so that x = 0.0002 = 5000. Thus the effect of an error of 0.00001 in the input is an error of 238.1 in the result. 1 2. (a) Start by pivoting on 0.4. Multiply the first row by 0.4 = 250 to get [1, 249, 250]. Now we want to subtract 75.3 times that row from the second row. Multiplying by 75.3 and keeping three significant digits gives 75.3[1, 249, 250] = [75.3, 18749.7, 18825.0] ≈ [75.3, 18700, 18800]. Then the second row becomes (again keeping only three significant digits) [75.3, 45.3, 30] − [75.3, 18700, 18800] = [0, 18654.7, 18770] ≈ [0, 18700, 18800]. 76 CHAPTER 2. SYSTEMS OF LINEAR EQUATIONS This gives y = 18800 18700 ≈ 1.00535 ≈ 1.01. Substituting back into the modified first row, which says x + 249y = 250 gives x = 250 − 251.49 ≈ 250 − 251 = −1. Substituting x = y = 1 into the original system, however, yields two true equations, so this is in fact the solution. The problem was that the large numbers in the second row, combined with the fact that we were keeping only three significant digits, overwhelmed the information from adding in the first row, so all that information was lost. (b) First interchanging the rows gives the matrix 75.3 −45.3 0.4 99.6 30 100 1 to get (after reducing to three significant digits) [1, −0.602, 0.398]. Multiply the first row by 75.3 Then subtract 0.4 times that row from the second row: [0.4, 99.6, 100] − 0.4[1, −0.602, 0.398] ≈ [0, 99.8, 99.8]. This gives 99.8y = 99.8, so that y = 1; from the first equation, we have x − 0.602y = 0.398, so that x − 0.602 · 1 = 0.398 and thus x = 1. We get the correct answer. 3. (a) Without partial pivoting, we pivot on 0.001 to three significant digits, so we divide the first row by 0.001, giving [1, 995, 1000], then multiply it by 10.2 and add the result to row 2, giving the matrix 1 995 1000 ⇒ y ≈ 1.01. 0 −10100 −10200 Back-substituting gives x = 1.00−1.00 = 0 (again remembering to keep only three significant 2 digits). This error was introduced because 1.00 and −50.0 were ignored when reducing to three significant digits in the original row-reduction, being overwhelmed by numbers in the range of 10000. Using partial pivoting, we pivot on −10.2 to three significant digits: the second row becomes [1, 0.098, 4.9]. Then multiply it by 0.001 and subtract it from the first row, giving [0, 0.995, 1.00]. Thus 0.995y = 1.00, so that to three significant digits y = 1.00. Back-substituting gives x = −50.0−1.00 = 51.0 −10.2 10.2 = 5. Pivoting on the largest absolute value reduced the error in the solution. (b) Start by pivoting on the 10.0 at upper left:    10.0 −7.00 0.00 7.00 10.0 −3.0 2.09 6.00 3.91 → −  0.0 5.0 1.00 5.00 6.00 0.0 −7.00 0.00 −0.01 6.00 2.50 5.00  7.00 6.01 . 2.50 If we continue, pivoting on −0.01 to three significant digits, the second row then becomes [0, 1, −600, −601]. Multiply it by 2.5 and add it to the third row:     10.0 −7.00 0.00 7.00 10.0 −7.00 0.00 7.00  0.0 −0.01 6.00 6.01 → −  0.0 −0.01 6.00 6.01  . 0.0 2.50 5.00 2.50 0.0 0 −1500 −1500 This gives z = 1.00, y = −1.00, and x = 0.00. If instead we reversed the second and third rows and pivoted on the 2.5, we would first divide that row by 2.5, giving [0, 1, 2, 1], then multiply it by 0.01 and add it to the third row:       10.0 −7.00 0.00 7.00 10.0 −7.00 0.00 7.00 10.0 −7.00 0.00 7.00  0.0 2.50 5.00 2.50 → −  0.0 1 2 1 → −  0.0 1 2 1 . 0.0 −0.01 6.00 6.01 0.0 −0.01 6.00 6.01 0.0 0 6.02 6.02 This also gives z = 1.00, y = −1.00, and x = 0.00. Exploration: An Introduction to the Analysis of Algorithms 77 Exploration: An Introduction to the Analysis of Algorithms 1. We count the number of operations, one step at a time.   1 R1  2 2 4 6 8 −− 1 −→  3 9  3 6 12 1 −1 1 −1 −1 2 9 1 3 6 −1  4 12 1 There were three operations here: 21 · 4 = 2, 12 · 6 = 3, and 21 · 8 = 4. We don’t count 12 · 2 = 1 since we don’t actually perform that operation — once we chose 2, we knew we were going to turn it into 1.     1 2 3 4 4 1 2 3 R −3R 1   3 9 6 12 −−2−−−→ 0 3 −3 0 1 −1 1 −1 −1 1 −1 1 There were three operations here: 3 · 2 = 6, 3 · 3 = 9, and 3 · 4 = 12, fr a total of six operations so far. Note that we are counting only multiplications and divisions, and that again we don’t count 3 · 1 = 3 since we didn’t actually have to perform it — we just replace the 3 in the second row by a 0.     1 2 3 4 1 2 3 4 0 3 −3 0  0 3 −3 0 R3 +R1 −1 1 −1 1 − 2 5 −−−−→ 0 3 There were 3 operations here: 1 · 2 = 2, 1 · 3 = 3, and 1 · 4 = 4, for a total of nine operations so far.     1 2 3 4 1 2 3 4 1 0 3 −3 0 −− 3 R2 0 1 −1 0 −→ 0 3 2 5 0 3 2 5 In this step, there were 2 operations: 13 · (−3) and 31 · 0 = 0, for a total of eleven so far.  1 0 0 2 1 3 3 −1 2   1 2 4 0 1 0 R3 −3R2 5 − −−−−→ 0 0 3 −1 5  4 0 5 In this step, there were 2 operations: −3 · (−1) = 3 and −3 · 0 = 0, for a total of 13 so far.  1 0 0 2 1 0 3 −1 5   4 1 2 0 1 0 1 5 R3 5 −− −→ 0 0 3 −1 1  4 0 1 In this step, there was one operation, 51 · 5 = 1, for a total of 14. Finally, to complete the back-substitution, we need three more operations: −1 · x3 to find x2 , and 2 · x2 and 3 · x3 to find x1 . So in total 17 operations were required. 2. To compute the reduced row echelon form, we first compute the row echelon form above, which took 14 operations. Then:   R1 −2R2   1 2 3 4 − 5 4 −−−−→ 1 0 0 1 −1 0 0 1 −1 0 0 0 1 1 0 0 1 1 This required two operations, −2 · (−1) and −2 · 0, for a total of 16. Finally,  1 0 0 1 0 0 5 −1 1  R −5R3  4 −−1−−−→ 1 0 0 R2 +R3 0 1 0 0 − −−−−→ 1 0 0 1  −1 1 1 78 CHAPTER 2. SYSTEMS OF LINEAR EQUATIONS This required two operations, both in the constant column: 1 · 1 and −5 · 1, for a total of 18. So for this example, Gauss-Jordan required one more operation than Gaussian elimination. We would probably need more data to conclude that it is generally less efficient. 3. (a) There are n operations required to create the first leading 1 because we have to divide every entry in the first row, except a11 , by a11 (we don’t have to divide a11 by itself because the 1 results from the choice of pivot. Next, there are n operations required to create the first zero in column 1, because we have to multiply every entry in the first row, except the 1 in column one, by a21 . Again, we don’t have to multiply the 1 by itself, since we know we are simply replacing a21 by zero. Similarly, there are n operations required to create each zero in column 1. Since there are n − 1 rows excluding the first row, introducing the 1 in row one and the zeros below it requires n + (n − 1)n = n2 operations. (b) We can think of the resulting matrix, below and to the right of a22 , as an (n − 1) × n augmented matrix. Applying the process in part (a) to this matrix, we see that turning a22 into a 1 and creating zeros below it requires (n − 1) + (n − 2)(n − 1) = (n − 1)2 operations. Continuing to the matrix starting at row 3, column 3; this requires (n − 2)2 operations. So altogether, to reduce the matrix to row echelon form requires n2 + (n − 1)2 + · · · + 22 + 12 operations. (c) To find xn−1 requires one operation: we must multiply xn by the entry an−1,n . Then there are two operations required to find xn−2 , and so forth, until we require n − 1 operations to find x1 . So the total number required for the back-substitution is 1 + 2 + · · · + (n − 1). (d) From Appendix B, we see that k(k + 1)(2k + 1) 6 k(k + 1) 1 + 2 + ··· + k = . 2 12 + 22 + · · · + k 2 = Thus the total number of operations required is n(n + 1)(2n + 1) n(n − 1) + 6 2 1 2 1 1 3 1 2 1 = n + n + n+ n − n 3 2 6 2 2 1 3 1 = n + n2 − n. 3 3 12 + 22 + · · · + n2 + 1 + 2 + · · · + (n − 1) = For large values of n, the cubic term dominates, so the total number of operations required is ≈ 13 n3 . 4. As we saw in Exercise 2 in this section, we get the same n2 + (n − 1)2 + · · · + 22 + 12 operations to get the matrix into row echelon form. Then to create the 1 zero above row 2 requires 1(n − 1) operations, since each of the remaining n − 1 entries in row 2 must be multiplied. To create the 2 zeros above row 3 requires 2(n − 2) operations, since each of the remaining entries in row 3 must be multiplied twice. Continuing, we see that the total number of operations is (n − 1) + 2(n − 2) + · · · +(n − 1)(n − (n − 1)) = (1 · n + 2 · n + · · · + (n − 1) · n) − (12 + 22 + · · · + (n − 1)2 ) = n(1 + 2 + · · · + (n − 1)) − (12 + 22 + · · · + (n − 1)2 ). Adding that to the operations required to get the matrix into row echelon form, and using the formulas 2.3. SPANNING SETS AND LINEAR INDEPENDENCE 79 from Appendix B, we get for the total number of operations n2 + (n − 1)2 + · · · + 22 + 12 + n(1 + 2 + · · · + (n − 1))−(12 + 22 + · · · + (n − 1))2 n(n − 1) 2 1 3 1 2 2 =n + n − n 2 2 1 3 1 2 = n + n . 2 2 Again, for large values of n, the cubic term dominates, so the total number of operations required is ≈ 21 n3 . = n2 + n · 2.3 Spanning Sets and Linear Independence 1. Yes, it is. We want to write x 1 2 1 , +y = −1 −1 2 so we want to solve the system of linear equations x + 2y = 1 −x − y = 2. Row reducing the augmented matrix for the system gives R1 −2R2 1 2 1 1 2 1 − −−−−→ 1 0 R2 +R1 −1 −1 2 − 0 1 −−−−→ 0 1 3 −5 . 3 So the solution is x = −5 and y = 3, and the linear combination is v = −5u1 + 3u2 . 2. No, it is not. We want to write 4 −2 2 x +y = , −2 1 1 so we want to solve the system of linear equations 4x − 2y = 2 −2x + y = 1 Row-reducing the augmented matrix for the system gives # 1 # " " " 4 R1 1 − 21 12 1 4 −2 2 −− −→ R2 +2R1 −2 1 1 −2 1 1 −−−−−→ 0 − 12 0 1 2 # 2 So the system is inconsistent, and the answer follows from Theorem 2.4. 3. No, it is not. We want to write       1 0 1 x 1 + y 1 = 2 , 0 1 3 so we want to solve the system of linear equations x =1 x+y =2 y = 3. Rather than constructing the matrix and reducing it, note that the first and third equations force x = 1 and y = 3, and that these values of x and y do not satisfy the second equation. Thus the system is inconsistent, so that v is not a linear combination of u1 and u2 . 80 CHAPTER 2. SYSTEMS OF LINEAR EQUATIONS 4. Yes, it is. We want to write       1 0 3 x 1 + y 1 =  2 , 0 1 −1 so we want to solve the system of linear equations x = 3 x+y = 2 y = −1 Rather than constructing the matrix and reducing it, note that the first and third equations force x = 3 and y = −1, and that these values of x and y also satisfy the second equation. Thus v = 3u1 − u2 . 5. Yes, it is. We want to write         1 0 1 1 x 1 + y 1 + z 0 = 2 , 0 1 1 3 so we want to solve the system of linear equations x +z = 1 x+y =2 y+z = 3. Row-reducing the associated augmented matrix gives  1 0 1 1 1 0 0 1 1   1 0 1 R2 −R1  2 − −−−−→ 0 1 3 0 1 1 −1 1   1 0 1 0 1 1 R3 −R2 3 − −−−−→ 0 0  1 1 −1 1 1 ··· 2 R3 2 2 −− −→   R1 −R3  1 0 1 1 −−− −−→ 1 R2 +R3 0 · · · 0 1 −1 1 − −−−−→ 0 0 1 1 0 0 1 0 0 0 1  0 2 1 This gives the solution x = 0, y = 2, and z = 1, so that v = 2u2 + u3 . 6. Yes, it is. The associated linear system is 1.0x + 3.4y − 1.2z = 3.2 0.4x + 1.4y + 0.2z = 2.0 4.8x − 6.4y − 1.0z = −2.6 Solving this system with a CAS gives x = 1.0, y = 1.0, z = 1.0, so that v = u1 + u2 + u3 . 7. Yes, it is. b is in the span of the columns of A if and only if the system Ax = b has a solution, which is to say if and only if the linear system x + 2y = 5 3x + 4y = 6 has a solution. Row-reducing the augmented matrix for the system gives # " # " # " " R −2R2 1 2 5 1 2 5 −−1−−−→ 1 0 1 2 5 − 21 R2 R2 −3R1 9 3 4 6 −−−−−→ 0 −2 −9 −−−−→ 0 1 2 0 1 Thus x = −4 and y = 29 , so that A −4 9 2 = 5 . 6 −4 9 2 # 2.3. SPANNING SETS AND LINEAR INDEPENDENCE 81 8. Yes, it is. b is in the span of the columns of A if and only if the system Ax = b has a solution, which is to say if and only if the linear system x + 2y + 3z = 4 5x + 6y + 7z = 8 9x + 10y + 11z = 12 Row-reducing the augmented matrix for the system gives  1 2 5 6 9 10 3 7 11   1 4 R −5R1 0 8  −−2−−−→ R3 −9R1 12 − −−−−→ 0 2 3 −4 −8 −8 −16   4 1 − 41 R2  −12 − 0 −−−→ −24 0  4 3 −24 2 3 1 2 −8 −16  1 0 R +8R2 0 −−3−−−→ 2 1 0 3 2 0  4 3 0 This system has an infinite number of solutions, so that b is indeed in the span of the columns of A. a 9. We must show that for any vector , we can write b 1 1 a x +y = 1 −1 b for some x, y. Row-reduce the associated augmented matrix: # " # " " 1 1 a 1 1 a 1 1 − 12 R2 R2 −R1 1 −1 b −−− −−→ 0 −2 b − a −−− −→ 0 1 a # a−b 2 " R1 −R2 −−− −−→ 1 0 0 1 a+b 2 a−b 2 # So given a and b, we have a+b 1 a−b 1 a · + · = . 1 −1 b 2 2 a 10. We want to show that for any vector , we can write b x 3 0 a +y = −2 1 b for some x, y. Row-reduce the associated augmented matrix: " " # 1 # " 3 R1 3 0 a −− 1 0 a3 1 −→ R2 +2R1 −2 1 b −2 1 b −−−−−→ 0 0 1 a 3 # b + 2a 3 which shows that the vector can be written a 2a a 3 0 = · + b+ · . b 1 3 −2 3 11. We want to show that any vector can be written as a linear combination of the three given vectors, i.e. that         a 1 1 0  b  = x 0 + y 1 + z 1 c 1 0 1 82 CHAPTER 2. SYSTEMS OF LINEAR EQUATIONS for some x, y, z. Row-reduce the associated augmented matrix:  1 1 0 0 1 1 1 0 1     1 1 0 1 a a 0 1 1 0 b b  R3 −R1 R3 +R2 c − −−−−→ 0 −1 1 c − a −−− −−→ 0    1 1 a 1 1 0  R2 −R3   b  −−−−−→ 0 1 0 1 1 1 b+c−a 2 R3 0 0 1 0 0 −−−→ 2   R1 −R2 −−− −−→ 1 0 0 a−b+c 2   0 1 0 a+b−c  2 b+c−a 0 0 1 2  a  b b+c−a  a a+b−c   2 1 0 1 1 0 2 0 0 1 b+c−a 2 Thus         a 1 1 0 1 1 1  b  = (a − b + c) · 0 + (a + b − c) · 1 + (−a + b + c) · 1 . 2 2 2 c 1 0 1 12. We want to show that any vector can be written as a linear combination of the three given vectors, i.e. that         a 1 1 2  b  = x 1 + y 2 + z  1 c 0 3 −1 for some x, y, z. Row-reduce the associated augmented matrix:  1 1 2 1 2 1 0 3 −1      a 1 1 2 a 1 1 2 a R −R 2 1  0 1 −1 b −−− 0 1 −1 b − a b−a  −−→ R3 −3R2 c 0 3 −1 c 0 0 2 c + 3a − 3b −−−−−→     a 3b − 2a − c 1 1 2 1 1 0 R1 −2R3   −−−−−→   1 b−a 0 1 −1  R2 +R3 0 1 0 2 (c + a − b)  1 −−−−−→ 1 2 R3 0 0 1 12 (c + 3a − 3b) −− −→ 0 0 1 2 (c + 3a − 3b)   R1 −R2 −−− −−→ 1 0 0 12 (7b − 5a − 3c)   1 0 1 0 2 (c + a − b)  0 0 1 12 (c + 3a − 3b) Thus         a 1 1 2  b  = 1 (7b − 5a − 3c) 1 + 1 (c + a − b) 2 + 1 (c + 3a − 3b)  1 . 2 2 2 c 0 3 −1 2 −1 = −2 , these vectors are parallel, so their span is the line through the origin with −4 2 −1 direction vector . 2 x −1 (b) The vector equation of this line is =t . Note that this is just another way of saying y 2 that the line is the span of the given vector. To derive the general equation of the line, expand the above, giving the parametric equations x = −t and y = 2t. Substitute −x for t in the second equation, giving y = −2x, or 2x + y = 0. 13. (a) Since 2.3. SPANNING SETS AND LINEAR INDEPENDENCE 83 0 3 is just the origin, and the origin is in the span of , the span of these two 0 4 3 3 vectors is just the span of , which is the line through the origin with direction vector . 4 4 x 3 (b) The vector equation of this line is =t . Note that this is just another way of saying that y 4 the line is the span of the given vector. To derive the general equation of the line, expand the above, giving the parametric equations x = 3t and y = 4t. Substitute 31 x for t in the second equation, giving y = 43 x. Clearing fractions gives 3y = 4x, or 4x − 3y = 0. 14. (a) Since the span of 15. (a) Since the given vectors are not parallel, their span is a plane through the origin containing these two vectors as direction vectors. (b) The vector equation of the plane is       x 1 3 y  = s 2 + t  2 . z 0 −1 One way of deriving the general equation of the plane is to solve the system of equations arising from the above vector equation. Row-reducing the corresponding augmented system gives       1 3 x 1 3 x 1 3 x R −2R 2  0 −4  2 2 y  −−2−−−→ y − 4x 0 −4 y − 2x R3 − 41 R2 1 0 −1 z 0 −1 z 0 0 z + (2x − y) −−−−−−→ 4 Since we know the system is consistent, the bottom row must consist entirely of zeros. Thus the points on the plane must satisfy z + 41 (2x − y) = 0, or 4z + 2x − y = 0. This is the plane 2x − y + 4z = 0. An alternative method is to find a normal vector to the plane by taking the cross-product of the two given vectors:         1 3 0 · 2 − 2 · (−1) 2 2 ×  2 = 1 · (−1) − 0 · 3 = −1 . 0 −1 2·3−1·2 4 So the equation of the plane is 2x − y + 4z = d; since it passes through the origin, we again get 2x − y + 4z = 0. 16. (a) First, note that       0 1 −1 −1 = −  0 −  1 , 1 −1 0 so that the third vector is in the span of the first two. So we need onlyconsider the   first  two 1 −1 vectors. Their span is the plane through the origin with direction vectors  0 and  1. −1 0 (b) The vector equation of this plane is       1 −1 x s  0 + t  1 = y  . −1 0 z To find the general equation of the plane, we can solve the corresponding system of equations for s and t. Row-reducing the augmented matrix gives       1 −1 x 1 −1 x 1 −1 x       1 y y  y 0 1 0 1  0 R3 +R1 R3 +R2 −1 0 z −−− 0 −1 x + z 0 0 x + y + z −−→ −−−−−→ 84 CHAPTER 2. SYSTEMS OF LINEAR EQUATIONS Since we know the system is consistent, the bottom row must consist of all zeros, so that points on the plane satisfy the equation x + y + z = 0. Alternatively, we could find a normal vector to the plane by taking the cross product of the direction vectors:         1 −1 0 · 0 − (−1) · 1 1  0 ×  1 = −1 · −1 − 1 · 0 = 1 . −1 0 1 · 1 − 0 · (−1) 1 Thus the equation of the plane is x + y + z = d; since it passes through the origin, we again get x + y + z = 0. 17. Substituting the two points (1, 0, 3) and (−1, 1, −3) into the equation for the plane gives a linear system in a, b, and c: a + 3c = 0 −a + b − 3c = 0. Row-reducing the associated augmented matrix gives 1 0 −1 1 3 −3 0 1 0 3 R2 +R1 0 − −−−−→ 0 1 0 0 , 0 so that b = 0 and a = −3c. Thus the equation of the plane is −3cx + cz = 0. Clearly we must have c 6= 0, since otherwise we do not get a plane. So dividing through by c gives −3x + z = 0, or 3x − z = 0, for the general equation of the plane. 18. Note that u = 1u + 0v + 0w, v = 0u + 1v + 0w, w = 0u + 0v + 1w. Or, if one adopts the usual convention of not writing terms with coefficients of zero, u is in the span because u = u. 19. Note that u = u, v = (u + v) − u, w = (u + v + w) − (u + v), and the terms in parentheses on the right, together with u, are the elements of the given set. 20. (a) To prove the statement, we must show that anything that can be written as a linear combination of the elements of S can also be written as a linear combination of the elements of T . But if x is a linear combination of elements of S, then x = a1 u1 + a2 u2 + · · · + ak uk = a1 u1 + a2 u2 + · · · + ak uk + 0uk+1 + 0uk+1 + · · · + 0um . This shows that x is also a linear combination of elements of T , proving the result. (b) We know that span(S) ⊆ span(T ) from part (a). Since T is a set of vectors in Rn , we know that span(T ) ⊆ Rn . But then Rn = span(S) ⊆ span(T ) ⊆ Rn , so that span(T ) = Rn . 2.3. SPANNING SETS AND LINEAR INDEPENDENCE 85 21. (a) Suppose that ui = ui1 v1 + ui2 v2 + · · · + uim vm , i = 1, 2, . . . , k. Suppose that w is a linear combination of the ui . Then w = w1 u1 + w2 u2 + · · · + wk uk = w1 (u11 v1 + u12 v2 + · · · + u1m vm ) + w2 (u21 v1 + u22 v2 + · · · + u2m vm ) + · · · wk (uk1 v1 + uk2 v2 + · · · + ukm vm ) = (w1 u11 + w2 u21 + · · · + wk uk1 )v1 + (w1 u12 + w2 u22 + · · · + wk uk2 )v2 + · · · (w1 u1m + w2 u2m + · · · + wk ukm )vm 0 0 = w1 v1 + w20 v2 + · · · + wm vm . This shows that w is a linear combination of the vj . Thus span(u1 , . . . , uk ) ⊆ span(v1 , . . . , vk ). (b) Reversing the roles of the u’s and the v’s in part (a) gives span(v1 , . . . , vk ) ⊆ span(u1 , . . . , uk ). Combining that with part (a) shows that each of the spans is contained in the other, so they are equal. (c) Let   1 u1 = 0 , 0   1 u2 = 1 , 0   1 u3 = 1 . 1 We want to show that each ui is a linear combination of e1 , e2 , and e3 , and the reverse. The first of these is obvious: since e1 , e2 , and e3 span R3 , any vector is R3 is a linear combination of those, so in particular u1 , u2 , and u3 are. Next, note that e1 = u1 e2 = u2 − u1 e3 = u3 − u2 , so that the ej are linear combinations of the ui . Part (b) then implies that span(u1 , u2 , u3 ) = span(e1 , e2 , e3 ) = R3 .     −1 2 22. The vectors −1 and  2 are not scalar multiples of one another, so are linearly independent. 3 3 23. There is no obvious dependence relation, so follow Example 2.23. We want scalars c1 , c2 , c3 so that         1 1 1 0 c1 1 + c2 2 + c3 −1 = 0 . 1 3 2 0 Row-reduce the associated augmented matrix:  1 1 1 2 1 3 1 −1 2   0 1 1 R2 −R1 0 −−−−−→ 0 1 R3 −R1 0 − −−−−→ 0 2    1 0 1 1 1 0 0 1 −2 0 −2 0 ··· 1 R3 −2R2 5 R3 1 0 −−−−−→ 0 0 5 0 −− −→   R1 −R3   R1 −R3  −−→ 1 1 0 0 − 1 1 1 0 −−− −−−−→ 1 0 0 R2 +2R3 0 1 0 0 0 1 0 · · · 0 1 −2 0 − −−−−→ 0 0 1 0 0 0 1 0 0 0 1 Thus the system has only the solution c1 = c2 = c3 = 0, so the vectors are linearly independent.  0 0 0 86 CHAPTER 2. SYSTEMS OF LINEAR EQUATIONS 24. There is no obvious dependence relation, so follow Example 2.23. We want scalars c1 , c2 , c3 so that         2 3 1 0 c1 2 + c2 1 + c3 −5 = 0 . 1 2 2 0 Row-reduce the associated augmented matrix:  2 3 1 2 1 −5 1 2 2  R3  0 −−→ 1 R1  0 − −→ 2 R2 0 − −→ 2 2 3 1 2 1 −5  1 −R2  0 −−−→ 0   1 2 2 0 R −2R1 0 −1 −3 0 −−2−−−→ R3 −2R1 0 − −−−−→ 0 −3 −9   1 2 2 0 0 1 3 0 R3 +3R2 −3 −9 0 − −−−−→ 0  0 0 0 2 1 0 2 3 0  R1 −2R2  0 − −−−−→ 1 0 0 1 0 0 0 0 −4 3 0  0 0 0 Thus the system has a nontrivial solution, and the vectors are linearly dependent. From the reduced matrix, the general solution is c1 = 4t, c2 = −3t, c3 = t. Setting t = 1, for example, we get the relation         2 3 1 0 4 2 − 3 1 + −5 = 0 . 1 2 2 0 25. Since by inspection       2 2 0 v1 − v2 + v3 = 1 − 1 + 0 = 0, 3 1 2 the vectors are linearly dependent. 26. Since we are given 4 vectors in R3 , we know they are linearly dependent vectors by Theorem 2.8. To find a dependence relation, we want to find scalars c1 , c2 , c3 , c4 such that           −2 4 3 5 0 c1  3 + c2 −1 + c3 1 + c4 0 = 0 . 7 5 3 2 0 Row-reduce the associated augmented matrix:  1     − 2 R1 −2 4 3 5 0 −−− 1 −2 − 23 − 52 0 1 −2 − 32 − 25 − →   R2 −3R1    15 3 −1   3 −1 1 0 0 1 0 0 5 11   −−−−−→ 0   2 2 R −7R1 39 7 5 3 2 0 7 5 3 2 0 −−3−−−→ 0 19 27 2 2    1 −2 − 23 − 25 0 1 −2 − 23 − 25    1 R 3 11 3 5 2  0 1 11 0 1 −− −→ 0   10 2 10 2 R −19R 2 39 0 19 27 0 −−3−−−−→ 0 0 − 37 −9 2 2 5   R1 + 3 R3  2 1 −2 − 23 − 52 0 −−−−− −→ 1 −2 0 − 25 37    11 3 11 6 0  R2 − 10 R3 0 1 0 1 0 − − − − − − →    10 2 37 5 − 37 R3 45 0 1 45 0 0 0 1 −−−−→ 0 37 37   R +2R2 13 1 0 0 − 37 0 −−1−−−→   6 0 1 0 0   37 45 0 0 1 0 37 0   0  0  0  0  0  0  0  0 2.3. SPANNING SETS AND LINEAR INDEPENDENCE 87 Thus the system has a nontrivial solution, and the vectors are linearly dependent. From the reduced 13 6 matrix, the general solution is c1 = 37 t, c2 = − 37 t, c3 = − 45 37 t, c4 = t. Setting t = 37, for example, we get the relation           −2 4 3 5 0 13  3 − 6 −1 − 45 1 + 37 0 = 0 . 7 5 3 2 0   0 27. Since v3 = 0 is the zero vector, these vectors are linearly dependent. For example, 0       3 6 0 0 · 4 + 0 · 7 + 137 0 = 0. 5 8 0 Any set of vectors containing the zero vector is linearly dependent. 28. There is no obvious dependence relation, so follow Example 2.23. We want scalars c1 , c2 , c3 so that         0 2 3 −1  3 0 2  1        c1   2 + c2 2 + c3  1 = 0 0 −1 4 1 Row-reduce the associated augmented matrix:  −1 3  1 2   2 2 1 4 2 3 1 −1  −R1  0 − −−→ 1 1 0    2 0 0 1   1 0 R −R 2 1 −−−−−→ 0 0  R −2R  2  0 − 0 −3−−−→ R4 −R1 0 − −−−−→ 0 −3 −2 2 3 2 1 4 −1  1 −3 0 1 R −6R2  0 −−3−−−→ 0 R4 −7R2 0 0 −−−−−→  1 −3 R +2R3 −−1−−−→ 0 1 R2 −R3  −−− −−→ 0 0 0 0 −2 1 −1 −6 0 0 1 0 −3 −2 5 5 6 5 7 1   1 −3 0 0 0 1  −R  3  0 − 0 0 −−→ 0 0 0  R1 +3R2  0 −−−−−→ 1 0 0 1 0    0 0 0 0 0 0 −2 1 1 −6 0 0 1 0   1 0 1 5 R2 0 0  −−−→  0 0 0 0 −3 −2 1 1 6 5 7 1   1 0 0 0   0 0 R4 +6R3 0 −−−−−→ 0   0 0  0 0 −3 −2 1 1 0 1 0 0  0 0  0 0 0 0  0 0 Since the system has only the trivial solution, the given vectors are linearly independent. 29. There is no obvious dependence relation, so follow Example 2.23. We want scalars c1 , c2 , c3 , c4 so that           1 −1 1 0 0 −1  1  0  1 0          c1   1 + c2  0 + c3  1 + c4 −1 = 0 0 1 −1 1 0 88 CHAPTER 2. SYSTEMS OF LINEAR EQUATIONS Row-reduce the associated augmented matrix:       1 −1 1 0 0 1 −1 1 0 0 1 −1 1 0 0 R +R 2 1   −1 1 0 1 0 0 1 1 0 1 0 −1 0 R2 ↔R3 0  −−−−−→ 0 −  − − − − → R −R    1   3 1 0 1 −1 0 −−−−−→ 0 1 0 −1 0 0 0 1 1 0 0 1 −1 1 0 0 1 −1 1 0 0 1 −1 1 0     1 −1 1 0 0 1 −1 1 0 0 0 0 1 0 −1 0 1 0 −1 0     0   0 1 1 0 0 0 1 1 0 R4 −R2 R4 +R3 0 −1 2 0 − 0 0 3 0 −−− −−→ 0 −−−−→ 0     1 −1 1 0 0 1 −1 1 0 0 R2 +R4  − 0 − − − − → 0 1 0 −1 1 0 0 0 0    R −R   0 3 4 0 1 1 0 −−−−−→ 0 0 1 0 0 1 3 R4 0 0 1 0 0 0 0 1 0 −− −→ 0   R +R2 −R3 −−1−−− −−→ 1 0 0 0 0 0 1 0 0 0   0 0 1 0 0 0 0 0 1 0 Since the system has only the trivial solution, the given vectors are linearly independent. 30. The vectors   0 0  v1 =  0 , 1   0 0  v2 =  2 , 1   0 3  v3 =  2 , 1   4 3  v4 =  2 1 are linearly dependent. For suppose c1 v1 + c2 v2 + c3 v3 + c4 v4 = 0. Then c4 must be zero in order for the first component of the sum to be zero. But then the only vector containing a nonzero entry in the second component is v3 , so that c3 = 0 as well. Now for the same reason c2 = 0, looking at the third component. Finally, that implies that c1 = 0. So the vectors are linearly independent. 31. By inspection        −1 1 −1 3 −1  3 1 −1        v1 + v2 − v3 + v4 =   1 +  1 − 3 +  1 = 0, 1 3 −1 −1  so the vectors are linearly dependent. 32. Construct a matrix A with these vectors as its rows and try to produce a zero row: R0 =R1 +R2 2 −1 3 − 1 1 6 −1−−−−−→ 1 1 6 0 0 R2 =R2 +2R1 −1 2 3 −1 2 3 − −−−−−−−→ 0 3 9 This matrix is in row echelon form and has no zero rows, so the matrix has rank 2 and the vectors are linearly independent by Theorem 2.7. 33. Construct a matrix A with these vectors as its rows and try to produce a zero row:  1 1 1   1 1 1 0 R =R2 −R1 2 3 −−2−−−−−→ 0 −1 2 R30 =R3 −R1 0 −−−−−−−→   1 1 1 0 1 2 00 0 R3 =R3 +2R20 −2 1 − −−−−−−−→ 0 1 1 0  1 2 5 This matrix is in row echelon form and has no zero rows, so the matrix has rank 3 and the vectors are linearly independent by Theorem 2.7. 2.3. SPANNING SETS AND LINEAR INDEPENDENCE 89 34. Construct a matrix A with these vectors as its rows and try to produce a zero row:  2 3 1   2 1 1 R1 ↔R3  1 2 − −−−−→ 3 −5 2 2   −5 2 1 R20 =R2 −3R1 1 2 −−−−−−−−→ 0 2 1 R30 =R3 −2R1 0 −−−−−−−−→   1 −5 2 16 −4 00 0 3 0 0 R3 =R3 − 4 R2 12 −3 − −−−−−−−−→ 0  −5 2 12 −3 0 0 Since there is a zero row, the matrix has rank less than 3, so the vectors are linearly dependent by Theorem 2.7. Working backwards, we have 3 3 1 3 [0, 0, 0] = R300 = R30 − R20 = R3 − 2R1 − (R2 − 3R1 ) = R1 − R2 + R3 4 4 4 4       1 3 2 3 1 = −5 − 1 + 2 . 4 4 2 2 1 35. Construct a matrix A with these vectors as its rows and try to produce a zero row:  0 2 2 1 1 0  R10 =R2  2 −−− −−→ 2 3 R20 =R1 0 −−−−−→ 1 2   3 2 0 2 0 R3 =R3 −R10 1 − −−−−−−→ 0 1 1 0   1 3 2 1 2 00 0 0 0 R3 =R3 +R2 −1 −2 − −−−−−−−→ 0 1 1 0  3 1 0 This matrix is in row echelon form and has a zero row, so the vectors are linearly dependent by Theorem 2.7. Working backwards, we have       0 2 2 [0, 0, 0] = R300 = R30 + R20 = R3 − R10 + R20 = R3 − R2 + R1 = R1 − R2 + R3 = 1 − 1 + 0 . 2 3 1 36. Construct a matrix A with these vectors and try to produce a zero row:  −2  4    3 5   −2 3 7 R20 =R2 +2R1  0 − − − − − − − − → −1 5   0  R =R3 + 23 R1  1 3 −−3−−− −−−→  0 0 0 2 R40 =R4 + 25 R1 −−−−−−−−→ 3 5 11 2 15 2   7 −2  0 19  00  R3 =2R30  27    0 − − − − − → 2 39 2 R00 =2R0 4 −−4−−−→  −2 3  0 5  0 · · · R3000 =R300 − 11  5 R2 −−−−−−−−−−→  0 0 R000 =R400 −3R20 0 0 −−4−−−− −−−→ 0 3 5 11 15  7 19  ··· 27 39  −2 3  0 5    0 0 0000 000 5·18 000 R4 =R4 − 74 R3 −18 −−−−−−−−−−−−→ 0 0  7 19  74  −5  7 19  74  −5 0 Since this matrix has a zero row, it has rank less than 2, so the vectors are linearly dependent. Working backwards, we have [0, 0, 0] = R40000 = R4000 − 90 000 R 74 3 45 11 R300 − R20 37 5 90 99 = 2R40 − 3R20 − R30 + R20 37 37 5 90 3 99 = 2 R4 + R1 − 3(R2 + 2R1 ) − R3 + R1 + (R2 + 2R1 ) 2 37 2 37 26 12 90 = R1 − R2 − R3 + 2R4 . 37 37 37 = R400 − 3R20 − 90 CHAPTER 2. SYSTEMS OF LINEAR EQUATIONS Clearing fractions gives [0, 0, 0] = 26R1 − 12R2 − 90R3 + 74R4 . Thus         −2 4 3 5 26  3 − 12 −1 − 90 1 + 74 0 = 0. 7 5 3 2 Note that instead of doing all this work, we could just have applied Theorem 2.8. 37. Construct a matrix A with these vectors and try to produce a zero row:   3 4 5 6 7 8 0 0 0 This matrix already has a zero row, so we know by Theorem 2.7 that the vectors are linearly dependent, and we have an obvious dependence relation: let the coefficients of each of the first two vectors be zero, and the coefficient of the zero vector be any nonzero number. 38. Construct a matrix A with these vectors and try to produce a zero row:  −1  3 2 1 2 3 2 2 1   1 −1 R20 =R2 +3R1 4 −−−−−−−−→  0 −1 R30 =R3 +2R1 0 −−−−−−−−→ 1 5 5 2 8 5   1 −1  0 7 00 R3 =R3 −R2 1 − 0 −−−−−−−→ 1 5 0  2 1 8 7 . −3 −6 This matrix is in row echelon form, and it has no zero row, so it has rank 4 and thus the vectors are linearly independent by Theorem 2.7. 39. Construct a matrix A with these vectors and try to produce a zero row:   −1 1 0 1 R20 =R2 +R1 0 1 0 1 − − − − − − − →   0 1 −1 R30 =R3 −R1 0 −−−−−−−→ 1 −1 1 0  1 −1   1 0  1 0  0 0  −1 1 0 0 1 1  1 0 −1 0 R4 =R4 −R30 1 −1 1 − −−−−−−→   −1 1 0 1 R200 =R30 0 0 1 1 − − − − − →   1 0 −1 R300 =R20 0 −−−−−→ 0 −1 2 0   −1 1 0 1 0 1 0 −1   0 1 1 00 0 0 0 R4 =R4 +R3 0 −1 2 − −−−−−−−→ 0  −1 1 0 1 0 −1 . 0 1 1 0 0 3 This matrix is in row echelon form, and it has no zero row, so it has rank 4 and thus the vectors are linearly independent by Theorem 2.7. 40. Construct a matrix A with these vectors and try to produce a zero row:  0 0  0 4 0 0 3 3 0 2 2 2   1 R20 =R2 −R1 0 −−−−−−−→  1  R0 =R −R 0 3 1  1 −−3−−− −−→ 0 0 1 R4 =R4 −R1 4 −−−−−−−→ 0 0 3 3 0 2 2 2   0 1  0  R300 =R30 −R20 0 0 −−−−−−−−→ 0 0 R400 =R40 −R20 4 −−−−−−−−→ 0 0 3 3 0 2 0 0  1 0  0 000 00 00 R4 =R4 −R3 0 − −−−−−−−→ 0  0 0  0 4 0 0 3 0 0 2 0 0 000 R1 =R4 − −−−−→  4 1 R200 =R300 0 0 − − − − − →   0 R3000 =R20 0 −−−−−→ 0 R0000 =R1 0 4 −−− −−−→ 0 3 0 0 0 0 2 0  0 0 . 0 1 2.3. SPANNING SETS AND LINEAR INDEPENDENCE 91 This matrix is in row echelon form, and it has no zero row, so it has rank 4 and thus the vectors are linearly independent by Theorem 2.7. 41. Construct a matrix A with these vectors and try to produce a zero row:  3 −1   1 −1  R10 =R3  1 −1 1 −1 −−− −−→ −1 3 1 −1   1 3 1 R30 =R1  3 −−−−−→ −1 −1 1 3  1 3 1 R20 =R2 +R10 −−−−−−−→ 3 1 −1  R00 =R0 −3R0 1  −1 1 −1 −−3−−−3−−−→ 0 0 −1 1 3 R4 =R4 +R1 −−−−−−−→    1 1 3 1 1 0  0 4 4 0   R000 =R00 +R0 +R0  3 2 4 0 0 −4 −8 −4 − −3−−−− −−− −−→ 0 0 4 4 0 1 4 0 0 3 4 0 4  1 0 . 0 4 Since this matrix has a zero row, it has rank less than 4, so the vectors are linearly dependent. Working backwards, we have [0, 0, 0, 0] = R3000 = R300 + R20 + R40 = R30 − 3R10 + R2 + R10 + R4 + R10 = R30 − R10 + R2 + R4 = R1 − R3 + R2 + R4 . Thus        −1 1 −1 3 −1  3 1 −1   +   −   +   = 0.  1  1 3  1 3 1 −1 −1  42. (a) In this case, rank(A) = n. To see this, recall that v1 , v2 , . . . , vn are linearly independent if and only if the only solution of [A | 0] = [v1 v2 . . . vn | 0] is the trivial solution. Since the trivial solution is always a solution, it is the only one if and only if the number of free variables in the associated system is zero. Then the Rank Theorem (Theorem 2.2 in Section 2.2) says that the number of free variables is n − rank(A), so that n = rank(A). (b) Let   v1  v2    A =  . .  ..  vn Then by Theorem 2.7, the vectors, which are the rows of A, are linearly independent if and only if the rank of A is greater than or equal to n. But A has n rows, so its rank is at most n. Thus, the rows of A are linearly independent if and only if the rank of A is exactly n. 43. (a) Yes, they are. Suppose that c1 (u + v) + c2 (v + w) + c3 (u + w) = 0. Expanding gives c1 u + c1 v + c2 v + c2 w + c3 u + c3 w = (c1 + c3 )u + (c1 + c2 )v + (c2 + c3 )w = 0. Since u, v, and w are linearly independent, all three of these coefficients are zero. So to see what values of c1 , c2 , and c3 satisfy this equation, we must solve the linear system c1 + c3 = 0 c1 + c2 =0 c2 + c3 = 0. 92 CHAPTER 2. SYSTEMS OF LINEAR EQUATIONS Row-reducing the coefficient matrix gives    1 0 1 1 1 1 0 → − 0 0 1 1 0  0 1 1 −1 . 0 2 Since the rank of the reduced matrix is 3, the only solution is the trivial solution, so c1 = c2 = c3 = 0. Thus the original vectors u + v, v + w, and u + w are linearly independent. (b) No, they are not. Suppose that c1 (u − v) + c2 (v − w) + c3 (u − w) = 0. Expanding gives c1 u − c1 v + c2 v − c2 w + c3 u − c3 w = (c1 + c3 )u + (−c1 + c2 )v + (−c2 − c3 )w = 0. Since u, v, and w are linearly independent, all three of these coefficients are zero. So to see what values of c1 , c2 , and c3 satisfy this equation, we must solve the linear system c1 + c3 = 0 −c1 + c2 =0 − c2 − c3 = 0. Row-reducing the coefficient matrix gives    1 1 0 1 −1 1 0 → − 0 0 0 −1 −1 0 1 0  1 1 . 0 So c3 is a free variable, and a solution to the original system is c1 = c2 = −c3 . Taking c3 = −1, for example, we have (u − v) + (v − w) − (u − w) = 0. 44. Let v1 and v2 be the two vectors. Suppose first that at least one of them, say v1 , is zero. Then they are linearly dependent since v1 + 0v2 = 0 + 0 = 0; further, v1 = 0v2 , so one is a scalar multiple of the other. Now suppose that neither vector is the zero vector. We want to show that one is a multiple of the other if and only if they are linearly dependent. If one is a multiple of the other, say v1 = kv2 , then clearly 1v1 − kv2 = 0, so that they are linearly dependent. To go the other way, suppose they are linearly dependent. Then c1 v1 + c2 v2 = 0, where c1 and c2 are not both zero. Suppose that c1 6= 0 (if instead c2 6= 0, reverse the roles of the first and second subscripts). Then c1 v1 + c2 v2 = 0 ⇒ c1 v1 = −c2 v2 ⇒ v1 = − c2 v2 , c1 so that v1 is a multiple of v2 , as was to be shown. 45. Suppose we have m row vectors v1 , v2 , . . . , vm in Rn with m > n. Let   v1  v2    A =  . .  ..  vm By Theorem 2.7, the rows (the vi ) are linearly dependent if and only if rank(A) < m. By definition, rank(A) is the number of nonzero rows in its row echelon form. But since there can be at most one leading 1 in each column, the number of nonzero rows is at most n < m. Thus rank(A) ≤ n < m, so that by Theorem 2.7, the vectors are linearly dependent. 2.4. APPLICATIONS 93 46. Suppose that A = {v1 , v2 , . . . , vn } is a set of linearly independent vectors, and that B is a subset of A, say B = {vi1 , vi2 , . . . , vik } where 1 ≤ i1 < i1 < · · · < ik ≤ n for j = 1, 2, . . . , k. We want to show that B is a linearly independent set. Suppose that ci1 vi1 + ci2 vi2 + · · · + cik vik = 0. Set cr = 0 if r is not in {i1 , i2 , . . . , ik }, and consider c1 v1 + c2 v2 + · · · + cn vn . This sum is zero since the sum of the i1 , i2 , . . . , ik elements is zero, and the other coefficients are zero. But A forms a linearly independent set, so all the coefficients must in fact be zero. Thus the linear combination of the vij above must be the trivial linear combination, so B is a linearly independent set. 47. Since S 0 = {v1 , . . . , vk } ⊂ S = {v1 , . . . , vk , v}, Exercise 20(a) proves that span(S 0 ) ⊆ span(S). Using Exercise 21(a), to show that span(S) ⊆ span(S 0 ), we must show that each element of S is a linear combination of elements of S 0 . But clearly each vi is a linear combination of elements of S 0 , since vi ∈ S 0 . Also, we are given that v is linear combination of the vi , so it too is a linear combination of elements of S 0 . Thus span(S) ⊆ span(S 0 ). Since we have proven both inclusions, the two sets are equal. 48. Suppose bv + b2 v2 + · · · + bk vk = 0. Substitute for v from the given equation, to get b(c1 v1 + c2 v2 + · · · + ck vk ) + b2 v2 + · · · + bk vk = bc1 v1 + (bc2 + b2 )v2 + · · · + (bck + bk )vk = 0. Since the vi are a linearly independent set, all of the coefficients must be zero: bc1 = 0, bc2 + b2 = 0, bc3 + b3 = 0, ... bck + bk = 0. In particular, bc1 = 0; since we are given that c1 6= 0, it follows that b = 0. Then substituting 0 for b in the remaining equations above b2 = 0, b3 = 0, ... bk = 0. Thus b and all of the bk are zero, so the original linear combination was the trivial one. It follows that {v, v2 , . . . , vk } is a linearly independent set. 2.4 Applications 1. Let x1 , x2 , and x3 be the number of bacteria of strains I, II, and III, respectively. Then from the consumption of A, B, and C, we get the following system:     1 0 0 160 x1 + 2x2 = 400 1 2 0 400 − 0 1 0 120 . 2x1 + x2 + x3 = 600 ⇒ 2 1 1 600 → 1 1 2 600 0 0 1 160 x1 + x2 + 2x3 = 600 So 160 bacteria of strain I, 120 bacteria of strain II, and 160 bacteria of strain III can coexist. 2. Let x1 , x2 , and x3 be the number of bacteria of strains I, II, and III, respectively. Then from the consumption of A, B, and C, we get the following system:     x1 + 2x2 = 400 1 2 0 400 1 0 2 0 2x1 + x2 + 3x3 = 500 ⇒ 2 1 3 500 → − 0 1 −1 0 . 1 1 1 600 0 0 0 1 x1 + x2 + x3 = 600 This is an inconsistent system, so the bacteria cannot coexist. 94 CHAPTER 2. SYSTEMS OF LINEAR EQUATIONS 3. Let x1 , x2 , and x3 be the number of small, medium, and large arrangements. Then from the consumption of flowers in each arrangement we get:     x1 + 2x2 + 4x3 = 24 1 0 0 2 1 2 4 24 − 0 1 0 3 . 3x1 + 4x2 + 8x3 = 50 ⇒ 3 4 8 50 → 3x1 + 6x2 + 6x3 = 48 3 6 6 48 0 0 1 4 So 2 small, 3 medium, and 4 large arrangements were sold that day. 4. (a) Let x1 , x2 , and x3 be the number of nickels, dimes, and quarters. Then we get     x1 + x2 + x3 = 20 20 1 0 0 4 1 1 1 0 → 2x1 − x2 = 0 ⇒ 2 −1 0 − 0 1 0 8 . 5 10 25 300 0 0 1 8 5x1 + 10x2 + 25x3 = 300 So there are 4 nickels, 8 dimes, and 8 quarters. (b) This is similar to part (a), except that the constraint 2x1 − x2 no longer applies, so we get 1 1 1 20 1 0 −3 −20 → − . 40 5 10 25 300 0 1 4 The general solution is x1 = −20 + 3t, x2 = 40 − 4t, and x3 = t. From the first equation, we must have 3t − 20 ≥ 0, so that t ≥ 7. From the second, we have t ≤ 10. So the possible values for t are 7, 8, 9, and 10, which give the following combinations of nickels, dimes, and quarters: [x1 , x2 , x3 ] = [1, 12, 7] or [4, 8, 8] or [7, 4, 9] or [10, 0, 10]. 5. Let x1 , x2 , and x3 be the number of house, special, and gourmet blends. Then from the consumption of beans in each blend we get 300x1 + 200x2 + 100x3 = 30,000 200x2 + 200x3 = 15,000 200x1 + 100x2 + 200x3 = 25,000 . Forming the augmented matrix and reducing it gives    300 200 100 30000 1 0 0  0 200 200 15000 → − 0 1 0 200 100 200 25000 0 0 1  65 30 . 45 The merchant should make 65 house blend, 30 special blend, and 45 gourmet blend. 6. Let x1 , x2 , and x3 be the number of house, special, and gourmet blends. Then from the consumption of beans in each blend we get 300x1 + 200x2 + 100x3 = 30,000 50x1 200x2 + 350x3 = 15,000 150x1 + 100x2 + 50x3 = 15,000 . Forming the augmented matrix and reducing it gives    1 0 300 200 100 30000  50 200 350 15000 → − 0 1 150 100 50 15000 0 0 −1 2 0  60 60 . 0 Thus x1 = 60 + t, x2 = 60 − 2t, and x3 = t. From the first equation, we get t ≥ −60; from the second, we get t ≤ 30, and the third gives us t ≥ 0. So the range of possible values is 0 ≤ t ≤ 30. We wish to maximize the profit P = 1 3 1 3 1 x1 + x2 + 2x3 = (60 + t) + (60 − t) + 2t = 120 − t. 2 2 2 2 2 Since 0 ≤ t ≤ 30, the profit is maximized if t = 0, in which case x1 = x + 2 = 60 and x3 = 0. Therefore the merchant should make 60 house and special blends, and no gourmet blends. 2.4. APPLICATIONS 95 7. Let x, y, z, and w be the number of FeS2 , O2 , Fe2 O3 , and SO2 molecules respectively. Then compare the number of iron, sulfur, and oxygen atoms in reactants and products: Iron: x = 2z Sulfur: 2x = w Oxygen: 2y = 3z + 2w, so we get the matrix  1 0  2 0 0 2 −2 0 0 −1 −3 −2   1 0 0 0   0 → − 0 1 0 0 0 1 0 − 21 − 11 8 − 14  0  0 . 0 1 Thus z = 14 w, y = 11 8 w, and x = 2 w. The smallest positive value of w that will produce integer values for all four variables is 8, so letting w = 8 gives x = 4, y = 11, z = 2, and w = 8. The balanced equation is 4FeS2 + 11O2 −→ 2Fe2 O3 + 8SO2 . 8. Let x, y, z, and w be the number of CO2 , H2 O, C6 H12 O6 , and O2 molecules respectively. The compare the number of carbon, oxygen, and hydrogen atoms in reactants and products: Carbon: x = 6z Oxygen: 2x + y = 6z + 2w Hydrogen: 2y = 12z. This system is easy to solve using back-substitution: the first equation gives x = 6z and the third gives y = 6z. Substituting 6z for both x and y in the second equation gives 2 · 6z + 6z = 6z + 2w, so that w = 6z as well. The smallest positive value of z giving integers is z = 1, so that x = y = w = 6 and z = 1. The balanced equation is 6CO2 + 6H2 O −→ C6 H12 O6 + 6O2 . 9. Let x, y, z, and w be the number of C4 H10 , O2 , CO2 , and H2 O molecules respectively. The compare the number of carbon, oxygen, and hydrogen atoms in reactants and products: Carbon: 4x = z Oxygen: 2y = 2z + w Hydrogen: 10x = 2w. This system is easy to solve using back-substitution: the first equation gives z = 4x and the third gives w = 5x. Substituting these values for z and w in the second equation gives 2y = 2 · 4x + 5x = 13x, so that y = 13 2 x. The smallest positive value of x giving integers for all four variables is x = 2, resulting in x = 2, y = 13, z = 8, and w = 10. The balanced equation is 2C4 H10 + 13O2 −→ 8CO2 + 10H2 O. 10. Let x, y, z, and w be the number of C7 H6 O2 , O2 , H2 O, and CO2 molecules respectively. The compare the number of carbon, oxygen, and hydrogen atoms in reactants and products: Carbon: 7x = Oxygen: 2x + 2y = z + 2w Hydrogen: 6x = w 2z. This system is easy to solve using back-substitution: the first equation gives w = 7x and the third gives z = 3x. Substituting these values for w and z in the second equation gives 2x + 2y = 3x + 2 · 7x, so that y = 15 2 x. The smallest positive value for x giving integer values for all four variables is x = 2, so that the solution is x = 2, y = 15, z = 6, and w = 14. The balanced equation is 2C7 H6 O2 + 15O2 −→ 6H2 O+14CO2 . 96 CHAPTER 2. SYSTEMS OF LINEAR EQUATIONS 11. Let x, y, z, and w be the number of C5 H11 OH, O2 , H2 O, and CO2 molecules respectively. The compare the number of carbon, oxygen, and hydrogen atoms in reactants and products: Carbon: 5x Oxygen: x + 2y = z + 2w = w Hydrogen: 12x = 2z. This system is easy to solve using back-substitution: the first equation gives w = 5x and the third gives z = 6x. Substituting these values for w and z in the second equation gives x + 2y = 6x + 2 · 5x, so that y = 15 2 x. The smallest positive value for x giving integer values for all four variables is x = 2, so that the solution is x = 2, y = 15, z = 12, and w = 10. The balanced equation is 2C5 H11 OH+15O2 −→ 12H2 O+10CO2 . 12. Let x, y, z, and w be the number of HClO4 , P4 O10 , H3 PO4 , and Cl2 O7 molecules respectively. The compare the number of hydrogen, chlorine, oxygen, and phosphorus atoms in reactants and products: Hydrogen: x = 3z Chlorine: x = 2w Oxygen: 4x + 10y = 4z + 7w Phosphorus: 4y = z. This gives the matrix  1 0 −3 0 1 0 0 −2   4 10 −4 −7 0 4 −1 0   1 0 0 0   0 0 1 0 − → 0 0 0 1 0 0 0 0  0 0  . 0 −2 − 16 − 23 0 0 Thus the general solution is x = 2t, y = 16 t, z = 32 t, w = t. The smallest positive value of t giving all integer answers is t = 6, so the desired solution is x = 12, y = 1, z = 4, and w = 6. The balanced equation is 12HClO4 +P4 O10 −→ 4H3 PO4 + 6Cl2 O7 . 13. Let x1 , x2 , x3 , x4 , and x5 be the number of Na2 CO3 , C, N2 , NaCN, and CO molecules respectively. The compare the number of sodium, carbon, oxygen, and nitrogen atoms in reactants and products: Sodium: 2x1 = x4 Carbon: x1 + x2 = x4 + x5 Oxygen: 3x1 = x5 Nitrogen: 2x3 = x4 .   0 1   0 0 − → 0 0 0 1 0 0 This gives the matrix  2 0 0 1 1 0   3 0 0 0 0 2 −1 0 −1 −1 0 −1 −1 0 0 0 4 x5 , 3 x3 = 0 0 1 0 0 0 0 1 − 31 − 34 − 31 − 32  0 0   0 0 Thus x5 is a free variable, and x1 = 1 x5 , 3 x2 = 1 x5 , 3 x4 = 2 x5 . 3 The smallest value of x5 that produces integer solutions is 3, so the desired solution is x1 = 1, x2 = 4, x3 = 1, x4 = 2, and x5 = 3. The balanced equation is Na2 CO3 + 4C+N2 −→ 2NaCN+3CO. 2.4. APPLICATIONS 97 14. Let x1 , x2 , x3 , x4 , and x5 be the number of C2 H2 Cl4 , Ca(OH)2 , C2 HCl3 , CaCl2 , and H2 O molecules respectively. The compare the number of carbon, hydrogen, chlorine, calcium, and oxygen atoms in reactants and products: Carbon: 2x1 Hydrogen: 2x1 + 2x2 = x3 + 2x5 = 2x3 Chlorine: 4x1 = 3x3 + 2x4 Calcium: x2 = x4 Oxygen: 2x2 = x5 . −2 0 0 −1 0 −2 −3 −2 0 0 −1 0 0 0 −1   1 0   0 0   − 0 0 →   0 0 0 0 This gives the matrix  2  2  4   0 0 0 2 0 1 2 0 1 0 0 0 0 0 1 0 0 0 0 0 1 0 −1 − 21 −1 − 21 0  0  0  0   0 0 Thus x5 is a free variable, and 1 1 x5 , x3 = x5 , x4 = x5 . 2 2 The smallest value of x5 that produces integer solutions is 2, so the desired solution is x1 = 2, x2 = 1, x3 = 2, x4 = 1, and x5 = 2. The balanced equation is 2C2 H2 Cl4 +Ca(OH)2 −→ 2C2 HCl3 +CaCl2 + 2H2 O. x1 = x5 , x2 = 15. (a) By applying the conservation of flow rule to each node we obtain the system of equations f1 + f2 = 20 f2 − f3 = −10 f1 + f3 = 30. Form the augmented matrix and row-reduce it:    1 1 0 1 0 20 0 1 −1 −10 → − 0 1 1 0 1 30 0 0 1 −1 0  30 −10 0 Letting f3 = t, the possible flows are f1 = 30 − t, f2 = t − 10, f3 = t. (b) The flow through AB is f2 , so in this case 5 = t − 10 and t = 15. So the other flows are f1 = f3 = 15. (c) Since each flow must be nonnegative, the first equation gives t ≤ 30, the second gives t ≥ 10, and the third gives t ≥ 0. So we have 10 ≤ t ≤ 30, which means that 0 ≤ f1 ≤ 20 0 ≤ f2 ≤ 20 10 ≤ f3 ≤ 30. (d) If a negative flow were allowed, it would mean a flow in the opposite direction, so that a negative flow on f2 would mean a flow from B into A. 16. (a) By applying the conservation of flow rule to each node we obtain the system of equations f1 + f2 = 20 f1 + f3 = 25 f2 + f4 = 25 Form the augmented matrix and row-reduce it:    1 1 0 0 20 1 0 0 1 0 1 0 25 0 1 0  −  0 1 0 1 25 → 0 0 1 0 0 1 1 30 0 0 0 −1 1 1 0 f3 + f4 = 30.  −5 0  0 0 Letting f4 = t, the possible flows are f1 = t − 5, f2 = 25 − t, f3 = 30 − t, f4 = t. 98 CHAPTER 2. SYSTEMS OF LINEAR EQUATIONS (b) If f4 = t = 10, then the average flows on the other streets will be f1 = 10−5 = 5, f2 = 25−10 = 15, and f3 = 30 − 10 = 20. (c) Each flow must be nonnegative, so we get the constraints t ≥ 5, t ≤ 25, t ≤ 30, and t ≥ 0. Thus 5 ≤ t ≤ 25. This gives the constraints 0 ≤ f1 ≤ 20 0 ≤ f2 ≤ 20 5 ≤ f3 ≤ 25 5 ≤ f4 ≤ 25. (d) Reversing all of the directions would have no effect on the solution. This would mean multiplying all rows of the augmented matrix by −1. However, multiplying a row by −1 is an elementary row operation, so the new matrix would have the same reduced row echelon form and thus the same solution. 17. (a) By applying the conservation of flow rule to each node we obtain the system of equations f1 + f2 = 100 f2 + f3 = f4 + 150 f4 + f5 = 150 f1 + 200 = f3 + f5 . Rearrange the equations, set up the augmented matrix, and row reduce it:     100 1 0 −1 0 −1 −200 1 1 0 0 0 0 1 0 1 1 −1 0 150 1 0 1 300  →  − 0 0 0 1 1 0 1 1 150 0 0 150 1 0 −1 0 −1 −200 0 0 0 0 0 Then both f3 = s and f5 = t are free variables, so the flows are f1 = −200 + s + t, f2 = 300 − s − t, f3 = s, f4 = 150 − t, f5 = t. (b) If DC is closed, then f5 = t = 0, so that f1 = −200 + s and f2 = 300 − s. Thus s, which is the flow through DB, must satisfy 200 ≤ s ≤ 300 liters per day. (c) If DB = f3 = S were closed, then f1 = −200 + t, so that t = f5 ≥ 200. But that would mean that f4 = 150 − t ≤ −50 < 0, which is impossible. (d) Since f4 = 150 − t, we must have 0 ≤ t ≤ 150, so that 0 ≤ f4 ≤ 150. Note that all of these values for t are consistent with the first three equations as well: if t = 0, then s = 250 produces positive values for all flows, while if t = 150, then s = 75 produces all positive values. So the flow through DB is between 0 and 150 liters per day. 18. (a) By applying the conservation of flow rule to each node we obtain: f3 + 200 = f1 + 100 f1 + 150 = f2 + f4 f2 + f5 = 300 f6 + 100 = f3 + 200 f4 + f7 = f6 + 100 f5 + f7 = 250. Create and row-reduce the augmented matrix for the system:    1 0 −1 0 0 0 0 100 1 0 0 1 −1  0 1 0 0 −1 0 0 0 −150    0 0 0 1 1 0 0 1 0 0 300  → − 0  0 1 0 0 −1 0 −100   0 0 0 0 0 0 1 0 −1 1 100 0 0 0 0 0 0 0 1 0 1 250 0 0 0 0 0 0 1 0 0 0 0 0 0 1 0 −1 0 0 −1 −1 0 −1 1 0 1 0 0  0 50  −100  100  250 0 Let f7 = t and f6 = s, so the flows are f1 = s f2 = 50 + t f3 = −100 + s f4 = 100 + s − t f5 = 250 − t f6 = s f7 = t. (b) From the solution in part (a), since f1 = s = f6 , it is not possible for f1 and f6 to have different values. 2.4. APPLICATIONS 99 (c) If f4 = 0, then s − t = 100, so that t = 100 + s. Substituting this for t everywhere gives f1 = s f2 = 150 + s f3 = −100 + s f4 = 0 f5 = 150 − s f6 = s f7 = 100 + s. The constraints imposed by the requirement that all the fi be nonnegative means that 100 ≤ s ≤ 150, so that 100 ≤ f1 ≤ 150 50 ≤ f5 ≤ 100 250 ≤ f2 ≤ 300 0 ≤ f3 ≤ 50 f4 = 0 100 ≤ f6 ≤ 150 200 ≤ f7 ≤ 250. 19. Applying the current law to node A gives I1 + I3 = I2 or I1 − I2 + I3 = 0. Applying the voltage law to the top circuit, ABCA, gives −I2 − I1 + 8 = 0. Similarly, for the circuit ABDA we obtain −I2 + 13 − 4I3 = 0. This gives the system  I1 − I2 + I3 = 0 1 I1 + I2 = 8 ⇒ 1 I2 + 4I3 = 13 0 −1 1 1 0 1 4   1 0 0 0 8 → − 0 1 0 0 0 1 13  3 5 2 Thus I1 = 3 amps, I2 = 5 amps, and I3 = 2 amps. 20. Applying the current law to node B and the voltage law to the upper and lower circuits gives  1 I1 − I2 + I3 = 0 I1 + 2I2 = 5 ⇒ 1 0 2I2 + 4I3 = 8 −1 1 2 0 2 4   0 1 5 → − 0 0 8 0 1 0  1 2 1 0 0 1 Thus I1 = 1 amps, I2 = 2 amps, and I3 = 1 amp. 21. (a) Applying the current and voltage laws to the circuit gives the system: Node B: I = I1 + I4 Node C: I1 = I2 + I3 Node D: I2 + I5 = I Node E: I3 + I4 = I5 Circuit ABEDA: −2I4 − I5 + 14 = 0 Circuit BCEB: −I1 − I3 + 2I4 = 0 Circuit CDEC: −2I2 + I3 + I5 = 0 Construct the augmented matrix and row-reduce it:  1 0  1  0  0  0 0 −1 0 0 −1 0 1 −1 −1 0 0 0 −1 0 0 −1 0 0 1 1 −1 0 0 0 2 1 1 0 1 −2 0 0 2 −1 0 −1   1 0  0  0  0  0  0 → − 0  14  0 0 0 0 0 0 1 0 0 0 0 0 0 0 1 0 0 0 0 0 0 0 1 0 0 0 0 0 0 0 1 0 0 0 0 0 0 0 1 0  10 6  4  2  4  6 0 Thus the currents are I = 10, I1 = 6, I2 = 4, I3 = 2, I4 = 4, and I5 = 6 amps. 7 (b) From Ohm’s Law, the effective resistance is Reff = VI = 14 10 = 5 ohms. (c) To force the current in CE to be zero, we force I3 = 0 and let r be the resistance in BC. This 100 CHAPTER 2. SYSTEMS OF LINEAR EQUATIONS gives the following system of equations: Node B: I = I1 + I4 Node C: I1 = I2 Node D: I2 + I5 = I Node E: I4 = I5 Circuit ABEDA: −2I4 − I5 + 14 = 0 −rI1 + 2I4 = 0 Circuit CDEC: −2I2 + I5 = 0 Circuit BCEB: It is easier to solve this system by substitution. Substitute I4 = I5 in the circuit equation for 14 7 ABEDA to get −3I5 + 14 = 0, so that I4 = I5 = 14 3 . Thus −2I2 + 3 = 0 so that I1 = I2 = 3 . Then the circuit equation for BCED gives us −r · 7 14 +2· =0 3 3 ⇒ r = 4. So if we use a 4 ohm resistor in BC, the current in CE will be zero. 22. (a) Applying the voltage law to Figure 2.23(a) gives IR1 +IR2 = E. But from Ohm’s Law, E = IReff , so that I(R1 + R2 ) = IReff and thus Reff = R1 + R2 . (b) Applying the current and voltage laws to Figure 2.23(b) gives the system of equations I = I1 + I2 , −I1 R1 + I2 R2 = 0, −I1 R1 + E = 0, and −I2 R2 + E = 0. Gauss-Jordan elimination gives     2 )E 1 0 0 (R1R+R 0 1 −1 −1 1 R2   0 −R E 0 R2  0 1 0   1 R1 − → .  E  0 −R1 0 −E  0 0 1 R 2 0 0 −R2 −E 0 0 0 0 2 )E Thus I = (R1R+R , I1 = RE1 , I2 = RE2 . But E = IReff ; substituting this value of E into the 1 R2 equation for I gives I= (R1 + R2 )IReff R1 R2 ⇒ 1 Reff = R1 +R2 = R1 R2 1 1 . 1 R1 + R2 23. The matrix for this system, with inputs on the left, outputs along the top, is Farming Manufacturing Farming Manufacturing 1 2 1 2 1 3 2 3 Let the output of the farm sector be f and the output of the manufacturing sector be m. Then 1 1 f+ m= f 2 3 1 2 f + m = m. 2 3 Simplifying and row-reducing the corresponding augmented matrix gives " # " 1 − 12 f + 13 m = 0 − 12 0 1 − 23 3 ⇒ → − 1 1 1 − 13 0 0 0 2f − 3m = 0 2 Thus f = 23 m, so that farming and manufacturing must be in a 2 : 3 ratio. # 0 0 2.4. APPLICATIONS 101 24. The matrix for this system, with needed resources on the left, outputs along the top, is Coal Steel Coal Steel 0.3 0.8 0.7 0.2 Let the coal output be c and the steel output be s. Then 0.3c + 0.8s = c 0.7c + 0.2s = s. Simplifying and row-reducing the corresponding augmented matrix gives −0.7c + 0.8s = 0 −0.7 0.8 0 1 − 87 → − ⇒ 0.7c − 0.8s = 0 0.7 −0.8 0 0 0 0 0 Thus c = 87 s, so that coal and steel must be in an 8 : 7 ratio. For a total output of $20 million, coal 8 7 28 output must be 8+7 · 20 = 32 3 ≈ $10.667 million, while steel output is 8+7 · 20 = 3 ≈ $9.333 million. 25. Let x be the painter’s rate, y the plumber’s rate, and z the electrician’s rate. From the given schedule, we get the linear system 2x + y + 5z = 10x 4x + 5y + z = 10y 4x + 4y + 4z = 10z. Simplifying and row-reducing the corresponding augmented matrix gives    −8x + y + 5z = 0 −8 1 5 0 1 0 − 13 18    4x − 5y + z = 0 ⇒  4 −5 1 0 → − 0 1 − 79 4x + 4y − 6z = 0 4 4 −6 0 0 0 0  0  0 . 0 13 Thus x = 18 z and y = 79 z. Let z = 54; this is a whole number in the required range such that the others will be whole numbers as well. Then x = 39 and y = 42. The painter should charge $39 per hour, the plumber $42 per hour, and the electrician $54 per hour. 26. From the given matrix, we get the linear system 1 1 1 L+ T + Z = B 4 8 6 1 1 1 1 B+ L+ T + Z = L 2 4 4 6 1 1 1 1 B+ L+ T + Z = T 4 4 2 3 1 1 1 1 B + L + T + Z = Z. 4 4 8 3 Simplify and row-reduce the associated augmented matrix:  1 1 1 −B + 14 L + 81 T + 16 Z = 0 −1 4 8 6  1 3 1 1 1 1 1 − 34  2B − 4L + 4T + 6Z = 0 4 6 ⇒  21 1 1 1 1 1 1 1  4 −2 4B + 2L − 2T + 3Z = 0 2 3 1 1 1 2 1 1 1 − 32 4B + 8L + 8T − 3Z = 0 4 8 8   0 1 0 0   0 0 1 0 − → 0 0 0 1 0 0 0 0 − 23 − 65 − 85 0  0 0  . 0 0 Thus B = 23 Z, L = 56 Z, and T = 85 Z. Furthermore, B is the smallest price of the four, so B = 23 Z must be at least 50, in which case Z = 75. With that value of Z, we get B = 50, L = 90, and T = 120. 102 CHAPTER 2. SYSTEMS OF LINEAR EQUATIONS 27. The matrix for this system, with needed resources on the left, outputs along the top, is Coal Steel Coal Steel 0.15 0.2 0.1 0.1 (a) Let the coal output be c and the steel output be s. Then since 45 million of external demand is required for coal and 124 million for steel, we have c = 0.15c + 0.25s + 45 s = 0.2c + 0.1s + 124. Simplify and row-reduce the associated matrix: 0.85c − 0.25s = 45 0.85 ⇒ −0.2c + 0.9s + 124 −0.2 −0.25 0.9 1 45 → − 124 0 0 1 100 160 So the coal industry should produce $100 million and the steel industry should produce $160 million. (b) The new external demands for coal and steel are 40 and 130 million respectively. The only thing that changes in the equations is the constant terms, so we have 0.85c − 0.25s = 40 0.85 −0.25 40 1 0 95.8 → − ⇒ −0.2c + 0.9s + 130 −0.2 0.9 130 0 1 165.73 So the coal industry should reduce production by 100−95.8 = $4.2 million, while the steel industry should increase production by 165.73 − 160 = $5.73 million. 28. This is an open system with external demand given by the other city departments. From the given matrix, we get the system 1 + 0.2A + 0.1H + 0.2T = A 1.2 + 0.1A + 0.1H + 0.2T = H 0.8 + 0.2A + 0.4H + 0.3T = T . Simplifying and row-reducing the associated augmented matrix gives    0.8 −0.1 −0.2 0.8A − 0.1H − 0.2T = 1 1 1 −0.1A + 0.9H − 0.2T = 1.2 ⇒ −0.1 0.9 −0.2 1.2 → − 0 −0.2A − 0.4H + 0.7T = 0.8 −0.2 −0.4 0.7 0.8 0 0 1 0 0 0 1  2.31 2.28 . 3.11 So the Administrative department should produce $2.31 million of services, the Health department $2.28 million, and the Transportation department $3.11 million. 29. (a) Over Z2 , we need to solve x1 a + x2 b + · · · + x5 e = t + s:        0 1 0 0 0 1 0 1 1 0 0 0 0 0 1 1 1 1 0 0 1 0 1 0 0 1            − 0 0 1 0 0  s= 0 , t = 0 , ⇒ 0 1 1 1 0 0 →  0 1 0 0 1 1 1 1 0 0 0 1 1 0 0 0 1 1 0 0 0 0 0 0 0 0  1 1  1  0 0 Hence x5 is a free variable, so there are exactly two solutions (with x5 = 0 and with x5 = 1). Solving for the other variables, we get x1 = 1 + x5 , x2 = 1 + x5 , x3 = 1, and x4 = x5 . So the two solutions are         x1 1 x1 0 x2  1 x2  0         x3  = 1 , x3  = 1 .         x4  0 x4  1 x5 0 x5 1 So we can push switches 1, 2, and 3 or switches 3, 4, and 5. 2.4. APPLICATIONS (b) In this case, t = e2 over Z2 , so the matrix becomes    1 1 0 0 0 0 1 0 0 0 1 1 1 1 0 0 1 0 1 0 0 1    0 1 1 1 0 0 →    − 0 0 1 0 0 0 0 1 1 1 0 0 0 0 1 1 0 0 0 1 1 0 0 0 0 0 0 103  0 0  0 . 0 1 This is an inconsistent system, so there is no solution. Thus it is impossible to start with all the lights off and turn on only the second light. 30. (a) With s = e4 and t = e2 + e4 over Z2 , so that s + t = e2 and we get the matrix     1 1 0 0 0 0 1 0 0 0 1 0 1 1 1 0 0 1 0 1 0 0 1 0     0 1 1 1 0 0 →     − 0 0 1 0 0 0 . 0 0 1 1 1 0 0 0 0 1 1 0 0 0 0 1 1 0 0 0 0 0 0 1 Since the system is inconsistent, there is no solution in this case: it is impossible to start with only the fourth light on and end up with only the second and fourth lights on. (b) In this case t = e2 over Z2 , so that s + t = e2 + e4 , and we get     1 1 0 0 0 0 1 0 0 0 1 1 1 1 1 0 0 1 0 1 0 0 1 1     0 1 1 1 0 0 →     − 0 0 1 0 0 1 . 0 0 1 1 1 1 0 0 0 1 1 0 0 0 0 1 1 0 0 0 0 0 0 0 So there are two solutions:         x1 x1 1 0 x2  1 x2  0         x3  = 1 , x3  = 1 .         x4  0 x4  1 0 x5 1 x5 So we can push switches 1, 2, and 3 or switches 3, 4, and 5. 31. Recall from Example 2.35 that it does not matter what order we push the switches in, and that pushing a switch twice is the same as not pushing it at all. So with           0 0 0 1 1 0 0 1 1 1                    a= 0 , b = 1 , c = 1 , d = 1 , e = 0 , 1 1 1 0 0 1 1 0 0 0 there are 25 = 32 possible choices, since there are 5 buttons and we have the choice of pushing or not pushing each one. If we make a list of the resulting 32 resulting light states and eliminate duplicates, we see that the list of possible states is                 0 0 0 0 0 0 0 0 0 0 0 0 1 1 1 1                 0 , 0 , 1 , 1 , 1 , 1 , 0 , 0 ,                 0 1 1 0 1 0 0 1 0 1 1 0 0 1 1 0                 1 1 1 1 1 1 1 1 1 1 1 1 0 0 0 0                 1 , 1 , 0 , 0 , 0 , 0 , 1 , 1 .                 0 1 1 0 1 0 0 1 0 1 1 0 0 1 1 0 104 CHAPTER 2. SYSTEMS OF LINEAR EQUATIONS 32. (a) Over Z3 , we need to solve x1 a + x2 b + x3 c = t, where      0 1 1 1 0 0 1 1 1 2 → − 0 t = 2 , so that 0 1 1 1 0 1 0 1 0 0 0 1  1 2 . 2 Thus x1 = 1, x2 = 2, and x3 = 2, so we must push switch A once and switches B and C twice each. (b) Over Z3 , we need to solve x1 a + x2 b + x3 c = t, where      1 1 1 1 0 1 1 1 1 0 → − 0 t = 0 , so that 0 1 1 1 0 1 0 1 0 0 0 1  2 2 . 2 In other words, we must push each switch twice. (c) Let x, y, and z be the final states of the three lights. Then we want to solve     1 0 0 y + 2z 1 1 0 x 1 1 1 y  → − 0 1 0 x + 2y + z  . 0 1 1 z 0 0 1 2x + y So we can achieve this configuration if we push switch A y + 2z times, switch B x + 2y + z times, and switch C 2x + y times (adding in Z3 ). 33. The switches here correspond to the same vectors as in Example 2.35, but now the numbers are interpreted in Z3 rather than in Z2 . So in Z53 , with s = 0 (all lights initially off), we need to solve x1 a + x2 b + · · · + x5 e = t, where       1 1 0 0 0 2 2 1 0 0 0 1 1 1 1 1 0 0 1 0 1 0 0 2 1 1        , so that 0 1 1 1 0 2 →   2 t=     − 0 0 1 0 0 2 0 0 1 1 1 1 0 0 0 1 1 2 1 2 0 0 0 1 1 2 0 0 0 0 0 0 So x5 is a free variable, so there are three solutions, corresponding to x5 = 0, 1, or 2. Solving for the other variables gives x1 = 1 − x5 , x2 = 1 − 2x5 , x3 = 2, x4 = 2 − x5 , so that the solutions are (remembering that 1 − 2 = 2 and 2 · 2 = 1 in Z3 )         x1 1 0 2 x2  1 2 0         x3  = 2 , 2 , 2 .         x4  2 1 0 x5 0 1 2 34. The possible configurations are  1 1  0  0 0 1 1 1 0 0 0 1 1 1 0 0 0 1 1 1     0 s1 s1 + s2     0  s2  s1 + s2 + s3      0 s3  = s2 + s3 + s4   1 s4  s3 + s4 + s5  1 s5 s4 + s5 where si ∈ {0, 1, 2} is the number of times that switch i is thrown. 2.4. APPLICATIONS 105 35. (a) Let a value of 1 represent white, and 0 represent black. Let ai correspond to touching the number i; it has a 1 in a given entry if the value of that square changes when i is pressed, and a 0 if it does not (thus the 1’s correspond to the squares with the blue dots in them in the diagram). Since each square has two states, white and black, we need to solve, over Z2 , the equation s + x1 a1 + x2 a2 + · · · + x9 a9 = t. Here     0 0 0 1     0 1     0 1      , t = 0 , 0 s=     0 1     0 1     0 1 0 0 so that x1 a1 + x2 a2 + · · · + x9 a9 = 0 − s = s. Row-reducing the corresponding augmented matrix gives    1 1 0 1 0 0 0 0 0 0 1 0 0 1 1 1 0 1 0 0 0 0 1 0 1 0    0 1 1 0 0 1 0 0 0 1 0 0 1    1 0 0 1 1 0 1 0 0 1 0 0 0    1 0 1 0 1 0 1 0 1 0 →    − 0 0 0 0 0 1 0 1 1 0 0 1 1 0 0 0    0 0 0 1 0 0 1 1 0 1 0 0 0    0 0 0 0 1 0 1 1 1 1 0 0 0 0 0 0 0 0 1 0 1 1 0 0 0 0 0 0 0 1 0 0 0 0 0 0 0 0 0 1 0 0 0 0 0 0 0 0 0 1 0 0 0 0 0 0 0 0 0 1 0 0 0 0 0 0 0 0 0 1 0 0 0 0 0 0 0 0 0 1  0 0  1  0  0 . 0  1  0 0 So touching the third and seventh squares will turn all the squares black. (b) In part (a), note that the row-reduced matrix consists of the identity matrix and a column vector corresponding to the constants. Changing the constants cannot produce an inconsistent system, since the matrix will still reduce to the identity matrix. So any configuration is reachable, since we can solve x1 a1 + x2 a2 + · · · + x9 a9 = s for any s. 36. In this version of the puzzle, the squares have three states, so we work over Z3 , with 2 corresponding to black, 1 to gray, and 0 to white. Then we need to go from     2 2 2 2     0 2     1 2      to t = 2 . 1 s=     0 2     2 2     0 2 1 2 Thus we want to solve   0 0   2   1    x1 a1 + x2 a2 + · · · + x9 a9 = t − s =  1 . 2   0   2 1 106 CHAPTER 2. SYSTEMS OF LINEAR EQUATIONS Row-reducing the corresponding augmented matrix gives    1 0 1 1 0 1 0 0 0 0 0 0 1 1 1 0 1 0 0 0 0 0 0 1    0 1 1 0 0 1 0 0 0 2 0 0    1 0 0 1 1 0 1 0 0 1 0 0     1 0 1 0 1 0 1 0 1 1 →   − 0 0 0 0 1 0 1 1 0 0 1 2 0 0    0 0 0 1 0 0 1 1 0 0 0 0    0 0 0 0 1 0 1 1 1 2 0 0 0 0 0 0 0 1 0 1 1 1 0 0 0 0 1 0 0 0 0 0 0 0 0 0 1 0 0 0 0 0 0 0 0 0 1 0 0 0 0 0 0 0 0 0 1 0 0 0 0 0 0 0 0 0 1 0 0 0 0 0 0 0 0 0 1 0 0 0 0 0 0 0 0 0 1  0 1  0  2  2 , 1  0  1 2 showing that a solution exists. 37. Let Grace and Hans’ ages be g and h. Then g = 3h and g + 5 = 2(h + 5), or g = 2(h + 5) − 5 = 2h + 5. It follows that g = 3h = 2h + 5, so that h = 5 and g = 15. So Hans is 5 and Grace is 15. 38. Let the ages be a, b, and c. Then the first two conditions give us a + b + c = 60 and a − b = b − c. For the third condition, Bert will be as old as Annie is now in a − b years; at that time, Annie will be a + (a − b) = 2a − b years, and we are given that 2a − b = 3c. So we get the system     31 0 0 28 31 1 1 60 a + b + c = 60 1 0 → −  0 1 0 20 . a − 2b + c = 0 ⇒  1 −2 0 0 1 12 2 −1 −3 0 2a − b − 3c = 0 Annie is 28, Bert is 20, and Chris is 12. 39. Let the areas of the fields be a and b. We have the two equations a + b = 1800 2 1 a + b = 1100. 3 2 Multiply the first equation by two and subtract the first equation from the result to get 13 a = 400, so that a = 1200 square yards and thus b = 1800 − a = 600 square yards. 40. Let x1 , x2 , and x3 be the number of bundles of the first, second, and third types of corn. Then from the given conditions we get 3x1 + 2x2 + x3 = 39 2x1 + 3x2 + x3 = 34 x1 + 2x2 + 3x3 = 26. Row-reduce the augmented matrix to get  3 2 1  2 3 1 1 2 3   1 0 0 39   34 → − 0 1 0 26 0 0 1  37 4 17  . 4  11 4 Therefore there are 9.25 measures of corn in a bundle of the first type, 4.25 measures of corn in a bundle of the second type, and 2.75 measures of corn in a bundle of the third type. 41. (a) The addition table gives a + c = 2, a + d = 4, b + c = 3, b + d = 5. Construct and row-reduce the corresponding augmented matrix:     1 0 1 0 2 1 0 0 1 4 1 0 0 1 4 0 1 0 1 5  −   0 1 1 0 3 → 0 0 1 −1 −2 . 0 1 0 1 5 0 0 0 0 0 2.4. APPLICATIONS 107 So d is a free variable. Solving for the other variables in terms of d gives a = 4 − d, b = 5 − d, and c = d − 2. So there are an infinite number of solutions; for example, with d = 1, we get a = 3, b = 4, and c = −1. (b) We have a + c = 3, a + d = 4, b + c = 6, b + d = 5. Construct and row-reduce the corresponding augmented matrix:     1 0 0 1 0 1 0 1 0 3 1 0 0 1 4 0 1 0 1 0  −   0 0 1 −1 0 . 0 1 1 0 6 → 0 1 0 1 5 0 0 0 0 1 So this is an inconsistent system, and there are no such values of a, b, c, and d. 42. The addition table gives a + c = w, a + d = y, b + c = x, b + d = z. Construct and row-reduce the corresponding augmented matrix:     1 0 0 1 1 0 1 0 w w  1 0 0 1 y  0 1 0 1 x −  .  0 0 1 −1  0 1 1 0 x  → y−w 0 1 0 1 z 0 0 0 0 z−x−y+w In order for this to be a consistent system, we must have z − x − y + w = 0, or z − y = x − w. Another way of phrasing this is x + y = z + w; in other words, the sum of the diagonal entries must equal the sum of the off-diagonal entries. 43. (a) The addition table gives the following system of equations: a+d=3 b+d=2 c+d=1 a+e=5 b+e=4 c+e=3 a+f =4 b+f =3 c + f = 1. Construct and row-reduce the corresponding augmented matrix:    1 0 0 1 0 0 3 1 0 0 0 0 1 1 0 0 0 1 0 5 0 1 0 0 0 1    1 0 0 0 0 1 4 0 0 1 0 0 1    0 1 0 1 0 0 2 0 0 0 1 0 −1    0 1 0 0 1 0 4 →    − 0 0 0 0 1 −1 0 1 0 0 0 1 3 0 0 0 0 0 0    0 0 1 1 0 0 1 0 0 0 0 0 0    0 0 1 0 1 0 3 0 0 0 0 0 0 0 0 1 0 0 1 1 0 0 0 0 0 0  0 0  0  0  0 . 1  0  0 0 The sixth row shows that this is an inconsistent system. (b) The addition table gives the following system of equations: a+d=1 b+d=2 c+d=3 a+e=3 b+e=4 c+e=5 a+f =4 b+f =5 c + f = 6. Construct and row-reduce the corresponding augmented matrix:    1 0 0 1 0 0 1 1 0 0 0 0 1 1 0 0 0 1 0 3 0 1 0 0 0 1    1 0 0 0 0 1 4 0 0 1 0 0 1    0 1 0 1 0 0 2 0 0 0 1 0 −1    0 1 0 0 1 0 4 →    − 0 0 0 0 1 −1 0 1 0 0 0 1 5 0 0 0 0 0 0    0 0 1 1 0 0 3 0 0 0 0 0 0    0 0 1 0 1 0 5 0 0 0 0 0 0 0 0 1 0 0 1 6 0 0 0 0 0 0  4 5  6  −3  −1 . 0  0  0 0 108 CHAPTER 2. SYSTEMS OF LINEAR EQUATIONS So f is a free variable, and there are an infinite number of solutions of the form a = 4 − f , b = 5 − f , c = 6, d = f − 3, e = f − 1. 44. Generalize the addition table to + d e f a m p s b n q t c o r u The addition table gives the following system of equations: a+d=m b+d=n c+d=o a+e=p b+e=q c+e=r a+f =s b+f =t c + f = u. Construct and row-reduce the corresponding augmented matrix:    1 0 0 1 0 0 m 1 0 0 0 0 1 1 0 0 0 1 0 p  0 1 0 0 0 1    1 0 0 0 0 1 s  0 0 1 0 0 1    0 1 0 1 0 0 n  0 0 0 1 0 −1    0 1 0 0 1 0 q  →    − 0 0 0 0 1 −1 0 1 0 0 0 1 t  0 0 0 0 0 0    0 0 1 1 0 0 o  0 0 0 0 0 0    0 0 1 0 1 0 r  0 0 0 0 0 0 0 0 1 0 0 1 u 0 0 0 0 0 0  m  n   o   m−p  n + p − m − t . m − n − p + q  n−o−q+r  p−q−s+t  q−r−t+u For the table to be valid, the last entry in the zero rows must also be zero, so we need the conditions m + q = n + p, n + r = o + q, p + t = q + s, and q + u = r + t. In terms of the addition table, this means that for each 2 × 2 submatrix, the sum of the diagonal entries must equal the sum of the off-diagonal entries (compare with Exercise 42). 45. (a) Since the three points (0, 1), (−1, 4), and (2, 1) must satisfy the equation y = ax2 + bx + c, substitute those values to get the three equations c = 1, a − b + c = 4, and 4a + 2b + c = 1. Construct and row-reduce the corresponding augmented matrix:     0 0 1 1 1 0 0 1 1 −1 1 4 → − 0 1 0 −2 4 2 1 1 0 0 1 1 Thus a = 1, b = −2, and c = 1, so that the equation of the parabola is y = x2 − 2x + 1. (b) Since the three points (−3, 1), (−2, 2), and (−1, 5) must satisfy the equation y = ax2 + bx + c, substitute those values to get the three equations 9a − 3b + c = 1, 4a − 2b + c = 2, and a − b + c = 5. Construct and row-reduce the corresponding augmented matrix:     9 −3 1 1 1 0 0 1 4 −2 1 2 → − 0 1 0 6  1 1 1 5 0 0 1 10 Thus a = 1, b = 6, and c = 10, so that the equation of the parabola is y = x2 + 6x + 10. 46. (a) Since the three points (0, 1), (−1, 4), and (2, 1) must satisfy the equation x2 + y 2 + ax + by + c = 0, substitute those values to get the three equations 1 + b + c = 0 1 + 16 − a + 4b + c = 0 4 + 1 + 2a + b + c = 0 ⇒ b + c = −1 a − 4b − c = 17 2a + b + c = −5 . 2.4. APPLICATIONS 109 Construct and row-reduce the corresponding augmented matrix:  0 1 2 1 1 −4 −1 1 1   −1 1 17 → − 0 −5 0 0 1 0 0 0 1  −2 −6 5 Thus a = −2, b = −6, and c = 5, so that the equation of the circle is x2 + y 2 − 2x − 6y + 5 = 0. Completing the square gives (x − 1)2 + (y − 3)2 +√5 − 1 − 9 = 0, so that (x − 1)2 + (y − 3)2 = 5. This is a circle whose center is (1, 3), with radius 5. A sketch of the circle follows part (b). (b) Since the three points (−3, 1), (−2, 2), and (−1, 5) must satisfy the equation x2 +y 2 +ax+by+c = 0, substitute those values to get the three equations 9 + 1 − 3a + b + c = 0 4 + 4 − 2a + 2b + c = 0 1 + 25 − a + 5b + c = 0 ⇒ 3a − b − c = 10 2a − 2b − c = 8 a − 5b − c = 26 . Construct and row-reduce the corresponding augmented matrix:  3 2 1 −1 −2 −5 −1 −1 −1   10 1 0 0 8 → − 0 1 0 26 0 0 1  12 −10 36 Thus a = 12, b = −10, and c = 36, so that the equation of the circle is x2 +y 2 +12x−10y +36 = 0. Completing the square gives (x + 6)2 + (y − 5)2 + 36 − 36 − 25 = 0, so that (x + 6)2 + (y − 5)2 = 25. This is a circle whose center is (−6, 5), with radius 5. Sketches of both circles are: 6 10 5 9 8 H-1, 4L 4 7 6 3 H1, 3L H-6, 5L H-1, 5L 5 4 2 3 1 H2, 1L H0, 1L H-2, 2L 2 H-3, 1L -2 -1 1 2 3 4 -11 -10 -9 -8 -7 -6 -5 -4 1 -3 -2 -1 3x+1 A B 47. We have x23x+1 +2x−3 = (x−1)(x+3) = x−1 + x+3 , so that clearing fractions, we get (x + 3)A + (x − 1)B = 3x + 1. Collecting terms gives (A + B)x + (3A − B) = 3x + 1. This gives the linear system A + B = 3, 3A − B = 1. Construct and row-reduce the corresponding augmented matrix: 1 3 1 −1 3 1 0 → − 1 0 1 1 2 Thus A = 1 and B = 2, so that x23x+1 +2x−3 = x−1 + x+3 . 1 . 2 110 CHAPTER 2. SYSTEMS OF LINEAR EQUATIONS 48. We have x2 − 3x + 3 A x3 − 3x + 3 B C = + = + x3 + 2x2 + x x(x + 1)2 x x + 1 (x + 1)2 x2 − 3x + 3 = A(x + 1)2 + Bx(x + 1) + Cx 2 ⇒ 2 x − 3x + 3 = (A + B)x + (2A + B + C)x + A A + B = 1, 2A + B + C = −3, ⇒ ⇒ A = 3. This is most easily solved by substitution: A = 3, so that A + B = 1 gives B = −2. Substituting into the remaining equation gives 2 · 3 − 2 + C = −3, so that C = −7. Thus x2 − 3x + 3 3 2 7 = − − x3 + 2x2 + x x x + 1 (x + 1)2 49. We have x−1 A Bx + C Dx + E = + 2 + 2 (x + 1)(x2 + 1)(x2 + 4) x+1 x +1 x +4 ⇒ x − 1 = A(x2 + 1)(x2 + 4) + (Bx + C)(x + 1)(x2 + 4) + (Dx + E)(x + 1)(x2 + 1). Expand the right-hand side and collect terms to get x − 1 = (A + B + D)x4 + (B + C + D + E)x3 + (5A + 4B + C + D + E)x2 + (4B + 4C + D + E)x + (4A + 4C + E). This gives the linear system A+ B = 0 B+ C +D+E = 0 5A + 4B + C + D + E = 0 4B + 4C + D + E = 1 4A +D + 4C + E = −1. Construct and row-reduce the corresponding augmented matrix:    0 1 1 0 1 0 1 0 0 0    0 1 1 1 1 0 0 1 0 0    5 4 1 1 1 0 − → 0 0 1 0     1 0 0 0 1 0 4 4 1 1 4 0 4 0 1 −1 0 0 0 0 0 0 0 0 1 − 15  1  3 0 . 2  − 15  − 15 2 Thus A = − 15 , B = 13 , C = 0, D = − 15 , and E = − 15 , giving 2 1 x−1 1 x 15 x + 5 = − + − (x + 1)(x2 + 1)(x2 + 4) 5(x + 1) 3(x2 + 1) x2 + 4 1 x 2x + 3 =− + − . 5(x + 1) 3(x2 + 1) 15(x2 + 4) 50. We have x3 + x + 1 A B Cx + D Ex + F Gx + H Ix + J = + + 2 + 2 + 2 + 2 . 2 2 3 2 x(x − 1)(x + x + 1)(x + 1) x x−1 x +x+1 x +1 (x + 1) (x + 1)3 2.4. APPLICATIONS 111 Multiply both sides of the equation by x(x − 1)(x2 + x + 1)(x2 + 1)3 , simplify, and equate coefficients of xn to get the system A + B + C + E = 0 −3A + 3B − 3C + 3D − 2E + F − G + H + J = 0 B−C +D+F =0 A + 4B + C − 3D − 2F − H = 1 3A + 4B + 3C − D + 2E + G = 0 −3A + B − C + D − E − G − I = 0 −A + 3B − 3C + 3D − E + 2F + H = 0 B−D−F −H −J =1 3A + 6B + 3C − 3D + E − F + G + I = 0 −A = 1. Construct and row-reduce the corresponding augmented matrix:    1 0 1 1 1 0 1 0 0 0 0 0 0     0 1 −1 1 0 1 0 0 0 0 0 0 1      3 4 3 −1 2 0 1 0 0 0 0   0 0    3 −1 2 0 1 0 0 0 0 0 −1 3 −3    0 0  3 6 3 −3 1 −1 1 0 1 0 0  → − −3 3 −3  3 −2 1 −1 1 0 1 0  0 0      1 4 1 −3 0 −2 0 −1 0 0 1 0 0     −3 1 −1 1 −1 0 −1 0 −1 0 0   0 0    0 −1 0 −1 0 −1 0 −1 1 0 0  0 1 −1 0 0 0 0 0 0 0 0 0 1 0 0 0 0 1 0 0 0 0 0 0 0 0 0 0 1 0 0 0 0 0 0 0 0 0 0 1 0 0 0 0 0 0 0 0 0 0 1 0 0 0 0 0 0 0 0 0 0 1 0 0 0 0 0 0 0 0 0 0 1 0 0 0 0 0 0 0 0 0 0 1 0 0 0 0 0 0 0 0 0 0 1  −1 1  8 −1   0  15  8 . − 98   7  4 1 −4 1 2 1 2 This gives the partial fraction decomposition 1 15 7 1 9 1 1 x3 + x + 1 1 x 8 8 x− 8 4x − 4 2x + 2 = − + − + + + x(x − 1)(x2 + x + 1)(x2 + 1)3 x x − 1 x2 + x + 1 x2 + 1 (x2 + 1)2 (x2 + 1)3 1 1 x 15x − 9 7x − 1 x+1 =− + − + + + . x 8(x − 1) x2 + 1 8(x2 + 1) 4(x2 + 1)2 2(x2 + 1)3 51. Suppose that 1 + 2 + · · · + n = an2 + bn + c, and let n = 0, 1, and 2 to get the system c=0 a+ b+c=1 4a + 2b + c = 3. Construct and row-reduce the associated augmented matrix:    0 0 1 0 1 0 0    − 0 1 0 1 1 1 1 → 4 2 1 3 0 0 1  1 2 1 2 0 Thus a = b = 21 and c = 0, so that 1 + 2 + ··· + n = 1 2 1 1 n + n = n(n + 1). 2 2 2 52. Suppose that 12 + 22 + · · · + n2 = an3 + bn2 + cn + d, and let n = 0, 1, 2, and 3 to get the system d= 0 a+ b+ c+d= 1 8a + 4b + 2c + d = 5 27a + 9b + 3c + d = 14. 112 CHAPTER 2. SYSTEMS OF LINEAR EQUATIONS Construct and row-reduce the associated augmented matrix:    1 0 0 0 0 0 1 0  1 1 1 1 1  0 1 0    −  →  8 4 2 1 5  0 0 1 0 0 0 27 9 3 1 14 1 3 1 2 . 1  6  0 0 0 1 0 Thus a = 31 , b = 12 , c = 16 , and d = 0, so that 12 + 22 + · · · + n2 = 1 3 1 2 1 1 1 n + n + n = (2n3 + 3n2 + n) = n(n + 1)(2n + 1). 3 2 6 6 6 53. Suppose that 13 + 23 + · · · + n3 = an4 + bn3 + cn2 + dn + e, and let n = 0, 1, 2, 3, and 4 to get the system e= 0 c+ d+e= 1 16a + 8b + 4c + 2d + e = 9 a+ b+ 81a + 27b + 9c + 3d + e = 36 256a + 64b + 16c + 4d + e = 100. Construct and row-reduce the associated augmented matrix:    0 0 0 0 1 0 1 0     1   1 1 1 1 1 0 1     16 → 0 0 9 − 8 4 2 1        81 27 9 3 1 36  0 0 256 64 16 4 1 100 0 0 0 0 1 0 0 0 0 0 1 0 0 0 0 0 1  1 4 1  2 1 . 4  0 0 Thus a = 14 , b = 12 , c = 14 , and d = e = 0, so that 1 1 1 1 1 1 + 2 + · · · + n = n4 + n3 + n2 = n2 (n2 + 2n + 1) = n2 (n + 1)2 = 4 2 4 4 4 3 2.5 3 3 n(n + 1) 2 2 . Iterative Methods for Solving Linear Systems 1. Start by solving the first equation for x1 and the second for x2 : 6 1 x1 = + x2 7 7 4 1 x2 = + x1 . 5 5 Using the initial vector [x1 , x2 ] = [0, 0], we get a sequence of approximations: n x1 x2 0 0 0 1 0.857 0.800 2 0.971 0.971 3 0.996 0.994 4 0.999 0.999 5 1.000 1.000 6 1.000 1.000 Solving the first equation for x2 gives x2 = 7x1 − 6; substituting into the second equation gives x1 − 5(7x1 − 6) = −4 so that − 34x1 = −34 ⇒ x1 = 1. Substituting this value back into either equation gives x2 = 1, so the exact solution is [x1 , x2 ] = [1, 1]. 2.5. ITERATIVE METHODS FOR SOLVING LINEAR SYSTEMS 113 2. Solving for x1 and x2 gives 5 1 − x2 2 2 x2 = −1 + x1 . x1 = Using the initial vector [x1 , x2 ] = [0, 0], we get a sequence of approximations: n x1 x2 0 0 0 1 2.5 −1.0 n x1 x2 13 2.008 0.969 14 2.016 1.008 2 3 1.5 3 1.75 2.0 4 1.4 0.75 5 2.125 0.5 6 2.25 1.125 7 1.938 1.25 8 1.875 0.938 9 2.031 0.875 10 2.062 1.031 11 1.984 1.062 12 1.969 0.984 15 1.996 1.016 16 1.992 0.996 17 2.002 0.992 18 2.004 1.002 19 1.999 1.004 20 1.998 0.999 21 2.000 0.998 22 2.001 1.000 23 2.000 1.001 24 2.000 1.000 25 2.000 1.000 Using x2 = x1 − 1 in the first equation gives x1 = 52 − 12 (x1 − 1), so that x1 = 2 and x2 = 1. So the exact solution is [x1 , x2 ] = [2, 1]. 3. Solving for x1 and x2 gives 2 1 x2 + 9 9 2 2 x2 = x2 + . 7 7 x1 = Using the initial vector [x1 , x2 ] = [0, 0], we get a sequence of approximations: n x1 x2 0 0 0 1 0.222 0.286 2 0.254 0.349 3 0.261 0.358 4 0.262 0.360 5 0.262 0.361 6 0.262 0.361 Solving the first equation for x2 gives x2 = 9x1 − 2. Substitute this into the second equation to get 8 8 x1 − 3.5(9x2 − 2) = −1, so that 30.5x1 = 8 and thus x1 = 30.5 ≈ 0.262295. This gives x2 = 9 · 30.5 −2 ≈ 0.360656. 4. Solving for x1 , x2 , and x3 gives x1 = − x2 = x3 = 1 17 1 x2 + x3 + 20 20 20 1 1 13 x1 + x3 − 10 10 10 1 1 9 x1 − x2 + . 10 10 5 Using the initial vector [x1 , x2 , x3 ] = [0, 0, 0], we get a sequence of approximations: n x1 x2 x3 0 0 0 0 1 0.85 −1.3 1.8 2 1.005 −1.035 2.015 3 1.002 −0.998 2.004 4 1.000 −0.999 2.000 To solve the system directly, row-reduce the augmented matrix:    20 1 −1 17 1 0 0  1 −10 1 13 → − 0 1 0 −1 1 10 18 0 0 1 giving the exact solution [x1 , x2 , x3 ] = [1, −1, 2]. 5 1.000 −1.000 2.000  1 −1 2 6 1.000 −1.000 2.000 114 CHAPTER 2. SYSTEMS OF LINEAR EQUATIONS 5. Solve to get 1 1 x2 + 3 3 1 1 1 x2 = − x1 − x3 + 4 4 4 1 1 x3 = − x2 + . 3 3 x1 = Using the initial vector [x1 , x2 , x3 ] = [0, 0, 0], we get a sequence of approximations n x1 x2 x3 0 0 0 0 1 0.333 0.250 0.333 2 0.250 0.083 0.250 3 0.306 0.125 0.306 4 0.292 0.097 0.292 5 0.301 0.104 0.301 6 0.299 0.100 0.299 7 0.300 0.101 0.300 8 0.300 0.100 0.300 9 0.300 0.100 0.300 To solve the system directly, row-reduce the augmented matrix:     3 1 0 0 10 3 1 0 1    1  − 0 1 0 10  1 4 1 1 → 3 0 1 3 1 0 0 1 10 3 1 3 giving the exact solution [x1 , x2 , x3 ] = 10 , 10 , 10 . 6. Solve to get 1 1 x2 + 3 3 1 1 x2 = x1 + x3 3 3 1 1 1 x3 = x2 + x4 + 3 3 3 1 1 x4 = x3 + . 3 3 x1 = Using the initial vector [x1 , x2 , x3 , x4 ] = [0, 0, 0, 0], we get a sequence of approximations n x1 x2 x3 x4 0 0 0 0 0 1 0.333 0 0.333 0.333 2 3 4 0.333 0.407 0.420 0.222 0.259 0.321 0.444 0.556 0.580 0.444 0.481 0.519 5 6 7 8 9 10 11 12 0.440 0.444 0.450 0.452 0.453 0.454 0.454 0.454 0.333 0.351 0.355 0.360 0.361 0.363 0.363 0.363 0.613 0.620 0.630 0.632 0.634 0.635 0.636 0.636 0.527 0.538 0.540 0.543 0.544 0.545 0.545 0.545 To solve the system directly, row-reduce the augmented matrix:    3 −1 0 0 1 1 0 0 0   −1 3 −1 0 0 0 1 0 0  − →   0 −1 3 −1 1 0 0 1 0 0 0 −1 3 1 0 0 0 1 5 4 7 6 giving the exact solution [x1 , x2 , x3 , x4 ] = 11 , 11 , 11 , 11 . 5 11 4  11  7   11 6 11  7. Using the equations from Exercise 1, we get (to three decimal places, but keeping four during the calculations) n 0 1 2 3 4 x1 0 0.857 0.996 1.000 1.000 x2 0 0.971 0.999 1.000 1.000 The Gauss-Seidel method takes 3 iterations, while the Jacobi method takes 5. 2.5. ITERATIVE METHODS FOR SOLVING LINEAR SYSTEMS 115 8. Using the equations from Exercise 2, we get (to three decimal places, but keeping four during the calculations) n x1 x2 0 0 0 1 2.5 1.5 2 1.75 0.75 3 2.125 1.125 4 1.938 0.938 5 2.031 1.031 6 1.984 0.984 7 2.008 1.008 8 1.996 0.996 9 2.002 1.002 10 1.999 0.999 11 2.000 1.000 12 2.000 1.000 The Gauss-Seidel method takes 11 iterations, while the Jacobi method takes 24. 9. Using the equations from Exercise 3, we get (to three decimal places, but keeping four during the calculations) n 0 1 2 3 4 x1 0 0.222 0.261 0.262 0.262 x2 0 0.349 0.360 0.361 0.361 The Gauss-Seidel method takes 3 iterations, while the Jacobi method takes 5. 10. Using the equations from Exercise 4, we get (to three decimal places, but keeping four during the calculations) n 0 1 2 3 4 x1 0 0.850 1.011 1.000 1.000 x2 0 −1.215 −0.998 −1.000 −1.000 x3 0 2.006 2.001 2.000 2.000 The Gauss-Seidel method takes 3 iterations, while the Jacobi method takes 5. 11. Using the equations from Exercise 5, we get (to three decimal places, but keeping four during the calculations) 1 2 3 4 5 6 n 0 x1 0 0.333 0.278 0.296 0.299 0.300 0.300 x2 0 0.167 0.111 0.102 0.100 0.100 0.100 x3 0 0.278 0.296 0.299 0.300 0.300 0.300 The Gauss-Seidel method takes 5 iterations, while the Jacobi method takes 8. 12. Using the equations from Exercise 6, we get (to three decimal places, but keeping four during the calculations) n 0 1 2 3 4 5 6 7 x1 0 0.333 0.370 0.416 0.433 0.451 0.454 0.454 x2 0 0.111 0.247 0.328 0.353 0.361 0.363 0.363 x3 0 0.370 0.568 0.617 0.631 0.635 0.636 0.636 x4 0 0.457 0.523 0.539 0.544 0.545 0.545 0.545 The Gauss-Seidel method takes 6 iterations, while the Jacobi method takes 11. 1. 13. 0.8 0.6 0.4 0.2 0.2 0.4 0.6 0.8 1. 116 14. CHAPTER 2. SYSTEMS OF LINEAR EQUATIONS 1.5 1. 0.5 1. 0.5 2. 1.5 2.5 15. Applying the Gauss-Seidel method to x1 − 2x2 = 3, 3x1 + 2x2 = 1 with an initial estimate of [0, 0] gives n 0 1 2 3 4 x1 0 3 −5 19 −53 x2 0 −4 8 −28 80 which evidently diverges. If, however, we reverse the two equations to get 3x1 + 2x2 = 1, x1 − 2x2 = 3, this is strictly diagonally dominant since |3| > |2| and |−2| > |1|. Applying the Gauss-Seidel method gives n x1 x2 0 0 0 1 2 0.333 1.222 −1.333 −0.889 3 0.926 −1.037 4 1.025 −0.988 5 0.992 −1.004 6 1.003 −0.999 7 8 9 0.999 1.000 1.000 −1.000 −1.000 −1.000 The exact solution is [x1 , x2 ] = [1, −1]. 16. Applying the Gauss-Seidel method to x1 − 4x2 + 2x3 = 2, 2x2 + 4x3 = 1, 6x1 − x2 − 2x3 = 1 with an initial estimate of [0, 0, 0] gives n x1 x2 x3 0 0 0 0 1 2 2 −6.5 0.5 −10 5.25 −15 3 −8 30.5 −39.75 4 203.5 80 570 which evidently diverges. If, however, we arrange the equations as 6x1 − x2 − 2x3 = 1 x1 − 4x2 + 2x3 = 2 2x2 + 4x3 = 1, we get a strictly diagonally dominant system. Applying the Gauss-Seidel method gives n x1 x2 x3 0 0 0 0 1 2 3 0.167 0.250 0.250 −0.458 −0.198 −0.263 0.479 0.349 0.382 The exact solution is [x1 , x2 , x3 ] = 14 , − 41 , 38 . 17. 75 50 25 -50 -40 -30 -20 10 -10 -25 20 4 0.250 −0.247 0.373 5 0.250 −0.251 0.375 6 7 0.250 0.250 −0.250 −0.250 0.375 0.375 2.5. ITERATIVE METHODS FOR SOLVING LINEAR SYSTEMS 117 18. Applying the Gauss-Seidel method to the equations −4x1 + 5x2 = 14 x1 − 3x2 = −7 gives n x1 x2 0 0 0 1 2 3 4 5 −3.5 −2.04 −1.43 −1.18 −1.08 1.17 1.65 1.86 1.94 1.97 6 7 8 −1.03 −1.01 −1.01 1.99 2.00 2.00 This gives an approximate solution (to 0.01) of x1 = −1.01, x2 = 2.00. The exact solution is [x1 , x2 ] = [−1, 2]. 19. Applying the Gauss-Seidel method to the equations 5x1 − 2x2 + 3x3 = −8 x1 + 4x2 − 4x3 = 102 −2x1 − 2x2 + 4x3 = 90 gives n x1 x2 x3 0 0 0 0 1 2 3 4 5 6 −1.6 14.97 8.55 10.74 9.84 10.12 25.9 11.41 14.05 11.62 11.72 11.25 −10.35 −9.31 −11.2 −11.32 −11.72 −11.82 n x1 x2 x3 9 10.00 11.05 −11.97 10 10.00 11.03 −11.98 11 10.00 11.01 −11.99 7 9.99 11.19 −11.90 8 10.02 11.08 −11.95 12 13 14 10.00 10.00 10.00 11.01 11.00 11.00 −12.00 −12.00 −12.00 This gives an approximate solution (to 0.01) of x1 = 10.00, x2 = 11.00, x3 = −12.00. The exact solution is [x1 , x2 , x3 ] = [10, 11, −12]. 20. Redoing the computation from Exercise 18, showing three decimal places, and computing to an accuracy of 0.001, gives n x1 x2 0 0 0 1 2 −3.5 −2.042 1.167 1.653 n x1 x2 3 −1.434 1.855 9 −1.002 1.999 4 −1.181 1.940 10 −1.001 2.000 5 −1.075 1.975 11 −1.000 2.000 6 −1.031 1.990 7 8 −1.013 −1.005 1.996 1.998 12 −1.000 2.000 This gives an approximate solution (to 0.001) of x1 = −1.000, x2 = 2.000. The exact solution is [x1 , x2 ] = [−1, 2]. 118 CHAPTER 2. SYSTEMS OF LINEAR EQUATIONS 21. Redoing the computation from Exercise 19, showing three decimal places, and computing to an accuracy of 0.001, gives n x1 x2 x3 0 0 0 0 1 −1.6 25.9 −10.35 2 3 14.97 8.55 11.408 14.051 −9.311 −11.199 4 5 10.74 9.839 11.615 11.718 −11.322 −11.721 6 10.120 11.249 −11.816 7 9.989 11.187 −11.912 n x1 x2 x3 9 10.002 11.052 −11.973 10 10.005 11.026 −11.985 11 10.001 11.015 −11.992 12 10.001 11.008 −11.996 13 10.000 11.005 −11.998 n x1 x2 x3 14 10.000 11.002 −11.999 15 10.000 11.001 −11.999 16 10.000 11.001 −12.000 17 10.000 11.000 −12.000 18 10.000 11.000 −12.000 8 10.022 11.082 −11.948 This gives an approximate solution (to 0.001) of x1 = 10.000, x2 = 11.000, x3 = −12.000. The exact solution is [x1 , x2 , x3 ] = [10, 11, −12]. 22. With the interior points labeled as shown, we use the temperature-averaging property to get the following system: 1 1 (0 + 40 + 40 + t2 ) = 20 + t2 4 4 1 5 1 1 t2 = (0 + t1 + t3 + 5) = + t1 + t3 4 4 4 4 85 1 1 + t2 . t3 = (t2 + 40 + 40 + 5) = 4 4 4 t1 = The Gauss-Seidel method gives the following with initial approximation [0, 0, 0]: n t1 t2 t3 0 0 0 0 1 20.000 6.250 22.812 2 21.562 12.344 24.336 3 23.086 13.105 24.526 4 23.276 13.201 24.550 5 23.300 13.213 24.553 6 23.303 13.214 24.554 7 23.304 13.214 24.554 8 23.304 13.214 24.554 This gives equilibrium temperatures (to 0.001) of t1 = 23.304, t2 = 13.214, and t3 = 24.554. 23. With the interior points labeled as shown, we use the temperature-averaging property to get the following system: 1 (t2 + t3 ) 4 1 t2 = (t1 + t4 ) 4 1 t3 = (t1 + t4 + 200) 4 1 t4 = (t2 + t3 + 200) 4 t1 = The Gauss-Seidel method gives the following with initial approximation [0, 0, 0]: n t1 t2 t3 t4 0 0 0 0 0 1 0 0 50 62.5 2 12.5 18.75 68.75 71.875 3 21.875 23.438 73.438 74.219 4 5 6 7 8 9 10 24.219 24.805 24.951 24.988 24.997 24.999 25.000 24.609 24.902 24.976 24.994 24.998 25.000 25.000 74.609 74.902 74.976 74.994 74.998 75.000 75.000 74.805 74.951 74.988 74.997 74.998 75.000 75.000 This gives equilibrium temperatures (to 0.001) of t1 = t2 = 25 and t3 = t4 = 75. 2.5. ITERATIVE METHODS FOR SOLVING LINEAR SYSTEMS 119 24. With the interior points labeled as shown, we use the temperature-averaging property to get the following system: 1 (t2 + t3 ) 4 1 t2 = (t1 + t4 + 40) 4 1 t3 = (t1 + t4 + 80) 4 1 t4 = (t2 + t3 + 200) 4 t1 = The Gauss-Seidel method gives the following with initial approximation [0, 0, 0]: n t1 t2 t3 t4 0 0 0 0 0 1 0 10 20 57.5 2 7.5 26.25 36.25 65.625 3 15.625 30.312 40.312 67.656 4 5 6 7 8 9 10 17.656 18.164 18.291 18.323 18.331 18.333 18.333 31.328 31.582 31.646 31.661 31.665 31.666 31.667 41.328 41.582 41.646 41.661 41.665 41.666 41.667 68.164 68.291 68.323 68.331 68.333 68.333 68.333 This gives equilibrium temperatures (to 0.001) of t1 = 18.333, t2 = 31.667, t3 = 41.667, and t4 = 68.333. 25. With the interior points labeled as shown, we use the temperature-averaging property to get the following system: 1 (t2 + 80) 4 1 t2 = (t1 + t3 + t4 ) 4 1 t3 = (t2 + t5 + 80) 4 1 t4 = (t2 + t5 + 5) 4 1 t5 = (t3 + t4 + t6 + 5) 4 1 t6 = (t5 + 85). 4 t1 = The Gauss-Seidel method gives the following with initial approximation [0, 0, 0]: n t1 t2 t3 t4 t5 t6 n t1 t2 t3 t4 t5 t6 0 0 0 0 0 0 0 1 2 20 21.25 5 11.25 21.25 24.609 2.5 5.859 7.188 14.629 23.047 24.097 7 23.809 15.282 28.058 9.308 16.963 25.491 8 23.821 15.297 28.065 9.315 16.968 25.492 3 22.812 13.32 26.987 8.237 16.283 25.321 9 23.824 15.301 28.067 9.317 16.969 25.492 4 23.33 14.639 27.730 8.980 16.758 25.439 10 23.825 15.302 28.068 9.318 16.970 25.492 5 23.66 15.093 27.963 9.213 16.904 25.476 11 23.826 15.303 28.068 9.318 16.970 25.492 6 23.773 15.237 28.035 9.285 16.949 25.487 12 23.826 15.303 28.068 9.318 16.970 25.492 This gives equilibrium temperatures (to 0.001) of t1 = 23.826, t2 = 15.303, t3 = 28.068, t4 = 9.318, t5 = 16.970, and t6 = 25.492. 120 CHAPTER 2. SYSTEMS OF LINEAR EQUATIONS 26. With the interior points labeled as shown, we use the temperature-averaging property to get the following system: 1 (t2 + t5 ) 4 1 t2 = (t1 + t3 + t6 ) 4 1 t3 = (t2 + t4 + t7 + 20) 4 1 t4 = (t3 + t8 + 40) 4 1 t5 = (t1 + t6 + t9 ) 4 1 t6 = (t2 + t5 + t7 + t10 ) 4 1 t7 = (t3 + t6 + t8 + t11 ) 4 1 t8 = (t4 + t7 + t12 + 20) 4 1 (t5 + t10 + t13 + 40) 4 1 t10 = (t6 + t9 + t11 + t14 ) 4 1 t11 = (t7 + t10 + t12 + t15 ) 4 1 t12 = (t8 + t11 + t16 + 100) 4 1 t13 = (t9 + t14 + 80) 4 1 t14 = (t10 + t13 + t15 + 40) 4 1 t15 = (t11 + t14 + t16 + 100) 4 1 t16 = (t12 + t15 + 200). 4 t1 = n t1 t2 t3 t4 t5 t6 t7 t8 t9 t10 t11 t12 t13 t14 t15 t16 n t1 t2 t3 t4 t5 t6 t7 t8 t9 t10 t11 t12 t13 t14 t15 t16 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 1 2 0 0 0 1.25 5. 8.438 11.25 14.141 0 2.5 0 1.875 1.25 4.844 8.125 16.562 10. 16.875 2.5 8.984 0.938 17.598 27.266 49.575 22.5 28.281 16.25 26.641 29.297 52.095 64.141 75.417 10 7.274 13.794 23.179 24.998 16.522 26.029 34.83 37.142 34.088 40.284 53.83 69.006 40.452 48.051 71.733 85.185 11 7.579 14.197 23.506 25.162 16.924 26.559 35.259 37.357 34.415 40.714 54.178 69.18 40.617 48.266 71.907 85.272 3 0.938 2.812 10.449 16.753 4.922 5.391 12.5 24.707 20.547 17.544 32.928 58.263 31.797 35.359 60.926 79.797 12 7.78 14.461 23.721 25.269 17.189 26.906 35.54 37.497 34.63 40.995 54.406 69.294 40.724 48.406 72.021 85.329 t9 = 4 1.934 4.443 13.424 19.533 6.968 10.364 20.356 29.538 24.077 25.682 41.307 62.661 34.859 40.367 65.368 82.007 13 7.912 14.635 23.861 25.34 17.362 27.133 35.724 37.589 34.77 41.179 54.554 69.368 40.794 48.498 72.095 85.366 5 2.853 6.66 16.637 21.544 9.323 15.505 25.747 32.488 27.466 31.161 46.234 65.182 36.958 43.372 67.903 83.271 14 7.999 14.748 23.953 25.386 17.476 27.282 35.845 37.65 34.862 41.299 54.652 69.417 40.84 48.559 72.144 85.39 6 3.996 9.035 19.081 22.892 11.742 19.421 29.306 34.345 29.965 34.748 49.285 66.725 38.334 45.246 69.451 84.044 15 8.056 14.823 24.013 25.416 17.55 27.379 35.923 37.689 34.922 41.378 54.716 69.449 40.87 48.598 72.176 85.406 7 5.194 10.924 20.781 23.781 13.645 22.156 31.642 35.537 31.682 37.092 51.227 67.702 39.232 46.444 70.429 84.533 16 8.093 14.871 24.053 25.435 17.599 27.443 35.975 37.715 34.962 41.43 54.757 69.47 40.89 48.624 72.197 85.417 8 6.142 12.27 21.923 24.365 14.995 24. 33.172 36.31 32.83 38.625 52.482 68.331 39.818 47.218 71.058 84.847 17 8.118 14.903 24.078 25.448 17.631 27.485 36.009 37.732 34.988 41.463 54.785 69.483 40.903 48.641 72.21 85.423 9 6.816 13.185 22.68 24.748 15.911 25.223 34.174 36.813 33.589 39.628 53.298 68.74 40.202 47.722 71.467 85.052 18 8.133 14.924 24.095 25.457 17.651 27.512 36.031 37.743 35.004 41.485 54.803 69.492 40.911 48.652 72.219 85.428 19 8.144 14.938 24.106 25.462 17.665 27.53 36.045 37.75 35.015 41.5 54.814 69.498 40.917 48.659 72.225 85.431 2.5. ITERATIVE METHODS FOR SOLVING LINEAR SYSTEMS n t1 t2 t3 t4 t5 t6 t7 t8 t9 t10 t11 t12 t13 t14 t15 t16 20 8.151 14.947 24.114 25.466 17.674 27.541 36.055 37.755 35.023 41.509 54.822 69.502 40.92 48.664 72.229 85.433 21 8.155 14.953 24.118 25.468 17.68 27.549 36.061 37.758 35.027 41.516 54.827 69.504 40.923 48.667 72.232 85.434 22 8.158 14.956 24.121 25.47 17.684 27.554 36.065 37.76 35.03 41.52 54.83 69.506 40.924 48.669 72.233 85.435 n t1 t2 t3 t4 t5 t6 t7 t8 t9 t10 t11 t12 t13 t14 t15 t16 23 8.16 14.959 24.123 25.471 17.686 27.557 36.068 37.761 35.033 41.522 54.832 69.507 40.925 48.67 72.234 85.435 30 8.163 14.963 24.127 25.473 17.691 27.563 36.072 37.764 35.036 41.527 54.836 69.509 40.927 48.673 72.236 85.436 24 8.161 14.961 24.125 25.471 17.688 27.56 36.069 37.762 35.034 41.524 54.834 69.508 40.926 48.671 72.235 85.436 31 8.164 14.963 24.127 25.473 17.691 27.563 36.073 37.764 35.036 41.527 54.836 69.509 40.927 48.673 72.236 85.436 25 8.162 14.962 24.126 25.472 17.689 27.561 36.071 37.763 35.035 41.525 54.835 69.508 40.926 48.672 72.235 85.436 32 8.164 14.964 24.127 25.473 17.691 27.563 36.073 37.764 35.036 41.527 54.836 69.509 40.927 48.673 72.236 85.436 121 26 8.163 14.962 24.126 25.472 17.69 27.562 36.071 37.763 35.035 41.526 54.835 69.509 40.927 48.672 72.236 85.436 33 8.164 14.964 24.127 25.473 17.691 27.564 36.073 37.764 35.036 41.527 54.836 69.509 40.927 48.673 72.236 85.436 27 8.163 14.963 24.127 25.472 17.69 27.562 36.072 37.763 35.036 41.526 54.836 69.509 40.927 48.672 72.236 85.436 28 8.163 14.963 24.127 25.472 17.69 27.563 36.072 37.763 35.036 41.527 54.836 69.509 40.927 48.672 72.236 85.436 29 8.163 14.963 24.127 25.473 17.691 27.563 36.072 37.763 35.036 41.527 54.836 69.509 40.927 48.673 72.236 85.436 34 8.164 14.964 24.127 25.473 17.691 27.564 36.073 37.764 35.036 41.527 54.836 69.509 40.927 48.673 72.236 85.436 Column 34 gives the equilibrium temperature to an accuracy of 0.001. 27. (a) Let x1 correspond to the left end of the paper and x2 to the right end, and let n be the number of folds. Then the first six values of [x1 , x2 ] are n x1 x2 0 0 1 1 0 1 2 2 3 4 5 6 1 4 1 2 1 4 3 8 5 16 3 8 5 16 11 32 21 64 11 32 The graph below shows these points, together with the two lines determined in part (b): 1. 0.9 0.8 0.7 0.6 0.5 0.4 0.3 0.2 0.1 0. 0. 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 1. 122 CHAPTER 2. SYSTEMS OF LINEAR EQUATIONS (b) From the table, we see that x1 = 0 gives x2 = 12 in the next iteration, and x1 = 41 gives x2 = 38 in the next iteration. Similarly, x2 = 1 gives x1 = 0 in the next iteration, and x2 = 21 gives x1 = 14 in the next iteration. Thus x2 − 3 −1 1 1 1 = 81 2 (x1 − 0), or x2 = − x1 + 2 2 2 4 −0 1 −0 1 1 (x2 − 1), or x1 = − x2 + 2 2 − 1 2 x1 − 0 = 41 (c) Switching to decimal representations, we apply Gauss-Seidel to these equations after n = 6, 21 and x2 = 11 starting with the estimates x1 = 64 32 , to obtain an approximate point of convergence: n x1 x2 0 0 1 1 0 1 2 2 3 4 5 6 1 4 1 2 1 4 3 8 5 16 3 8 5 16 11 32 21 64 11 32 7 0.328 0.336 8 0.332 0.334 9 0.333 0.333 10 0.333 0.333 (d) Row-reduce the augmented matrix for this pair of equations, giving " # " # 1 1 12 1 0 13 2 . → − 1 21 12 0 1 13 The exact solution to the system of equations is [x1 , x2 ] = 13 , 13 . The ends of the piece of paper converge to 13 . 28. (a) Let x1 record the positions of the left endpoints and x2 the positions of the right endpoints at the end of each walks. Then the first six values of [x1 , x2 ] are n x1 x2 0 0 1 2 1 2 3 4 5 6 1 4 1 2 1 4 5 8 5 16 5 8 5 16 21 32 21 64 21 32 21 64 85 128 The graph below shows these points, together with the two lines determined in part (b): 1. 0.9 0.8 0.7 0.6 0.5 0.4 0.3 0.2 0.1 0. 0. (b) 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 1. From the table, we see that x1 = 0 gives x2 = 12 in the next iteration, and x1 = 41 gives x2 = 58 in 5 the next iteration. Similarly, x2 = 12 gives x1 = 14 in the next iteration, and x2 = 85 gives x1 = 16 in the next iteration. Thus 5 −1 1 1 1 = 81 2 (x1 − 0), or x2 = x1 + 2 2 2 4 −0 5 −1 1 1 1 x1 − = 85 21 x2 − , or x1 = x2 . 4 2 2 − 16 4 x2 − (c) Switching to decimal representations, we apply Gauss-Seidel to these equations after n = 6, 21 85 starting with the estimates x1 = 64 and x2 = 128 , to obtain an approximate point of convergence: n x1 x2 0 0 1 2 1 2 3 4 5 6 1 4 1 2 1 4 5 8 5 16 5 8 5 16 21 32 21 64 21 32 21 64 85 128 7 8 9 0.332 0.333 0.333 0.666 0.667 0.667 Chapter Review 123 (d) Row-reduce the augmented matrix for this pair of equations, giving " # " # 1 − 21 1 1 0 13 2 . → − 1 − 12 0 0 1 23 The exact solution to the system of equations is [x1 , x2 ] = 13 , 23 . So the ant eventually oscillates between 13 and 32 . Chapter Review 1. (a) False. Some linear systems are inconsistent, and have no solutions. See Section 2.1. One example is x + y = 0 and x + y = 1. (b) True. See Section 2.2. Since in a homogeneous system, the lines defined by the equations all pass through the origin, the system has at least one solution; namely, the solution in which all variables are zero. (c) False. See Theorem 2.6. This is guaranteed only if the system is homogeneous, but not if it is not (for example, x + y = 0, x + y = 1 is inconsistent in R3 ). (d) False. See the discussion of homogeneous systems in Section 2.2. It can have either a unique solution, infinitely many solutions, or no solutions. (e) True. See Theorem 2.4 in Section 2.3. (f ) False. If u and v are linearly dependent (i.e., one is a multiple of the other), we get a line, not a plane. And if u and v are both zero, we get only the origin. If the two vectors are linearly independent, then their span is indeed a plane through the origin. (g) True. If u and v are parallel, then v = cu, so that −cu + v = 0, so that u and v are linearly dependent. (h) True. If they can be drawn that way, then the sum of the vectors is 0, since tracing the vectors in turn returns you to the starting point. (i) False. For example, u = [1, 0, 0], v = [0, 1, 0] and w = [1, 1, 0] are linearly dependent since u + v − w = 0, yet none of these is a multiple of another. (j) True. See Theorem 2.8 in Section 2.3. m vectors in Rn are linearly dependent if m > n. 2. The rank is the number of nonzero rows in row echelon form:      1 1 −2 0 3 2 1 −2 0 3 2 0   R ↔R 0 −5 −1 3 −1 6 2 1 3 4  R −3R1 +2R2   2 4   3 −−−→ 0 4 2 −3 2 −−3−−−− 4 2 −3 2 −−−−−→ 3 R −3R 1 +R2 3 −1 1 3 4 −−4−−−− 0 −5 −1 6 2 −−→ 0  −2 0 3 2 −5 −1 6 2  0 0 0 0 0 0 0 0 This row echelon form of A has two nonzero rows, so that rank(A) = 2. 3. Construct the augmented matrix of the system and row-reduce it:         1 1 −2 4 1 1 −2 4 1 1 −2 4 1 1 −2 4 R2 −R1 1 3 −1 7 − 0 2 0 2 −−−−→ 0 2 1 3 1 3 1 3 −2R3 R3 −R2 R3 −2R1 2 1 −5 7 − 0 −1 −1 −1 0 2 2 2 0 0 1 −1 −−−→ −−−−−→ −−−−→       R +2R3 R1 −R2 −−1−−−→ 1 1 0 2 1 1 1 0 2 − 0 −−−−→ 1 0 0 R2 +R3 0 2 0 0 1 0 2 R2 0 4 −− 1 0 2 2 . −→ −−− −−→ 0 0 1 −1 0 0 1 −1 0 0 1 −1 Thus the solution is     x1 0 x2  =  2 . x3 −1 124 CHAPTER 2. SYSTEMS OF LINEAR EQUATIONS 4. Construct the augmented matrix and row-reduce it:  3 8 1 2 1 3 −18 1 −4 0 −7 1   R2    1 2 −4 0 11 35 −−→ 1 2 −4 0 11 R2 −R1 R3  11− −→ 1 3 −7 1 10 −−−−−→ 0 1 −3 1 −1 R3 −3R1 R1 10 − 2 −→ 3 8 −18 1 35 −−−−−→ 0 2 −6 1     1 2 −4 0 11 1 2 −4 0 11 0 1 −3 0 1 −3 1 −1 1 −1 R3 −2R2 −R3 0 −1 4 −−−→ 0 0 0 1 −4 −−−−−→ 0 0   R1 −2R2   2 0 1 2 −4 0 11 −−−−−→ 1 0 5 R2 −R3  0 1 −3 0 3 3 −−− −−→ 0 1 −3 0 0 0 0 1 −4 0 0 0 1 −4 Then y = t is a free variable, and the general solution is w = 5 − 2t, x = 3 + 3t, y = t, z = −4:         −2 5 5 − 2t w  3  x  3 + 3t  3       =  y   t  =  0 + t  1 . 0 −4 −4 z 5. Construct the augmented matrix and row-reduce it over Z7 : 4R1 R1 +4R2 2 3 4 − 1 5 2 − −→ 1 5 2 −−−−→ 1 R2 +6R1 1 2 3 1 2 3 − 0 −−−−→ 0 4 1 Thus the solution is 0 4 6 . 2 x 6 = . y 2 6. Construct the augmented matrix and row-reduce over Z5 : 3 2 3 2 1 R2 +3R1 1 4 2 − −−−−→ 0 0 1 0 So the solutions are all of the form 3x + 2y = 1. Setting y = 0, 1, 2, 3, 4 in turn, we get x 2 y = 0 ⇒ 3x = 1 ⇒ x = 2 ⇒ = y 0 x 3 y = 1 ⇒ 3x + 2 = 1 ⇒ 3x = 4 ⇒ x = 3 ⇒ = y 1 x 4 y = 2 ⇒ 3x + 4 = 1 ⇒ 3x = 2 ⇒ x = 4 ⇒ = y 2 x 0 y = 3 ⇒ 3x + 1 = 1 ⇒ x = 0 ⇒ = y 3 x 1 y = 4 ⇒ 3x + 3 = 1 ⇒ 3x = 3 ⇒ x = 1 ⇒ = . y 4 7. Construct the augmented matrix and row-reduce it: k 2 1 R2 ↔R1 1 2k 1 1 R2 −kR1 1 2k 1 −−−−−→ k 2 1 − −−−−→ 0 2k 2 − 2k 2 1 1−k The system is inconsistent when 2 − 2k 2 = −2(k + 1)(k − 1) = 0 but 1 − k 6= 0. The first of these is zero for k = −1 and for k = 1; when k = −1 we have 1 − k = 2 6= 0. So the system is inconsistent when k = −1. Chapter Review 125 8. Form the augmented matrix of the system and row reduce it (see Example 2.14 in Section 2.2): 1 5 2 6 3 7 4 1 R2 −5R1 8 − 0 −−−−→ 2 3 −4 −8 4 1 2 3 1 − 4 R2 −12 − 0 1 2 −−−→ R1 −2R2 4 − −−−−→ 1 0 3 0 1 −1 2 −2 3 Thus z = t is a free variable, and x = −2 + t, y = 3 − 2t, so the parametric equation of the line is         x −2 + t −2 1 y  =  3 − 2t  =  3 + t −2 . z t 0 1 9. As in Example 2.15 in Section 2.2, we wish to find x = [x, y, z] that satisfies both equations simultaneously; that is, x = p + su = q + tv. This means su − tv = q − p. Substituting the given vectors into this equation gives           1 −1 5 1 4 s + t = 4 s −1 − t  1 = −2 − 2 = −4 ⇒ −s − t = −4 2 1 −4 3 −7 2s − t = −7 Form the augmented matrix of this system and row-reduce it to determine s and t:  1 −1 2 1 −1 −1   4 1 R2 ↔R3  −4 − 2 −−−−→ −7 −1 1 −1 −1   4 1 R −2R1 0 7 −−2−−−→ R3 +R1 −4 − −−−−→ 0  4 −15 0  1 1 − 3 R2  −−− −→ 0 0 1 −3 0 1 1 0  R1 −R2  4 − −−−−→ 1 0 0 1 5 0 0 0  −1 5 0 Thus s = −1 and t = 5, so the point of intersection is         x 1 1 0 y  = 2 − −1 = 3 . z 3 2 1 (We could equally well have substituted t = 5 into the equation for the other line; check that this gives the same answer.) 10. As in Example 2.18 in Section 2.3, we want to find scalars x and y such that       1 1 3 x + y = 3 x + 2y = 5 x 1 + y  2 =  5 ⇒ 3 −2 −1 3x − 2y = −1 Form the augmented matrix of this system and row-reduce it to determine x and y:  1 1 3 1 2 −2   3 1 R2 −R1 5 −−− −−→ 0 R3 −3R1 −1 − −−−−→ 0 1 1 −5   3 1 0 R1 −R2 2 −−− −−→ 0 1 R3 +5R3 −10 − −−−−→ 0 0  1 2 0 So this system has a solution, and thus so the vector is in the span of the other two vectors:       3 1 1  5 = 1 + 2  2  . −1 3 −2 126 CHAPTER 2. SYSTEMS OF LINEAR EQUATIONS 11. As in Example 2.21 in Section 2.3, the equation of the plane we are looking for is       x 1 3 s + 3t = x y  = s 1 + t 2 ⇒ s + 2t = y z 1 1 s + t = z Form the augmented matrix and row-reduce it to find conditions for x, y, and z:  1 3 1 2 1 1   1 x R2 −R1 y  −−− −−→ 0 R3 −R2 z − −−−−→ 0 3 −1 −2   1 x −R2  y − x − 0 −−→ 0 z−x 3 1 −2   x 1 3 0 1 x − y R3 +2R2 z−x − −−−−→ 0 0  x x−y  x − 2y + z We know the system is consistent, since the two given vectors span some plane. So we must have x − 2y + z = 0, and this is the general equation of the plane we were seeking. (An alternative method would be to take the cross product of the two given vectors, giving [−1, 2, −1], so that the plane is of the form −x + 2y − z = d; plugging in x = y = z = 0 gives −x + 2y − z = 0, or x − 2y + z = 0.) 12. As in Example 2.23 in Section 2.3, we want to find scalars c1 , c2 , and c3 so that         2 1 3 0 c1  1 + c2 −1 + c3  9 = 0 . −3 −2 −2 0 Form the augmented matrix of the resulting system and row-reduce it:     1 0 4 0 2 1 3 0  1 −1 9 0 → − 0 1 −5 0 −3 −2 −2 0 0 0 0 0 Since c1 = −4c3 , c2 = 5c3 is a solution, the vectors are linearly dependent. For example, setting c3 = −1 gives         2 1 3 0 4  1 − 5 −1 −  9 = 0 . −3 −2 −2 0 13. By part (a) of Exercise 21 in Section 2.3, it is sufficient to prove that ei is in span(u, v, w) ⊂ R3 , since then we have R3 = span(e1 , e2 , e3 ) ⊂ span(u, v, w) ⊂ R3 . (a) Begin with e1 . We want to find scalars x, y, and z such that         1 1 0 1 x 1 + 0 + 1 = e1 = 0 . 0 1 1 0 Form the augmented matrix and row reduce:     1 1 0 0 1 1 0 1 2    1  − 0 1 0 1 0 1 0 → 2  0 1 1 0 0 0 1 − 21 Similarly, for e2 ,  1 1 0  1 0 1 0 1 1   0 1   1 → − 0 0 0 0 1 0 0 0 1 ⇒ e1 = 1 1 1 u + v − w. 2 2 2 ⇒ e1 = 1 1 1 u − v + w. 2 2 2  1 2  − 21  1 2 Chapter Review Similarly, for e3 ,  1 1 0  1 0 1 0 1 1 127   1 0 0 0   0 → − 0 1 0 1 0 0 1 − 21  1 2 1 2   ⇒ 1 1 1 e1 = − u + v + w. 2 2 2 Thus R3 = span(u, v, w). (b) Since w = u + v, these vectors are not linearly independent. But any set of vectors that spans R3 must contain 3 linearly independent vectors, so span(u, v, w) 6= R3 . 14. All three statements are true, so (d) is the correct answer. If the vectors are linearly independent, and b is any vector in R3 , then the augmented matrix [A | b], when reduced to row echelon form, cannot have a zero row (since otherwise, the portion of the matrix corresponding to A would have a zero row, so the three vectors would be linearly dependent). Thus [A | b] has a unique solution, which can be read off from the row-reduced matrix. This proves statement (c). Next, since there is not a zero row, continuing and putting the system in reduced row-echelon form results in a matrix with 3 leading ones, so it is the identity matrix. Hence the reduced row echelon form of A is I3 , and thus A has rank 3. 15. Since A is a 3 × 3 matrix, rank(A) ≤ 3. Since the vectors are linearly dependent, rank(A) cannot be 3; since they are not all zero, rank(A) cannot be zero. Thus rank(A) = 1 or rank(A) = 2. 16. By Theorem 2.8 in Section 5.3, the maximum rank of a 5 × 3 matrix is 3: since the rows are vectors in R3 , any set of four or more of them are linearly dependent. Thus when we row reduce the matrix, we must introduce at least two zero rows. The minimum rank is zero, achieved when all entries are zero. 17. Suppose that c1 (u + v) + c2 (u − v) = 0. Rearranging gives (c1 + c2 )u + (c1 − c2 )v = 0. Since u and v are linearly independent, it follows that c1 + c2 = 0 and c1 − c2 = 0. Adding the two equations gives 2c1 = 0, so that c1 = 0; then c2 = 0 as well. It follows that u + v and u − v are linearly independent. 18. Using Exercise 21 in Section 2.3, we show first that u and v are linear combinations of u and u + v. But this is clear: u = u, and v = (u + v) − u. This shows that span(u, v) ⊆ span(u, u + v). Next, we show that u and u + v are linear combinations of u and v. But this is also clear. Thus span(u, u + v) ⊆ span(u, v). Hence the two spans are equal. 19. Note that rank(A) ≤ rank([A | b]), since any zero rows in an echelon form of [A | b] are also zero rows in a row echelon form of A, so that a row echelon form of A has at least as many zero rows as a row echelon form of [A | b]. If rank(A) < rank([A | b]), then a row echelon form of [A | b] will contain at least one row that is zero in the columns corresponding to A but nonzero in the last column on that row, so that the system will be inconsistent. Hence the system is consistent only if rank(A) = rank([A | b]). 20. Row-reduce both A and B:  1 1  2 3 −1 4  1 0 1 1 0 1      1 1 1 1 1 1 1 R −2R1 0 1 −3 0 1 −3 −1−−2−−−→ R −5R 3 2 R +R 3 1 1 −−− 0 5 2 −−−−−→ 0 0 17 −−→      −1 1 0 −1 1 0 −1 R2 −R1  0 1 1− 2 2 . −−−−→ 0 1 R3 −R2 3 0 1 3 − 0 0 1 −−−−→ Since in row echelon form each of these matrices has no zero rows, it follows that both matrices have rank 3, so that their reduced row echelon forms are both I3 . Thus the matrices are row-equivalent. The sequence of row operations required to derive B from A is the sequence of row operations required to reduce A to I3 , followed by the sequence of row operations, in reverse, required to reduce B to I3 . Chapter 3 Matrices 3.1 Matrix Operations 1. Since A and D have the same shape, this operation makes sense. 3 0 0 −3 3 0 0 −6 3+0 0−6 3 A + 2D = +2 = + = = −1 5 −2 1 −1 5 −4 2 −1 − 4 5 + 2 −5 −6 7 2. Since D and A have the same shape, this operation makes sense. 0 −3 3 0 0 −9 6 0 0−6 −9 − 0 −6 3D − 2A = 3 −2 = − = = −2 1 −1 5 −6 3 −2 10 −6 − (−2) 3 − 10 −4 −9 −7 3. Since B is a 2 × 3 matrix and C is a 3 × 2 matrix, they do not have the same shape, so they cannot be subtracted. 4. Since both C and B T are 3 × 2 matrices, this operation makes sense.        T 1 2 1 2 4 0 1−4 4 −2 1 C − B T = 3 4 − = 3 4 − −2 2 = 3 − (−2) 0 2 3 5 6 5 6 1 3 5−1   2−0 −3 4 − 2 =  5 6−3 4 5. Since A has two columns and B has two rows, the product makes sense, and 3 0 4 −2 1 3·4+0·0 3 · (−2) + 0 · 2 3·1+0·3 12 −6 AB = = = −1 5 0 2 3 −1 · 4 + 5 · 0 −1 · (−2) + 5 · 2 −1 · 1 + 5 · 3 −4 12  2 2 3 3 . 14 6. Since the number of columns of B (3) does not equal the number of rows of D (2), this matrix product does not make sense. 7. Since B has three columns and C has three rows, the multiplication BC makes sense, and results in a 2 × 2 matrix (since B has 2 rows and C has 2 columns). Since D is also a 2 × 2 matrix, they can be subtracted, so this formula makes sense. First,   1 2 4 −2 1  4·1−2·3+1·5 4·2−2·4+1·6 3 6  3 4 = BC = = . 0 2 3 0·1+2·3+3·5 0·2+2·4+3·6 21 26 5 6 Then D + BC = 0 −2 −3 3 + 1 21 6 0+3 = 26 −2 + 21 129 −3 + 6 3 = 1 + 26 19 3 . 27 130 CHAPTER 3. MATRICES 8. Since B has three columns and B T has three rows, the product makes sense, and T 4 −2 1 4 −2 1 BB = 0 2 3 0 2 3   4 0 4 −2 1  −2 2 = 0 2 3 1 3 4 · 4 − 2 · (−2) + 1 · 1 4 · 0 − 2 · 2 + 1 · 3 21 −1 = = 0 · 4 + 2 · (−2) + 3 · 1 0 · 0 + 2 · 2 + 3 · 3 −1 13 T 9. Since A has two columns and F has two rows, AF makes sense; since A has two rows and F has one column, the result is a 2 × 1 matrix. So this matrix has two rows, and E has two columns, so we can multiply by E on the left. Thus this product makes sense. Then 3 0 −1 3 · (−1) + 0 · 2 −3 AF = = = and then −1 5 2 −1 · (−1) + 5 · 2 11 −3 E(AF ) = 4 2 = 4 · (−3) + 2 · 11 = 10 . 11 10. F has shape 2 × 1. D has shape 2 × 2, so DF has shape 2 × 1. Since the number of columns (1) of F does not equal the number of rows (2) of DF , the product does not make sense. 11. Since F has one column and E has one row, the product makes sense, and FE = −1 4 2 −1 · 4 2 = 2·4 −1 · 2 −4 = 2·2 8 −2 . 4 12. Since E has two columns and F has one row, the product makes sense, and EF = 4 −1 2 = 4 · (−1) + 2 · 2 = 0 . 2 (Note that F E from Exercise 11 and EF are not the same matrix. In fact, they don’t even have the same shape! In general, matrix multiplication is not commutative.) 13. Since B has two rows, B T has two columns; since C has two columns, C T has two rows. Thus B T C T makes sense, and the result is a 3 × 3 matrix (since B T has three rows and C T has three columns). Next, CB is defined (3 × 2 multiplied by 2 × 3) and also yields a 3 × 3 matrix, so its transpose is also a 3 × 3 matrix. Thus B T C T − (BC)T is defined.       4 0 4·1+0·2 4·3+0·4 4·5+0·6 4 12 20 1 3 5 B T C T = −2 2 = −2 · 1 + 2 · 2 −2 · 3 + 2 · 4 −2 · 5 + 2 · 6 = 2 2 2  2 4 6 1 3 1·1+3·2 1·3+3·4 1·5+3·6 7 15 23       1 2 1 · 4 + 2 · 0 1 · (−2) + 2 · 2 1 · 1 + 2 · 3 4 2 7 4 −2 1 CB = 3 4 = 3 · 4 + 4 · 0 3 · (−2) + 4 · 2 3 · 1 + 4 · 3 = 12 2 15 0 2 3 5 6 5 · 4 + 6 · 0 5 · (−2) + 6 · 2 5 · 1 + 6 · 3 20 2 23 T    4 12 20 4 2 7 (CB)T = 12 2 15 = 2 2 2  . 20 2 23 7 15 23 Thus B T C T = (CB)T , so their difference is the zero matrix. 3.1. MATRIX OPERATIONS 131 14. Since A and D are both 2 × 2 matrices, the products in both orders are defined, and each results in a 2 × 2 matrix, so their difference is defined as well. 0 −3 3 0 0 · 3 − 3 · (−1) 0·0−3·5 3 −15 DA = = = −2 1 −1 5 −2 · 3 + 1 · (−1) −2 · 0 + 1 · 5 −7 5 3 0 0 −3 3 · 0 + 0 · (−2) 3 · (−3) + 0 · 1 0 −9 AD = = = −1 5 −2 1 −1 · 0 + 5 · (−2) −1 · (−3) + 5 · 1 −10 8 3 −15 0 −9 3−0 −15 − (−9) 3 −6 DA − AD = − = = . −7 5 −10 8 −7 − (−10) 5−8 3 −3 15. Since A is 2 × 2, A2 is defined and is also a 2 × 2 matrix, so we can multiply by A again to get A3 . 3 0 3 0 3 · 3 + 0 · (−1) 3·0+0·5 9 0 2 A = = = −1 5 −1 5 −1 · 3 + 5 · (−1) −1 · 0 + 5 · 5 −8 25 3 0 9 0 3 · 9 + 0 · (−8) 3 · 0 + 0 · 25 27 0 3 2 A = AA = = = . −1 5 −8 25 −1 · 9 + 5 · (−8) −1 · 0 + 5 · 25 −49 125 16. Since D is a 2 × 2 matrix, I2 − A makes sense and is also a 2 × 2 matrix. Since it has the same number of rows as columns, squaring it (which is multiplying it by itself) also makes sense. 1 0 0 −3 1 3 . I2 − D = − = 0 1 −2 1 2 0 Thus 1 (I2 − D) = 2 2 a 17. Suppose that A = c 2 3 1·1+3·2 = 0 2·1+0·2 1·3+3·0 7 = 2·3+0·0 2 3 6 b . Then if A2 = 0, d A2 = a c b a d c 2 b a + bc = d ac + cd ab + bd 0 = bc + d2 0 0 0 If c = 0, so that the upper right-hand entry is zero, then we must also have a = 0 and d = 0 so that 0 0 the diagonal entries are zero. So is one solution. Alternatively, if c 6= 0, then ac = −cd and 1 0 thus a = −d. This will make the lower left entry zero as well, and the two diagonal entries will both be equal to a2 + bc. So any values of a, b, and c 6= 0 with a2 = −bc will work. For example, a = b = 1, c = −1, d = −1: 1 1 1 1 1−1 1−1 0 0 = = . −1 −1 −1 −1 −1 + 1 −1 + 1 0 0 18. Note that the first column of A is twice its second column. Thus 1 0 0 0 0 0 A = =A . −2 0 0 0 0 0 1.50 1.00 2.00 , where the first row 1.75 1.50 1.00 gives the costs of shipping each product by truck, and the second row the costs when shipping by train. Then BA gives the costs of shipping all the products to each of the warehouses by truck or train:   200 75 1.50 1.00 2.00  650.00 462.50  150 100 = BA = . 1.75 1.50 1.00 675.00 406.25 100 125 19. The cost of shipping one unit of each product is given by B = 132 CHAPTER 3. MATRICES For example, the upper left entry in the product is the sum of 1.50 · 200, the cost of shipping doohickies to warehouse 1 by truck, 1.00 · 150, the cost of shipping gizmos to warehouse 1 by truck, and 2.00 · 100, the cost of shipping widgets to warehouse 1 by truck. So 650.00 is the cost of shipping all of the items to warehouse 1 by truck. Comparing, we see that it is cheaper to ship items to warehouse 1 by truck ($650.00 vs $675.00), but to warehouse 2 by train ($406.25 vs $462.50). 0.75 0.75 0.75 20. The cost of distributing one unit of the product is given by C = , where the entry 1.00 1.00 1.00 in row i, column j is the cost of shipping one unit of product j from warehouse i. Then the cost of distribution is   200 75 0.75 0.75 0.75  337.50 225.00 150 100 = CA = 1.00 1.00 1.00 450.00 300.00 100 125 In this product, the entry in row i, column i represents the total cost of shipping all the products from warehouse i. The off-diagonal entries do not represent anything in particular, since they are a result of multiplying shipping costs from one warehouse with products from the other warehouse.   x1 1 −2 3   0 x2 = 21. , so that Ax = b, where 2 1 −5 4 x3   x1 1 −2 3 0   . A= , x = x2 , b = 2 1 −5 4 x3  −1 22.  1 0     1 0 2 x1 −1 0 x2  = −2, so that Ax = b, where −1 1 1 x3     −1 0 2 x1 A =  1 −1 0 , x = x2  , 0 1 1 x3   1 b = −2 . −1 23. The column vectors of B are   2 b1 =  1 , −1   3 b2 = −1 , 6   0 b3 = 1 , 4 so the matrix-column representation is         1 0 −2 4 Ab1 = 2 −3 + 1 1 − 1  1 = −6 2 0 −1 5         1 0 −2 −9 Ab2 = 3 −3 − 1 1 + 6  1 = −4 2 0 −1 0         1 0 −2 −8 Ab3 = 0 −3 + 1 1 + 4  1 =  5 . 2 0 −1 −4 Thus  4 AB = −6 5  −9 −8 −4 5 . 0 −4 3.1. MATRIX OPERATIONS 133 24. The row vectors of A are A1 = 1 0 −2 , A2 = −3 1 1 , A3 = 2 0 −1 , so the row-matrix representation is A1 B = 1 2 3 0 + 0 1 −1 1 − 2 −1 6 4 = 4 −9 −8 A2 B = −3 2 3 0 + 1 1 −1 1 + 1 −1 6 4 = −6 −4 5 A3 B = 2 2 3 0 + 0 1 −1 1 − 1 −1 6 4 = 5 0 −4 . Thus  4 AB = −6 5  −9 −8 −4 5 . 0 −4 25. The outer product expansion is       1 0 −2 a1 B1 + a2 B2 + a3 B3 = −3 2 3 0 + 1 1 −1 1 +  1 −1 6 2 0 −1       2 3 0 0 0 0 2 −12 −8 6 4 = −6 −9 0 + 1 −1 1 + −1 4 6 0 0 0 0 1 −6 −4   4 −9 −8 5 . = −6 −4 5 0 −4 26. We have         2 3 0 −7 Ba1 = 1  1 − 3 −1 + 2 1 =  6 −1 6 4 −11         2 3 0 3 Ba2 = 0  1 + 1 −1 + 0 1 = −1 −1 6 4 6         2 3 0 −1 Ba3 = −2  1 + 1 −1 − 1 1 = −4 . −1 6 4 4 Thus   −7 3 −1 BA =  6 −1 −4 . −11 6 4 27. We have B1 A = 2 1 0 −2 + 3 −3 1 1 + 0 2 0 −1 = −7 3 −1 B2 A = 1 1 0 −2 − 1 −3 1 1 + 1 2 0 −1 = 6 −1 −4 B3 A = −1 1 0 −2 + 6 −3 1 1 + 4 2 0 −1 = −11 6 4 . Thus   −7 3 −1 BA =  6 −1 −4 . −11 6 4 4 134 CHAPTER 3. MATRICES 28. The outer product expansion is       2 3 0 b1 A1 + b2 A2 + b3 A3 =  1 1 0 −2 + −1 −3 1 1 + 1 2 0 −1 6 4       2 0 −4 −9 3 3 0 0 0 =  1 0 −2 +  3 −1 −1 + 2 0 −1 −1 0 2 −18 6 6 8 0 −4   −7 3 −1 =  6 −1 −4 . −11 6 4 −1 29. Assume that the columns of B = b1 b2 . . . bn are linearly dependent. Then there exists a solution to x1 b1 + x2 b2 + · · · + xn bn = 0 with at least one xi 6= 0. Now, we have AB = A b1 b2 . . . bn = Ab1 Ab2 . . . Abn , so that A(x1 b1 + x2 b2 + · · · + xn bn ) = x1 (Ab1 ) + x2 (Ab2 ) + · · · + xn (Abn ) = 0, so that the columns of AB are linearly dependent. 30. Let   a1  a2    A =  . ,  ..    a1 B  a2 B    AB =  .  .  ..  so that an an B Since the rows of A are linearly dependent, there are xi , not all zero, such that x1 a1 +x2 a2 +· · ·+xn an = 0. But then (x1 a1 + x2 a2 + · · · + xn an )B = x1 (a1 B) + x2 (a2 B) + · · · + xn (an B) = 0, so that the rows of AB are linearly dependent. 31. We have the block structure A11 A12 B11 AB = A21 A22 B21 B12 A11 B11 + A12 B21 = B22 A21 B11 + A22 B21 A11 B12 + A12 B22 . A21 B12 + A22 B22 Now, the blocks A12 , A21 , B12 , and B21 are all zero blocks, so any products involving those block are zero. Thus AB reduces to A11 B11 0 AB = . 0 A22 B22 But 1 −1 2 3 1 · 2 − 1 · (−1) = 0 1 −1 1 0 · 2 + 1 · (−1) 1 A22 B22 = 2 3 = 2·1+3·1 = 5 . 1 A11 B11 = 1·3−1·1 3 = 0·3+1·1 −1 2 1 So finally, A11 B11 AB = 0 32. We have AB = A1 B11 A2 B21 0 A22 B22  3 = −1 0 B12 = A1 B11 + A2 B21 B22 2 1 0  0 0 . 5 A1 B12 + A2 B22 . But B11 is the zero matrix, and both A2 and B12 are the identity matrix I2 , so this expression simplifies to 1 7 7 AB = I2 B21 A1 I2 + I2 B22 = B21 A1 + B22 = . −2 7 7 3.1. MATRIX OPERATIONS 135 33. We have the block structure A11 A12 B11 AB = A21 A22 B21 B12 A11 B11 + A12 B21 = B22 A21 B11 + A22 B21 A11 B12 + A12 B22 . A21 B12 + A22 B22 Now, B21 is the zero matrix, so products involving that block are zero. Further, A12 , A21 , and B11 are all the identity matrix I2 . So this product simplifies to A11 I2 A11 B12 + I2 B22 A11 A11 B12 + B22 AB = = . I2 I2 I2 B12 + A22 B22 I2 B12 + A22 B22 The remaining products are 1 3 0 1 2 1 0 −1 2 0 + B22 = + = 1 0 4 3 1 0 5 3 −1 1 0 −1 0 1 1 1 1 B12 + A22 B22 = B12 + = + = 1 −1 1 0 1 0 −1 −1 0 A11 B12 + B22 = 2 4 Thus  1 3 AB =  1 0 34. We have the block structure A11 A12 B11 AB = A21 A22 B21 2 4 0 1 2 5 1 0 2 . −1  0 3 . 2 −1 B12 A11 B11 + A12 B21 = B22 A21 B11 + A22 B21 A11 B12 + A12 B22 . A21 B12 + A22 B22 Now, A11 is the identity matrix I3 , and A21 is the zero matrix. Further, B22 is a 1 × 1 matrix, as is A22 , so they both behave just like scalar multiplication. Thus this product reduces to I3 B11 + A12 B21 I3 B12 − A12 B11 + A12 B21 B12 − A12 AB = = . 4B21 4 · (−1) 4B21 −4 Compute the matrices in the first row:    1 1 B11 + A12 B21 = B11 + 2 1 1 1 = 0 3 0       1 1 0 B12 − A12 = 1 − 2 = −1 . 1 3 −2 So  2 2 AB =  3 4 3 3 3 4 4 6 4 4 2 1 0   3 1 4 + 2 1 3 1 2 3   1 2 2 = 2 3 3  0 −1 . −2 −4 35. (a) Computing the powers of A as required, we get −1 1 −1 0 0 −1 2 3 4 A = , A = , A = , −1 0 0 −1 1 −1 1 −1 1 0 0 1 A5 = , A6 = = I2 , A7 = = A. 1 0 0 1 −1 1 Note that A6 = I2 and A7 = A. 3 3 3  4 6 4 136 CHAPTER 3. MATRICES (b) Since A6 = I2 , we know that A6k = A6 2015 = 2010 + 5 = 6 · 370 + 5, we have A 36. We have " B2 = 2015 =A # 0 −1 1 0 B3 = B2B = , 6·370+5 k =A " − √12 − √12 √1 2 − √12 = I2k = I2 for any positive integer value of k. Since 6·370 1 A = I2 A = A = 1 5 5 # , 5 −1 0 " −1 # 0 0 −1 B 4 = (B 2 )2 = " , B8 = 1 # 0 0 1 = I2 Thus any power of B whose exponent is divisible by 8 is equal to I2 . Now, 2015 = 8 · 251 + 7, so that # " # " # " √1 √1 − √12 − √12 0 −1 2015 8·251 7 7 4 3 2 2 = B =B ·B =B =B ·B = · √1 − √12 √12 − √12 1 0 2 1 n 37. Using induction, we show that A = for n ≥ 1. The base case, n = 1, is just the definition of 0 1 1 A = A. Now assume that this equation holds for k. Then n A k+1 1 =A A inductive k hypothesis 1 = 0 1 1 1 0 k 1 = 1 0 k+1 . 1 So by induction, the formula holds for all n ≥ 1. 38. (a) cos θ A = sin θ 2 − sin θ cos θ cos θ sin θ 2 − sin θ cos θ − sin2 θ = cos θ 2 cos θ sin θ −2 cos θ sin θ . cos2 θ − sin2 θ But cos2 θ − sin2 θ = cos 2θ and 2 cos θ sin θ = sin 2θ, so that we get cos 2θ − sin 2θ 2 A = . sin 2θ cos 2θ (b) We have already proven the cases n = 1 and n = 2, which serve as the basis for the induction. Now assume that the given formula holds for k. Then inductive hypothesis cos θ − sin θ cos kθ − sin kθ Ak+1 = A1 Ak = sin θ cos θ sin kθ cos kθ cos θ cos kθ − sin θ sin kθ −(cos θ sin kθ + sin θ cos kθ) . = sin θ cos kθ + cos θ sin kθ cos θ cos kθ − sin θ sin kθ But cos θ cos kθ − sin θ sin kθ = cos(θ + kθ) = cos(k + 1)θ, and sin θ cos kθ + cos θ sin kθ = sin(θ + kθ) = sin(k + 1)θ, so this matrix becomes cos(k + 1)θ − sin(k + 1)θ k+1 A = . sin(k + 1)θ cos(k + 1)θ So by induction, the formula holds for all k ≥ 1. 39. (a) aij = (−1)i+j means that if i + j is even, then the corresponding entry is 1; otherwise, it is −1.   1 −1 1 −1 −1 1 −1 1 . A=  1 −1 1 −1 −1 1 −1 1 3.1. MATRIX OPERATIONS 137 (b) Each entry is its row number minus its column number:   0 −1 −2 −3 1 0 −1 −2 . A= 2 1 0 −1 3 2 1 0 (c)  (1 − 1)1 (2 − 1)1 A= (3 − 1)1 (4 − 1)1 (1 − 1)2 (2 − 1)2 (3 − 1)2 (4 − 1)2 (1 − 1)3 (2 − 1)3 (3 − 1)3 (4 − 1)3   0 (1 − 1)4 1 (2 − 1)4  = (3 − 1)4  2 3 (4 − 1)4 0 1 4 9 0 1 8 27  0 1 . 16 81 (d)  sin (1+1−1)π sin (1+2−1)π sin (1+3−1)π 4 4 4  (2+1−1)π (2+2−1)π (2+3−1)π sin sin sin 4 4 4 A= sin (3+1−1)π sin (3+2−1)π sin (3+3−1)π  4 4 4 (4+2−1)π (4+3−1)π sin (4+1−1)π sin sin 4 4 4   π π 3π sin 4 sin 2 sin 4 sin π   π 3π  sin 2 sin 4  sin π sin 5π 4   = 3π 5π 3π  sin π sin 4 sin 2  sin 4 sin π sin 5π sin 3π sin 7π 4 2 4 √ √   2 2 1 0 2 √ √   2 2 2 1 0 − 2 2  √ = .  √2 0 − 22 −1   2  √ √ 0 − 22 −1 − 22 sin (1+4−1)π 4    sin (2+4−1)π 4  (3+4−1)π  sin  4 sin (4+4−1)π 4 40. (a) This matrix has zeros whenever the row number exceeds the column number, and the nonzero entries are the sum of the row and column numbers:   2 3 4 5 6 7 0 4 5 6 7 8    0 0 6 7 8 9  .  A=  0 0 0 8 9 10 0 0 0 0 10 11 0 0 0 0 0 12 (b) This matrix has 1’s on the main diagonal and one diagonal away from the main diagonal, and zeros elsewhere:   1 1 0 0 0 0 1 1 1 0 0 0   0 1 1 1 0 0  . A=  0 0 1 1 1 0 0 0 0 1 1 1 0 0 0 0 1 1 (c) This matrix is zero except where the row number plus the column number is 6, 7, or 8:   0 0 0 0 1 1 0 0 0 1 1 1   0 0 1 1 1 0  A= 0 1 1 1 0 0 .   1 1 1 0 0 0 1 1 0 0 0 0 138 CHAPTER 3. MATRICES 41. Let A be an m × n matrix with rows a1 , a2 , . . . , am . Since ei has a 1 in the ith position and zeros elsewhere, we get ei A = 0 · a1 + 0 · a2 + · · · + 1 · ai + · · · + 0 · am = ai , which is the ith row of A. 3.2 Matrix Algebra 1. Since X − 2A + 3B = O, we have X = 2A − 3B, so 1 2 −1 0 2 · 1 − 3 · (−1) X=2 −3 = 3 4 1 1 2·3−3·1 2. Since 2X = A − B, we have X = 21 (A − B), so 1 1 1 2 −1 X = (A − B) = − 3 4 1 2 2 2·2−3·0 5 = 2·4−3·1 3 4 . 5 1 1 2 0 = 1 2 2 1 2 = 3 1 #! " 0 2 −1 = 3 1 5 # " 2 −2 = 103 6 3 3 2 3. Solving for X gives 2 2 X = (A + 2B) = 3 3 " 1 3 # " 2 −1 +2 4 1 4 3 # 4 . 4. The given equation is 2A − 2B + 2X = 3X − 3A; solving for X gives X = 5A − 2B. 1 2 −1 0 7 10 −2 = X = 5A − 2B = 5 3 4 1 1 13 18 5. As in Example 3.16, we form a matrix whose columns consist of the entries of the matrices in question and row-reduce it:     1 0 2 1 0 2  2 1 5 0 1 1  −   −1 2 0 → 0 0 0 1 1 3 0 0 0 Thus 2 0 5 1 =2 3 −1 2 0 + 1 2 1 . 1 6. As in Example 3.16, we form a matrix whose columns consist of the entries of the matrices in question and row-reduce it:     1 0 1 2 1 0 0 3 0 −1 1   3   → 0 1 0 −4 0 0 1 −1 0 1 0 −4 1 0 1 2 0 0 0 0 so that 2 −4 3 1 =3 2 0 0 0 −4 1 1 −1 1 − 0 0 1 1 7. As in Example 3.16, we form a matrix whose columns consist of the entries of the matrices in question and row-reduce it:     1 −1 1 3 1 0 0 0  0 0 1 0 0 2 1 1      0 0 1 0 −1 0 1 1   . →  0  0 0 0   0 0 0 1  1   1 0 1 0 0 0 0 0 0 0 0 0 0 0 0 So the system is inconsistent, and B is not a linear combination of A1 , A2 , and A3 . 3.2. MATRIX ALGEBRA 139 8. As in Example 3.16, we form a matrix whose columns consist of the entries of the matrices in question and row-reduce it:     1 0 0 0 1 1 0 −1 1 2 0 1 0 0 2 0 1 0 −1 −2     0 0 1 0 3 0 1 −1  1 3     0 0 0 1 4 0 0 0 0 0      1 0  0 1 −1   → 0 0 0 0 0 0 0 0 0 0 0 1  0 −1 −2     0 0 0 0 0 0 0 0 0 0     0 0 0 0 0 0 0 0 0 0 2 1 0 −1 1 0 0 0 0 0 Since the linear system is consistent, we conclude that B = A1 + 2A2 + 3A3 + 4A4 . w x 9. As in Example 3.17, let be an element in the span of the matrices. Then form a matrix whose y z columns consist of the entries of the matrices in question and row-reduce it:     1 0 w 1 0 w  2 1 x 0 1 x − 2w   −   −1 2 y  → 0 0 y + 5w − 2x . 1 1 z 0 0 z+w−x This gives the restrictions y + 5w − 2x = 0, so that y = 2x − 5w, and z + w − x = 0, so that z = x − w. So the span consists of matrices of the form w x . 2x − 5w x − w w x 10. As in Example 3.17, let be an element in the span of the matrices. Then form a matrix whose y z columns consist of the entries of the matrices in question and row-reduce it:     1 0 0 w + x−y 1 0 1 w 2 0 1 0  0 −1 1 x x+y     2 →   y−x 0 0 1  0 1 0 y 2 1 0 1 z 0 0 0 z−w This gives the single restriction z − w = 0, so that z = w. So the span consists of matrices of the form w x . y w w x 11. As in Example 3.17, let be an element in the span of the matrices. Then form a matrix whose y z columns consist of the entries of the matrices in question and row-reduce it:     1 −1 1 a 1 0 0 e  0 0 1 0  2 1 b c+e     −1  0 0 1  0 1 c b − 2c − 2e →    0  0 0 0 a + 3b − 4c − 5e . 0 0 d       1 0 0 0 1 0 e d f 0 0 0 f 0 0 0 We get the restrictions a + 3b − 4c − 5e = 0, so that a = −3b + 4c + 5e, as well as d = f = 0. Thus the span consists of matrices of the form −3b + 4c + 5e b c . 0 e 0 140 CHAPTER 3. MATRICES   x1 x2 x3 12. As in Example 3.17, let x4 x5 x6  be an element in the span of the matrices. Then form a matrix x7 x8 x9 whose columns consist of the entries of the matrices in question and row-reduce it:     x1 +x5 1 0 0 0 1 0 −1 1 x1 2 0 1 0 0 0 1  1 0 −1 x2  x3 + x5 −x     2      0 1 −1  1 x 0 0 1 0 x + x − x − x 3 3 5 1 2    0 0 0 1 x3 − x2 + x5 −x1  0 0 0 0 x4      2     1 −1 x5  → 0 0 0 0 x6 − x2 1 0       0 1   0 −1 x 0 0 0 0 x 6 4     0 0   x7 0 0 x7  0 0 0 0       0 0 0 0 0 0  0 0 x8  x8 1 0 −1 1 0 x9 0 0 x9 − x1 0 We get the restrictions x6 = x2 , x9 = x1 , Thus the span consists of matrices of the form  x1 0 0 x2 x5 0 x4 = x7 = x8 = 0.  x3 x6  . x1 13. Following Example 3.18, we construct an augmented matrix whose columns are the entries of the given matrices and with zeros in the constant column, and row-reduce it:     1 4 0 1 0 0 2 3 0 0 1 0     3 2 0 → 0 0 0 4 1 0 0 0 0 Since the only solution is the trivial solution, we conclude that these matrices are linearly independent. 14. Following Example 3.18, we construct an augmented matrix whose columns are the entries of the given matrices and with zeros in the constant column, and row-reduce it:     1 2 1 0 1 0 13 0 2 0 1 1 0 1 1 0     3  →  4 −1 1 0 0 0 0 0 3 0 1 0 0 0 0 0 Since the system has a nontrivial solution, the original matrices are not linearly independent; in fact, from the solutions we get from the row-reduced matrix, 1 1 2 1 2 1 1 1 + = 1 1 3 4 3 3 −1 0 15. Following Example 3.18, we construct an augmented matrix whose columns are the entries of the given matrices and with zeros in the constant column, and row-reduce it:     1 0 0 0 0 0 1 −2 −1 0 0 1 0 0 0  1 0 −1 −3 0         5 2 0 1 0  → 0 0 1 0 0   2 3  0 0 0 1 0 1 9 0     0 0 0 0 0 −1 1 0 4 0 0 1 2 5 0 0 0 0 0 0 Since the only solution is the trivial solution, we conclude that these matrices are linearly independent. 3.2. MATRIX ALGEBRA 141 16. Following Example 3.18, we construct an augmented matrix whose columns are the entries of the given matrices and with zeros in the constant column, and row-reduce it:  1 −1   0   0   2   0   0   2 6 2 1 0 0 3 0 0 4 9 1 2 0 0 1 0 0 3 5 −1 1 0 0 −1 0 0 0 −4   1 0 0 0   0 0   0  0   0 →  0 0 0   0  0  0  0 0 0 0 1 0 0 0 0 0 0 0 0 0 1 0 0 0 0 0 0 0 0 0 1 0 0 0 0 0  0 0  0  0  0  0  0  0 0 The only solution is the trivial solution, so the matrices are linearly independent. 17. Let A, B, and C be m × n matrices. (a)  a11  .. A+B = . ... .. . ...   a1n b11 ..  +  .. .   .  . . . b1n ..  .. . .  . . . bmn  am1 amn bm1  a11 + b11 . . . a1n + b1n   .. .. .. =  . . . am1 + bm1 . . . amn + bmn   b11 + a11 . . . b1n + a1n   .. .. .. =  . . . bm1 + am1 . . . bmn + amn    b11 . . . b1n a11 . . .  ..   . . .. .. ..  +  ... = . . bm1 = B + A. ... bmn am1 ...  a1n ..  .  amn 142 CHAPTER 3. MATRICES (b)  a11  .. (A + B) + C =  . am1   =   c11 . . . . . . b1n ..  +  .. .. .. . . .   . cm1 . . . bm1 . . . bmn amn    c11 . . . c1n . . . a1n + b1n   .. ..  .. .. .. + . . . .  .   b11 a1n ..  +  .. .   . ... .. . ... a11 + b11 .. .  c1n ..  .  cmn cm1 . . . cmn am1 + bm1 . . . amn + bmn   a11 + b11 + c11 . . . a1n + b1n + c1n   .. .. .. =  . . . am1 + bm1 + cm1 . . . amn + bmn + cmn     a11 . . . a1n b11 + c11 . . . b1n + c1n   ..  +  .. .. .. .. =  ...  . . .   . . am1 . . . amn  b11 . . .  .. .. = A +  . . bm1 bm1 + cm1 . . . bmn + cmn    b1n c11 . . . c1n ..  +  .. ..  .. . .   . .  bmn cm1 . . . cmn ... = A + (B + C). (c)  a11  .. A+O = . am1   . . . a1n 0 ..  +  .. .. . .  . . . . amn 0   0 a11 + 0 ..  =  .. .  . ... .. . ... 0 ... .. . ... am1 + 0  a1n + 0  ..  . amn + 0  a11  .. = . am1 (d)  a11  .. A + (−A) =  . am1  a11  .. = . ... .. . ... ... .. . ...    a1n a11 ..  + −  .. .    . amn am1  a1n −a11 ..  +  .. .   . am1 amn  a11 − a11 . . .  .. .. = . . am1 − am1 . . .   0 ... 0   =  ... . . . ...  0 = O. ... 0  −am1 ... .. . ...  a1n − a1n  ..  . amn − amn ... .. . ...  a1n ..  .  amn  −a1n ..  .  −amn ... .. . ...  a1n ..  = A. .  amn 3.2. MATRIX ALGEBRA 143 18. Let A and B be m × n matrices, and c and d be scalars. (a)  a11  .. c(A + B) = c  .   a1n b11 ..  +  .. .   . ... .. . ... am1 amn  a11 + b11 . . .  .. .. = c  . . am1 + bm1 ... bm1  . . . b1n ..  .. . .  . . . bmn  a1n + b1n  ..  . amn + bmn  . . . c(a1n + b1n )  .. ..  . . c(am1 + bm1 ) . . . c(amn + bmn )   ca11 + cb11 . . . ca1n + cb1n   .. .. .. =  . . .  c(a11 + b11 )  .. = . cam1 + cbm1 . . . camn + cbmn    ca11 . . . ca1n cb11 . . .  ..   .. . . .. . . = . . . . + . cam1  a11  .. = c .  cb1n ..  .  ... camn cbm1 . . . cbmn    . . . a1n b11 . . . b1n ..  + c  .. ..  .. ..  . . . .  .  . . . amn bm1 . . . bmn am1 = cA + cB. (b)  a11  (c + d)A = (c + d)  ... am1  (c + d)a11  .. = . (c + d)am1  ca11 + da11  .. = . ... .. . ... ... .. . ...  a1n ..  .  amn  (c + d)a1n  ..  . (c + d)amn  ... ca1n + d1n  .. ..  . . cam1 + dam1 . . . camn + damn     ca11 . . . ca1n da11 . . . da1n  ..  +  .. ..  .. .. =  ... . . .   . .  cam1  a11  .. = c . am1 ... camn dam1 . . . damn    . . . a1n a11 . . . a1n ..  + d  .. ..  .. ..  . . . .  .  . . . amn am1 . . . amn = cA + dA. 144 CHAPTER 3. MATRICES (c)  da11  .. c(dA) = c  . ... .. . ... dam1   da1n cda11 ..  =  .. .   . damn ... .. . ... cdam1   cda1n a11 ..  = cd  ..  . .  cdamn am1   a11 1a1n ..  =  .. .   . ... .. . ... ... .. . ...  a1n ..  . .  amn (d)  a11  .. 1A = 1  . am1   1a11 . . . a1n ..  =  .. .. . .   . 1am1 . . . amn ... .. . ... 1amn am1  a1n ..  = A. .  amn 19. Let A and B be r × n matrices, and C be an n × s matrix, so that the product makes sense. Then       c11 . . . c1s b11 . . . b1n a11 . . . a1n  ..  ..   .. ..  +  .. .. .. .. (A + B)C =  ... . . . .  .   . .   . cn1 . . . cns br1 . . . brn ar1 . . . arn    a11 + b11 . . . a1n + b1n c11 . . . c1s    .. .. .. ..  .. .. =  . . . . . .  ar1 + br1 . . . arn + brn cn1 . . . cns   (a11 + b11 )c11 + · · · + (a1n + b1n )cn1 . . . (a11 + b11 )c1s + · · · + (a1n + b1n )cns   .. .. .. =  . . . (ar1 + br1 )c11 + · · · + (arn + brn )cn1 . . . (ar1 + br1 )c1s + · · · + (ar1 + br1 )c1s   a11 c11 + · · · + a1n cn1 . . . a11 c1s + · · · + a1n cns   .. .. .. = + . . . ar1 c11 + · · · + arn cn1 . . . ar1 c1s + · · · + ar1 cns   b11 c11 + · · · + b1n cn1 . . . b11 c1s + · · · + b1n cns   .. .. ..   . . . br1 c11 + · · · + brn cn1 ... br1 c1s + · · · + br1 cns = AC + BC. 20. Let A be an r × n matrix and C be an n × s matrix, so that the product makes sense. Let k be a scalar. Then   a11 c11 + · · · + a1n cn1 · · · a11 c1s + · · · + a1n cns   .. .. .. k(AB) = k   . . . ar1 c11 + · · · + arn cn1 · · · ar1 c1s + · · · + arn cns   k(a11 c11 + · · · + a1n cn1 ) · · · k(a11 c1s + · · · + a1n cns )   .. .. .. =  . . . k(ar1 c11 + · · · + arn cn1 ) · · · k(ar1 c1s + · · · + arn cns )   (ka11 )c11 + · · · + (ka1n )cn1 · · · (ka11 )c1s + · · · + (ka1n )cns   .. .. .. =  . . . (kar1 )c11 + · · · + (karn )cn1 · · · (kar1 )c1s + · · · + (karn )cns    ka11 . . . ka1n c11 . . . c1s  ..   .. ..  .. .. =  ... . . .  . .  kar1 = (kA)B. ... karn cn1 ... cns 3.2. MATRIX ALGEBRA 145 That proves the first equality. Regrouping the expression on the second line above differently, we get   a11 (kc11 ) + · · · + a1n (kcn1 ) · · · a11 (kc1s ) + · · · + a1n (kcns )   .. .. .. k(AB) =   . . . ar1 (kc11 ) + · · · + arn (kcn1 ) · · · ar1 (kc1s ) + · · · + arn (kcns )    a11 . . . a1n kc11 . . . kc1s  ..   .. ..  .. .. =  ... . . .  . .  ar1 . . . arn kcn1 . . . kcns = A(kB). 21. If A is m × n, then to prove Im A = A, note that   e1  e2    Im =  .  ,  ..  em where ei is a standard (row) unit vector; then       A1 e1 e1 A  e2   e2 A   A2        Im A =  .  A =  .  =  .  = A.  ..   ..   ..  em em A Am 22. Note that (A − B)(A + B) = A(A + B) − B(A + B) = A2 + AB − BA + B 2 = (A2 + B 2 ) + (AB − BA), so that (A − B)(A + B) = A2 + B 2 if and only if AB − BA = O; that is, if and only if AB = BA. 23. Compute both AB and BA: 1 1 a b a+c b+d = 0 1 c d c d a b 1 1 a a+b BA = = . c d 0 1 c c+d AB = So if AB = BA, then a + c = a, b + d = a + b, c = c, and d = c + d. The first equation gives c = 0 and the second gives d = a. So the matrices that commute with A are matrices of the form a b . 0 a 24. Compute both AB and BA: 1 −1 a b a−c b−d AB = = −1 1 c d −a + c −b + d a b 1 −1 a − b −a + b BA = = . c d −1 1 c − d −c + d So if AB = BA, then a − c = a − b, b − d = −a + b, −a + c = c − d, and −b + d = −c + d. The first (or fourth) equation gives b = c and the second (or third) gives d = a. So the matrices that commute with A are matrices of the form a b . b a 146 CHAPTER 3. MATRICES 25. Compute both AB and BA: 1 2 a b a + 2c b + 2d = 3 4 c d 3a + 4c 3b + 4d a b 1 2 a + 3b 2a + 4b BA = = . c d 3 4 c + 3d 2c + 4d AB = So if AB = BA, then a + 2c = a + 3b, b + 2d = 2a + 4b, 3a + 4c = c + 3d, and 3b + 4d = 2c + 4d. The first (or fourth) equation gives 3b = 2c and the third gives a = d − c. Substituting a = d − c into the second equation gives 3b = 2c again. Thus those are the only two conditions, and the matrices that commute with A are those of the form # " d − c 32 c . c d 26. Let A1 be the first matrix and A4 be the second. Then 1 0 a b a A1 B = = 0 0 c d 0 a b 1 0 a BA1 = = c d 0 0 c 0 0 a b 0 A4 B = = 0 1 c d c a b 0 0 0 BA4 = = c d 0 1 0 b 0 0 0 0 d b . d So the conditions imposed by requiring that A1 B = BA1 and A4 B = BA4 are b = c = 0. Thus the matrices that commute with both A1 and A4 are those of the form a 0 0 d 27. Since we want B to commute with every2 × 2matrix, in particular it must commute with A1 and A4 a 0 from the previous exercise, so that B = . In addition, it must commute with 0 d A2 = 0 0 1 0 0 1 0 . 0 0 0 = d 0 1 0 = 0 0 0 0 = d a 0 0 = 0 d d 0 a 0 0 0 0 . 0 and A3 = Computing the products, we get 0 0 a BA2 = 0 0 A3 B = 1 a BA3 = 0 A2 B = 1 a 0 0 0 0 d 0 0 a 0 0 0 0 d 1 The condition that arises from these equalities is a = d. Thus a 0 B= . 0 a 3.2. MATRIX ALGEBRA 147 Finally, note that for any 2 × 2 matrix M , x y M= = xA1 + yA2 + zA3 + wA4 , z w so that if B commutes with A1 , A2 , A3 , and A4 , then it will commute with M . So the matrices B that satisfy this condition are the matrices above: those where the off-diagonal entries are zero and the diagonal entries are the same. 28. Suppose A is an m × n matrix and B is an a × b matrix. Since AB is defined, the number of columns (n) of A must equal the number of rows (a) of B; since BA is defined, the number of columns (b) of B must equal the number of rows (m) of A. Since m = b and n = a, we see that A is m × n and B is n × m. But then AB is m × m and BA is n × n, so both are square. 29. Suppose A = (aij ) and B = (bij ) are both upper triangular n × n matrices. This means that aij = bij = 0 if i > j. But if C = (cij ) = AB, then ckl = ak1 b1l + ak2 b2l + · · · + akl bll + ak(l+1) b(l+1)l + · · · + akn bnl . If k > l, then ak1 b1l + ak2 b2l + · · · + akl bll = 0 since each of the akj is zero. But also ak(l+1) b(l+1)l + · · · + akn bnl = 0 since each of the bil is zero. Thus ckl = 0 for k > l, so that C is upper triangular. 30. Let A and B whose sizes permit the indicated operations, and let k be a scalar. Denote the ith row of a matrix X by rowi (X) and its j th column by colj (X). Then Theorem 3.4(a): For any i and j, h T i = AT ji = A ij , so that (AT )T = A. AT ij Theorem 3.4(b): For any i and j, (A + B)T ij = A + B ji = A ji + B ji = AT ij + B ij , so that (A + B)T = AT + B T . Theorem 3.4(c): For any i and j, (kA)T ij = kA ji = k A ji = k A ij , so that (kA)T = k(AT ). 31. We prove that (Ar )T = (AT )r by induction on r. For r = 1, this says that (A1 )T = (AT )1 , which is clear. Now assume that (Ak )T = (AT )k . Then (Ak+1 )T = (A · Ak )T = (Ak )T AT = (AT )k AT = (AT )k+1 . (The second equality is from Theorem 3.4(d), and the third is the inductive assumption). 32. We prove that (A1 + A2 + · · · + An )T = AT1 + AT2 + · · · + ATn by induction on n. For n = 1, this says that AT1 = AT1 , which is clear. Now assume that (A1 + A2 + · · · + Ak )T = AT1 + AT2 + · · · + ATk . Then (A1 + A2 + · · · + Ak + Ak+1 )T = ((A1 + A2 + · · · + Ak ) + Ak+1 )T = (A1 + A2 + · · · + Ak )T + ATk+1 = AT1 + AT2 + · · · + ATk + ATk+1 (Thm. 3.4(b)) (Inductive assumption). 33. We prove that (A1 A2 · · · An )T = ATn ATn−1 · · · AT2 AT1 by induction on n. For n = 1, this says that AT1 = AT1 , which is clear. Now assume that (A1 A2 · · · Ak )T = ATk ATk−1 · · · AT2 AT1 . Then (A1 A2 · · · Ak Ak+1 )T = ((A1 A2 · · · Ak )Ak+1 )T = ATk+1 (A1 A2 · · · Ak )T = ATk+1 ATk · · · AT2 AT1 (Thm. 3.4(d)) (Inductive assumption). 148 CHAPTER 3. MATRICES 34. Using Theorem 3.4, we have (AAT )T = (AT )T AT = AAT and (AT A)T = AT (AT )T = AT A. So each of AAT and AT A is equal to its own transpose, so both matrices are symmetric. 35. Let A and B be symmetric n × n matrices, and let k be a scalar. (a) By Theorem 3.4(b), (A + B)T = AT + B T ; since both A and B are symmetric, this is equal to A + B. Thus A + B is equal to its transpose, so is symmetric. (b) By Theorem 3.4(c), (kA)T = kAT ; since A is symmetric, this is equal to kA. Thus kA is equal to its own transpose, so is symmetric 1 1 0 1 36. (a) For example, let n = 2, with A = and B = . Then both A and B are symmetric, 1 0 1 0 but 1 1 0 1 1 1 AB = = , 1 0 1 0 0 1 which is not symmetric. (b) Suppose A and B are symmetric. Then if AB is symmetric, we have AB = (AB)T = B T AT = BA using Theorem 3.4(d) and the fact that A and B are symmetric. In the other direction, if A and B are symmetric and AB = BA, then (AB)T = B T AT = BA = AB, so that AB is symmetric. 1 −2 −1 −2 T 37. (a) A = while −A = , so AT 6= −A and A is not skew-symmetric. 2 3 2 −3 0 1 T (b) A = = −A, so A is skew-symmetric. −1 0   0 −3 1 0 −2 = −A, so A is skew-symmetric. (c) AT =  3 −1 2 0   0 −1 2 0 5 6= −A, so A is not skew-symmetric. (d) AT = 1 2 5 0 38. A matrix A is skew-symmetric if AT = −A. But this means that for every i and j, (AT )ij = Aji = −Aij . Thus the components must satisfy aij = −aji for every i and j. 39. By the previous exercise, we must have aij = −aji for every i and j. So when i = j, we get aii = −aii , which implies that aii = 0 for all i. 40. If A and B are skew-symmetric, so that AT = −A and B T = −B, then (A + B)T = AT + B T = −A − B = −(A + B), so that A + B is also skew-symmetric. 0 a 0 b 41. Suppose that A = and B = are skew-symmetric. Then −a 0 −b 0 −ab 0 −ab 0 T AB = , and thus (AB) = . 0 −ab 0 −ab So AB = −(AB)T only if ab = −ab, which happens if either a or b is zero. Thus either A or B must be the zero matrix. 3.2. MATRIX ALGEBRA 149 42. We must show that A−AT is skew-symmetric, so we must show that (A−AT )T = −(A−AT ) = AT −A. But (A − AT )T = AT − (AT )T = AT − A as desired, proving the result. 43. (a) If A is a square matrix, let S = A + AT and S 0 = A − AT . Then clearly A = 12 S + 12 S 0 . S is symmetric by Theorem 3.5, so 21 S is as well. S 0 is symmetric by Exercise 42, so 12 S 0 is as well. Thus A is a sum of a symmetric and a skew-symmetric matrix. (b) We have       1 2 3 1 4 7 2 6 10 S = A + AT = 4 5 6 + 2 5 8 =  6 10 14 7 8 9 3 6 9 10 14 18       1 2 3 1 4 7 0 −2 −4 S = A − AT = 4 5 6 − 2 5 8 = 2 0 −2 . 7 8 9 3 6 9 4 2 0 Thus  1 1 0  1 A= S+ S = 3 2 2 5 3 5 7   5 0 7 + 1 9 2   −1 −2 1 0 −1 = 4 1 0 7 2 5 8  3 6 . 9 44. Let A and B be n × n matrices, and k be a scalar. Then (a) tr(A + B) = (a11 + b11 ) + (a22 + b22 ) + · · · + (ann + bnn ) = (a11 + a22 + · · · + ann ) + (b11 + b22 + · · · + bnn ) = tr(A) + tr(B). (b) tr(kA) = ka11 + ka22 + · · · + kann = k(a11 + a22 + · · · + ann ) = k tr(A). 45. Let A and B be n × n matrices. Then tr(AB) = (AB)11 + (AB)22 + · · · + (AB)nn = (a11 b11 + a12 b21 + · · · + a1n bn1 ) + (a21 b12 + a22 b22 + · · · + a2n bn2 ) + · · · (an1 b1n + an2 b2n + · · · + ann bnn ) = (b11 a11 + b12 a21 + · · · + b1n an1 ) + (b21 a12 + b22 a22 + · · · + b2n an2 ) + · · · (bn1 a1n + bn2 a2n + · · · + bnn ann ) = tr(BA). 46. Note that the ii entry of AAT is equal to (AAT )ii = (A)i1 (AT )1i + (A)i2 (AT )2i + · · · (A)in (AT )ni = a2i1 + a2i2 + · · · + a2in , so that tr(AAT ) = (a211 + a212 + · · · + a21n ) + (a221 + a222 + · · · + a22n ) + (a2n1 + a2n2 + · · · + a2nn ). That is, it is the sum of the squares of the entries of A. 47. Note that by Exercise 45, tr(AB) = tr(BA). By Exercise 44, then, tr(AB − BA) = tr(AB) − tr(BA) = tr(AB) − tr(AB) = 0. Since tr(I2 ) = 2, it cannot be the case that AB − BA = I2 . 150 3.3 CHAPTER 3. MATRICES The Inverse of a Matrix 1. Using Theorem 3.8, note that 4 det 1 7 = ad − bc = 4 · 2 − 1 · 7 = 1, 2 so that −1 1 7 2 = 2 −1 1 4 1 −7 2 = 4 −1 −7 . 4 As a check, 4 1 7 2 2 −1 −7 4 · 2 + 7 · (−1) 4 · (−7) + 7 · 4 1 = = 4 1 · 2 + 2 · (−1) 1 · (−7) + 2 · 4 0 0 . 1 2. Using Theorem 3.8, note that 4 2 −2 = ad − bc = 4 · 0 − (−2) · 2 = 4, 0 " 4 −2 2 0 det so that #−1 # 2 " 0 1 = 4 −2 4 " = # 0 1 2 − 12 1 As a check, " 4 2 # " 0 −2 · 1 0 −2 1 2 # 1 " = 4 · 0 − 2 · − 21 4 · 12 − 2 · 1 2 · 0 + 0 · − 21 2 · 12 + 0 · 1 # = " 1 0 0 1 # 3. Using Theorem 3.8, note that 3 det 6 4 = ad − bc = 3 · 8 − 4 · 6 = 0. 8 Since the determinant is zero, the matrix is not invertible. 4. Using Theorem 3.8, note that 1 = ad − bc = 0 · 0 − 1 · (−1) = 1, 0 1 0 0 det −1 so that 0 −1 −1 0 =1· 1 −1 0 = 0 1 −1 0 As a check, 0 −1 1 0 · 0 1 5. Using Theorem 3.8, note that " det 3 4 5 6 −1 0·0+1·1 = 0 −1 · 0 + 0 · 1 3 5 2 3 # = ad − bc = 0 · (−1) + 1 · 0 1 = −1 · (−1) + 0 · 0 0 3 2 3 5 1 1 · − · = − = 0. 4 3 5 6 2 2 Since the determinant is zero, the matrix is not invertible. 0 1 3.3. THE INVERSE OF A MATRIX 6. Using Theorem 3.8, note that " √1 2 − √12 det 151 √1 2 √1 2 # 1 −1 1 1 1 1 = ad − bc = √ · √ − √ · √ = + = 1 2 2 2 2 2 2 so that " As a check, " √1 2 √1 2 − √12 √1 2 − √12 # " √1 2 − √12 · √1 2 √1 2 √1 2 √1 2 √1 2 # #−1 =1· " = " √1 2 √1 2 − √12 # √1 2 " = √1 · √1 + − √1 · − √1 2 2 2 2 √1 · √1 + √1 · − √1 2 2 2 2 √1 2 √1 2 − √12 # √1 2 √1 · √1 + − √1 · √1 2 2 2 2 √1 · √1 + √1 · √1 2 2 2 2 # = " 1 # 0 0 1 7. Using Theorem 3.8, note that −1.5 −4.2 = ad − bc = −1.5 · 2.4 − (−4.2) · 0.5 = −3.6 + 2.1 = −1.5. det 0.5 2.4 Then −1 1 2.4 −4.2 · =− 2.4 1.5 −0.5 −1.5 0.5 4.2 −1.6 = −1.5 0.333 As a check, −1.5 −4.2 −1.6 −2.8 −1.5 · (−1.6) − 4.2 · 0.333 = 0.5 2.4 0.333 1 0.5 · (−1.6) + 2.4 · 0.333 −2.8 1 −1.5 · (−2.8) − 4.2 · 1 1 ≈ 0.5 · (−2.8) + 2.4 · 1 0 0 . 1 8. Using Theorem 3.8, note that 3.55 0.25 det = ad − bc = 3.55 · 0.5 − 0.25 · 8.52 = 1.775 − 2.13 = −0.335. 8.52 0.50 Then −1 1 0.25 0.5 −0.25 −1.408 =− · ≈ 0.50 24 0.335 −8.52 3.55 3.55 8.52 As a check, 3.55 0.25 −1.408 8.52 0.50 24 0.704 −10 0.704 3.55 · (−1.408) + 0.25 · 24 3.55 · 0.704 + 0.25 · (−10) 1 = ≈ −10 8.52 · (−1.408) + 0.50 · 24 8.52 · 0.704 + 0.50 · (−10) 0 0 . 1 9. Using Theorem 3.8, note that a −b det = a · a − (−b) · b = a2 + b2 . b a Then " " #−1 a a −b 1 · = 2 a + b2 −b b a # " a b 2 2 = a +bb a − a2 +b2 b # a2 +b2 a a2 +b2 As a check, " #" a a −b a2 +b2 b b a − a2 +b 2 b a2 +b2 a a2 +b2 #  b a − a2 +b a · a2 +b 2 − b · 2 = a b b · a2 +b − a2 +b 2 + a · 2 b a a · a2 +b 2 − b · a2 +b2  b a b · a2 +b 2 + a · a2 +b2 " 2 2  = a +b a2 +b2 ab−ab a2 +b2 ab−ab a2 +b2 a2 +b2 a2 +b2 # " 1 = 0 # 0 . 1 152 CHAPTER 3. MATRICES 10. Using Theorem 3.8, note that " det Then " 1 a 1 c 1 b 1 d 1 a 1 c #−1 1 b 1 d # = 1 1 1 1 1 1 bc − ad · − · = − = a d b c ad bc abcd " 1 abcd = · d1 bc − ad − c − 1b 1 a # " = abc bc−ad abd − bc−ad acd − bc−ad bcd bc−ad # . As a check, " 1 a 1 c 1 b 1 d #" abc bc−ad abd − bc−ad acd − bc−ad bcd bc−ad # " = bc ad bc−ad − bc−ad ab ab bc−ad − bc−ad # " cd cd 1 − bc−ad + bc−ad = ad bc − bc−ad + bc−ad 0 # 0 . 1 11. Following Example 3.25, we first invert the coefficient matrix 2 1 . 5 3 The determinant of this matrix is 2 · 3 − 1 · 5 = 1, so its inverse is 1 3 −1 3 −1 = . 2 −5 2 1 −5 Thus the solution to the system is given by −1 x 2 1 −1 3 = · = y 5 3 2 −5 −1 −1 3 · (−1) − 1 · 2 −5 · = = , 2 2 −5 · (−1) + 2 · 2 9 so the solution is x = −5, y = 9. 12. Following Example 3.25, we first invert the coefficient matrix 1 −1 . 2 1 The determinant of this matrix is 1 · 1 − (−1) · 2 = 3, so its inverse is # " " # 1 1 1 1 1 3 3 · = 3 −2 1 −2 1 3 3 and thus the solution to the system is given by " # x y = " 1 2 #−1 " # " 1 −1 1 3 · = − 32 1 2 1 3 1 3 # " # " # " # 1 1 1 1 3 ·1+ 3 ·2 · = = 2 − 23 · 1 + 31 · 2 0 and the solution is x = 1, y = 0. 13. (a) Following Example 3.25, we first invert the coefficient matrix 1 2 . 2 6 The determinant of this matrix is 1 · 6 − 2 · 2 = 2, so its inverse is " # " # 6 −2 3 −1 1 · = . 1 2 −2 1 −1 2 3.3. THE INVERSE OF A MATRIX 153 Then the solutions for the three vectors given are 3 −1 3 4 x1 = A−1 b1 = = 1 5 −1 − 12 2 3 −1 −1 −5 x2 = A−1 b2 = = 1 2 2 −1 2 3 −1 2 6 = . x3 = A−1 b3 = 1 0 −2 −1 2 (b) To solve all three systems at once, form the augmented matrix A | b1 reduce to solve. 1 0 4 −5 6 1 2 3 −1 2 → 2 0 2 6 5 0 1 − 12 2 −2 b2 b3 and row (c) The method of constructing A−1 requires 7 multiplications, while row reductions requires only 6. 14. To show the result, we must show that (cA) · 1c A−1 = I. But 1 (cA) · A−1 = c 1 · c AA−1 = AA−1 = I. c 15. To show the result, we must show that AT (A−1 )T = I. But AT (A−1 )T = (A−1 A)T = I T = I, where the first equality follows from Theorem 3.4(d). 16. First we show that In−1 = In . To see that, we need to prove that In · In = In , but that is true. Thus In−1 = In , and it also follows that In is invertible, since we found its inverse. 17. (a) There are many possible counterexamples. For instance, let 1 2 0 1 1 AB = 2 0 1 A= B= 1 1 0 . −1 Then 1 1 0 1 = −1 3 0 . −1 Then det(AB) = 1 · (−1) − 0 · 3 = −1, det A = 1 · 1 − 0 · 2 = 1, and det B = 1 · (−1) − 0 · 1 = −1, so that 1 −1 0 1 0 −1 (AB) = = 3 −1 −1 −3 1 1 1 0 1 0 −1 A = = −2 1 1 −2 1 1 −1 0 1 0 −1 B = = . 1 −1 −1 −1 1 Finally, A and (AB)−1 6= A−1 B −1 . −1 B −1 1 = −2 0 1 1 1 0 1 = −1 −1 0 , −1 154 CHAPTER 3. MATRICES (b) Claim that (AB)−1 = A−1 B −1 if and only if AB = BA. First, if (AB)−1 = A−1 B −1 , then AB = ((AB)−1 )−1 = (A−1 B −1 )−1 = (B −1 )−1 (A−1 )−1 = BA. To prove the reverse, suppose that AB = BA. Then (AB)−1 = (BA)−1 = A−1 B −1 . This proves the claim. 18. Use induction on n. For n = 1, the statement says that (A1 )−1 = A−1 1 , which is clear. Now assume that the statement holds for n = k. Then −1 (A1 A2 · · · Ak Ak+1 )−1 = ((A1 A2 · · · Ak )Ak+1 )−1 = A−1 k+1 (A1 A2 · · · Ak ) by Theorem 3.9(c). But then by the inductive hypothesis, this becomes −1 −1 −1 −1 −1 −1 −1 −1 A−1 = A−1 k+1 (A1 A2 · · · Ak ) k+1 (Ak · · · A2 A1 ) = Ak+1 Ak · · · A2 A1 . So by induction, the statement holds for all k ≥ 1. 19. There are many examples. For instance, let 1 0 A= , 0 1 B = −A = −1 0 0 . −1 Then A−1 = A and B −1 = B = −A, so that A−1 + B −1 = O. But A + B = O, so (A + B)−1 does not exist. 20. XA2 = A−1 ⇒ (XA2 )A−2 = A−1 A−2 ⇒ X(A2 A−2 ) = A−1+(−2) ⇒ XI = A−3 ⇒ X = A−3 . 21. AXB = (BA)2 ⇒ A−1 (AXB)B −1 = A−1 (BA)2 B −1 ⇒ (A−1 A)X(BB −1 ) = A−1 (BA)2 B −1 . Thus X = A−1 (BA)2 B −1 . 22. By Theorem 3.9, (A−1 X)−1 = X −1 (A−1 )−1 = X −1 A and (B −2 A)−1 = A−1 (B −2 )−1 = A−1 B 2 , so the original equation becomes X −1 A = AA−1 B 2 = B 2 . Then X −1 A = B 2 ⇒ XX −1 A = XB 2 ⇒ A = XB 2 ⇒ AB −2 = XB 2 B −2 = X, so that X = AB −2 . 23. ABXA−1 B −1 = I + A, so that B −1 A−1 (ABXA−1 B −1 )BA = B −1 A−1 (I + A)BA = B −1 A−1 BA + B −1 A−1 ABA But B −1 A−1 (ABXA−1 B −1 )BA = (B −1 A−1 AB)X(A−1 B −1 BA) = X, and B −1 A−1 BA + B −1 A−1 ABA = B −1 A−1 BA + A. Thus X = B −1 A−1 BA + A = (AB)−1 BA + A.   0 0 1 24. To get from A to B, reverse the first and third rows. Thus E = 0 1 0. 1 0 0   0 0 1 25. To get from B to A, reverse the first and third rows. Thus E = 0 1 0. 1 0 0   1 0 0 26. To get from A to C, add the first row to the third row. Thus E = 0 1 0. 1 0 1 3.3. THE INVERSE OF A MATRIX 155   0 0. 1  1 0 28. To get from C to D, subtract twice the third row from the second row. Thus E = 0 1 0 0   1 0 0 29. To get from D to C, add twice the third row to the second row. Thus E = 0 1 2. 0 0 1 1 27. To get from C to A, subtract the first row from the third row. Thus E =  0 −1 0 1 0  0 −2. 1 30. The row operations R2 − 2R1 − 2R3 → R2 , R3 + R1 → R3 will turn matrix A into matrix D. Thus        1 0 0 1 0 0 1 2 −1 1 2 −1 1 1 = −3 −1 3 = D. E = −2 1 −2 satisfies EA = −2 1 −2 1 1 0 1 1 0 1 1 −1 0 2 1 −1 However, E is not an elementary matrix since it includes several row operations, not just one. 31. This matrix multiplies the first row by 3, so its inverse multiplies the first row by 31 : 3 0 −1 1 0 = 3 1 0 0 . 1 32. This matrix adds twice the second row to the first row, so its inverse subtracts twice the second row from the first row: −1 1 2 1 −2 = . 0 1 0 1 33. This matrix reverses rows 1 and 2, so its inverse does the same thing. Hence this matrix is its own inverse. 34. This matrix adds − 21 times the first row to the second row, so its inverse adds 12 times the first row to the second row: −1 1 0 1 0 = . 1 1 − 12 1 2 35. This matrix subtracts twice the third row from the second row, so its inverse adds twice the third row to the second row:  −1   1 0 0 1 0 0 0 1 −2 = 0 1 2 . 0 0 1 0 0 1 36. This matrix reverses rows 1 and 3, so its inverse does the same thing. Hence this matrix is its own inverse. 37. This matrix multiplies row 2 by c. Since c 6= 0, its inverse multiplies the second row by 1c :  1 0 0 0 c 0 −1  1 0 0 = 0 1 0 0 1 c 0  0 0 . 1 38. This matrix adds c times the third row to the second row, so its inverse subtracts c times the third row from the second row:  −1   1 0 0 1 0 0 0 1 c  = 0 1 −c . 0 0 1 0 0 1 156 CHAPTER 3. MATRICES 39. As in Example 3.29, we first row-reduce A: 1 0 1 A= R2 +R1 −1 −2 − −−−−→ 0 0 1 1 − 2 R2 −2 − −−−→ 0 0 . 1 The first row operation adds the first row to the second row, and the second row operation multiplies the second row by − 12 . These correspond to the elementary matrices 1 0 1 0 E1 = and E2 = . 1 1 0 − 12 So E2 E1 A = I and thus A = E1−1 E2−1 . But E1−1 is R2 − R1 → R2 , and E2−1 is −2R2 → R2 , so 1 0 1 0 −1 −1 E1 = and E2 = , −1 1 0 −2 so that A = E1−1 E2−1 = 0 1 1 0 1 −1 0 . −2 Taking the inverse of both sides of the equation A = E1−1 E2−1 gives A−1 = (E1−1 E2−1 )−1 = (E2−1 )−1 (E1−1 )−1 = E2 E1 , so that #" 1 − 21 1 # " 0 1 = 1 − 12 − 21 40. As in Example 3.29, we first row-reduce A: 2 4 R1 ↔R2 1 1 1 A= R2 −2R1 1 1 −−−−−→ 2 4 − −−−−→ 0 1 1 1 2 R2 2 −− −→ 0 R1 −R2 1 − −−−−→ 1 1 0 " 1 A−1 = E2 E1 = 0 0 Converting each of these steps to an elementary matrix gives 1 0 0 1 1 0 , E1 = , E2 = , E3 = 1 0 −2 1 0 12 0 # . E4 = 1 0 0 . 1 −1 . 1 So E4 E3 E2 E1 A = I and thus A = E1−1 E2−1 E3−1 E4−1 . We can invert each of the elementary matrices above by considering the row operation which reverses its action, giving 0 1 1 0 1 0 1 1 −1 −1 −1 −1 E1 = , E2 = , E3 = , E4 = , 1 0 2 1 0 2 0 1 so that A = E1−1 E2−1 E4−1 = 0 1 1 1 0 2 0 1 1 0 0 1 2 0 1 . 1 Taking the inverse of both sides of the equation A = E1−1 E2−1 E3−1 E4−1 gives A−1 = E4 E3 E2 E1 , so that #" #" # " # " #" 1 0 0 1 − 12 2 1 −1 1 0 −1 A = E4 E3 E2 E1 = = . 1 −1 0 1 0 21 −2 1 1 0 2 41. Suppose AB = I and consider the equation Bx = 0. Left-multiplying by A gives ABx = A0 = 0. But AB = I, so ABx = Ix = x. Thus x = 0. So the system represented by Bx = 0 has the unique solution x = 0. From the equivalence of (c) and (a) in the Fundamental Theorem, we know that B is invertible (i.e., B −1 exists and satisfies BB −1 = I = B −1 B.) So multiply both sides of the equation AB = I on the right by B −1 , giving ABB −1 = IB −1 ⇒ AI = IB −1 ⇒ A = B −1 . This proves the theorem. 3.3. THE INVERSE OF A MATRIX 157 42. (a) Let A be invertible and suppose that AB = O. Left-multiply both sides of this equation by A−1 , giving A−1 AB = A−1 O. But A−1 A = I and A−1 O = O, so we get IB = B = O. Thus B is the zero matrix. (b) For example, let A= Then 1 AB = 2 1 2 2 , 4 2 2 4 −1 B= −2 . 1 2 −1 −2 0 = 1 0 0 , but B 6= O. 0 43. (a) Since A is invertible, we can multiply both sides of BA = CA on the right by A−1 , giving BAA−1 = CAA−1 , so that BI = CI and thus B = C. (b) For example, let 2 , 4 1 B= 1 1 , 1 2 3 = 4 3 6 3 = 6 1 0 1 1 2 2 = CA, but B 6= C. 4 44. (a) There are many possibilities. For example: 1 0 1 A= = I, B = 0 1 1 0 , 0 1 C= 0 A2 = I 2 = I = A 1 0 1 0 1 2 B = = 1 0 1 0 1 1 1 1 1 1 C2 = = 0 0 0 0 0 0 =B 0 1 = C, 0 1 A= 2 Then 1 BA = 1 1 1 1 2 3 C= 1 0 . 1 1 . 0 Then so that A, B, and C are idempotent. (b) Suppose that A and n × n matrix that is both invertible and idempotent. Then A2 = A since A is idempotent. Multiply both sides of this equation on the right by A−1 , giving AAA−1 = AA−1 = I. Then AAA−1 = A, so we get A = I. Thus A is the identity matrix. 45. To show that A−1 = 2I − A, it suffices to show that A(2I − A) = I. But A2 − 2A + I = O means that I = 2A − A2 = A(2I − A). This equation says that A multiplied by 2I − A is the identity, which is what we wanted to prove. 46. Let A be an invertible symmetric matrix. Then AA−1 = I. Take the transpose of both sides of this equation: (AA−1 )T = I T ⇒ (A−1 )T AT = I ⇒ (A−1 )T A = I ⇒ (A−1 )T AA−1 = IA−1 = A−1 . Thus (A−1 )T AA−1 = A−1 . But also clearly (A−1 )T AA−1 = (A−1 )T I = (A−1 )T , so that (A−1 )T = A−1 . This says that A−1 is its own transpose, so that A−1 is symmetric. 47. Let A and B be square matrices and assume that AB is invertible. Then AB(AB)−1 = A(B(AB)−1 ) = I, so that A is invertible with A−1 = B(AB)−1 . But this means that B (AB)−1 A = B(AB)−1 A = I, so that also B is invertible, with inverse (AB)−1 A. 158 CHAPTER 3. MATRICES 48. As in Example 3.30, we adjoin the identity matrix to A then row-reduce A 1 5 1 4 1 0 1 0 R2 −R1 1 − −−−−→ 0 5 −1 1 0 −R2 1 − −−→ 0 1 −1 5 1 1 1 I to I R1 −5R2 0 − −−−−→ 1 −1 0 0 1 A−1 . 5 −1 −4 1 Thus the inverse of the matrix is −4 1 5 . −1 49. As in Example 3.30, we adjoin the identity matrix to A then row-reduce A " −2 3 4 −1 1 0 # 1 " − 2 R1 0 −−− −→ 1 1 3 −2 −1 − 21 0 I to I " # # 1 −2 − 21 0 0 1 ··· R −3R1 3 5 R2 1 −−2−−−→ 1 −− 0 5 −→ 2 # " " R +2R2 1 1 −2 − 21 0 −−1−−−→ 3 1 0 1 0 10 5 A−1 . 1 10 3 10 0 1 2 5 1 5 # Thus the inverse of the matrix is " 1 10 3 10 2 5 1 5 # . 50. As in Example 3.30, we adjoin the identity matrix to A then row-reduce A " 4 2 −2 0 1 0 " # 1 4 R1 0 −− −→ 1 2 1 − 21 0 1 4 0 # " 0 1 R −2R1 0 1 −−2−−−→ − 12 1 1 4 − 21 I to I # " R1 + 12 R2 0 −−− −−−→ 1 1 0 0 1 A−1 . # 0 1 2 − 21 1 Thus the inverse of the matrix is " # 0 1 2 − 12 1 . 51. As in Example 3.30, we adjoin the identity matrix to A then row-reduce A " 1 −a a 1 1 0 # " 0 1 aR1 +R2 1 −−− −−→ 0 a a2 + 1 I to I # 1 0 1 R ··· +1 2 a 1 −a−2− −−→ " # " R −aR2 1 a 1 0 1 0 −−1−−−→ 0 1 a2a+1 a21+1 0 1 Thus the inverse of the matrix is " a2 +1 − a2a+1 a a2 +1 1 a2 +1 1 # . 1 a2 +1 a a2 +1 A−1 . − a2a+1 1 a2 +1 # . 3.3. THE INVERSE OF A MATRIX 159 52. As in Example 3.30, we adjoin the identity matrix to A then row-reduce A I to I A−1 .       1 −2 −1 0 1 0 1 −2 −1 0 2 3 0 1 0 0 1 0   R1 ↔R2   R2 −2R1     1 −2 −1 0 1 0 − 3 0 1 0 0 7 2 1 −2 0  −−−−→ 2  −−−−−→ 0   R −2R1 2 0 −1 0 0 1 2 0 −1 0 0 1 −−3−−−→ 0 4 1 0 −2 1     1 −2 −1 1 0 0 1 0 1 −2 −1 0     1 1 1 2 2 7 R2  0 − 72 0 − 27 0 1 1 −− −→ 0    7 7 7 7 R −4R2 0 4 1 0 −2 1 −−3−−−→ 0 0 − 71 − 47 − 67 1   R +R   1 3 1 −2 −1 0 1 0 −−−−−→ 1 −2 0 4 7 −7   R2 − 2 R3   2 1 0  7 1 − 72 0 1 0 −1 −2 2   −−−−−−→ 0  7 7 −7R3 0 0 1 4 6 −7 0 0 1 4 6 −7 −−−→   R +2R2 1 0 0 2 3 −3 −−1−−−→   0 1 0 −1 −2 2   0 0 1 4 6 −7 The inverse of the matrix is  2 −1 4  3 −3 −2 2 . 6 −7 53. As in Example 3.30, we adjoin the identity matrix to A then row-reduce A  1 3 2 −1 2 1 2 3 −1 1 0 0 0 1 0   1 0 R2 −3R1 0 −−−−−→ 0 R3 −2R1 1 − −−−−→ 0 −1 2 4 −4 5 −5 1 −3 −2   1 0 0 0 4R3 −5R2 1 − −−−−−→ 0 0 1 0 I to I −1 2 4 −4 0 0 A−1 . 1 −3 7  0 0 1 0 . −5 4 Since the matrix on the left has a zero row, it cannot be reduced to the identity matrix, so that the given matrix is not invertible. 54. As in Example 3.30, we adjoin the identity matrix to A then row-reduce A I to I A−1 .       1 1 0 1 1 0 1 1 1 0 1 0 0 1 0 0 0 0  R2 −R1   −R2    1 0 1 0 1 0−       −−−−→ 0 −1 1 −1 1 0 −−−→ 0 1 −1 1 −1 0 0 1 1 0 0 1 0 1 1 0 0 1 0 1 1 0 0 1     1 1 0 1 0 0 1 1 0 1 0 0     0 1 −1 0 1 −1 1 −1 0 1 −1 0   1   R3 −R2 1 1 1 2 R3 1 1 −− − 2 −1 0 0 1 −−− −−→ 0 0 −→ 2 2 2     1 1 1 1 1 0 1 0 0 R1 −R2 1 0 0 − 2 2 2   −−−−−→  R2 +R3  1 1 1 0 1 0 − 12 12  − 21 −−− −−→ 0 1 0   2 2 2 1 1 1 1 0 0 1 − 12 0 0 1 − 12 2 2 2 2 The inverse of the matrix is  1  2  1  2 − 12 1 2 − 12 1 2 − 21   1 2 1 2 160 CHAPTER 3. MATRICES 55. As in Example 3.30, we adjoin the identity matrix to A then row-reduce A  a  1 0 0 a 1 0 0 a 1 0 0 0 1 0  a1 R1  0 −−−→ 1  a1 R2  1 0 −− −→  a 1 a1 R3 0 −−−→ 0 1 0 0 1 1 a 1 a 0 0 0 1 a   1 0  R2 − a1 R1  0  −−−−−−→ 0 1 0 a 0 0 1 0 0 1 1 a 1 a − a12 0 0 0  1 0 0  0 1 0 1 R3 − a R2 −−−−−−→ 0 0 1 Thus the inverse of the matrix is  1 a  1 − a2 1 a3 0 1 a − a12 A−1 . I to I 1 a  0  0 1 a − a12 1 a3 1 a 0 1 a − a12  0  0 . 1 a  0  0 . 1 a Note that these calculations assume a 6= 0. But if a = 0, then the original matrix is not invertible since the first row is zero. 56. As in Example 3.30, we adjoin the identity matrix to A then row-reduce A I to I A−1 . Note that if a = 0, then the first row is zero, so the matrix is clearly not invertible. So assume a 6= 0. Then     0 a 0 1 0 0 0 a 0 1 0 0     0 1 0 .  b 0 c 0 1 0 . b 0 c d R − R1 0 d 0 0 0 1 −−3−−a−−→ 0 0 0 − ad 0 1 Since the matrix on the left has a zero row, it cannot be reduced to the identity matrix, so the original matrix is not invertible. 57. As in Example 3.30, we adjoin the identity matrix to A then row-reduce A I to I A−1 .     1 0 0 0 −11 −2 5 −4 0 −1 1 0 1 0 0 0  2 4 1 −2 2 1 0 2 0 1 0 0  → 0 1 0 0 .    1 −1 3 0 0 0 1 0 0 0 1 0 5 1 −2 2 0 1 1 −1 0 0 0 1 0 0 0 1 9 2 −4 3 Thus the inverse of the matrix is  −11  4   5 9  −2 5 −4 1 −2 2 . 1 −2 2 2 −4 3 58. As in Example 3.30, we adjoin the identity matrix to A then row-reduce A  √ 2 −4√2    0 0 √ 0 2 2 0 √ 2 0 0 0 1 0 0 3 1 Thus the inverse of the matrix is √ 1 0 0 0 0 1 0 0 0 0 1 0  √2 2  √ 2 2   0 0   0 1 0 0 0  0 0   1 0 0 → 0 0 1 0 0 1 0 0 0 1 0 √ 2 2 0 0  −2 0  −8 0 . 1 0 −3 1 2 2 √ 2 2 0 0 I to I 0 √ 2 2 0 0 A−1 .  −2 0  −8 0 . 1 0 −3 1 3.3. THE INVERSE OF A MATRIX 161 59. As in Example 3.30, we adjoin the identity matrix to A then row-reduce A I to I     1 0 0 0 1 0 0 0 1 0 0 0 1 0 0 0 0 1 0 0 0 1 0 0 0 1 0 0 0 1 0 0      . → 0 0 1 0 0 0 1 0 0 0 1 0 0 0 1 0 a b c d 0 0 0 1 0 0 0 1 − ad − db − dc d1 Thus the inverse of the matrix is  1  0   0 − ad 0 1 0 − db 0 0 1 − dc A−1 .  0 0 . 0 1 d 60. As in Example 3.30, we adjoin the identity matrix to A then row-reduce A I over Z2 to I R1 +R2 0 1 1 0 − 1 0 1 1 −−−−→ 1 0 1 1 . R2 +R1 1 1 0 1 1 1 0 1 − −−−−→ 0 1 1 0 Thus the inverse of the matrix is 1 1 A−1 . 1 . 0 61. As in Example 3.30, we adjoin the identity matrix to A then row-reduce A I over Z5 to I 4R1 1 3 4 0 4 2 1 0 − −→ 1 3 4 0 . R2 +2R1 3 4 0 1 3 4 0 1 − −−−−→ 0 0 3 1 A−1 . Since the left-hand matrix has a zero row, it cannot be reduced to the identity matrix, so the original matrix is not invertible. 62. As in Example 3.30, we adjoin the identity matrix to A then row-reduce A I over Z3 to I A−1 .  2 1 0 1 1 2 0 2 1 1 0 0 0 1 0  2R1    0 −−→ 1 2 0 2 0 0 1 2 R2 +R1 0 2 2 1 1 0 2R2 0 1 0− −−→ −−−−→ 0 2 1 0 0 1 0 2 1    1 2 1 2 0 2 0 0 0 1 0 1 1 2 2 0 2R3 R3 +R2 0 0 −−− −−→ 0 0 2 2 2 1 −−→   R1 +R2  1 2 0 2 0 0 −−−−−→ 1 R2 +2R3  0 0 1 0 1 1 1 −−−−−→ 0 0 1 1 1 2 0 Thus the inverse of the matrix is  0 1 1 1 1 1 0 1 1 2 2 0 0 2 0 0 1 1 2 2 1 0 2 1  0 0 1  0 0 2 0 1 0 0 0 1 0 1 1 1 1 1  1 1 . 2 63. As in Example 3.30, we adjoin the identity matrix to A then row-reduce A     1 5 0 1 0 0 1 0 0 4 6 4 1 2 4 0 1 0 → 0 1 0 5 3 2 . 3 6 1 0 0 1 0 0 1 0 6 5 Thus the inverse of the matrix is  4 5 0 6 3 6  1 1 . 2  4 2 . 5 I over Z7 to I A−1 . 162 CHAPTER 3. MATRICES 64. Using block multiplication gives A O B D −1 A O −A−1 BD−1 AA−1 + BO −AA−1 BD−1 + BD−1 = D−1 OA−1 + DO −OA−1 BD−1 + DD−1 −1 AA −BD−1 + BD−1 = O DD−1 I O = . O I 65. Using block multiplication gives O C B I −(BC)−1 (BC)−1 B C(BC)−1 I − C(BC)−1 B −O(BC)−1 + BC(BC)−1 O(BC)−1 B + B(I − C(BC)−1 B) = −C(BC)−1 + IC(BC)−1 C(BC)−1 B + I(I − C(BC)−1 B) BC(BC)−1 BI − BC(BC)−1 B = −C(BC)−1 + C(BC)−1 C(BC)−1 B + I − C(BC)−1 B I O = . O I 66. Using block multiplication gives I C B I (I − BC)−1 −(I − BC)−1 B −C(I − BC)−1 I + C(I − BC)−1 B I(I − BC)−1 − BC(I − BC)−1 −I(I − BC)−1 B + B(I + C(I − BC)−1 B) = C(I − BC)−1 − I(C(I − BC)−1 ) −C(I − BC)−1 B + I(I + C(I − BC)−1 B) (I − BC)(I − BC)−1 (BC − I)(I − BC)−1 B + B = C(I − BC)−1 − C(I − BC)−1 −C(I − BC)−1 B + I + C(I − BC)−1 B I −IB + B = O I I O = . O I 67. Using block multiplication gives O C B D −(BD−1 C)−1 (BD−1 C)−1 BD−1 D−1 C(BD−1 C)−1 D−1 − D−1 C(BD−1 C)−1 BD−1   BD−1 C(BD−1 C)−1 D−1 − D−1 CBD−1 C)−1 BD−1  C(BD−1 C)−1 BD−1 = −C(BD−1 C)−1 + C(BD−1 C)−1 −1 −1 −1 −1 −1 +D(D − D C(BD C) BD ) −1 −1 −1 I D − D CC DB −1 BD−1 = O C(BD−1 C)−1 BD−1 + I − DD−1 C(BD−1 C)−1 BD−1 I D−1 − D−1 = O C(BD−1 C)−1 BD−1 + I − C(BD−1 C)−1 BD−1 I O = . O I 3.3. THE INVERSE OF A MATRIX 163 68. Using block multiplication gives A B P Q = AP + BR AQ + BS CP + DR DQ + DS C D R S AP + B(−D−1 CP ) A(−P BD−1 + B(D−1 + D−1 CP BD−1 ) = CP + D(−D−1 CP ) C(−P BD−1 ) + D(D−1 + D−1 CP BD−1 ) (A − BD−1 C)P −(A − BD−1 C)P BD−1 + BD−1 = CP − CP −CP BD−1 + I + CP BD−1 −1 P P −P −1 P BD−1 + P BD−1 = O I I −BD−1 + BD−1 I O = = . O I O I 69. Partition the matrix  1 0 A= 2 1 0 1 3 2 0 0 1 0  0 0 = I 0 C 1 B , where B = O. I Then using Exercise 66, (I − BC)−1 = (I − OC)−1 = I −1 = I, so that (I − BC)−1 −(I − BC)−1 B I −IB I −1 A = = = −C(I − BC)−1 I + C(I − BC)−1 B −CI I + CIB −C Substituting back in for C gives  1  0 −1  A = −2 −1 70. Partition the matrix  0 0 0 1 0 0 . −3 1 0 −2 0 1 √  √0 2 2 0 2 0 0 = A O 0 1 0 0 3 1  √ 2 √ −4 2 M =  0 0 B D and use Exercise 64. Then # √ −1 " √2 2 0 0 2 √ √ √ A = = √ −4 2 2 2 2 22 −1 1 0 1 0 D−1 = = 3 1 −3 1 "√ # √ 2 0 1 0 −2 2 2 0 2 √ −A−1 BD = √ = 2 −8 0 0 −3 1 2 2 −1 2 Then " A−1 −1 M = O 71. Partition the matrix  √2 2 #  √ −A−1 BD−1 2 2 =  0 D−1 0  0 0 A= 0 1  0 1 1 0 1 0 = O −1 1 0 C 1 0 1 0 √ 2 2 0 0 B I 0 . 0  −2 0  −8 0 . 1 0 −3 1 O . I 164 CHAPTER 3. MATRICES and use Exercise 65. Then −1 1 1 0 −1 1 −(BC)−1 = − =− 1 0 1 1 0 1 0 1 1 1 1 (BC −1 )B = = 0 −1 1 0 −1 0 0 −1 1 0 0 1 C(BC)−1 = = 1 1 0 −1 1 −1 0 1 1 1 0 0 I − C(BC)−1 B = I − = . 1 −1 1 0 0 0 Thus A−1 =  −1  0 (BC)−1 B =  0 I − C(BC)−1 B 1 −(BC)−1 C(BC)−1 72. Partition the matrix  0 A= 1 −1 where B= 1 1 , 1 3 5  1 O  1 = C 2 1 C= , −1 0 −1 = −1 0 0 1  0 1 1 1 −1 0 . 1 0 0 −1 0 0 B , D 3 D= 5 1 , 2 and use Exercise 67. Then " # " #−1 i 3 1 −1 1 h i−1 h i  = − −5 = 51 1 5 2 −1 " # i 3 1 −1 h i 1h −1 −1 −1 (BD C) BD = − 1 1 = 35 − 52 5 5 2 " #−1 " # " 3# −5 3 1 1 1 = D−1 C(BD−1 C)−1 = − 8 5 5 2 −1 5 " #−1 " # " # " # i 3 1 −1 3 h 1 1 3 1 − 5 5 5 . D−1 − D−1 C(BD−1 C)−1 BD−1 = − = 1 1 8 5 2 5 2 − 15 − 51 5  h −(BD−1 C)−1 = −  1 Thus " A−1 = 3.4 −1 −1 −(BD C) D C(BD−1 C)−1 −1 −1 D −1 −1 −1 (BD C) BD − D−1 C(BD−1 C)−1 BD−1 #  1 5  3 = − 5 8 5 3 5 1 5 − 51 − 25  1 . 5 1 −5 The LU Factorization 1. As in the method outlined after Theorem 3.15, we want to solve Ax = (LU )x = b. Let y = U x and first solve Ly = b for y by forward substitution. Then solve U x = y by backward substitution. We have −2 1 1 0 −2 1 5 A= = = LU, and b = . 2 5 −1 1 0 6 1 Then 1 Ly = b ⇒ −1 y1 = 5 0 y1 5 y1 = 5 5 = ⇒ ⇒ ⇒y= . 1 y2 1 −y1 + y2 = 1 6 y2 = y1 + 1 3.4. THE LU FACTORIZATION 165 Likewise, Ux = y ⇒ x2 = 1 1 x1 5 −2x1 + x2 = 5 x −2 = ⇒ ⇒ ⇒ 1 = . 6 x2 6 6x2 = 6 x2 1 −2x1 = 5 − x2 −2 0 2. As in the method outlined after Theorem 3.15, we want to solve Ax = (LU )x = b. Let y = U x and first solve Ly = b for y by forward substitution. Then solve U x = y by backward substitution. We have " # " #" # " # 4 −2 1 0 4 −2 0 A= = 1 = LU, and b = . 2 3 1 0 4 8 2 Then Ly = b ⇒ 1 1 2 0 y1 y = 0 0 0 ⇒y= . = ⇒1 1 8 8 1 y2 y + y = 8 1 2 2 Likewise, 4 Ux = y ⇒ 0 x2 = 2 −2 x1 0 4x1 − 2x2 = 0 x 1 = ⇒ ⇒ ⇒ 1 = . 4 x2 8 4x2 = 8 x 2 4x1 = 2x2 2 3. As in the method outlined after Theorem 3.15, we want to solve Ax = (LU )x = b. Let y = U x and first solve Ly = b for y by forward substitution. Then solve U x = y by backward substitution. We have        2 1 −2 −3 1 0 0 2 1 −2        A = −2 1 0 0 4 −6 = LU, and b =  1 . 3 −4 = −1 5 0 2 − 4 1 0 0 − 72 4 −3 0 Then  1 Ly = b ⇒ −1 2 0 1 − 45     y1 = −3 0 −3 y1 = 1 0 y2  =  1 ⇒ −y1 + y2 y3 0 2y1 − 45 y2 + y3 = 0 1 y1 = −3     −3 y1 ⇒ y2 = y1 + 1 ⇒ y2  = −2 . 7 5 y3 2 y3 = −2y1 + y2 4 Likewise,  2  U x = y ⇒ 0 0 1 4 0     −2 x1 −3 2x1 + x2 − 2x3 = −3     −6 x2  = −2 ⇒ 4x2 − 6x3 = −2 7 7 −2 x3 − 27 x3 = 27 2     x1 − 23     ⇒ 4x2 = 6x3 − 2 ⇒ x2  =  −2 . x3 −1 2x1 = −x2 + 2x3 − 3 x3 = −1 4. As in the method outlined after Theorem 3.15, we want to solve Ax = (LU )x = b. Let y = U x and first solve Ly = b for y by forward substitution. Then solve U x = y by backward substitution. We have        2 −4 0 1 0 0 2 −4 0 2     3    A =  3 −1 4 =  2 1 0 0 5 4 = LU, and b =  0 . 1 −1 2 2 −2 0 1 0 0 2 −5 166 CHAPTER 3. MATRICES Then  1  Ly = b ⇒  32 − 12     y1 = 2 2 0 0 y1     1 0 y2  =  0 ⇒ 32 y1 + y2 = 0 1 −5 y3 0 1 − 2 y1 + y3 = −5 y1 = 2 3 ⇒ y2 = − 2 y1 1 y3 = y1 − 5 2     2 y1     ⇒ y2  = −3 . −4 y3 Likewise,  2 U x = y ⇒ 0 0     −4 0 x1 2 2x1 − 4x2 = 2 5 4 x2  = −3 ⇒ 5x2 + 4x3 = −3 0 2 x3 −4 2x3 = −4     x1 3     ⇒ 5x2 = −4x3 − 3 ⇒ x2  =  1 . x3 −2 2x1 = 4x2 + 2 x3 = −2 5. As in the method outlined after Theorem 3.15, we want to solve Ax = (LU )x = b. Let y = U x and first solve Ly = b for y by forward substitution. Then solve U x = y by backward substitution. We have        1 0 1 0 0 0 2 −1 0 2 −1 0 0 2  0 −1 5 −3 6 −4 5 −3 3 1 0 0  = LU, and b =   .  = A= 2 8 −4 1 0 1 0 0 1 0 0 0 4 1 0 0 0 4 2 −1 5 1 4 −1 0 7 Then  1 3 Ly = b ⇒  4 2     y1 = 1 0 0 0 y1 1 y2  2 3y + y = 2 1 0 0 1 2   =   ⇒ 4y1 + y3 = 2 0 1 0 y3  2 2y1 − y2 + 5y3 + y4 = 1 −1 5 1 y4 1 y1 = 1     y1 1 y2  −1 y2 = −3y1 + 2    ⇒ ⇒ y3  = −2 . y3 = −4y1 + 2 8 y4 y = −2y + y − 5y + 1 4 1 2 3 Likewise,  2 0 Ux = y ⇒  0 0     −1 0 0 x1 1 2x1 − x2 = 1 x2  −1 −1 5 −3 − x + 5x − 3x = −1 2 3 4   =   ⇒ 0 1 0 x3  −2 x3 = −2 0 0 4 x4 8 4x4 = 8 x4 = 2     x1 −7 x2  −15 x3 = −2    ⇒ ⇒ x3  =  −2 . x2 = 5x3 − 3x4 + 1 x4 2 2x = x + 1 1 2 6. As in the method outlined after Theorem 3.15, we want to solve Ax = (LU )x = b. Let y = U x and first solve Ly = b for y by forward substitution. Then solve U x = y by backward substitution. We 3.4. THE LU FACTORIZATION 167 have  1 A = −2 3   1 4 3 0 −2 −5 −1 2 =   3 6 −3 −4 −5  0 0 0 1 4  1 0 0  0 3  −2 1 0 0 0 0 0 4 −2 1    3 0 1 −3 5 2  = LU, and b =   . −1 −2 0 0 1 0 Then     1 y1 = 1 0 0 0 y1 y2  −3 −2y + y = −3 1 0 0 1 2   =   ⇒ 3y1 − 2y2 + y3 = −1 −2 1 0 y3  −1 0 −5y1 + 4y2 − 2y3 + y4 = 0 4 −2 1 y4  1 −2 Ly = b ⇒   3 −5 y1 = 1     1 y1 y2  −1 y2 = 2y1 − 3    ⇒ ⇒ y3  = −6 . y3 = −3y1 + 2y2 − 1 −3 y4 y = 5y − 4y + 2y 4 1 2 3 Likewise,  1 0 Ux = y ⇒  0 0 4 3 0 0     x1 + 4x2 + 3x3 = 1 1 3 0 x1 x2  −1 3x + 5x + 2x 5 2 2 3 4 = −1   =   ⇒ − 2x3 = −6 −2 0 x3  −6 0 1 x4 x4 = −3 −3 x4 = −3    16  x1 3 x   − 10  x3 = 3  2  3  ⇒ ⇒ = . x3   3x2 = −5x3 − 2x4 − 1 3 x1 = −4x2 − 3x3 + 1 7. We use the multiplier method from Example 3.35: 1 2 1 A= R3 −(−3R1 ) −3 −1 − −−−−−−−→ 0 x4 2 = U. 5 The multiplier was −3, so 1 L= −3 0 . 1 8. We use the multiplier method from Example 3.35: 2 −4 2 3 A= R3 − 2 R1 3 1 − −−−−−→ 0 The multiplier was 23 , so L= 1 3 2 −4 = U. 7 0 . 1 9. We use the multiplier method from Example 3.35:      1 2 3 1 2 3 1 R −4R1 0 −3 −6 0 A = 4 5 6 −−2−−−→ R3 −3R2 R3 −8R1 8 7 9 − −−−−→ 0 −9 −15 −−−−−→ 0 The multipliers were `21 = 4, `31 = 8, and `32 = 3, so we get   1 0 0 L = 4 1 0 8 3 1  2 3 −3 −6 = U. 0 3 −3 168 CHAPTER 3. MATRICES 10. We use the multiplier method from Example 3.35:      2 2 −1 2 2 −1 R −2R 2 2 1 0 6 4 −−−−−→ 0 −4 A = 4 0 R3 −(− 41 R2 ) R3 − 23 R1 3 4 4 − 1 11 2 −−−−−−−−→ 0 −−−−−→ 0  2 −1 −4 6 = U. 0 7 The multipliers were `21 = 2, `31 = 23 , and `32 = − 14 , so we get   1 0 0 1 0 L = 2 1 3 −4 1 2 11. We use the multiplier method from Example 3.35:  1  2 A=  0 −1   1 2 3 −1 R2 −2R1 −−−−−→ 0 6 3 0  R3 −0R1  6 −6 7 − −−−−→ 0 −2 −9 0 R4 −(−1R1 ) 0 −−−−−−−−→ 2 2 6 0   1 2 3 −1 0 2 −3 2  R −3R2  0 0 −6 7 −−3−−−→ R −0R 4 2 −6 −1 −−−−−→ 0 0  3 −1 −3 2  3 1 −6 −1  1 2 0 2  0 0 R4 −(−2R3 ) −−−−−−−−→ 0 0  3 −1 −3 2  = U. 3 1 0 1 The multipliers were `21 = 2, `31 = 0, `41 = −1, `32 = 3, `42 = 0, and `43 = −2, so   1 0 0 0  2 1 0 0 . L=  0 3 1 0 −1 0 −2 1 12. We use the multiplier method from Example 3.35:  2 −2  A= 4 6 2 4 4 9   2 1 R −(−R ) 2 2 1 −−−→  −1 2  −−R−− 0 3 −2R1  7 3 −−−−−→ 0 R −3R1 0 5 8 −−4−−−→ 2 6 0 3   2 2 1  1 3  R3 −0R2 0 3 1 −−−−−→ 0 R4 − 21 R2 −1 5 − −−−−−→ 0 2 6 0 0 2 1 3 − 23  1 3  1 7 2  2 0  0 1 R4 −(− 2 R3 ) −−−−−−−−→ 0 2 6 0 0 2 1 3 0  1 3  = U. 1 4 The multipliers were `21 = −1, `31 = 2, `41 = 3, `32 = 0, `42 = 21 , and `43 = − 12 , so   1 0 0 0 −1 1 0 0 . L=  2 0 1 0 3 21 − 12 1 13. We can use the multiplier method as before to make the matrix “upper triangular”, i.e., in row echelon form. In this case, however, the matrix is already in row echelon form, so it is equal to U , and L = I3 :      1 0 1 −2 1 0 0 1 0 1 −2 0 3 3 1 = 0 1 0 0 3 3 1 . 0 0 0 5 0 0 1 0 0 0 5 3.4. THE LU FACTORIZATION 169 14. We can use the multiplier method as before to make the matrix “upper triangular”, i.e., in row echelon form.  1 −2   1 0   2 0 −1 1 1 0 −3 3 6 0  R3 − 31 R2  −1 3 6 1 −−−−− −→ 0 3 −3 −6 0 R4 −(−1R2 ) 0 −−−−−−−−→   2 0 −1 1 R −(−2R ) 1 2 1 −−−−−→  −7 3 8 −2 0  −−− R −1R 1 0 1 3 5 2 −−3−−−→ R4 −0R1 0 3 −3 −6 0 −−−−−→  2 0 −1 1 −3 3 6 0  = U. 0 2 4 1 0 0 0 0 The multipliers were `21 = −2, `31 = 1, `41 = 0, `32 = 31 , and `42 = −1, so  1 −2 L=  1 0 0 1 1 3 −1  0 0 0 0 . 1 0 0 1 15. In Exercise 1, we have −2 A= 2 1 1 = LU = 5 −1 0 1 −2 0 1 . 6 Then −1 L " 1 1 = 1 1 # " 0 1 = 1 1 # 0 , 1 U −1 " 1 6 =− 12 0 # " −1 − 21 = −2 0 1 12 1 6 # , so that " − 12 A−1 = (LU )−1 = U −1 L−1 = 0 1 12 1 6 #" 1 1 # " 5 0 − 12 = 1 1 6 1 12 1 6 # . 16. In Exercise 4, we have  2  A= 3 −1   1 −4 0   −1 4 = LU =  32 2 2 − 21  0 0 2  1 0 0 0 1 0  −4 0  5 4 . 0 2 To compute L−1 , we adjoin the identity matrix to L and row-reduce:      1 0 0 1 0 0 1 0 0 1 0 0 1  3     3 −1 3 → ⇒ L = 1 0 0 1 0 0 1 0 − 1 0 −  2     2 2 1 1 − 12 0 1 0 0 1 0 0 1 0 1 2 2 To compute U −1 , we adjoin the identity matrix to U and row-reduce:      1 2 −4 0 1 0 0 1 0 0 12 25 − 54 2      −1 5 4 0 1 0 → 0 1 0 0 15 − 52  ⇒ U =  0 0 1 0 0 2 0 0 1 0 0 1 0 0 0 2  0 0  1 0 . 0 1 2 5 1 5  − 54  − 52  . 0 1 2 2 5 1 5  − 45  − 25  . Then  1 2  A−1 = (LU )−1 = U −1 L−1 =  0 0 2 5 1 5 0  − 45 1 2  3 − 5  − 2 1 2 1 2   0 0 − 12   1 1 0 = − 2 1 0 1 4 0 1 2 17. Using the LU decomposition from Exercise 1, we compute A−1 one column at a time using the method outlined in the text. That is, we first solve Lyi = ei , and then solve U xi = yi . Column 1 of A−1 is 170 CHAPTER 3. MATRICES found as follows: " #" # " # " # 1 0 y11 1 1 y11 = 1 Ly1 = e1 ⇒ = ⇒ ⇒ y1 = −1 1 y21 0 1 −y11 + y21 = 0 " #" # " # " # h i 5 −2 1 x11 1 −2x11 + x21 = 1 − 12 U x1 = y1 ⇒ = ⇒ ⇒ x = . 1 1 0 6 x21 1 6x21 = 1 6 Use the same procedure to find the second column: " #" # " # " # 1 0 y12 0 0 y12 = 0 Ly2 = e2 ⇒ = ⇒ ⇒ y2 = −1 1 y22 1 1 −y12 + y22 = 1 " #" # " # " # h i 1 −2 1 x12 0 −2x12 + x22 = 0 U x2 = y2 ⇒ = ⇒ ⇒ x = 121 . 1 0 6 x22 1 6x21 = 1 6 So h A−1 = x1 i x2 = " 5 − 12 1 6 1 12 1 6 # . This matches what we found in Exercise 15. 18. Using the LU decomposition from Exercise 4, we compute A−1 one column at a time using the method outlined in the text. That is, we first solve Lyi = ei , and then solve U xi = yi . Column 1 of A−1 is found as follows:          1 0 0 y11 1 y11 = 1 y11 1  3        3 3 Ly1 = e1 =  2 1 0 y21  = 0 ⇒ 2 y11 + y21 = 0 ⇒ y21  = − 2  1 − 12 0 1 y31 0 − 21 y11 + y31 = 0 y31 2          2 −4 0 x11 1 x11 − 21 2x11 − 4x21 = 1      3     U x1 = y1 = 0 5 4 x21  = − 2  ⇒ 5x21 + 4x31 = − 23 ⇒ x21  = − 12  1 1 0 0 2 x31 x31 2x31 = 12 2 4 Repeat this procedure to find the second column:          y12 = 0 0 y12 0 y12 1 0 0          3 3 Ly2 = e2 =  2 1 0 y22  = 1 ⇒ 2 y12 + y22 = 1 ⇒ y22  = 1 0 0 − 12 0 1 y32 − 21 y12 + y32 = 0 y32          2 2 −4 0 x12 0 2x12 − 4x22 = 0 x12 5        1 U x2 = y2 = 0 5 4 x22  = 1 ⇒ 5x22 + 4x32 = 1 ⇒ x22  =  5  0 0 2 x32 0 0 2x32 = 0 x32 Do this again to find the third column:          1 0 0 y13 0 y13 = 0 y13 0          Ly3 = e3 =  32 1 0 y23  = 0 ⇒ 32 y13 + y23 = 0 ⇒ y23  = 0 − 12 0 1 y33 1 − 12 y13 + y33 = 1 y33 1          2 −4 0 x13 0 2x13 − 4x23 = 0 x13 − 54         2 U x3 = y3 = 0 5 4 x23  = 0 ⇒ 5x23 + 4x33 = 0 ⇒ x23  = − 5  1 0 0 2 x33 1 2x33 = 1 x33 2   − 12 25 − 54 h i  1 1  −1 Thus A = x1 x2 x3 = − 2 5 − 52 ; this matches what we found in Exercise 16. 1 1 0 4 2 3.4. THE LU FACTORIZATION 171 19. This matrix results from the identity matrix by first interchanging rows 1 and 2, and then interchanging the (new) row 1 with row 3, so it is      0 0 1 0 1 0 0 0 1 E13 E12 = 0 1 0 1 0 0 = 1 0 0 . 1 0 0 0 0 1 0 1 0 Other products of elementary matrices are possible. 20. This matrix results from the identity matrix by first interchanging rows 1 and 4, and then interchanging rows 2 and 3, so it is      1 0 0 0 0 0 0 1 0 0 0 1 0 1 0 0 0 0 1 0 0 0 1 0     E14 E23 =  0 0 1 0 0 1 0 0 = 0 1 0 0 1 0 0 0 0 0 0 1 1 0 0 0 Other products of elementary matrices are possible; for example, also P = E13 E23 E34 . 21. To get from the given matrix to the identity matrix, perform the following exchanges:        1 0 0 1 0 0 0 1 0 0 0 0 1 0 0 0 0 0 1 R ↔R 0 0 0 1 R ↔R 0 1 0 0 R ↔R 0 1 0  3 4   2 3   1 3   1 0 0 0 −−−−−→ 0 1 0 0 −−−−−→ 0 0 0 1 −−−−−→ 0 0 1 0 0 0 0 0 1 0 0 0 1 0 0 0 1 0  0 0  0 1 Thus we can construct the given matrix P by multiplying these elementary matrices together in the reverse order: P = E13 E23 E34 . Other combinations are possible. 22. To get from the given matrix to the identity matrix, perform the following exchanges:  0 1  0  0 0 0 0 0 0 1 1 0 0 0 0 0 0 1 0 0   1 0 0 0  R ↔R  1 2  0  −−−−−→ 0 0  1 0 0 0 0 0 0 1 0 1 0 0 0 0 0 1 0 0   1 0 0 0  R ↔R  2 5  0  −−−−−→ 0 0  1 0 0  1 0  0  0 0 0 1 0 0 0 0 0 0 0 1 0 0 1 0 0 0 1 0 0 0 0 0 1 0 0 0 0 0 0 1  0 0  R ↔R 3 5 0  −−−−−→  1 0   0 1 0 0  R ↔R  4 5  0  −−−−−→ 0  0 1 0 0 0 1 0 0 0 0 0 1 0 0 0 0 0 1 0  0 0  0  0 1 Thus we can construct the given matrix P by multiplying these elementary matrices together in the reverse order: P = E12 E25 E35 E45 . Other combinations are possible. 23. To find a P T LU factorization of A, we begin by permuting the rows of A to get it into a form that will result in a small number of operations to reduce to row echelon form. Since        0 1 4 0 1 0 0 1 4 −1 2 1 A = −1 2 1 , we have P A = 1 0 0 −1 2 1 =  0 1 4 . 1 3 3 0 0 1 1 3 3 1 3 3 172 CHAPTER 3. MATRICES Now follow the multiplier method from Example 3.35 to find the LU factorization of this matrix:  −1 PA =  0 1 2 1 3   1 −1  0 4 R3 −(−R1 ) 3 − 0 −−−−−−→   1 −1  0 4 R3 −5R2 4 − 0 −−−−→ 2 1 5 2 1 0  1 4 = U. −16 Then `21 = 0, `31 = −1, and `32 = 5, so that  1 L= 0 −1 Thus  0 A = P T LU = 1 0 1 0 0  0 1 0  0 1 −1  0 0 1 0 1 5 0 1 5  0 −1 0  0 1 0 2 1 0  1 4 . −16 24. To find a P T LU factorization of A, we begin by permuting the rows of A to get it into a form that will result in a small number of operations to reduce to row echelon form. Since        1 1 −1 0 0 0 1 2 0 0 0 1 0 0 1 2    −1 1 1 1 3 2 3 2 . = 0 2  , we have P A = 0 0 1 0 −1 1 A= 1 0 0 0  0 2  0 2 1 2 1 1  0 0 1 1 −1 1 3 2 1 1 −1 0 0 1 0 0 1 1 −1 0 Now follow the multiplier method from Example 3.35 to find the LU factorization of this matrix:  1  0 PA =   0 −1 1 2 0 1   1 −1 0 0 1 1   1 2 R4 −(−R1 ) 0 3 2 −−−−−−−→ 0 1 2 0 2   1 −1 0 0 1 1   1 2 R4 −R2 0 2 2 −−−−−→ 0  −1 0 1 1  1 2 1 1  1 1 0 2  0 0 R4 −R3 −−−−−→ 0 0 1 2 0 0  −1 0 1 1  = U. 1 2 0 −1 Then `41 = −1, `42 = 1, `42 = 1, and the other `ij are zero, so that   1 0 0 0  0 1 0 0  L=  0 0 1 0 . −1 1 1 1 Thus  0 0 T A = P LU =  0 1 0 0 1 0 1 0 0 0  0 1  0 1  0  0 0 −1 0 1 0 1 0 0 1 1  0 1 0 0  0 0 1 0 1 2 0 0  −1 0 1 1 . 1 2 0 −1 25. To find a P T LU factorization of A, we begin by permuting the rows of A to get it into a form that will result in a small number of operations to reduce to row echelon form. Since        0 −1 1 3 0 1 0 0 0 −1 1 3 −1 1 1 2 −1    1 1 2 1 1 2 1 3  , we have P A = 1 0 0 0 −1  =  0 −1 . A=  0       1 −1 1 0 0 0 1 0 1 −1 1 0 0 1 1 0 0 1 1 0 0 1 0 0 0 1 1 0 1 −1 1 3.4. THE LU FACTORIZATION 173 Now follow the multiplier method from Example 3.35 to find the LU factorization of this matrix:     −1 1 1 2 −1 1 1 2  0 −1 1 3  0 −1 1 3   = U.  PA =   0  0 1 1 0 1 1 R4 −(−R2 )  0 0 0 0 4 0 1 −1 1 −−−−−−−→ Then `42 = −1 and the other `ij are zero, so that  1 0 L= 0 0  0 0 0 1 0 0 . 0 1 0 −1 0 1 Thus  0  1 A = P T LU =  0 0 1 0 0 0 0 0 0 1  1 0 0 0  1 0 0 0  0 0 0 −1  1 0 0  0 0 1 0  0 0 −1 0 1  1 1 2 −1 1 3 . 0 1 1 0 0 4 26. From the text, a permutation matrix consists of the rows of the identity matrix written in some order. So each row and each column of a permutation matrix has exactly one 1 in it, and the remaining entries in that row or column are zero. Thus there are n choices for the column containing the 1 in the first row; once that is chosen, there are n − 1 choices for the column to contain the 1 in the second row. Next, there are n − 2 choices in the third row. Continuing, we decrease the number of choices by one each time, until there is just one choice in the last row. Multiplying all these together gives the total number of choices, i.e., the number of n×n permutation matrices, which is n·(n−1)·(n−2) · · · 2·1 = n!. 27. To solve Ax = P T LU x = b, multiply both sides by P to get LU x = P b. Now compute P b, which is a permutation of the rows of b, and solve the resulting equation by the method outlined after Theorem 3.15:          1 1 0 1 0 0 1 0 0 1 0 P T = 1 0 0 ⇒ P = 1 0 0 ⇒ P b = 1 0 0 1 = 1 . 0 0 1 0 0 1 0 0 1 5 5 Then  1  Ly = P b ⇒  0 0 1 − 12 1 2     0 y1 1 y1 = 1     0 y2  = 1 ⇒ y2 = 1 1 1 1 y3 5 2 y1 − 2 y2 + y3 = 5 y1 = 1     y1 1     ⇒ y2 = 1 ⇒ y2  = 1 . 1 1 y3 5 y3 = 5 − y1 + y2 2 2 Likewise,  2 U x = y ⇒ 0 0 3 1 0     2 2x1 + 3x2 + 2x3 = 1 x1 1 −1 x2  = 1 ⇒ x2 − x3 = 1 x3 5 − 25 − 52 x3 = 5     x1 4     ⇒ x2 = x3 + 1 ⇒ x = x2  = −1 . x3 −2 2x1 = 1 − 3x2 − 2x3 x3 = −2 174 CHAPTER 3. MATRICES 28. To solve Ax = P T LU x = b, multiply both sides by P to get LU x = P b. Now compute P b, which is a permutation of the rows of b, and solve the resulting equation by the method outlined after Theorem 3.15:          0 0 1 0 1 0 0 1 0 16 −4 P T = 1 0 0 ⇒ P = 0 0 1 ⇒ P b = 0 0 1 −4 =  4 . 0 1 0 1 0 0 1 0 0 4 16 Then  1 Ly = P b ⇒ 1 2     0 0 y1 −4 y1 = −4 1 0 y2  =  4 ⇒ y1 + y2 = 4 −1 1 y3 16 2y1 − y2 + y3 = 16     −4 y1     ⇒ y2  =  8 . ⇒ y2 = 4 − y1 32 y3 y3 = 16 − 2y1 + y2 y1 = −4 Likewise,  4 U x = y ⇒ 0 0     4x1 + x2 + 2x3 = −4 −4 1 2 x1 − x2 + x3 = 8 −1 1 x2  =  8 ⇒ 2x3 = 32 32 0 2 x3     x1 −11     ⇒ x2 = x3 − 8 ⇒ x = x2  =  8 . x3 16 4x1 = −x2 − 2x3 − 4 x3 = 16 29. Let A = (aij ) and B = (bij ) be unit lower triangular n × n matrices, so that   0 i < j aij = bij = 1 i = j   ∗ i>j (where of course the ∗’s maybe different for A and B). Then if C = (cij ) = AB, we have cij = ai1 b1j + ai2 b2j + · · · + ai(j−1) b(j−1)j + aij bjj + · · · + ain bnj . If i < j, then all the terms up through ai(j−1) b(j−1)j since the b’s are zero, and all terms from the next term to the end are zero since the a’s are zero. If i = j, then all the terms up through ai(j−1) b(j−1)j since the b’s are zero, the aij bjj = ajj bjj term is 1, and all terms from the next term to the end are zero since the a’s are zero. Finally, if i > j, multiple terms in the sum may be nonzero. Thus we get   0 cij = 1   ∗ so that C is lower unit triangular as well. 30. To invert L, we row-reduce L I :  1 0 ··· ∗ 1 · · ·   .. .. . . . . . ∗ ∗ ··· i<j i=j i>j 0 0 .. . 1 0 .. . 0 1 .. . ··· ··· .. . 1 0 0 ···  0 0  ..  . . 1 3.4. THE LU FACTORIZATION 175 To do this, we first add multiples of the first row to each of the lower rows to make the first column zero except for the 1 in row 1. Note that the result of this is a unit lower triangular matrix on the right (since we added multiples of the leading 1 in the first row to the lower rows). Now do the same with row 2: add multiples of that row to each of the lower rows, clearing column 2 on the left except for the 1 in row 2. Again, for the same reason, the result on the right remains unit lower triangular. Continue this process for each row. The result is the identity matrix on the left, and a unit lower triangular matrix on the right. Thus L is invertible and its inverse is unit lower triangular. 31. We start by finding D. First, suppose D is a diagonal 2 × 2 matrix and A is any 2 × 2 matrix with 1’s on the diagonal. Then a 0 DA = A 0 b has the effect of multiplying the first row of A by a and the second row of A by b, so results in a matrix with a and b in the diagonal entries. In our example, we want to write −2 1 U= 0 6 as unit upper triangular, say U = DU1 . So we must turn the −2 and 6 into 1’s, and thus −2 0 1 − 21 = DU1 . U= 0 6 0 1 Thus A = LU = 1 −1 0 −2 1 0 0 1 = 6 −1 0 −2 1 0 0 1 6 0 − 21 = LDU1 . 1 32. We start by finding D. First, suppose D is a diagonal 3 × 3 matrix and A is any 3 × 3 matrix with 1’s on the diagonal. Then   a 0 0 DA = 0 b 0 A 0 0 c has the effect of multiplying the first row of A by a, the second row of A by b, and the third row of A by c, so the result is a matrix with a, b, and c on the diagonal. In our example, we want to write   2 −4 0 5 4 U = 0 0 0 2 as unit upper triangular, say U = DU1 . So we must turn the 2, 5, and 2 into 1’s, and thus    1 −2 0 2 0 0 1 54  = DU1 . U = 0 5 0 0 0 0 2 0 0 1 Thus  1  A = LU =  23 − 12  0 0 2  1 0 0 0 1 0   −4 0 1   3 5 4 =  2 0 2 − 21  0 0 2  1 0 0 0 1 0 0 5 0  0 1  0 0 2 0 −2 1 0  0 4 = LDU1 . 5 1 33. We must show that if A = LDU is both symmetric and invertible, then U = LT . Since A is symmetric, L(DU ) = LDU = A = AT = (LDU )T = U T DT LT = U T (DT LT ) = U T (DLT ) since the diagonal matrix D is symmetric. Since L is lower triangular, it follows that LT is upper triangular. Since any diagonal matrix is also upper triangular, Exercise 29 in Section 3.2 tells us that 176 CHAPTER 3. MATRICES DLT and DU are both upper triangular. Finally, since U is upper triangular, it follows that U T is lower triangular. Thus we have A = U T (DLT ) = L(DU ), giving two LU -decompositions of A. But since A is invertible, Theorem 3.16 tells us that the LU -decomposition is unique, so that U T = L. Taking transposes gives U = LT . 34. Suppose A = L1 D1 LT1 = L2 D2 LT2 where L1 and L2 are unit lower triangular, D1 and D2 are diagonal, and A is symmetric and invertible. We want to show that L1 = L2 and D1 = D2 . First, since L2 (D2 LT2 ) = A = L1 (D1 LT1 ) gives two LU -factorizations of an invertible matrix A, it follows from Theorem 3.16 that these two factorizations are the same, so that L1 = L2 and D2 LT2 = D1 LT1 . Since L1 = L2 , this equation becomes D2 LT2 = D1 LT2 . From Exercise 47 in Section 3.3, since A = L2 (D2 LT2 ) and A is invertible, T we see that both L2 and D2 LT2 are invertible; thus LT2 is invertible as well ((LT2 )−1 = (L−1 2 ) ). So T T T −1 multiply D2 L2 = D1 L2 on the right by (L2 ) , giving D2 = D1 , and we are done. 3.5 Subspaces, Basis, Dimension, and Rank 1. Geometrically, x = 0 is a line through the origin in R2 , so you might conclude from Example 3.37 that x this is a subspace. And in fact, the set of vectors with x = 0 is y 0 0 =y . y 1 0 Since y is arbitrary, S = span , and this is a subspace of R2 by Theorem 3.19. 1 1 1 −1 2. This is not a subspace. For example, is in S, but − = is not in S. This violates property 0 0 0 3 of the definition of a subspace. 2 3. Geometrically, y = 2x is a line through the origin in R , so you might conclude from Example 3.37 x that this is a subspace. And in fact, the set of vectors with y = 2x is y x 1 =x . 2x 2 1 Since y is arbitrary, S = span , and this is a subspace of R2 by Theorem 3.19. 2 −3 2 4. This is not a subspace. For example, both and are in S, since −3 · (−1) and 2 · 2 are both −1 2 nonnegative. But their sum, −3 2 −1 + = , −1 2 1 is not in S since −1 · 1 < 0. So S is not a subspace (violates property 2 of the definition). 5. Geometrically, x = y = z is a line through the origin in R3 , so youcould  conclude from Exercise 9 in x this section that this is a subspace. And in fact, the set of vectors y  with x = y = z is z     x 1 x = x 1 . x 1 3.5. SUBSPACES, BASIS, DIMENSION, AND RANK 177   1 Since x is arbitrary, S = span 1, and this is a subspace of R3 by Theorem 3.19. 1 6. Geometrically, this is a line in R3 lying in the xz-plane and passing   through the origin, so Exercise 9 x in this section implies that it is a subspace. The set of vectors y  satisfying these conditions is z     x 1  0  = x 0 . 2x 2   1 Since x is arbitrary, S = span 0, and this is a subspace of R3 by Theorem 3.19. 2 7. This is a plane in R3 , but it does not pass through the origin since (0, 0, 0) does not satisfy the equation. So condition 1 in the definition is violated, and this is not a subspace. 8. This is not a subspace. For example     0 1 0 ∈ S and 1 ∈ S 2 1 since for the first vector |1 − 0| = |0 − 1| and for the second |0 − 1| = |1 − 2|. But their sum,       0 1 1 0 + 1 = 1 ∈ /S 2 3 1 since |1 − 1| = 6 |1 − 3|. 9. If a line ` passes through the origin, then it has the equation x = 0 + td = td, so that the set of points on ` is just span(d). Then Theorem 3.19 implies that ` is a subspace of R3 . (Note that we did not use the assumption that the vectors lay in R3 ; in fact this conclusion holds in any Rn for n ≥ 1.) 10. This is not a subspace of R2 . For example, (1, 0) ∈ S and (0, 1) ∈ S, but their sum, (1, 1), is not in S since it does not lie on either axis. 11. As in Example 3.41, b is in col(A) if and only if the augmented matrix 1 0 −1 3 A b = 1 1 1 2 is consistent as a linear system. So we row-reduce that augmented matrix: 1 0 1 1 −1 1 3 1 0 R2 −R1 2 − −−−−→ 0 1 −1 2 3 . −1 This matrix is in row echelon form and has no zero rows, so the system is consistent and thus b ∈ col(A). Using Example 3.41, w is in row(A) if and only if the matrix   1 0 −1 A 1  = 1 1 w −1 1 1 178 CHAPTER 3. MATRICES can be row-reduced to a matrix whose last row is zero by operations excluding row interchanges involving the last row. Thus  1  1 −1 0 1 1   1 0 −1 R2 −R1 1  −−− −−→  0 1 R3 +R1 1 − −−−−→ 0 1   −1 1  0 2  R3 −R2 0 − −−−−→ 0 0 1 0  −1 2  −2 This matrix is in row echelon form and has no zero rows, so we cannot make the last row zero. Thus w∈ / row(A). 12. Using Example 3.41, b is in col(A) if and only if the augmented matrix   1 1 −3 1 2 1 1 A b = 0 1 −1 −4 0 is consistent as a linear system. So we row-reduce that augmented matrix:  1  0 1   1 1   1 0 R3 −R1 0 −−−−−→ 0 1 −3 2 1 −1 −4 1 −3 2 1 −2 −1   1 1   1 0 R3 +R2 −1 −−−−−→ 0 1 2 0   1 1  21 R2  1 −−−→ 0 0 0 −3 1 0 1 1 0 −3 1 2 0  1 1 2 0 The linear system has a solution (in fact, an infinite number of them), so is consistent. Thus b is in col(A). Using Example 3.41, w is in row(A) if and only if the matrix   1 1 −3  0 A 2 1   =  1 −1 −4  w 2 4 −5 can be row-reduced to a matrix whose last row is zero by operations excluding row interchanges involving the last row. Thus  1  0   1 2   1 1 −3  0 2 1   R3 −R1  −1 −4  −−− −−→  0 R −2R 4 1 4 −5 −−−−−→ 0   1 −3 1  0 2 1     0 −2 −1  R4 +R3 2 1 −−−−−→ 0  1 −3 2 1   −2 −1  0 0 Thus w is in row(A). 13. As in the remarks following Example 3.41, we determine whether w is in row(A) using col(AT ). To say w is in row(A) means that wT is in col(A), so we form the augmented matrix and row-reduce to see if the system has a solution:  T A w T 1 1 = 0 1 −1 1   −1 1 1 0 1 1 R3 +R2 1 − −−−−→ 0 2   −1 1 1 1 R2 − 12 R3 0 0 0 −−−−−−→ 0 2   −1 1 1 R2 ↔R3  1 − −−−−→ 0 2 0 0 0  −1 0 1 This matrix has a row that is all zeros except for the final column, so the system is inconsistent. Thus wT ∈ / col(AT ) and thus w ∈ / row(A). 14. As in the remarks following Example 3.41, we determine whether w is in row(A) using col(AT ). To say w is in row(A) means that wT is in col(A), so we form the augmented matrix and row-reduce to 3.5. SUBSPACES, BASIS, DIMENSION, AND RANK 179 see if the system has a solution:  T A w T 1 = 1 −3 0 2 1 1 −1 −4   1 0 2 R2 −R1 4 −−− −−→ 0 2 R3 +3R1 −5 − −−−−→ 0 1 1 −2 −1  2 2 1  1 0 1 2 R2 0 1 −−−→ 0 1 1 −1 −1   2 1 0 1 R3 −R2 1 − −−−−→ 0 0 1 0 1 −1 0  2 1 . 0 This is in row reduced form, and the final row consists of all zeros. This is a consistent system, so that wT ∈ col(AT ) and thus w ∈ row(A). 15. To check if v is in null(A), we need only multiply A by v:   −1 1 0 −1   0 3 = Av = 6= 0, 1 1 1 1 −1 so that v ∈ / null(A). 16. To check if v is in null(A), we need only multiply A by v:      0 7 1 1 −3 2 1 −1 = 0 = 0, Av = 0 0 2 1 −1 −4 so that v ∈ null(A). 17. We follow the comment after Example 3.48. Start by finding the reduced row echelon form of A: 1 1 0 1 −1 1 R2 −R1 1 − −−−−→ 0 { 1 0 0 1 −1 2 Thus a basis for row(A) is given by −1 , 0 1 2 }. A basis for col(A) is given by the column vectors of the original matrix A corresponding to the leading 1s in the row-reduced matrix, so a basis for col(A) is 1 0 , . 1 1 Finally, from the row-reduced form, the solutions to Ax = 0 where   x1 x = x2  x3 can be found by noting that x3 is a free variable; setting it to t and then solving for the other two variables gives x1 = t and x2 = −2t. Thus a basis for the nullspace of A is given by   1 x = −2 . 1 180 CHAPTER 3. MATRICES 18. We follow the comment after Example 3.48. Start by finding the reduced row echelon form of A:           R1 −R2 1 1 −3 1 1 −3 1 1 −3 1 1 −3 −−− −−→ 1 0 − 72  1 R2          1 1 2 0 2 0 1 0 0  1 2 1 2 1  −−−→ 0 1       2 2 R3 +R2 R3 −R1 0 1 −1 −4 −−− 0 0 0 0 0 0 −−→ 0 0 −−→ 0 −2 −1 −−− Thus a basis for row(A) is given by 1 0 − 72 , 0 1 12 or, clearing fractions, 2 −7 , 0 0 2 1 . Since the row-reduction did not involve any row exchanges, a basis for row(A) is also given by the rows of A corresponding to nonzero rows in the reduced matrix: 1 1 −3 , 0 2 1 . A basis for col(A) is then given by the original column vectors of A corresponding to the columns containing the leading 1s in the row-reduced matrix; thus a basis for col(A) is given by     1   1 0 ,  2   1 −1 Finally, from the row-reduced form, the solutions of Ax = 0 where   x1 x = x2  x3 can be found by noting that x3 is a free variable; we set it to t, and then solving for the other two variables in terms of t to get x1 = 27 t and x2 = − 21 t. Thus a basis for the nullspace of A is given by     7 7    2 1    x = − 2  or −1  2 1 19. We follow the comment after Example 3.48. Start by finding the reduced row echelon form of A:  1 0 0 1 1 1   1 1 0 1 0 1 −1 1 R3 −R2 −1 −1 − −−−−→ 0 0 Thus a basis for row(A) is given by 1 0 1    R1 −R3 0 1 1 1 0 1 −−− −−→ 0 1 −1 1 R2 −R3 −1 1 1 −−−−−→ − 2 R3 0 −2 − 0 1 −−−→ 0 0   R1 −R2  1 1 0 0 − −−−−→ 1 0 0 1 −1 0 0 1 0 0 0 1 0 0 0 , 0 1 −1 0 , 0 0 0 1  1 0 −1 0 0 1 . Since the row-reduction did not involve any row exchanges, a basis for row(A) is also given by the rows of A corresponding to nonzero rows in the reduced matrix; since all the rows are nonzero, the rows of A form a basis for row(A). A basis for col(A) is then given by the original column vectors of A corresponding to the columns containing the leading 1s in the row-reduced matrix; thus a basis for col(A) is given by       1 1   1 0 , 1 ,  1   0 1 −1 3.5. SUBSPACES, BASIS, DIMENSION, AND RANK 181 Finally, from the row-reduced form, the solutions of Ax = 0 where   x1 x2   x= x3  x4 can be found by noting that x3 is a free variable; we set it to t. Also, x4 = 0; solving for the other two variables in terms of t gives x1 = −t and x2 = t. Thus a basis for the nullspace of A is given by   −1  1  x=  1 . 0 20. We follow the comment after Example 3.48. Start by finding the reduced row echelon form of A:       1 −2 1 4 4 1 −2 1 4 4 2 −4 0 2 1   R2 +R1   R ↔R3   0 2 6 7 −−→ 0 2 1 2 3 −−− 2 1 2 3 −−1−−−→ −1 −1 R3 +R2 R3 −2R1 0 −2 −6 −7 −−− 2 −4 0 2 1 − 1 −2 1 4 4 −−→ −−−−→ 0     R −R   1 2 1 −2 1 4 4 1 −2 1 4 4 −−− 1 −2 0 1 12 −−→   21 R2     0 2 6 7 −− 0 1 3 72  0 1 3 72  0 0 −→ 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 Thus a basis for row(A) is given by the nonzero rows of the reduced matrix, which after clearing fractions are 2 −4 0 2 1 , 0 0 2 6 7 . A basis for col(A) is then given by the original column vectors of A corresponding to the columns containing the leading 1s in the row-reduced matrix; thus a basis for col(A) is given by     0   2 −1 , 1 .   1 1 Finally, from the row-reduced form, the solutions of Ax = 0 where   x1 x2     x= x3  x4  x5 can be found by noting that x2 = s, x4 = t, and x5 = u are free variables. Solving for the other two variables gives x1 = 2s − t − 12 u and x3 = −3t − 27 u. Thus the nullspace of A is given by           1     1  2s − t + 12 u  2 2 −1 −2  −1 −2                1  0  0  0  0 s 1                   −3t − 7 u  = s 0 + t −3 + u − 7  = span 0 , −3 , − 7  . 2 2          2               0  1  0  1  0 0 t             0 0 0 0 u 1 1 Clearing fractions, we get for a basis for the nullspace of A       −1 −1    2      0  0  1        0 , −3 , −7 .            1  0    0    0 0 2 182 CHAPTER 3. MATRICES 21. Since row(A) = col(AT ) and col(A) = row(AT ), it suffices to find the row and column spaces of AT . Then     1 1 1 0 AT =  0 1 → 0 1 , −1 1 0 0 so that the column space of AT is given by the columns of AT corresponding to columns of this matrix containing a leading 1, so     1   1 col(AT ) =  0 , 1 ⇒ row(A) = 1 0 −1 , 1 1 1 .   −1 1 The row space of AT is given by the nonzero rows of the reduced matrix, so 1 0 T row(A ) = 1 0 , 0 1 ⇒ col(A) = , . 0 1 22. Since row(A) = col(AT ) and col(A) = row(AT ), it suffices to find the row and column spaces of AT . Then     1 0 1 1 0 1 AT =  1 2 −1 → 0 1 −1 , −3 1 −4 0 0 0 so that the column space of AT is given by the columns of AT corresponding to columns of the reduced matrix containing a leading 1, so     0   1 T    1 , 2 ⇒ row(A) = 1 1 −3 , 0 2 1 . col(A ) =   1 −3 The row space of AT is given by the nonzero rows of the reduced matrix, so     0   1 row(AT ) = 1 0 1 , 0 1 −1 ⇒ col(A) = 0 ,  1 .   −1 1 23. Since row(A) = col(AT ) and col(A) = row(AT ), it suffices to find the row and column spaces of AT . Then     1 0 1 1 0 0 1 0 1 0 1 1    AT =  0 −1 −1 → 0 0 1 , 1 1 −1 0 0 0 so that the column space of AT is given by the columns of AT corresponding to columns of the reduced matrix containing a leading 1, so       1 0 0           1  1  1 T  col(A ) =   ,   ,   ⇒ row(A) = 1 1 0 1 , 0 1 −1 1 , 0 1 −1 −1 . 0 −1 −1       1 1 −1 The row space of AT is given by the nonzero rows of the reduced matrix, so       0 0   1 row(AT ) = 1 0 0 , 0 1 0 , 0 0 1 ⇒ col(A) = 0 , 1 , 0 .   0 0 1 3.5. SUBSPACES, BASIS, DIMENSION, AND RANK 183 24. Since row(A) = col(AT ) and col(A) = row(AT ), it suffices to find the row and column spaces of AT . Then     1 0 1 2 −1 1 0 1 1 −4 2 −2      → 0 0 0 , 0 1 1 AT =      0 0 0  2 2 4 0 0 0 1 3 4 so that the column space of AT is given by the columns of AT corresponding to columns of the reduced matrix containing a leading 1, so     −1  2           −4     2  T  ,  1 ⇒ row(A) = 2 −4 0 2 1 , −1 2 1 2 3 . 0 col(A ) =          2  2       3 1 The row space of AT is given by the nonzero rows of the reduced matrix, so     0   1 row(AT ) = 1 0 1 , 0 1 1 ⇒ col(A) = 0 , 1 .   1 1 25. There is no reason to expect the bases for the row and column spaces found in Exercises 17 and 21 to be the same, since we used different methods to find them. What is importantis that they span should 1 0 −1 , 0 1 2 }, the same subspaces. The basis for the row space found in Exercise 17 was { and Exercise 21 gave { 1 0 −1 , 1 1 1 }. Since the vector is the same, in order to first basis 0 1 2 is in the span of the second basis, show the subspaces are the same it suffices to see that and that 1 1 1 is in the span of the first basis. But 0 1 2 = − 1 0 −1 + 1 1 1 , 1 1 1 = 1 0 −1 + 0 1 2 . So the two bases span the same row space. Similarly, the basis for the two column spaces were 1 0 1 0 Exercise 17: , , Exercise 21: , . 1 1 0 1 Here the second basis vectors are the same, and since 1 1 0 1 1 0 = + , = − , 1 0 1 0 1 1 each basis vector is a linear combination of the vectors in the other basis, so they span the same column space. 26. There is no reason to expect the bases for the row and column spaces found in Exercises 18 and 22 to be the same, since we used different methods to find them. What is importantis that they span should 2 0 −7 0 2 1 the same subspaces. The basis for the row space found in Exercise 18 was { , }, and Exercise 22 gave { 1 1 −3 , 0 2 1 }. Since the vector is the same, in order to second basis show the subspaces are the same it suffices to see that 2 0 −7 is in the span of the second basis, and that 1 1 −3 is in the span of the first basis. But 2 0 −7 = 2 1 1 −3 − 0 2 1 , 1 1 1 −3 = 2 2 0 1 −7 + 0 2 2 1 . So the two bases span the same row space. Similarly, the basis for the two column spaces were         1  0   1  1 0 ,  1 . 0 ,  2 , Exercise 18: Exercise 22:     1 −1 1 −1 184 CHAPTER 3. MATRICES Here the first basis vectors are the same, and since       1 0 1  2 = 0 + 2  1 , −1 1 −1      0 1 1  1 = 1  2 − 1 0 , 2 2 −1 1 −1  each basis vector is a linear combination of the vectors in the other basis, so they span the same column space. 27. As in Example 3.46, a basis for the span of the given set of vectors is the same as a basis for the row space of the matrixthat has rows equal to these vectors. Thus we form a matrix whose rows are 1 −1 0 , −1 0 1 , and 0 1 −1 and row-reduce it:  1 −1 0   −1 0 1 R2 +R1  0 1 − −−−−→ 0 1 −1 0   −1 0 1 −R2  −1 1 − −−→ 0 1 −1 0  R1 +R2  −1 0 −−− −−→ 1 0 0 1 1 −1 R −R 3 2 1 −1 −−−−−→ 0 0  −1 −1 0 A basis for the row space of the matrix, and thus a basis for the span of the given vectors, is therefore given by the nonzero rows of the reduced matrix:     0   1  0 ,  1 .   −1 −1 Note that since no row interchanges were involved, the rows of the original matrix corresponding to nonzero rows in the reduced matrix also form a basis for the span of the given vectors. 28. As in Example 3.46, a basis for the span of the given set of vectors is the same as a basis for the row space of to these vectors. Thus we form a matrix whose rows are the matrix that has rows equal 1 −1 1 , 1 2 0 , 0 1 1 , and 2 1 2 and row-reduce it:  1 1  0 2   1 −1 1 R −R 2 1 0 − − − − − → 2 0   0 1 1 R −2R 4 1 1 2 −−−−−→ 0  1 0 R3 −R2  −−− −−→ 0 R4 −2R2 −−−−−→ 0     1 1 −1 1 1 −1 1  3 R2 0 0 3 0 3 −1 R ↔R 2 4  −−−−−→   −−−→  0 0 1 1 1 1 3 0 0 3 −1 0    −1 1 1 −1 1 0 1 0 1 0    0 0 1 0 1 R4 +R3 0 −1 − 0 0 −−−−→ 0  −1 1 1 0  1 1 3 −1 A basis for the row space of the matrix, and thus a basis for the span of the given vectors, is therefore given by       0 0   1 −1 , 1 , 0 .   1 0 1 (Note that this matrix could be further reduced, and we would get a different basis). 29. As in Example 3.46, a basis for the span of the given set of vectors is the same as a basis for the row space of the matrix that has rows equal to these vectors. Thus we form a matrix whose rows are 3.5. SUBSPACES, BASIS, DIMENSION, AND RANK 2 −3 185 1 , 1 −1 0 , 4 −4 1 and row-reduce it:         1 −1 0 1 −1 0 2 −3 1 1 −1 0 R2 −2R1 −R2  R1 ↔R2  1 −1 0− 0 1 −1 −−−−→ 2 −3 1 −−−−−→ 0 −1 1 −−−→ R3 −4R1 0 −2 1 0 −2 1 4 −4 1 4 −4 1 − −−−−→       1 −1 0 1 −1 0 1 −1 0 R2 +R3  0 0 1 −1 1 −1 − 1 0 −−−−→ 0 R3 +2R2 −R3 0 −1 −−−→ 0 0 1 0 0 1 −−−−−→ 0   R1 +R2 −−−−−→ 1 0 0 0 1 0 0 0 1 A basis for the row space of the matrix, and thus a basis for the span of the given vectors, is therefore given by { 1 0 0 , 0 1 0 , 0 0 1 } (Note that as soon as we got to the first matrix on the second line above, we could have concluded that the reduced matrix would have no zero rows; since it was 3 × 3, it would thus be the identity matrix, so that we could immediately write down the basis above, or simply use the basis given by the rows of that matrix). 30. As in Example 3.46, a basis for the span of the given set of vectors is the same as a basis for the row space of the matrix that has rows equal to these vectors. Thus we form a matrix whose rows are 0 1 −2 1 , 3 1 −1 0 , 2 1 5 1 and row-reduce it:  1      1 1 3 R1 3 1 −1 0 −− 1 0 1 −2 1 − 0 3 3  −→   R1 ↔R2    0 1 −2 1 0 1 −2 1 3 1 −1 0 − − − − − →       2 1 5 1 2 1 5 1 2 1    1 31 − 31 0 1    0 1 −2 1 0    R3 − 31 R2 R −2R1 17 0 13 1 −−3−−−→ −−−−−−→ 0 3   1 13 − 31 0   0 1 −2 1   3 R3 2 1 19 −19 −−→ 0 0 5 1 1 3 − 13 1 −2 0 19 3  0  1  2 3 Although this matrix could be reduced further, we already have a basis for the row space: 2 0 1 −2 1 , 1 13 − 13 0 , 0 0 1 19 , or 3 1 −1 0 , 0 1 −2 1 , 0 0 19 2 . 31. If, as in Exercise 29, we form the matrix whose rows are those vectors and row-reduce it, we get the identity matrix (see Exercise 29). Because there were no row exchanges, the rows of the original matrix corresponding to nonzero rows in the row-reduced matrix form a basis for the span of the vectors (because there were no row exchanges). Thus a basis consists of all three given vectors: 2 −3 1 , 1 −1 0 , 4 −4 1 . 32. If, as in Exercise 30, we form the matrix whose rows are those vectors and row-reduce it, we get a matrix with no zero rows. Thus the rank of the matrix is 3, so the rows are linearly independent, so a basis consists of all three row vectors: 0 1 −2 1 , 3 1 −1 0 , 2 1 5 1 . 186 CHAPTER 3. MATRICES 33. If R is a matrix in echelon form, then row(R) is the span of the rows of R, by definition. But nonzero rows of R are linearly independent, since their first entries are in different columns, so those nonzero rows form a basis for row(R) (they span, and are linearly independent). 34. col(A) is the span of the columns of A by definition. If the columns of A are also linearly independent, then they form a basis for col(A) by definition of “basis”. 35. The rank of A is the number of vectors in a basis for either row(A) or col(A), by definition. From Exercise 17, row(A) has two vectors in its basis, so rank(A) = 2. Then nullity(A) is the number of columns minus the rank, or 3 − 2 = 1. 36. The rank of A is the number of vectors in a basis for either row(A) or col(A), by definition. From Exercise 18, row(A) has two vectors in its basis, so rank(A) = 2. Then nullity(A) is the number of columns minus the rank, or 3 − 2 = 1. 37. The rank of A is the number of vectors in a basis for either row(A) or col(A), by definition. From Exercise 19, row(A) has three vectors in its basis, so rank(A) = 3. Then nullity(A) is the number of columns minus the rank, or 4 − 3 = 1. 38. The rank of A is the number of vectors in a basis for either row(A) or col(A), by definition. From Exercise 20, row(A) has two vectors in its basis, so rank(A) = 2. Then nullity(A) is the number of columns minus the rank, or 5 − 2 = 3. T 39. Since nullity(A) > 0, the equation P Ax = 0 has a nontrivial solution x = c1 c2 · · · cn , where at least one ci 6= 0. Then Ax = ci ai = 0. Since at least one ci 6= 0, this shows that the columns of A are linearly dependent. 40. Since rank(A) is at most the number of columns of A and also at most the number of rows of A, in this case rank(A) ≤ 2. Thus any row-reduced form of A will have at least 4 − 2 = 2 zero rows, which means that some row of A is a linear combination of the other rows, so that the rows of A are linearly dependent. 41. Since rank(A) is at most the number of columns of A and also at most the number of rows of A, in this case rank(A) ≤ 5 and rank(A) ≤ 3, so that rank(A) ≤ 3 and its possible values are 0, 1, 2, and 3. Using the Rank Theorem, we know that nullity(A) = n − rank(A) = 5 − rank(A), so that nullity(A) = 5, 4, 3, or 2. 42. Since rank(A) is at most the number of columns of A and also at most the number of rows of A, in this case rank(A) ≤ 2 and rank(A) ≤ 4, so that rank(A) ≤ 2 and its possible values are 0, 1, and 2. Using the Rank Theorem, we know that nullity(A) = 2 − rank(A) = 1 − rank(A), so that nullity(A) = 2, 1, or 0. 43. First row-reduce A:    1 2 a 1 R2 +2R1 −2 4a 2 −−−−−→ 0 R3 −aR1 a −2 1 − −−−−→ 0   1 2 a 0 4a + 4 2a + 2 R2 + 21 R2 −2a − 2 1 − a2 − −−−−−→ 0 2 4a + 4 0 If a = −1, then we have  1 0 0 2 0 0  −1 0 so that rank(A) = 1. 0  1 0 0 2 12 0  2 6 so that rank(A) = 2. 0 If a = 2, then we have Otherwise, the row-reduced matrix has all rows nonzero, so that rank(A) = 3.  a 2a + 2  −a2 + a + 2 3.5. SUBSPACES, BASIS, DIMENSION, AND RANK 187 44. First row-reduce A:     R1 +R2   a 2 −1 −2 −1 a − −−−−→ 1 2 a − 2 R1 ↔R3   3 3 3 3 −2− 3 3 −2 −2 −−−−→ −2 −1 a a 2 −1 a 2 −1    1 2 1 2 a−2 1 R2 −3R1 − 3 R2  −3 4 − 3a  − −−−−−→ 0 1 −−−→ 0 2 R3 −aR2 0 2 − 2a −a + 2a − 1 0 2 − 2a −−−−−→   1 2 a−2 0 1  a − 34 R3 −(2−2a)R2 5 0 0 (a − 1) a − −−−−−−−−−→ 3  a−2  a − 34 2 −a + 2a − 1 The first two rows are always nonzero. If a = 1 or a = 53 , then the bottom row is zero, so that rank(A) = 2. Otherwise, all three rows are nonzero and rank(A) = 3. 45. As in example 3.52,   1 1 , 0   1 0 , 1   0 1 1 form a basis for R3 if and only if the matrix with those vectors as its rows has rank 3. Row-reduce that matrix:       1 1 0 1 1 0 1 1 0 R2 −R1  0 −1 1 1 0 1 − −−−−→ 0 −1 1 R3 +R2 0 2 0 1 1 0 1 1 − −−−−→ 0 Since the matrix has rank 3 (it has no nonzero rows and is in echelon form), the vectors are linearly independent; since there are three of them, they form a basis for R3 . 46. As in example 3.52,  1 −1 , 3    −1  5 , 1   1 −3 1 form a basis for R3 if and only if the matrix with those vectors as its rows has rank 3. Row-reduce that matrix:       1 −1 3 1 −1 3 1 −1 3 R +2R1 R2 +2R3  −1 0 5 1 −−2−−−→ 4 4 − 0 0 −−−−→ 0 R −R 3 1 1 −3 1 −−− 0 −2 −2 0 −2 −2 −−→ Since the matrix does not have rank 3, it follows that these vectors do not form a basis for R3 . (In fact,         1 −1 1 0 − −1 +  5 + 2 −3 = 0 , 3 1 1 0 so the vectors are linearly dependent.) 47. As in example 3.52,   1 1  , 1 0   1 1  , 0 1   1 0  , 1 1   0 1   1 1 188 CHAPTER 3. MATRICES form a basis for R4 if and only if the matrix with those vectors as its rows has rank 4. Row-reduce that matrix:  1 1  1 0 1 1 0 1 1 0 1 1   1 0 R −R 2 1  − − − − − → 1 0  R3 −R1 0 1 − −−−−→ 0 1   1 1 1 0 0 0 −1 1 R ↔R 4 2   −1 0 1 −−−−−→ 0 0 1 1 1  1 1 0 1 1 1  −1 0 1 0 −1 1    1 1 1 1 0 0 0 1  1 1    R3 +R2  0 1 2 −−− −−→ 0 0 R4 +R3 0 0 −1 1 −−−−−→ 0 1 1 0 0 1 1 1 0  0 1  2 3 This matrix is in echelon form and has no zero rows, so it has rank 4 and thus these vectors form a basis for R4 . 48. As in example 3.52,   1 −1  ,  0 0   0  1  ,  0 −1   0  0  , −1 1   −1  0    1 0 form a basis for R4 if and only if the matrix with those vectors as its rows has rank 4. Row-reduce that matrix:       1 −1 0 0 1 −1 0 0 1 −1 0 0 0 0  0 1 0 −1 1 0 −1 1 0 −1            0 0 0 −1 1 0 0 −1 1 0 −1 1 R4 +R2 R4 +R1 0 1 −1 1 0 − −1 0 1 0 − −−−−→ 0 −−−−→ 0 −1   1 −1 0 0 0 1 0 −1   0 0 −1 1 R4 +R3 0 0 0 −−− −−→ 0 Since the matrix does not have rank 4, these vectors do not form a basis for R4 . (In fact, the sum of the four vectors is zero, so they are linearly dependent.) 49. As in example 3.52,   1 1 , 0   0 1 , 1   1 0 1 form a basis for Z32 if and only if the matrix with those vectors as its rows has rank 3. Row-reduce that matrix over Z2 :       1 1 0 1 1 0 1 1 0 0 1 1 0 1 1 0 1 1 R3 +R1 R3 +R2 1 0 1 − −−−−→ 0 1 1 −−− −−→ 0 0 0 Since the matrix does not have rank 3, these vectors do not form a basis for Z32 . 50. As in example 3.52,   1 1 , 0   0 1 , 1   1 0 1 3.5. SUBSPACES, BASIS, DIMENSION, AND RANK 189 form a basis for Z33 if and only if the matrix with those vectors as its rows has rank 3. Row-reduce that matrix over Z3 :       1 1 0 1 1 0 1 1 0 0 1 1 0 1 1 0 1 1 R3 +2R1 R3 +R2 1 0 1 − −−−−→ 0 2 1 −−− −−→ 0 0 2 Since the matrix has rank 3, these vectors form a basis for Z33 . 51. We want to solve       1 1 1 c1 2 + c2  0 = 6 , 0 −1 2 so set up the augmented matrix of the linear system and row-reduce to solve it:      R1 −R2   1 1 1 −−→ 1 0 1 1 1 −−− 1 1 1 1 − 2 R2  R2 −2R1  0 1 2 0 6 − 1 −2 −−−−→ 0 −2 4 −−− −→ 0 R +R 3 2 0 −1 2 0 −1 2 0 −1 2 −−−−−→ 0 0  3 −2 0 Thus c1 = 3 and c2 = −2, so that       1 1 1 3 2 − 2  0 = 6 . 2 −1 0 3 Then the coordinate vector [w]B = . −2 52. We want to solve       1 5 3 c1 1 + c2 1 = 3 , 4 6 4 so set up the augmented matrix of the linear system and row-reduce to solve it:        R1 −R2   3 5 1 1 1 1 1 3 1 1 3 1 3 −−− −−→ 1 0 R −3R 2 1 R R ↔R 2  1 1 3 −−1−−−→  0 1 2 2 0 3 5 1 −−−−−→ 0 2 −8 −− 1 −4 −→ R3 −4R1 R3 −2R2 4 6 4 4 6 4 − 0 2 −8 0 2 −8 −−−−→ −−−−−→ 0 0  7 −4 0 Thus c1 = 7 and c2 = −4, so that       3 5 1 7 1 − 4 1 = 3 , 4 6 4 7 Then the coordinate vector [w]B = . −4 53. Row-reduce A over Z2 :  1 A = 0 1 1 1 0   0 1 0 1 R3 +R1 1 − −−−−→ 0 1 1 1   0 1 0 1 R3 +R2 1 − −−−−→ 0 1 1 0  0 1 . 0 This matrix is in echelon form with one zero row, so it has rank 2. Thus nullity(A) = 3 − rank(A) = 3 − 2 = 1. 54. Row-reduce A over Z3 :  1 A = 2 2 1 1 0   2 1 R2 +R1 2 −−− −−→ 0 R3 +R1 0 − −−−−→ 0 1 2 1   2 1 0 1 R3 +R2 2 − −−−−→ 0 1 2 0  2 1 . 0 190 CHAPTER 3. MATRICES This matrix is in echelon form with one zero row, so it has rank 2. Thus nullity(A) = 3 − rank(A) = 3 − 2 = 1. 55. Row-reduce A over Z5 :  1 A = 2 1 3 3 0 1 0 4   1 4 R +3R1 0 1 −−2−−−→ R3 +4R1 0 − −−−−→ 0 3 2 2 1 3 3   4 1 0 3 R3 +4R2 1 − −−−−→ 0 3 2 0 1 3 0  4 3 3 This matrix is in echelon form with no zero rows, so it has rank 3. Thus nullity(A) = 4 − rank(A) = 4 − 3 = 1. 56. Row-reduce A over Z7 :  2 6 A= 1 1 4 3 0 1 0 5 2 1 0 1 2 1   2 1 R +4R 2 1 −−−−−→ 0 0  R +3R  1  0 5 − −3−−−→ R +3R 1 0 1 −−4−−−→ 4 5 5 6 0 5 2 1 0 1 2 1  1 4  R +6R2 1 −−3−−−→ R4 +3R2 4 − −−−−→  2 4 0 0 5 5  0 0 4 0 0 2 0 1 1 4   2 1 0 4   0 4 R4 +3R3 2 − −−−−→ 0 4 5 0 0 0 5 4 0 0 1 1 0  1 4 . 4 0 This matrix is in echelon form with one zero row, so it has rank 3. Thus nullity(A) = 5 − rank(A) = 5 − 3 = 2. 57. Since x is in null(A), we know that Ax = 0. But if Ai is the ith row of A, then       0 a11 x1 + a12 x2 + · · · + a1n xn A1 · x  a21 x1 + a22 x2 + · · · + a2n xn   A2 · x  0       Ax =   =  ..  =  ..  . ..    .  . . am1 x1 + am2 x2 + · · · + amn xn Am · x 0 Thus Ai · x = 0 for all i, so that every row of A is orthogonal to x. 58. By the Fundamental Theorem, if A and B have rank n, then they are both invertible. Therefore, by Theorem 3.9(c), AB is invertible as well. Applying the Fundamental Theorem again shows that AB also has rank n. 59. (a) Let bk be the k th column of B. Then Abk is the k th column of AB. Now suppose that some set of columns of AB, say Abk1 , Abk2 , . . . , Abkr , are linearly independent. Then ck1 bk1 + ck2 bk2 + · · · + ckr bkr = 0 ⇒ A (ck1 bk1 + ck2 bk2 + · · · + ckr bkr ) = 0 ⇒ ck1 (Abk1 ) + ck2 (Abk2 ) + · · · + ckr (Abkr = 0). But the Abki are linearly independent, so all the cki are zero and thus the bki are linearly independent. So the number of linearly independent columns in B is at least as large as the number of linearly independent columns of AB; it follows that rank(AB) ≤ rank(B). (b) There are many examples. For instance, 1 1 1 0 A= , B= . 0 0 1 1 Then B has rank 2, and AB = 2 0 1 has rank 1. 0 3.5. SUBSPACES, BASIS, DIMENSION, AND RANK 191 60. (a) Recall that by Theorem 3.25, if C is any matrix, then rank(C T ) = rank(C). Thus rank(AB) = rank((AB)T ) = rank(B T AT ) ≤ rank(AT ) = rank(A), using Exercise 59(a) with AT in place of B and B T in place of A. (b) For example, we can use the transposes of the matrices in part (b) of Exercise 59: 1 1 1 0 A= , B= . 0 1 1 0 Then A has rank 2 and 2 AB = 1 0 has rank 1. 0 61. (a) Since U is invertible, we have A = U −1 (U A), so that using Exercise 59(a) twice, rank(A) = rank U −1 (U A) ≤ rank(U A) ≤ rank(A). Thus rank(A) = rank(U A). (b) Since V is invertible, we have A = (AV )V −1 , so that using Exercise 60(a) twice, rank(A) = rank (AV )V −1 ≤ rank(AV ) ≤ rank(A). Thus rank(A) = rank(AV ). 62. Suppose first that A has rank 1. Then col(A) = span(u) for some vector u. It follows that if ai is the ith column of A, then for each i, we have ai = ci u. Let vT = c1 c2 · · · cn . Then A = uvT . For the converse, suppose that A = uvT . Suppose that vT = c1 c2 · · · cn . Then A = uvT = c1 u · · · cn u . Thus every column of A is a scalar multiple of u, so that col(A) = span(u) and thus rank(A) = 1. 63. If rank(A) = r, then a basis for col(A) consists of r vectors, say u1 , u2 , . . . , ur ∈ Rm . So if ai is the ith column of A, then for each i we have ai = ci1 u1 + ci2 u2 + · · · + cir ur , i = 1, 2, . . . , n. Now let viT = c1i c2i · · · cni , for i = 1, 2, . . . , r. Then so that A = a1 a2 · · · an = c11 u1 + c12 u2 + · · · + c1r ur · · · cn1 u1 + cn2 u2 + · · · + cnr ur = c11 u1 · · · cn1 u1 + c12 u2 · · · cn2 u2 + c1r ur · · · cnr ur = u1 · v1T + u2 · v2T + · · · + ur · vrT . By exercise 62, each ui · viT is a matrix of rank 1, so A is a sum of r matrices of rank 1. 64. Let A = {u1 , . . . , ur } be a basis for col(A) and B = {v1 , . . . , vs } be a basis for col(B). Then rank(A) = r and rank(B) = s. Let ai and bi be the columns of A and B respectively, for i = 1, 2, . . . , n. Then if x ∈ col(A + B),   ! ! n n r s r n s n X X X X X X X X  x= c1 (ai + bi ) = αij uj + βik vk  = ci αij uj + ci βik vk , i=1 i=1 j=1 k=1 j=1 i=1 k=1 i=1 so that x is a linear combination of A and B. Thus A ∪ B spans col(A + B). But this implies that col(A + B) ⊆ span(A ∪ B) = span(A) + span(B) = col(A) + col(B). Since rank(A + B) is the dimension of col(A + B) and rank(A) + rank(B) is the dimension of col(A) plus the dimension of col(B), we are done. 192 CHAPTER 3. MATRICES P 65. We show that col(A) ⊆ null(A) and then use the Rank Theorem. Suppose x = ci ai ∈ col(A). Let ai be the ith column of A; then ai = Aei where the ei are the standard basis vectors for Rn . Then X X X X X Ax = A ci ai = ci (Aai ) = ci (A (Aei )) = ci A2 ei = ci (Oei ) = 0. Thus any vector in the column space of A is taken to zero by A, so it is in the null space of A. Thus col(A) ⊆ null(A). But then rank(A) = dim(col(A)) ≤ dim(null(A)) = nullity(A), so by the Rank Theorem, rank(A) + rank(A) ≤ rank(A) + nullity(A) = n ⇒ 2 rank(A) ≤ n ⇒ rank(A) ≤ n . 2 T 66. (a) Note first that since x is n × 1, then xT is 1 × n and thus xT Ax is 1 × 1. Thus xT Ax = xT Ax. Now since A is skew-symmetric, xT Ax = xT Ax T = x T AT x T T = xT (−A) x = −xT Ax. So xT Ax is equal to its own negation, so it must be zero. (b) By Theorem 3.27, to show I + A is invertible it suffices to show that null(I + A) = 0. Choose x with (I + A)x = 0. Then xT (I + A)x = xT 0 = x · 0 = 0, so that 0 = xT (I + A)x = xT Ix + xT Ax = xT Ix + xT 0 = xT Ix = xT x = x · x since Ax = 0. But x · x = 0 implies that x = 0. So any vector in null(I + A) is the zero vector and thus I + A has a zero null space, so it is invertible. 3.6 Introduction to Linear Transformations 1. We must compute T (u) and T (v), so we take the product of the matrix A of T and each of these two vectors: 2 −1 1 0 T (u) = Au = = 3 4 2 11 2 −1 1 8 T (v) = Av = = . 3 4 2 1 2. We must compute T (u) and T (v), so we take the product of the matrix A of T and each of these two vectors:       3 −1 3·1−1·2 1 1 2 T (u) = Au = 1 = 1 · 1 + 2 · 2 = 5 2 1 4 1·1+4·2 9       3 −1 3 · 3 − 1 · (−2) 11 3 2 T (v) = Av = 1 = 1 · 3 + 2 · (−2) = −1 −2 1 4 1 · 3 + 4 · (−2) −5 3. Use the remark following Example 3.55. T is a linear transformation if and only if x x T (c1 v1 + c2 v2 ) = c1 T (v1 ) + c2 T (v2 ), where v1 = 1 , v2 = 2 . y1 y2 3.6. INTRODUCTION TO LINEAR TRANSFORMATIONS 193 Then x1 x + c2 2 y1 y2 c1 x1 + c2 x2 =T c1 y1 + c2 y2 (c1 x1 + c2 x2 ) + (c1 y1 + c2 y2 ) = (c1 x1 + c2 x2 ) − (c1 y1 + c2 y2 ) (c1 x1 + c1 y1 ) + (c2 x2 + c2 y2 ) = (c1 x1 − c1 y1 ) + (c2 x2 − c2 y2 ) c1 x1 + c1 y1 c2 x2 + c2 y2 = + c1 x1 − c1 y1 c2 x2 − c2 y2 x1 + y1 x2 + y2 = c1 + c2 x1 − y1 x2 − y2 T (c1 v1 + c2 v2 ) = T c1 = c1 T (v1 ) + c2 T (v2 ) and thus T is linear. 4. Use the remark following Example 3.55. T is a linear transformation if and only if T (c1 v1 + c2 v2 ) = c1 T (v1 ) + c2 T (v2 ), where v1 = x1 x , v2 = 2 . y1 y2 Then x1 x T (c1 v1 + c2 v2 ) = T c1 + c2 2 y1 y2 c x + c2 x2 =T 1 1 c1 y1 + c2 y2   −(c1 y1 + c2 y2 ) =  (c1 x1 + c2 x2 ) + 2(c1 y1 + c2 y2 )  3(c1 x1 + c2 x2 ) − 4(c1 y1 + c2 y2 )     −c1 y1 −c2 y2 =  c1 x1 + 2c1 y1  +  c2 x2 + 2c2 y2  3c1 x1 − 4c1 y1 3c2 x2 − 4c2 y2     −y1 −y2 = c1  x1 + 2y1  + c2  x2 + 2y2  3x1 − 4y1 3x2 − 4y2 = c1 T (v1 ) + c2 T (v2 ) and thus T is linear. 5. Use the remark following Example 3.55. T is a linear transformation if and only if     x1 x2 T (c1 v1 + c2 v2 ) = c1 T (v1 ) + c2 T (v2 ), where v1 =  y1  , v2 =  y2  . z1 z2 194 CHAPTER 3. MATRICES Then      x1 x2 T (c1 v1 + c2 v2 ) = T c1  y1  + c2  y2  z1 z2   c1 x1 + c2 x2 = T  c1 y1 + c2 y2  c1 z1 + c2 z2 (c1 x1 + c2 x2 ) − (c1 y1 + c2 y2 ) + (c1 z1 + c2 z2 ) = 2(c1 x1 + c2 x2 ) + (c1 y1 + c2 y2 ) − 3(c1 z1 + c2 z2 ) (c1 x1 − c1 y1 + c1 z1 ) + (c2 x2 − c2 y2 + c2 z2 ) = (2c1 x1 + c1 y1 − 3c1 z1 ) + (2c2 x2 + c2 y2 − 3c2 z2 ) c1 x1 − c1 y1 + c1 z1 c2 x2 − c2 y2 + c2 z2 = + 2c1 x1 + c1 y1 − 3c1 z1 2c2 x2 + c2 y2 − 3c2 z2 x1 − y1 + z1 x2 − y2 + z2 = c1 + c2 2x1 + y1 − 3z1 2x2 + y2 − 3z2 = c1 T (v1 ) + c2 T (v2 ) and thus T is linear. 6. Use the remark following Example 3.55. T is a linear transformation if and only if     x1 x2 T (c1 v1 + c2 v2 ) = c1 T (v1 ) + c2 T (v2 ), where v1 =  y1  , v2 =  y2  . z1 z2 Then      x1 x2 T (c1 v1 + c2 v2 ) = T c1  y1  + c2  y2  z1 z2   c1 x1 + c2 x2 = T  c1 y1 + c2 y2  c1 z1 + c2 z2   (c1 x1 + c2 x2 ) + (c1 z1 + c2 z2 ) =  (c1 y1 + c2 y2 ) + (c1 z1 + c2 z2 )  (c1 x1 + c2 x2 ) + (c1 y1 + c2 y2 )     c1 x1 + c1 z1 c2 x2 + c2 z2 =  c1 y1 + c1 z1  +  c2 y2 + c2 z2  c1 x1 + c1 y1 c2 x2 + c2 y2     x1 + z1 x2 + z2 = c1  y1 + z1  + c2  y2 + z2  x1 + y1 x2 + y2 = c1 T (v1 ) + c2 T (v2 ) and thus T is linear. 7. For example, 1 0 0 2 0 0 1+2 3 0 0 0 0 T = 2 = , T = 2 = , but T =T = 2 = 6= + . 0 1 1 0 2 4 0 0 3 9 1 4 8. For example, −1 |−1| 1 1 |1| 1 −1 + 1 0 |0| 0 1 1 T = = , T = = , but T = = = 6= + . 0 |0| 0 0 |0| 0 0 0 |0| 0 0 0 3.6. INTRODUCTION TO LINEAR TRANSFORMATIONS 9. For example T 2 2·2 4 2+2 4 4·4 16 4 4 , but T =T = = 6= + . = = 2 2+2 4 2+2 4 4+4 8 4 4 10. For example, 1 1+1 2 1+1 2 2+1 3 2 2 T = = , but T =T = = 6= + . 0 0−1 −1 0+0 0 0−1 −1 −1 −1 11. By Theorem 3.31, the standard matrix of T is the matrix A whose ith column is T (ei ). Since 1 1+0 1 0 0+1 1 T (e1 ) = T = = , T (e2 ) = T = = , 0 1−0 1 1 0−1 −1 the matrix of T is 1 A= 1 1 . −1 12. By Theorem 3.31, the standard matrix of T is the matrix A whose ith column is T (ei ). Since         −0 0 −1 −1 1 0 T (e1 ) = T =  1 + 2 · 0  = 1 , T (e2 ) = T =  0 + 2 · 1  =  2 , 0 1 3·1−4·0 3 3·0−4·1 −4 the matrix of T is  0 A = 1 3  −1 2 . −4 13. By Theorem 3.31, the standard matrix of T is the matrix A whose ith column is T (ei ). Since   1 1−0+0 1 T (e1 ) = T 0 = = , 2·1+0−3·0 2 0   0 0−1+0 −1 T (e2 ) = T 1 = = , 2·0+1−3·0 1 0   0 0−0+1 1 T (e3 ) = T 0 = = , 2·0+0−3·1 −3 1 the matrix of T is 1 A= 2 −1 1 . 1 −3 14. By Theorem 3.31, the standard matrix of T is the matrix A whose ith column is T (ei ). Since       1 1+0 1 T (e1 ) = T 0 = 0 + 0 = 0 , 0 1+0 1       0 0+0 0 T (e2 ) = T 1 = 1 + 0 = 1 , 0 0+1 1       0 0+1 1 T (e3 ) = T 0 = 0 + 1 = 1 , 1 0+0 0 195 196 CHAPTER 3. MATRICES the matrix of T is  1 A = 0 1  1 1 . 0 0 1 1 15. Reflecting a vector in the x-axis means negating the x-coordinate. So using the method of Example 3.56, x −x −1 0 , F = =x +y y y 0 1 and thus F is a matrix transformation with matrix −1 F = 0 0 . 1 It follows that F is a linear transformation. 16. Example 3.58 shows that a rotation of 45◦ about the origin sends " # " # " √ # " # " # "√ # 2 ◦ 0 cos 135 − 2 1 cos 45◦ to = √22 , to = √22 and ◦ ◦ 1 sin 135 0 sin 45 2 2 so that R is a matrix transformation with matrix "√ R= √ 2 √2 2 2 −√ 22 # 2 2 . It follows that R is a linear transformation. 17. Since D stretches by a factor of 2 in the x-direction and a factor of 3 in the y-direction, we see that x 2x 2 0 x D = = . y 3y 0 3 y Thus D is a matrix transformation with matrix 2 0 0 . 3 It follows that D is a linear transformation. 18. Since P = (x, y) projects onto the line whose direction vector is h1, 1i, it follows that the projection of P onto that line is x+y x+y hx, yi · h1, 1i h1, 1i = , . projh1,1i hx, yi = h1, 1i · h1, 1i 2 2 Thus " # " # " # " # " x+y 1 1 1 x 2 2 T = x+y = x 1 + y 21 = 21 y 2 2 2 2 1 2 1 2 #" # x . y Thus T is a matrix transformation with the matrix above, and it follows that T is a linear transformation. 19. Let A1 = k 0 0 , 1 A2 = 1 0 0 , k B= 0 1 1 , 0 C1 = 1 0 k , 1 C2 = 1 k 0 1 3.6. INTRODUCTION TO LINEAR TRANSFORMATIONS 197 Then x k A1 = y 0 x 1 A2 = y 0 x 0 B = y 1 x 1 C1 = y 0 x 1 C2 = y k 0 x kx : A1 stretches vectors horizontally by a factor of k = 1 y y 0 x x = : A2 stretches vectors vertically by a factor of k k y ky 1 x y = : B reflects vectors in the line y = x 0 y x k x x + ky = : C1 extends vectors horizontally by ky 1 y y 0 x x = : C1 extends vectors vertically by kx 1 y y + kx (C1 and C2 are called shears.) The effect of each of these transformations, with k = 2, on the unit square is Original A1 3 3 2 2 2 1 1 1 0 0 1 2 0 3 0 1 B 2 0 3 C1 3 3 2 2 2 1 1 1 0 0 1 2 0 3 0 1 A2 3 0 1 0 3 3 2 3 C2 3 2 2 0 1 Note that reflection across the line y = x leaves the square unchanged, since it is symmetric about that line. 20. Using the formula from Example 3.58, we compute the matrix for a counterclockwise rotation of 120◦ : " R120◦ (e1 ) = cos 120◦ sin 120◦ # " = − 21 √ 3 2 # " and R120◦ (e2 ) = − sin 120◦ cos 120◦ So by Theorem 3.31, we have " h i −1 A = R120◦ (e1 ) R120◦ (e2 ) = √32 2 √ − 23 1 2 # . # " = √ − 23 − 12 # . 198 CHAPTER 3. MATRICES 21. A clockwise rotation of 30◦ is a rotation of −30◦ . Then using the formula from Example 3.58, we compute the matrix for that rotation: R−30◦ (e1 ) = " # cos(−30◦ ) sin(−30◦ ) "√ # = 3 2 − 12 " and R120◦ (e2 ) = # − sin(−30◦ ) cos(−30◦ ) " = 1 √2 3 2 # . So by Theorem 3.31, we have h "√ i 3 2 − 21 A = R−30◦ (e1 ) R−30◦ (e2 ) = 22. A direction vector of the line y = 2x is d = P` (e1 ) = d · e1 d·d 1 √2 3 2 # . 1 , so 2 " # " # " # " # 1 2 1 1 1 2 d · e2 5 d= = 2 and P` (e2 ) = d= = 54 . 1+4 2 d·d 1+4 2 5 5 Thus by Theorem 3.31, we have h i A = P` (e1 ) P` (e2 ) = " 1 5 2 5 2 5 4 5 # . 23. A direction vector of the line y = −x is d = P` (e1 ) = d · e1 d·d 1 , so −1 " # " # " # " # 1 1 1 − 21 1 d · e2 −1 2 d= = and P (e ) = d = = . ` 2 1 1 + 1 −1 d·d 1 + 2 −1 − 12 2 Thus by Theorem 3.31, we have h i A = P` (e1 ) P` (e2 ) = " 1 2 − 21 24. Reflection in the line y = x is given by x y 0 1 0 R = =x +y = y x 1 0 1 − 12 1 2 # . 1 x , 0 y so that the matrix of R is 0 1 1 . 0 25. Reflection in the line y = −x is given by x −y 0 −1 0 R = =x +y = y −x −1 0 −1 so that the matrix of R is 0 −1 −1 . 0 −1 x , 0 y 3.6. INTRODUCTION TO LINEAR TRANSFORMATIONS 199 26. (a) Consider the following diagrams: a+b { c F{ HbL a { F{ HbL F{ HbL cb b F{ HaL F{ Ha + bL b F{ HaL From the left-hand graph, we see that F` (cb) = cF` (b), and the right-hand graph shows that F` (a + b) = F` (a) + F` (b). (b) Let P` (x) be the projection of x on the line `; this is the image of x under the linear transformation P` . Then from the figure in the text, since the diagonals of the parallelogram shown bisect each other, and the point furthest from the origin on ` is x + F` (x), we have x + F` (x) = 2P` (x), so that F` (x) = 2P` (x) − x. Now, Example 3.59 gives us the matrix of P` , where ` has direction vector d = F` (x) = 2P` (x) − x " # " # d21 d1 d2 1 0 2 = 2 x− x d1 + d22 d1 d2 d22 0 1 " " # d21 d1 d2 d21 + d22 2 1 = − d21 + d22 d1 d2 d21 + d22 d22 0 " # d21 − d22 2d1 d2 1 x, = 2 2 d1 + d2 2d1 d2 −d21 + d22 0 2 d1 + d22 d1 , so we have d2 #! x which shows that the matrix of F` is as claimed in the exercise statement. (c) If θ is the angle that ` forms with the positive x-axis, we may take d cos θ d= 1 = d2 sin θ as the direction vector. Substituting those values into the formula for F` from part (b) gives 2 1 d1 − d22 2d1 d2 F` = 2 d1 + d22 2d1 d2 −d21 + d22 2 1 cos θ − sin2 θ 2 cos θ sin θ = − cos2 θ + sin2 θ cos2 θ + sin2 θ 2 cos θ sin θ 2 cos θ − sin2 θ 2 cos θ sin θ = , 2 cos θ sin θ − cos2 θ + sin2 θ 200 CHAPTER 3. MATRICES since cos2 θ + sin2 θ = 1. Now, cos2 θ − sin2 θ = cos 2θ, and 2 cos θ sin θ = sin 2θ, so the matrix above simplifies to cos 2θ sin 2θ F` = . sin 2θ − cos 2θ 1 27. By part (b) of the previous exercise, since the line y = 2x has direction vector d = , we have for 2 the standard matrix of this reflection # " # " 1−4 2·2 − 53 54 1 . F` (x) = = 4 3 1 + 4 2 · 2 −1 + 4 5 5 √ √ 1 3 sin θ 28. The direction vector of this line is √ , so that cos = 3, and thus θ = π3 = 60◦ . So by part θ 1 = 3 (c) of the previous exercise, the standard matrix of this reflection is " # " # " √ # 1 cos 2θ sin 2θ cos 120◦ sin 120◦ − 23 √2 F` (x) = = = . 3 1 sin 2θ − cos 2θ sin 120◦ − cos 120◦ 2 2 29. Since   x1 x T 1 =  2x1 − x2  , x2 3x1 + 4x2     2y1 + y3 y1  3y2 − y3   S y2  =   y1 − y2  , y3 y1 + y2 + y3 we substitute the result of T into the equation for S:   x1 x (S ◦ T ) 1 = S  2x1 − x2  x2 3x1 + 4x2   2x1 + (3x1 + 4x2 )  3(2x1 − x2 ) − (3x1 + 4x2 )   =   x1 − (2x1 − x2 ) x1 + (2x1 − x2 ) + (3x1 + 4x2 )   5x1 + 4x2 3x1 − 7x2   =  −x1 + x2  6x1 + 3x2   5 4  3 −7 x1  = . −1 1 x2 6 3 Thus the standard matrix of S ◦ T is   4 −7 . 1 3 5  3 A= −1 6 30. (a) By direct substitution, we have x x x − x2 2(x1 − x2 ) 2x1 − 2x2 (S ◦ T ) 1 = S T 1 =S 1 = = . x2 x2 x1 + x2 −(x1 + x2 ) −x1 − x2 Thus [S ◦ T ] = 2 −1 −2 . −1 3.6. INTRODUCTION TO LINEAR TRANSFORMATIONS 201 (b) Using matrix multiplication, we first find the standard matrices for S and T , which are 2 0 1 −1 [S] = , [T ] = . 0 −1 1 1 Then 2 [S][T ] = 0 0 1 −1 1 −1 2 = 1 −1 −2 . −1 Then [S ◦ T ] = [S][T ]. 31. (a) By direct substitution, we have x1 x1 x1 + 2x2 (x1 + 2x2 ) + 3(−3x1 + x2 ) −8x1 + 5x2 (S ◦ T ) =S T =S = = . x2 x2 −3x1 + x2 (x1 + 2x2 ) − (−3x1 + x2 ) 4x1 + x1 Thus [S ◦ T ] = −8 4 5 . 1 (b) Using matrix multiplication, we first find the standard matrices for S and T , which are 1 3 1 2 [S] = , [T ] = . 1 −1 −3 1 Then [S][T ] = 1 1 3 1 −1 −3 2 −8 = 1 4 5 . 1 Then [S ◦ T ] = [S][T ]. 32. (a) By direct substitution, we have     x2 + 3(−x1 ) −3x1 + x2 x1 x1 x2 (S ◦ T ) =S T =S =  2x2 − x1  = −x1 + 2x2  . x2 x2 −x1 x2 − (−x1 ) x1 + x2 Thus  −3 [S ◦ T ] = −1 1  1 2 . 1 (b) Using matrix multiplication, we first find the standard matrices for S and T , which are   1 3 0 1   1 , [T ] = [S] = 2 . −1 0 1 −1 Then  1 [S][T ] = 2 1  3 0 1 −1 −1  −3 1 = −1 0 1  1 2 . 1 Then [S ◦ T ] = [S][T ]. 33. (a) By direct substitution, we have      x1 x1 x1 + x2 − x3       (S ◦ T ) x2 = S T x2 =S 2x1 − x2 + x3 x3 x3 4(x1 + x2 − x3 ) − 2(2x1 − x2 + x3 ) 6x2 − 6x3 = = −(x1 + x2 − x3 ) + (2x1 − x2 + x3 ) x1 − 2x2 + 2x3 202 CHAPTER 3. MATRICES Thus 0 [S ◦ T ] = 1 6 −6 . −2 2 (b) Using matrix multiplication, we first find the standard matrices for S and T , which are 4 −2 1 1 −1 [S] = , [T ] = . −1 1 2 −1 1 Then [S][T ] = −2 1 1 2 4 −1 1 −1 0 = −1 1 1 6 −6 . −2 2 Then [S ◦ T ] = [S][T ]. 34. (a) By direct substitution, we have          x1 x1 (x1 + 2x2 ) − (2x2 − x3 ) x1 + x3 x + 2x2 (S ◦ T ) x2  = S T x2  = S 1 =  (x1 + 2x2 ) + (2x2 − x3 )  = x1 + 4x2 − x3  2x2 − x3 x3 x3 −(x1 + 2x2 ) + (2x2 − x3 ) −x1 − x3 Thus  1 [S ◦ T ] =  1 −1 0 4 0  1 −1 . −1 (b) Using matrix multiplication, we first find the standard matrices for S and T , which are   1 −1 1 2 0   1 1 , [T ] = . [S] = 0 2 −1 −1 1 Then  1 [S][T ] =  1 −1  −1 1 1 0 1 2 2  1 0 = 1 −1 −1 0 4 0  1 −1 . −1 Then [S ◦ T ] = [S][T ]. 35. (a) By direct substitution, we have            x1 x1 x1 + x2 (x1 + x2 ) − (x2 + x3 ) x1 − x3 (S ◦ T ) x2  = S T x2  = S x2 + x3  =  (x2 + x3 ) − (x1 + x3 )  = −x1 + x2  . x3 x3 x1 + x3 −(x1 + x2 ) + (x1 + x3 ) −x2 + x3 Thus  1 [S ◦ T ] = −1 0  0 −1 1 0 . −1 1 (b) Using matrix multiplication, we first find the standard matrices for S and T , which are     1 −1 0 1 1 0 1 −1 , [T ] = 0 1 1 . [S] =  0 −1 0 1 1 0 1 Then  1 [S][T ] =  0 −1 Then [S ◦ T ] = [S][T ].  −1 0 1 1 −1 0 0 1 1 1 1 0   0 1 1 = −1 1 0  0 −1 1 0 . −1 1 3.6. INTRODUCTION TO LINEAR TRANSFORMATIONS 203 36. A counterclockwise rotation T through 60◦ is given by (see Example 3.58) " # " √ # 1 cos 60◦ − sin 60◦ − 23 2 [T ] = = √3 , 1 sin 60◦ cos 60◦ 2 2 1 and reflection S in y = x is reflection in a line with direction vector d = , which has the matrix 1 (see Exercise 26(b)) 1 0 2 0 1 [S] = = . 1 0 2 2 0 Thus by Theorem 3.32, the composite transformation is given by "√ " #" √ # 3 3 1 0 1 − 2 2 √2 = [S ◦ T ] = [S][T ] = 3 1 1 1 0 2 2 2 1 √2 − 23 # . 37. Reflection T in the y-axis is taking the negative of the x-coordinate, so it has matrix −1 0 [T ] = , 0 1 while clockwise rotation S through 30◦ has matrix (see Example 3.58) # "√ " # 3 1 cos(−30◦ ) − sin(−30◦ ) 2 2 √ = [S] = . 3 sin(−30◦ ) cos(−30◦ ) − 12 2 Thus by Theorem 3.32, the composite transformation is given by "√ # " √ #" 3 1 − 23 −1 0 [S ◦ T ] = [S][T ] = 21 √23 = 1 −2 0 1 2 2 38. Clockwise rotation T through 45◦ has matrix (see Example 3.58) # " √ " 2 cos 45◦ sin 45◦ √2 = [T ] = − sin 45◦ cos 45◦ − 22 √ 2 √2 2 2 1 √2 3 2 # . # , while projection S onto the y-axis has matrix (see Example 3.59, or note that this transformation simply sets the x-coordinate to zero) 0 0 [S] = . 0 1 Thus by Theorem 3.32, the composite transformation is given by " √ #" √ √ #" 2 2 2 0 0 2 2 √ √ √2 [T ◦ S ◦ T ] = [T ][S][T ] = 2 − 22 0 1 − 22 2 √ 2 √2 2 2 # − 12 1 2 1 2 # . 1 , is given by (see Exercise 26(b)) 1 2 0 1 = . 0 1 0 39. Reflection T in the line y = x, with direction vector d = 1 0 [T ] = 2 2 = " − 12 Counterclockwise rotation S through 30◦ has matrix (see Example 3.58) " # "√ # 3 1 cos 30◦ sin 30◦ [S] = = 21 √23 . − sin 30◦ cos 30◦ −2 2 204 CHAPTER 3. MATRICES Finally, reflection R in the line y = −x, with direction vector d = 1 , is given by (see Exercise −1 26(b)) 1 0 −2 0 −1 = . 0 −1 0 2 −2 Then by Theorem 3.32, the composite transformation is given by #" # " √ " #" √ 3 1 0 1 − 23 0 −1 2 2 √ = [R ◦ S ◦ T ] = [R][S][T ] = 3 1 1 0 −1 0 − 12 2 2 [R] = − 12 √ − 23 # . 40. Using the formula for rotation through an angle θ, from Example 3.58, we have cos α − sin α cos β − sin β [Rα ◦ Rβ ] = [Rα ][Rβ ] = sin α cos α sin β cos β cos α cos β − sin α sin β − cos α sin β − sin α cos β = sin α cos β + cos α sin β − sin α sin β + cos α cos β cos(α + β) − sin(α + β) = sin(α + β) cos(α + β) = Rα+β . cos α cos β 41. Assume line ` has direction vector l = and line m has direction vector . Let θ = β − α sin α sin β be the angle from ` to m (note that if α > β, then θ is negative). Then by Exercise 26(c), cos 2β sin 2β cos 2α sin 2α Fm ◦ F` = sin 2β − cos 2β sin 2α − cos 2α cos 2β cos 2α + sin 2β sin 2α cos 2β sin 2α − sin 2β cos 2α = sin 2β cos 2α − cos 2β sin 2α cos 2β cos 2α + sin 2β sin 2α cos(2(α − β)) sin(2(α − β)) = − sin(2(α − β)) cos(2(α − β)) cos(2(β − α)) − sin(2(β − α)) = sin(2(β − α)) cos(2(β − α)) cos 2θ − sin 2θ = sin 2θ cos 2θ = R2θ . 42. (a) Let P be the projection onto a line with direction vector d = P is P = 2 1 d1 d21 + d22 d1 d2 d1 . Then the standard matrix of d2 d1 d2 , d22 and thus 2 2 1 1 d1 d1 d2 d1 d1 d2 d22 d21 + d22 d1 d2 d22 d21 + d22 d1 d2 4 1 d31 d2 + d1 d32 d1 + d21 d22 = 2 3 3 2 2 d21 d22 + d42 (d1 + d2 ) d1 d2 + d1 d2 2 2 1 d1 (d1 + d22 ) d1 d2 (d21 + d22 ) = 2 (d1 + d22 )2 d1 d2 (d21 + d22 ) d22 (d21 + d22 ) 2 1 d1 d1 d2 = 2 d22 d1 + d22 d1 d2 P ◦ P = P2 = = P. 3.6. INTRODUCTION TO LINEAR TRANSFORMATIONS 205 Note that we have proven this fact before, in a different form. Since P (v) = projd v, then (P ◦ P )(v) = projd (projd v) = projd v = P (v) by Exercise 70(a) in Section 1.2. (b) Let P be any projection matrix. Using the form of P from the previous exercise, we first prove that P 6= I. Suppose it were. Then d1 d2 = 0, so that either d1 = 0 or d2 = 0. But then either d21 or d22 would be zero, so that P could not be the identity matrix. So P is a non-identity matrix such that P 2 = P . Then P 2 − P = P (P − I) = O. Since P 6= I, we see that P − I 6= O, so there is some k such that (P − I)ek 6= 0 (otherwise, every row of P − I would be orthogonal to every standard basis vector, so would be zero, so that P − I would be the zero matrix). Let x = (P − I)ek . But then P x = P (P − I)ek = Oek = 0, so that x ∈ null(P ). Since the null space of P contains a nonzero vector, P is not invertible by Theorem 3.27. 43. Let `, m, and n be lines through the origin with angles from the x-axis of θ, α, and β respectively. Then cos 2θ sin 2θ cos 2α sin 2α cos 2β sin 2β Fn ◦ Fm ◦ F` = sin 2θ − cos 2θ sin 2α − cos 2α sin 2β − cos 2β cos 2θ cos 2α + sin 2θ sin 2α cos 2θ sin 2α − sin 2θ cos 2α cos 2β sin 2β = sin 2θ cos 2α − cos 2θ sin 2α sin 2θ sin 2α + cos 2θ cos 2α sin 2β − cos 2β cos 2(θ − α) − sin 2(θ − α) cos 2β sin 2β = sin 2(θ − α) cos 2(θ − α) sin 2β − cos 2β cos 2(θ − α) cos 2β − sin 2(θ − α) sin 2β cos 2(θ − α) sin 2β + sin 2(θ − α) cos 2β = sin 2(θ − α) cos 2β + cos 2(θ − α) sin 2β sin 2(θ − α) sin 2β − cos 2(θ − α) cos 2β cos 2(θ − α + β) sin 2(θ − α + β) = . sin 2(θ − α + β) − cos 2(θ − α + β) This is a reflection through a line making an angle of θ − α + β with the x-axis. 44. If ` is a line with direction vector d through the point P , then it has equation x = p + td. Since T is linear, we have T (x) = T (p + td) = T (p) + tT (d), so that the equation of the resulting graph has equation T (x) = T (p) + tT (d). If T (d) = 0, then the image of ` has equation T (x) = T (p) so that it is the single point T (p). Otherwise, the image of ` is the line `0 with direction vector T (d) passing through T (p). 45. Suppose that we start with two parallel lines ` : x = p + td, m : x = q + sd. Then the images of these lines have equations `0 : T (p) + tT (d), m0 : T (q) + sT (d). Then • If T (d) 6= 0 and T (p) 6= T (q), then `0 and m0 are distinct parallel lines. • If T (d) 6= 0 and T (p) = T (q), then `0 and m0 are the same line. • If T (d) = 0 and T (p) 6= T (q), then `0 and m0 are distinct points T (p) and T (q). • If T (d) = 0 and T (p) 6= T (q), then `0 and m0 are the single point T (p). 206 CHAPTER 3. MATRICES 46. Using the linear transformation T from Exercise 3, we have −1 0 T = , 1 −2 1 2 T = , 1 0 1 0 T = , −1 2 −1 −2 T = . −1 0 The image is the square below: 2 C 1 D B -2 1 -1 2 -1 A -2 47. Using the linear transformation D from Exercise 17, we have D −1 −2 = , 1 3 D 1 2 = , 1 3 1 2 = , −1 −3 D D −1 −2 = . −1 −3 The image is the rectangle below: 3 A B 2 1 -2 1 -1 2 -1 -2 D C 48. Using the linear transformation P from Exercise 18, we have −1 0 P = , 1 0 The image is the line below: 1 1 P = , 1 1 1 0 P = , −1 0 −1 −1 P = . −1 −1 3.6. INTRODUCTION TO LINEAR TRANSFORMATIONS 207 1. B 0.5 A -1. 1. 0.5 C -0.5 -0.5 D -1. 49. Using the projection P from Exercise 22, we have " # " # " # " # " # " # 1 3 −1 1 1 − 51 5 5 P = 2 , P = 6 , P = , 1 1 −1 − 25 5 5 " # " # −1 − 35 P = . −1 − 65 The image is the line below: B 1. 0.5 A -1. 1. 0.5 -0.5 C -0.5 -1. D 1 2 50. Using the transformation T from Exercise 31, with matrix , we have −3 1 −1 1 1 3 1 −1 −1 −3 T = , T = , T = , T = . 1 4 1 −2 −1 −4 −1 2 The image is the parallelogram below: 4 A 3 2 D 1 -4 -3 -2 1 -1 2 3 4 -1 B -2 -3 C -4 51. Using the transformation resulting from Exercise 37, which we call Q, we have " # "√ # " # " √ # " # " √ # " # " √ # 3+1 − 3+1 − 3−1 3−1 −1 1 1 −1 2 2 √2 Q = √3+1 , Q = , Q = −√23−1 , Q = −√3+1 . 3−1 1 1 −1 −1 2 2 2 2 The image is the parallelogram below: 208 CHAPTER 3. MATRICES A 1 D 1 -1 B -1 C 52. If ` has direction vector d, then using standard properties of the dot product and of scalar multiplication, we get d · cv c(d · v) d·v P` (cv) = projd (cv) = d= d=c d = cP` (v). d·d d·d d·d 53. First suppose that T is linear. Then applying property 1 first, followed by property 2, we have T (c1 v1 + c2 v2 ) = T (c1 v1 ) + T (c2 v2 ) = c1 T (v1 ) + c2 T (v2 ). For the reverse direction, assume that T (c1 v1 + c2 v2 ) = c1 T (v1 ) + c2 T (v2 ). To prove property 1, set c1 = c2 = 1; this gives T (v1 + v2 ) = T (v1 ) + T (v2 ). For property 2, set c1 = c, v1 = v, and c2 = 0; this gives T (cv) = T (cv + 0v2 ) = cT (v) + 0T (v2 ) = cT (v). Thus T is linear. 54. Let range(T ) be the range of T and col([T ]) be the column space of [T ]. We show that range(T ) = col([T ]) by showing that they are both equal to span({T (ei )}). To see P that range(T ) = span({T (ei )}), note that x ∈ range(T ) if and only if x = T (v) for some v = ci ei , so that X X x = T (v) = T ( ci e i ) = ci T (ei ), which is the case if and only if x ∈ span({T (ei )}). Thus range(T ) = span({T (ei )}). Next, we show that col([T ]) = span({T (ei )}). By Theorem 2, [T ] = T (e1 ) T (e2 ) · · · Thus if vT = v1 v2 · · · vn , then T (en ) . T (v) = v1 T (e1 ) + v2 T (e2 ) + · · · + vn T (en ). Then x ∈ col([T ]) ⇐⇒ x = T (v) = T ( X vi T (ei )) = X vi T (ei ), so that col([T ]) = span({T (ei )}). 55. The Fundamental Theorem of Invertible Matrices implies that any invertible 2 × 2 matrix is a product of elementary matrices. The matrices in Exercise 19 represent the three types of elementary 2 × 2 matrices, so TA must be a composition of the three types of transformations represented in Exercise 19. 3.7. APPLICATIONS 3.7 209 Applications 1. x1 = P x0 = 0.5 0.3 0.5 0.4 = , 0.5 0.7 0.5 0.6 x2 = P x1 = 0.5 0.5 0.3 0.4 0.38 = . 0.7 0.6 0.62 2. Since there are two steps involved, the proportion we seek will be [P 2 ]21 — the probability of a state 1 population member being in state 2 after two steps: 0.5 P = 0.5 2 0.3 0.7 2 0.4 = 0.6 0.36 . 0.64 The proportion of the state 1 population in state 2 after two steps is 0.6. 3. Since there are two steps involved, the proportion we seek will be [P 2 ]22 — the probability of a state 2 population member remaining in state 2 after two steps: 0.5 P = 0.5 2 0.3 0.7 2 0.4 = 0.6 0.36 . 0.64 The proportion of the state 2 population in state 2 after two steps is 0.64. x 4. The steady-state vector x = 1 has the property that P x = x, or (I − P )x = 0. So we form the x2 augmented matrix of that system and row-reduce it: 1 − 0.5 −0.3 0 0.5 −0.3 0 1 −0.6 0 I −P 0 : = → −0.5 1 − 0.7 0 0.5 0.3 0 0 0 0 Thus x2 = t is free and x1 = 35 t. Since we also want x to be a probability vector, we also have x1 + x2 = 53 t + t = 85 t = 1, so that t = 85 . Thus the steady-state vector is " # x=  1 2  5. x1 = P x0 =  0 1 2 1 3 1 3 1 3     90 .  1 120 150 3    2  = 180 120 ,     3 0 3 8 5 8 1 3 1 3 1 3 1 2  x2 = P x1 =  0 1 2 120     1 150 155 3    2  = 120 120 .     3 0 120 115 6. Since there are two steps involved, the proportion we seek will be [P 2 ]11 — the probability of a state 1 population member being in state 1 after two steps:  1 2  P 2 = 0 1 2 1 3 1 3 1 3 2  5 1 3 12   2 =  31 3 1 0 4 7 18 1 3 5 18  7 18 2  . 9  7 18 5 The proportion of the state 1 population in state 1 after two steps is 12 . 7. Since there are two steps involved, the proportion we seek will be [P 2 ]32 — the probability of a state 2 population member being in state 3 after two steps:  1 2  P 2 = 0 1 2 1 3 1 3 1 3 2  1 5 3 12  2 =  31 3 1 0 4 7 18 1 3 5 18  7 18 2  . 9  7 18 5 The proportion of the state 2 population in state 3 after two steps is 18 . 210 CHAPTER 3. MATRICES   x1 8. The steady-state vector x = x2  has the property that P x = x, or (I − P )x = 0. So we form the x3 augmented matrix of that system and row-reduce it: h I −P  1 − 12  0 :  0 − 12 i − 13 1 − 13 − 13 − 13 − 23 1   1 0 2   0 =  0 − 21 0 − 13 2 3 − 13 − 31 − 32 1   1 0 0   0 → 0 1 0 0 0 − 43 −1 0  0  0 0 Thus x3 = t is free and x1 = 43 t, x2 = t. Since we want x to be a probability vector, we also have x1 + x2 + x3 = 10 3 4 t+t+t= t = 1, so that t = . 3 3 10 Thus the steady-state vector is   2 5 3 x =  10  . 3 10 9. Let the first column and row represent dry days and the second column and row represent wet days. (a) Given the data in the exercise, the transition matrix is 0.750 P = 0.250 0.338 , 0.662 with x x= 1 , x2 where x1 and x2 denote the probabilities of dry and wet days respectively. 1 (b) Since Monday is dry, we have x0 = . Since Wednesday is two days later, the probability vector 0 for Wednesday is given by 0.750 0.338 0.750 0.338 1 0.647 x2 = P (P x0 ) = = . 0.250 0.662 0.250 0.662 0 0.353 Thus the probability that Wednesday is wet is 0.353. (c) We wish to find the steady state probability vector; that is, over the long run, the probability that a given day will be dry or wet. Thus we need to solve P x = x, or (I − P )x = 0. So form the augmented matrix and row-reduce: 1 − 0.750 −0.338 0 0.250 −0.338 0 1 −1.352 0 I −P 0 : = → . −0.750 1 − 0.662 0 −0.750 0.338 0 0 0 0 Let x2 = t, so that x1 = 1.352t; then we also want x1 + x2 = 1, so that x1 + x2 = 2.352t = 1, so that t = 1 ≈ 0.425. 2.352 Thus the steady state probability vector is 1.352 · 0.425 0.575 x= ≈ . 0.425 0.425 So in the long run, about 57.5% of the days will be dry and 42.5% will be wet. 3.7. APPLICATIONS 211 10. Let the three rows (and columns) correspond to, in order, tall, medium, or short. (a) From the given data, the transition matrix is  0.6 P = 0.2 0.2 0.1 0.7 0.2  0.2 0.4 . 0.4   0 (b) Starting with a short person, we have x0 = 0. That person’s grandchild is two generations 1 later, so that       0.6 0.1 0.2 0.6 0.1 0.2 0 0.24 x2 = P (P x0 ) = 0.2 0.7 0.4 0.2 0.7 0.4 0 = 0.48 . 0.2 0.2 0.4 0.2 0.2 0.4 1 0.28 So the probability that a grandchild will be tall is 0.24.   0.2 (c) The current distribution of heights is x0 = 0.5, and we want to find x3 : 0.3   3    0.2457 0.2 0.6 0.1 0.2 x2 = P 3 x0 = 0.2 0.7 0.4 0.5 = 0.5039 . 0.2504 0.3 0.2 0.2 0.4 So roughly 24.6% of the population is tall, 50.4% medium, and 25.0% short. (d) We want to solve (I − P )x = 0, so form the augmented matrix and row-reduce:      0.4 −0.1 −0.2 0 1 0 1 − 0.6 −0.1 −0.2 0 0 = −0.2 0.3 −0.4 0 → 0 1 I − P 0 :  −0.2 1 − 0.7 −0.4 −0.2 −0.2 1 − 0.4 0 −0.2 −0.2 0.6 0 0 0 −1 −2 0  0 0 . 0 So x3 = t is free, and x1 = t, x2 = 2t. Since x1 + x2 + x3 = t + 2t + t = 4t = 1, we get t = 1 , 4 so that the steady state vector is   1 4   x =  12  . 1 4 So in the long run, half the population is medium and one quarter each are tall and short. 11. Let the rows (and columns) represent good, fair, and poor crops, in order. (a) From the data given, the transition matrix is  0.08 P = 0.07 0.85 0.09 0.11 0.80  0.11 0.05 . 0.84 (b) We want to find the probability that starting from a good crop in 1940 we end up with a good crop in each of the five succeeding years, so we want (P i )11 for i = 1, 2, 3, 4, 5. Computing P i and 212 CHAPTER 3. MATRICES looking at the upper left entries, we get 1941 : P11 = 0.0800 2 1942 : (P )11 = 0.1062 1943 : (P 3 )11 ≈ 0.1057 1944 : (P 4 )11 ≈ 0.1057 1945 : (P 5 )11 ≈ 0.1057. (c) To find the long-run proportion, we want to solve P x = x. So row-reduce I − P 0 :     1 − 0.08 −0.09 −0.11 0 1 0 −0.126 0 0 → 0 1 −0.066 0 . I − P 0 =  −0.07 1 − 0.11 −0.05 −0.85 −0.80 1 − 0.84 0 0 0 0 0 So x3 = t is free, and x1 = 0.126t, x2 = 0.066t. Since x1 + x2 + x3 = 1, we have x1 + x2 + x3 = 0.126t + 0.066t + t = 1.192t = 1, so that t = 1 ≈ 0.839. 1.192 So the steady state vector is       x1 0.126 · 0.839 0.106 x = x2  ≈ 0.066 · 0.839 ≈ 0.055 . x3 0.839 0.839 In the long run, about 10.6% of the crops will be good, 5.5% will be fair, and 83.9% will be poor. 12. (a) Looking at the connectivity of the graph, we get for the transition matrix 0 1 3 1  P =  12 2 0  0 1 4 1 4 1 3 1 3 0  0 1 3 . 2  3 1 2 0 Thus for example from state 2, there is a 13 probability of getting to state 3, since there are three paths out of state 2 one of which leads to state 3. Thus P32 = 13 . (b) First find the steady-state probability distribution by solving P x = x:  h I −P 1 i − 1  0 =  21 − 2 0 − 31 1 1 −3 − 31 − 41 − 41 1 1 −2 0 − 13 − 23 1   0 1 0 0   → 0 0 0 0 0 1 0 0 0 0 1 0 − 32 −1 − 34 0  0 0  . 0 0 So x4 is free; setting x4 = 3t to avoid fractions gives x1 = 2t, x2 = 3t, x3 = 4t, and x4 = 3t. Since this is a probability vector, we also have x1 + x2 + x3 + x3 = 2t + 3t + 4t + 3t = 12t = 1, so that t = So the long run probability vector is 1 6 1   x =  41  . 3 1 4 1 . 12 3.7. APPLICATIONS 213 Since we initially started with 60 total robots, the steady-state distribution will be   10 15  60x =  20 . 15 13. First suppose that P = p1 · · · pn is a stochastic matrix (so that the elements of each pi sum to 1). Let j be a row vector consisting entirely of 1’s. Then P P (p1 )i · · · (pn )i = 1 · · · 1 = j. jP = j · p1 · · · j · pn = P Conversely, suppose that jP = j. Then j · pi = 1 for each i, so that j (pi )j = 1 for all i, showing that the column sums of P are 1 so that P is stochastic. x1 y1 w1 z1 14. Let P = and Q = be stochastic matrices. Then x2 y2 w2 z2 x y1 w1 z1 x w + y1 w2 x1 z1 + y1 z2 PQ = 1 = 1 1 . x2 y2 w2 z2 x2 w1 + y2 w2 x2 z1 + y2 z2 Now, both P and Q are stochastic, so that x1 + x2 = y1 + y2 = z1 + z2 = w1 + w2 = 1. Then summing each column of P Q gives x1 w1 + y1 w2 + x2 w1 + y2 w2 = (x1 + x2 )w1 + (y1 + y2 )w2 = w1 + w2 = 1 x1 z1 + y1 z2 + x2 z1 + y2 z2 = (x1 + x2 )z1 + (y1 + y2 )z2 = z1 + z2 = 1. Thus each column of P Q sums to 1 so that P Q is also stochastic. 15. The transition matrix from Exercise 9 is P = 0.750 0.338 . 0.250 0.662 To find the expected number of days until a wet day, we delete the second row and column of P (corresponding to wet days), giving Q = 0.750 . Then the expected number of days until a wet day is (I − Q)−1 = (1 − 0.750)−1 = 4. We expect four days until the next wet day. 16. The transition matrix from Exercise 10 is  0.6 P = 0.2 0.2 0.1 0.7 0.2  0.2 0.4 . 0.4 We delete the row and column of this matrix corresponding to “tall”, which is the first row and column, giving 0.7 0.4 Q= . 0.2 0.4 Then (I − Q) −1 0.3 = −0.2 −0.4 0.6 −1 6.0 4.0 = . 2.0 3.0 The column of this matrix corresponding to “short” is the second column, since this was the third column in the original matrix. Summing the values in the second column shows that the expected number of generations before a short person has a tall descendant is 7. 214 CHAPTER 3. MATRICES 17. The transition matrix from Exercise 11 is  0.08 P = 0.07 0.85 0.09 0.11 0.80  0.11 0.05 . 0.84 Since we want to know how long we expect it to be before a good crop, we remove the first row and column, leaving 0.11 0.05 Q= . 0.80 0.84 Then −1 (I − Q) −0.05 0.16 0.89 = −0.80 −1 1.562 0.488 = . 7.812 8.691 Since we are starting from a fair crop, we sum the entries in the first column (which was the “fair” column in the original matrix), giving 1.562 + 7.812, or 9 or 10 years expected until the next good crop. 18. The transition matrix from Exercise 12 is 0 1 3 1  P =  21 2 0  0 1 3 1 3 1 4 1 4 0  0 1 3 . 2  3 1 2 0 The expected number of moves from any junction other than 4 to reach junction 4 is found by removing the fourth row and column of P , giving   0 31 14   Q =  12 0 14  . 1 1 0 2 3 Then  1  (I − Q)−1 = − 21 − 21 − 31 1 − 13 −1  22 − 41 13   − 14  =  15 13 16 1 13 10 13 21 13 12 13  8 13 9  , 13  20 13 15 16 53 so that a robot is expected to reach junction 4 starting from junction 1 in 22 13 + 13 + 13 = 13 ≈ 4.08 moves, 10 21 12 43 8 9 37 from junction 2 in 13 + 13 + 13 = 13 ≈ 3.31 moves, and from junction 3 in 13 + 13 + 20 13 = 13 ≈ 2.85 moves. 19. Since the sums of the columns of the matrix E are both 1, and all numbers are nonnegative, this matrix is an exchange matrix. To find a solution to Ex = x, or equivalently (E − I)x = 0, we row-reduce the augmented matrix of that system: " # " # " # −2R1 i h 1 1 1 1 − 12 0 − 0 1 − 0 − − − → 4 2 4 2 . E−I 0 : R2 +R1 1 − 41 0 −−− 0 0 0 0 0 0 −−→ 2 1 Thus the solutions are of the form x1 = 2 t, x2 = t. Set t = 2 to avoid fractions, giving a price vector 1 . Of course, other values of t lead to equally valid price vectors. 2 20. Since neither column sum is 1, this is not an exchange matrix. 21. While the first column sums to 1, the second does not, so this is not an exchange matrix. 3.7. APPLICATIONS 215 22. Since the sums of the columns of the matrix E are both 1, and all numbers are nonnegative, this matrix is an exchange matrix. To find a solution to Ex = x, or equivalently (E − I)x = 0, we row-reduce the augmented matrix of that system: 0 : E−I −0.9 0.9 0.6 −0.6 − 10 R1 9 0 − −− −→ 1 0 0 0 −0.9 0.6 R2 +R1 0 − 0 0 −−−−→ − 23 0 0 . 0 2 Thus the solutions are of the form x1 = 3 t, x2 = t. Set t = 3 to avoid fractions, giving a price vector 2 . Of course, other values of t lead to equally valid price vectors. 3 23. Since the matrix contains a negative entry, it is not an exchange matrix. 24. Since the sums of the columns of the matrix E are both 1, and all numbers are nonnegative, this matrix is an exchange matrix. To find a solution to Ex = x, or equivalently (E − I)x = 0, we row-reduce the augmented matrix of that system: h E−I  − 12 i  0 :  0 1 2 1 −1 0 0 1 3 1 −3   0 − 21  −R2  0 −−−→  0 R3 +R1 0 − 0 −−−−→ 1 1 1 0 − 13 − 13  − 12   0 0  R1 −R2 −−→ 0 −−−  0 R3 −R2 0 − −−−−→ 0 1 0 1 3 − 13 0  −2R  1 1 0 0 −−−→   0 0 1 0 0 0 − 23 − 13 0  0  0 . 0 Thus the solutions are of the form x1 = 32 t, x2 = 13 t, x3 = t. Set t = 3 to avoid fractions, giving a price   2 vector 1. Of course, other values of t lead to equally valid price vectors. 3 25. Since the sums of the columns of the matrix E are both 1, and all numbers are nonnegative, this matrix is an exchange matrix. To find a solution to Ex = x, or equivalently (E − I)x = 0, we row-reduce the augmented matrix of that system: h E−I  − 10 R1   7 0 0.2 0 −−− 1 0 − 27 0 −→    −0.5 0.3 0 0.3 0 0.3 −0.5 0.5 −0.5 0 0.4 0.5 −0.5 0      1 0 − 72 0 1 0 − 27 0 1 0    −2R2  R2 −0.3R1  1 27 1 27 −−−−−−→ 0 − 2 0 0 −−−→ 0 1 0 − 2 70 70 R3 +R2 R3 −0.4R1 1 27 − 0 0 0 0 0 0 0 0 − − − − − → −−−−−−→ 2 70  −0.7 i  0 :  0.3 0.4 − 27 − 27 35 0  0  0 . 0 2 27 Thus the solutions   are of the form x1 = 7 t, x2 = 35 t, x3 = t. Set t = 35 to avoid fractions, giving a 10 price vector 27. Of course, other values of t lead to equally valid price vectors. 35 26. Since the sums of the columns of the matrix E are both 1, and all numbers are nonnegative, this matrix is an exchange matrix. To find a solution to Ex = x, or equivalently (E − I)x = 0, we row-reduce the 216 CHAPTER 3. MATRICES augmented matrix of that system:   −2R   1 1 −1.40 −0.70 0 −0.50 0.70 0.35 0 −−−→ i     R −0.25R1 0 :  0.25 −0.70 0.25 0 0.25 0 −−2−−−−−→ 0.25 −0.70 R3 −0.25R1 0.25 0 −0.6 0 0.25 0 −0.6 0 − −−−−−−→   R1 −4R2     0 −2.40 0 1 −1.40 −0.70 0 −−−−−→ 1 1 0 −2.40 0 1 − 3.5 R2  0 −0.35 0.425 0 − 0 −0.35 0.425 0 −−−−→ 0 1 −1.214 0 R3 +R2 0 0 0.35 −0.425 0 − 0 0 0 0 0 0 0 −−−−→   2.4 So one possible solution is 1.214. Multiplying through by 70 will roughly clear fractions, giving 1   168 ≈  85  if an integral solution is desired. 70 h E−I 27. This matrix is a consumption matrix since all of its entries are nonnegative. By Corollary 3.36, it is productive since the sum of the entries in each column is strictly less than 1. (Note that Corollary 3.35 does not apply since the sum of the entries in the second row exceeds 1). 28. This matrix is a consumption matrix since all of its entries are nonnegative. By Corollary 3.35, it is productive since the sum of the entries in each row is strictly less than 1. (Note that Corollary 3.36 does not apply since the sum of the entries in the third column exceeds 1). 29. This matrix is a consumption matrix since all of its entries are nonnegative. Since the entries in the second row as well as those in the second column sum to a number greater than 1, neither Corollary 3.35 nor 3.36 applies. So to determine if C is productive, we must use the definition, which involves seeing if I − C is invertible and if its inverse has all positive entries.       1 0 0 0.35 0.25 0 0.65 −0.25 0 0.45 −0.35 . I − C = 0 1 0 − 0.15 0.55 0.35 = −0.15 0 0 1 0.45 0.30 0.60 −0.45 −0.30 0.40 Using technology,   −13.33 −17.78 −15.56 (I − C)−1 ≈ −38.67 −46.22 −40.44 . −44.00 −54.67 −45.33 Since this is not a positive matrix, it follows that C is not productive. 30. This matrix is a consumption matrix since all of its entries are nonnegative. Since the entries in the first row as well as those in the each column sum to a number at least equal to 1, neither Corollary 3.35 nor 3.36 applies. So to determine if C is productive, we must use the definition, which involves seeing if I − C is invertible and if its inverse has all positive entries.       1 0 0 0 0.2 0.4 0.1 0.4 0.8 −0.4 −0.1 −0.4 0 1 0 0 0.3 0.2 0.2 0.1 −0.3 0.8 −0.2 −0.1     . I −C = 0 0 1 0 −  0 0.4 0.5 0.3 =  0 −0.4 0.5 −0.3 0 0 0 1 0.5 0 0.2 0.2 −0.5 0 −0.2 0.8 Using technology reveals that I − C is not invertible, so that C is not productive. 31. To find a solution x to (I − C)x = d, we invert I − C and compute x = (I − C)−1 d. Here # " # " # " 1 1 1 1 − 1 0 2 4 . I −C = − 21 41 = 1 − 21 0 1 2 2 2 3.7. APPLICATIONS 217 Then det(I − C) = 12 · 21 − − 14 − 12 = 18 , so that " 1 12 −1 (I − C) = 1/8 12 1 4 1 2 # " 4 = 4 # 2 . 4 Thus a feasible production vector is −1 x = (I − C) 4 d= 4 2 1 10 = . 4 3 16 32. To find a solution x to (I − C)x = d, we invert I − C and compute x = (I − C)−1 d. Here 1 0 0.1 0.4 0.9 −0.4 I −C = − = . 0 1 0.3 0.2 −0.3 0.8 Then det(I − C) = 0.9 · 0.8 − (−0.4)(−0.3) = 0.60, so that " # " 4 0.8 0.4 1 −1 = 31 (I − C) = 0.60 0.3 0.9 2 2 3 3 2 # . Thus a feasible production vector is " x = (I − C)−1 d = 4 3 1 2 2 3 3 2 #" # " # 10 2 = 35 . 1 2 33. To find a solution x to (I − C)x = d, we invert I − C and compute x = (I − C)−1 d. Here       1 0 0 0.5 0.2 0.1 0.5 −0.2 −0.1 0.6 −0.8 . I − C = 0 1 0 −  0 0.4 0.2 =  0 0 0 1 0 0 0.5 0 0 0.5 Then row-reduce I − C I to compute (I − C)−1 :     1 0 0 2 23 23 0.5 −0.2 −0.1 1 0 0 h i     I −C I = 0 0.6 −0.8 0 1 0 → 0 1 0 0 53 23  . 0 0 0.5 0 0 1 0 0 1 0 0 2 Thus a feasible production vector is  2  −1 x = (I − C) d = 0 0     2 3 5 3 2 3 10 3   2   =  6 . 3  2 0 2 4 8 34. To find a solution x to (I − C)x = d, we invert I − C and compute x = (I − C)−1 d. Here       1 0 0 0.1 0.4 0.1 0.9 −0.4 −0.1 0 0.8 −0.2 . I − C = 0 1 0 −  0 0.2 0.2 =  0 0 1 0.3 0.2 0.3 −0.3 −0.2 0.7 Compute (I − C)−1 using technology to get   1.238 0.714 0.381 (I − C)−1 ≈ 0.143 1.429 0.429 . 0.571 0.714 1.714 Thus a feasible production vector is      1.238 0.714 0.381 1.1 4.624      x = (I − C)−1 d = 0.143 1.429 0.429 3.5 = 6.014 . 0.571 0.714 1.714 2.0 6.557 218 CHAPTER 3. MATRICES 35. Since x ≥ 0 and A ≥ O, then Ax has all nonnegative entries as well, so that Ax ≥ Ox = 0. But then x > Ax ≥ Ox = 0, so that x > 0. 36. (a) Since all entries of all matrices are positive, it is clear that AC ≥ O and BD ≥ O. To see that AC ≥ BD, let A = (aij ), B = (bij ), C = (cij ), and D = (dij ). Then each aij ≥ bij and each cij ≥ dij . So n n n X X X (AC)ij = aik ckj ≥ bik ckj ≥ bik dkj = (BD)ij . k=1 k=1 k=1 So every entry of AC is greater than or equal to the corresponding entry of BD, and thus AC ≥ BD. (b) Let xT = x1 x2 · · · xn . Then for any i and j, we know that aij xj ≥ bij xj since aij > bij and xj ≥ 0. Since x 6= 0, we know that some component of x, say xk , is strictly positive. Then aik xk > bik xk since xk > 0. But then (Ax)i = ai1 x1 + · · · + aik xk + · · · + ain xn > bi1 x1 + · · · + bik xk + · · · + bin xn = (Bx)i . 37. Multiplying, we get 2 5 10 45 x1 = Lx0 = = 0.6 0 5 6 2 5 45 120 x2 = Lx1 = = 0.6 0 6 22.5 2 5 120 375 x3 = Lx2 = = 0.6 0 27 72 38. Multiplying, we get  0 x1 = Lx0 = 0.2 0  0 x2 = Lx1 = 0.2 0  0 x3 = Lx2 = 0.2 0     10 2 10 0  4  =  2  2 3 0     6 1 2 10 0 0  2  = 2 1 2 0.5 0     1 2 6 4 0 0 2 = 1.2 0.5 0 1 1 1 0 0.5 39. Multiplying, we get  1 x1 = Lx0 = 0.7 0  1 x2 = Lx1 = 0.7 0  1 x3 = Lx2 = 0.7 0     3 100 500 0 100 =  70  0 100 50     1 3 500 720 0 0  70  = 350 0.5 0 50 35     1 3 720 1175 0 0 350 =  504  . 0.5 0 35 175 1 0 0.5 3.7. APPLICATIONS 219 40. Multiplying, we get  0 0.5 x1 = Lx0 =  0 0  0 0.5 x2 = Lx1 =  0 0  0 0.5 x3 = Lx2 =  0 0 1 0 0.7 0 1 0 0.7 0 1 0 0.7 0     80 5 10 10  5  0   =   0 10  7  3 10 0     34 2 5 80  5   40  0 0   =   0 0  7  3.5 3 2.1 0.3 0     57.5 34 2 5     0 0   40  =  17  . 0 0 3.5  28  1.05 2.1 0.3 0 2 0 0 0.3 41. (a) The first ten generations of the species using each of the two models are as follows: L1 x0 10 10 x1 50 8 x2 40 40 x3 200 32 x4 160 160 x5 800 128 10 10 50 8 208 40 L2 3654.4 697.6 15315.2 2923.52 L1 x6 640 640 L2 64184.3 12252.2 872 166.4 x7 3200 512 x8 2560 2560 x9 12800 2048 x10 10240 10240 268989 51347.5 1.127 × 106 215192 4.724 × 106 901844 1.980 × 107 . 3.780 × 106 (b) Graphs of the first ten generations using each model are: Model L1 Percentage 1.0 Class 1 0.8 0.8 0.6 0.6 0.4 0.4 0.2 0.2 Class 2 0 2 4 6 Model L2 Percentage 1.0 8 10 TimeHyrsL Class 2 Class 1 0 2 4 6 8 10 TimeHyrsL The graphs suggest that the first model does not have a steady state, while the second Leslie matrix quickly approaches a steady state ratio of class 1 to class 2 population. 220 CHAPTER 3. MATRICES   20 42. Start with x0 = 20. Then 20  0 x1 = 0.1 0  0 x2 = 0.1 0  0 x3 = 0.1 0 0 0 0.5 0 0 0.5 0 0 0.5     20 20 400 0  20 =  2  0 20 10     20 400 200 0   2  =  40  0 10 1     20 200 20 0   40  = 20 . 0 1 20 Thus after three generations the population is back to where it started, so that the population is cyclic.   20 43. Start with x0 = 20. Then 20      0 0 20 20 400 0  20 = 20s x1 =  s 0 0 0.5 0 20 10      200 0 0 20 400 0  20s = 400s x2 =  s 0 10 0 0.5 0 10s      200s 200 0 0 20 0  400s = 200s . x3 =  s 0 200s 10s 0 0.5 0 Thus after three generations the population is still evenly distributed among the three classes, but the total population has been multiplied by 10s. So • If s < 0.1, then the overall population declines after three years; • If s = 0.1, then the overall population is cyclic every three years; • If s > 0.1, then the overall population increases after three years. 44. The table gives us seven separate age classes, each of duration two years; from the table, the Leslie matrix of the system is   0 0.4 1.8 1.8 1.8 1.6 0.6 0.3 0 0 0 0 0 0    0 0.7 0 0 0 0 0   0 0.9 0 0 0 0 L= 0 . 0  0 0 0.9 0 0 0   0 0 0 0 0.9 0 0 0 0 0 0 0 0.6 0 Since in 1990,       46.4 42.06 10 13.92  3  2        2.1  8  1.4             x0 =   5  , so that x1 = Lx0 =  7.2  , and x2 = Lx1 =  1.26  .  6.48   4.5  12       10.8  4.05  0 0 6.48 1 3.7. APPLICATIONS 221 Then x1 and x2 represent the population distribution in 1991 and 1992. To project the population for the years 2000 and 2010, note that xn = Ln x0 . Thus     167.80 69.12  46.08  18.61      29.54  12.24     20    x10 = L10 x0 ≈  10.59 , x20 = L x0 ≈  24.35  .  20.06   8.29       16.48   6.51  9.05 3.57 45. The adjacency matrix contains a 1 in row i, column j if there is an edge between vi and vj :   0 1 0 1 1 0 1 0   0 1 0 1 . 1 0 1 0 46. The adjacency matrix contains a 1 in row i, column j if there is an edge between vi and vj :   1 0 0 1 0 1 1 0   0 1 0 0 . 1 0 0 0 47. The adjacency matrix contains a 1 in row i, column j if there is an edge between vi and vj :   0 1 1 1 1 1 0 1 0 0   1 1 0 1 0 .   1 0 1 0 1 1 0 0 1 0 48. The adjacency matrix contains a 1 in row i, column j if there is an edge between vi and vj :   0 0 0 1 1 0 0 0 1 0   0 0 0 1 0 .   1 1 1 0 0 1 0 0 0 0 49. v2 v4 v1 v3 222 CHAPTER 3. MATRICES 50. v4 v3 v1 v2 51. v4 v1 v2 v3 v5 52. v2 v4 v1 v5 v3 53. The adjacency matrix contains a 1 in row i, column j if there is an edge from vi to vj :  0 0  0 1 1 0 1 0 1 0 0 0  0 0 . 1 0 54. The adjacency matrix contains a 1 in row i, column j if there is an edge from vi to vj :  0 0  1 0 0 0 1 1 1 0 0 1  1 1 . 0 0 3.7. APPLICATIONS 223 55. The adjacency matrix contains a 1 in row i, column j if there is an edge from vi to vj :   0 1 0 1 0 1 0 0 1 0   1 1 0 0 0 .   1 0 0 0 1 1 0 0 0 0 56. The adjacency matrix contains a 1 in row i, column j if there is an edge from vi to vj :   0 1 0 1 1 0 0 0 0 1   0 1 0 1 0 .   0 0 0 0 1 0 0 1 0 0 57. v4 v3 v1 v2 58. v3 v4 v1 v2 59. v1 v5 v3 v4 v2 224 CHAPTER 3. MATRICES 60. v5 v2 v1 v3 v4 61. In Exercise 50,  0 1 A= 0 1 1 1 1 1 0 1 0 1   2 1 2 1 2  , so A =  2 1 1 0 2 4 2 3 2 2 2 1  1 3 . 1 3 The number of paths of length 2 between v1 and v2 is given by A2 12 = 2. So there are two paths of length 2 between v1 and v2 . 62. In Exercise 52,  0 0  A= 0 1 1 0 0 0 1 1 0 0 0 1 1 1 1 1 0 0   2 1 2 1   2  1  , so A = 2 0  0 0 0 2 2 2 0 0 2 2 2 0 0 0 0 0 3 3  0 0  0 . 3 3 The number of paths of length 2 between v1 and v2 is given by A2 12 = 2. So there are two paths of length 2 between v1 and v2 . 63. In Exercise 50,  0 1 A= 0 1  3 6 7 8 . 3 6 6 5 The number of paths of length 3 between v1 and v3 is given by A3 13 = 3. So there are three paths of length 3 between v1 and v3 . 1 1 1 1   1 3  1  , so A3 = 7 3 1 0 6 7 11 7 8   1 12 12 1   4  1  , so A = 12 0 0 0 0 12 12 12 0 0 0 1 0 1 64. In Exercise 52,  0 0  A= 0 1 1  12 0 0 12 0 0   12 0 0  . 0 18 18 0 18 18 The number of paths of length 4 between v2 and v2 is given by A4 22 = 12. So there are twelve paths of length 4 between v2 and itself. 0 0 0 1 1 0 0 0 1 1 1 1 1 0 0 3.7. APPLICATIONS 225 65. In Exercise 57,  0 1 A= 0 1 1 0 1 0    0 1 0 0 1   1  , so A2 = 1 1 1 1 . 1 0 0 1 0 1 1 2 1 1 The number of paths of length 2 from v1 to v3 is A2 13 = 0. There are no paths of length 2 from v1 to v3 .  0 1 A= 0 1 1 0 1 0 0 0 0 1 66. In Exercise 57,    0 1 1 1 1   1  , so A3 = 2 2 1 2 . 1 1 1 1 0 1 3 2 1 3 The number of paths of length 3 from v4 to v1 is A3 41 = 3. There are three paths of length 3 from v4 to v1 . 0 0 0 1 67. In Exercise 60,  0 0  A= 1 1 1    1 1 1 1 1 1 1 1 0 1 2 0    3   1 , so A =  2 3 0 3 3 .   3 3 1 1 1 0 2 1 1 1 0 0 The number of paths of length 3 from v4 to v1 is A3 41 = 3. There are three paths of length 3 from v4 to v1 . 1 0 0 0 1 0 0 0 1 0 0 1 1 0 0 68. In Exercise 60,    3 2 1 2 2 1 3 3 1 1 1 0    4 6 5 3 3 2 . 1 , so A =    3 4 1 4 4 0 2 2 1 2 3 0 The number of paths of length 4 from v1 to v4 is A3 14 = 2. There are two paths of length 4 from v1 to v4 .  0 0  A= 1 1 1 1 0 0 0 1 0 0 0 1 0 0 1 1 0 0 69. (a) If there are no ones in row i, then no edges join vertex i to any of the other vertices. Thus G is a disconnected graph, and vertex i is an isolated vertex. (b) If there are no ones in column j, then no edges join vertex j to any of the other vertices. Thus G is a disconnected graph, and vertex j is an isolated vertex. 70. (a) If row i of A2 is all zeros, then there are no paths of length 2 from vertex i to any other vertex. (b) If column j of A2 is all zeros, then there are no paths of length 2 from any vertex to vertex j. 71. The adjacency matrix for the digraph is  0 1  1 A= 1  0 1 0 0 0 1 0 0 0 1 0 0 0 1 0 0 1 0 0 1 1 1 1 1 0 0  0 1  0 . 0  1 0 226 CHAPTER 3. MATRICES The number of wins that each player had is given by     1 1 1 4     1 3    A 1 = 3 ,     1 1 3 1 so the ranking is: first, P2 ; second, P3 , P4 , P6 (tie); third, P1 , P5 (tie). Ranking instead by combined wins and indirect wins, we compute        2 1 0 0 0 0 1 1 1 1 3 0 2 2 3 2 1 12        1 2 1 0 1 3 1 1  8      = (A + A2 )  1 2 1 1 0 3 2 1 =  9  ,        1 1 0 1 1 0 1 1  4  10 1 3 1 1 2 3 0 1 so the ranking is: first, P2 ; second, P6 ; third, P4 ; fourth, P3 ; fifth, P5 ; sixth, P1 . 72. Let vertices 1 through 7 correspond to, in order, rodent, plant, insect, bird, fish, fox, and bear. Then the adjacency matrix is   0 1 0 0 0 0 0 0 0 0 0 0 0 0   0 1 0 0 0 0 0    A= 0 1 1 0 1 0 0 . 0 1 1 0 0 0 0   1 0 0 1 0 0 0 1 0 0 0 1 1 0 (a) The number of direct sources of food is the number of ones in each row, so it is     1 1 1 0     1 1        A 1 = 3 . 1 2     1 2 3 1 Thus birds and bears have the most direct sources of food, 3. (b) The number of species for which a given species is a direct source of food is the number of ones in its column. Clearly plants, with a value of 4, are a direct source of food for the greatest number of species. Note the analogy of parts (a) and (b) with the previous exercise, where eating a given species is equivalent to winning that head-to-head match, and being eaten is equivalent to losing it. (c) Since A2 ij gives the number of paths of length 2 from vertex i to vertex j, it follows that A2 ij is the number of ways in which species j is an indirect food source for species i. So summing the entries of the ith row of A2 gives the total number of indirect food sources:     1 0 1 0     1 0     2   A 1 =  3 . 1 1     1 4 1 5 3.7. APPLICATIONS 227 Thus bears have the most indirect food sources. Combined direct and indirect food sources are given by     1 1 1 0     1 1     2    (A + A ) 1 =  6 . 1 3     1 6 8 1 (d) The matrix A∗ is the matrix A after removing row 2 and column 2:   0 0 0 0 0 0 0 0 0 0 0 0   0 1 0 1 0 0  A∗ =  0 1 0 0 0 0 .   1 0 1 0 0 0 1 0 0 1 1 0 Then we have     0 1 1 0       2 ∗ 1  A  =  , 1 1 1 2 3 1 so the species with the most direct food sources is the bear, with three. The species that is a direct food source for the most species is the species for which the column sum is the largest; three species, rodents, insects, and birds, each are prey for two species. Note that the column sums can be computed as   1 1   1  1 1 1 1 1 1 A∗ or (A∗ )T  1 .   1 1 The number of indirect food sources each species has is     1 0 1 0       2 1  1 (A∗ )  1 = 0 ,     1 2 1 3 and the number of combined direct and indirect food sources is     1 0 1 0       2 1  = 3 . A∗ + (A∗ )  1 1     1 4 1 6 Thus birds have lost the most combined direct and indirect sources of food (5). The bear, fox, and fish species have each lost two combined sources of food, and rodents and insects have each lost one combined food source. 228 CHAPTER 3. MATRICES 5 (e) Since (A∗ ) is the zero matrix, after a while, no species will have a direct food source, so that the ecosystem will be wiped out. 73. (a) Using the relationships given, we get the digraph Bert Ehaz Carla Dana Ann Letting vertices 1 through 5 correspond to the people in alphabetical order, the adjacency matrix is   0 0 1 0 1 0 0 1 1 0    A= 0 0 0 0 1 . 1 0 1 0 0 0 1 0 0 0 k (b) The element A ij is 1 if there is a path of length k from i to j, so there is a path of length at most k from i to j if A + A2 + · · · + Ak ij ≥ 1. If Bert hears a rumor and we want to know how many steps it takes before everyone else has heard the rumor, then we must calculate A, A + A2 , A + A2 + A3 , and so forth, until every entry in row 2 (except for the one in column 2, which corresponds to Bert himself) is nonzero. This is not true for A, since for example A21 = 0. Next,   0 1 1 0 2 1 0 2 1 1    A + A2 =  0 1 0 0 1 , 1 0 2 0 2 0 1 1 1 0 so everyone else will have heard the rumor, either directly or indirectly, from Bert after two steps (Carla will have heard it twice). (c) Using the same logic as in part (b), we must compute successive sums of powers of A until every entry in the first row, except for the entry in the first column corresponding to Ann herself, is nonzero. From part (b), we see that this is not the case for A + A2 since A + A2 14 = 0. Next,   0 2 2 1 2 1 1 3 1 3   2 3  A+A +A = 0 1 1 1 1 , 1 2 2 0 3 1 1 2 1 1 so everyone else will have heard the rumor from Ann after three steps. In fact, everyone but Dana will have heard it twice. (d) j is reachable from i by a path of length k if Ak ij is nonzero. Now, if j is reachable from i at all, it is reachable by a path of length at most n, where n is the number of vertices in the digraph, so we must compute A + A2 + A3 + · · · + An ij and see if it is nonzero. If it is, the two vertices are connected. 3.7. APPLICATIONS 229 74. Let G be a graph with m vertices, and A = (aij ) its adjacency matrix. (a) The basis for the induction is n = 1, where the statement says that aij is the number of direct paths between vertices i and j. But this is just the definition of the adjacency matrix, since aij = 1 if i and j have an edge between them and is zero otherwise. Next, assume [Ak ]ij is the Pm k+1 k number of paths of length k between i and j. Then [A ]ij = l=1 [A ]il [A]lj . For a given vertex l, [Ak ]il gives the number of paths of length k from i to l, and [A]lj gives the number of paths of length 1 from l to j (this is either zero or one). So the product gives the number of paths of length k + 1 from i to j passing through l. Summing over all vertices l in the graph gives all paths of length k + 1 from i to j, proving the inductive step. (b) If G is a digraph, the statement of the result must be modified to say that the (i, j) entry of An is equal to the number of n-paths from i to j; the proof is valid since the adjacency matrix of a digraph takes arrow directions into account. Pm 75. [AAT ]ij = k=1 aik ajk . So this matrix entry is the number of vertices k that have edges from both vertex i and vertex j. 76. From the diagram in Exercise 49, this graph is bipartite, with U = {v1 } and V = {v2 , v3 , v4 }. 77. From the diagram in Exercise 52, this graph is bipartite, with U = {v1 , v2 , v3 } and V = {v4 , v5 }. 78. The graph in Exercise 51 is not bipartite. For suppose v1 ∈ U . Then since there are edges from v1 to both v3 and v4 , we must have v3 , v4 ∈ V . Next, there are edges from v3 to v5 , and from v4 to v2 , so that v2 , v5 ∈ U . But there is an edge from v2 to v5 . So this graph cannot be bipartite. 79. Examining the matrix, we see that vertices 1, 2, and 4 are connected to vertices 3, 5, and 6 only, so that if we define U = {v1 , v2 , v4 } and V = {v3 , v5 , v6 }, we have shown that this graph is bipartite. 80. Suppose that G has n vertices. (a) First suppose that the adjacency matrix of G is of the form shown. Suppose that the zero blocks have p and q = n − p rows respectively. Then any edges at vertices v1 , v2 , . . . , vp have their other vertex in vertices vp+1 , vp+2 , . . . , vn , since the upper left p × p matrix is O. Similarly, any edges at vertices vp+1 , vp+2 , . . . , vn have their other vertex in vertices v1 , v2 , . . . , vp , since the lower right q × q matrix is O. So G is bipartite, with U = {v1 , v2 , . . . , vp } and V = {vp+1 , vp+2 , . . . , vn }. Next suppose that G is bipartite. Label the vertices so that U = {v1 , v2 , . . . , vp } and V = {vp+1 , vp+2 , . . . , vn }. Now consider the adjacency matrix A of G. aij = 0 if i, j ≤ p, since then both vi and vj are in U ; similarly, aij = 0 if i, j > p, since then both vi and vj are in V . So the upper left p × p submatrix as well as the bottom right q × q submatrix are the zero matrix. The remaining two submatrices are transposes of each other since this is an undirected graph, so that aij = 1 if and only if aji = 1. Thus A is of the form shown. (b) Suppose G is bipartite and A its adjacency matrix, of the form given in part (a). It suffices to show that odd powers of A have zero blocks at upper left of size p × p and at lower right of size q × q, since then there are no paths of odd length between any vertex and itself. We do this by induction. Clearly A1 = A is of that form. Now, OO + BB T OB + BO BB T O D1 O A2 = = = O D2 B T O + OB T B T B + OO O BT B where D1 = BB T and D2 = B T B. Suppose that A2k−1 has the required zero blocks. Then O C1 D1 O A2k+1 = A2k−1 A2 = C2 O O D2 OD1 + C1 O OO + C1 D2 = C2 D1 + OO D1 O + OD2 O C 1 D2 = . C2 D1 O 230 CHAPTER 3. MATRICES Thus A2k+1 is of this form as well. So no odd power of A has any entries in the upper left or lower right block, so there are no odd paths beginning and ending at the same vertex. Thus all circuits must be of even length. (Note that in the matrix A2k+1 above, we must in fact have (C1 D2 )T = C2 D1 since powers of a symmetric matrix are symmetric, but we do not need this fact.) Chapter Review 1. (a) True. See Exercise 34 in Section 3.2. If A is an m × n matrix, then AT is an n × m matrix. Then AAT is defined and is an m × m matrix, and AT A is also defined and is an n × n matrix. (b) False. This is true only when A is invertible. For example, if 1 0 0 0 0 A= , B= , then AB = 0 1 0 1 0 0 , 0 but A 6= O. (c) False. What is true is that XAA−1 = BA−1 so that X = BA−1 . Matrix multiplication is not commutative, so that BA−1 6= A−1 B in general. (d) True. This is Theorem 3.11 in Section 3.3. (e) True. If E performs kRi , then E is a matrix with all zero entries off the diagonal, so it is its own transpose and thus E T is also the elementary matrix that performs kRi . If E performs Ri + kRj , then the only off-diagonal entry in E is [E]ij = k, so the only off-diagonal entry of E T is [E T ]ji = k and thus E T is the elementary matrix performing Rj + kRi . Finally, if E performs Ri ↔ Rj , then [E]ii = [E]jj = 0 and [E]ij = [E]ji = 1, so that E T = E and thus E T is also an elementary matrix exchanging Ri and Rj . (f ) False. Elementary matrices perform a single row operation, while their products can perform multiple row operations. There are many examples. For instance, consider the elementary matrices corresponding to R1 ↔ R2 and 2R1 : 0 1 2 0 0 1 = , 1 0 0 1 2 0 which is not an elementary matrix. (g) True. See Theorem 3.21 in Section 3.5. (h) False. For this to be true, the plane must pass through the origin. (i) True, since T (c1 u + c2 v) = −(c1 u + c2 v) = −c1 u − c2 v = c1 (−u) + c2 (−v) = c1 T (u) + c2 T (v). (j) False; the matrix is 5 × 4, not 4 × 5. See Theorems 3.30 and 3.31 in Section 3.6. 2. Since A is 2 × 2, so is A2 . Since B is 2 × 3, the product makes sense. Then 1 2 1 2 7 12 7 12 2 0 −1 50 −36 2 2 A = = , so that A B = = 3 5 3 5 18 31 18 31 3 −3 4 129 −93 41 . 106 3. B 2 is not defined, since the number of columns of B is not equal to the number of rows of B. So this matrix product does not make sense. 4. Since B T is 3 × 2, A−1 is 2 × 2, and B is 2 × 3, this matrix product makes sense as long as A is invertible. Since det A = 1 · 5 − 2 · 3 = −1, we have 1 5 −2 −5 2 = , A−1 = 1 3 −1 −1 −3 Chapter Review 231 so that  2 B T A−1 B =  0 −1  3 −5 −3 3 4  −1 0 −1 = −9 −3 4 17 2 2 −1 3  1 2 3 3 −6   1 −3 5 0 −1 21 . = −9 −9 −3 4 16 18 −41 5. Since BB T is always defined (see exercise 34 in section 3.2) and is square, (BB T )−1 makes sense if BB T is invertible. We have   2 3 2 0 −1  5 2 0 −3 = BB T = . 3 −3 4 2 34 −1 4 Then det(BB T ) = 5 · 34 − 22̇ = 166, so that (BB T )−1 = 1 34 −2 . 5 166 −2 6. Since B T B is always defined (see exercise 34 in section 3.2) and is square, (B T B)−1 makes sense if B T B is invertible. We have     2 3 13 −9 10 2 0 −1 9 −12 . B T B =  0 −3 = −9 3 −3 4 −1 4 10 −12 17 To compute the inverse, we row-reduce B T B I :  h BT B 13 −9 10  I = −9 9 −12 10 −12 17 i 1 0 0 0 1 0   13 0 9  R2 + 13 R1  0 −−−−−−→  0 R3 − 10 R1 1 − −−−13 −−→ 0 −9 10 1 36 13 66 − 13 − 66 13 9 13 10 − 13 121 13  0 0  1 0 0 1  13 −9 10  36 66  0 13 − 13 R3 + 66 R 2 0 0 −−−−36 −−→ 0 1 9 13 1 2  0 0  1 0 . 11 1 6 Since the matrix on the left has a zero row, B T B is not invertible, so (B T B)−1 does not exist. 7. The outer product expansion of AAT is 1 T T T 1 AA = a1 A1 + a2 A2 = 3 2 3 + 2 5 1 5 = 3 3 4 10 5 + = 9 10 25 13 13 . 34 8. By Theorem 3.9 in Section 3.3, we know that (A−1 )−1 = A, so we must find the inverse of the given matrix. Now, det A−1 = 12 · 4 − (−1) − 32 = 12 , so that " # " # 8 2 1 4 1 −1 −1 A = (A ) = = . 1/2 23 12 3 1  −1 9. Since AX = B =  5 3  −3 0, we have X = A−1 B. To find A−1 , we compute −2     3 1 0 −1 1 0 0 1 0 0 2 − 12 2     1 − 21  . 2 3 −1 0 1 0 → 0 1 0 −1 2 3 0 1 1 0 0 1 0 0 1 1 − 12 2 232 CHAPTER 3. MATRICES Then   − 21 2  X = A−1 B = −1 1 3 −1 2  − 21   5 3 3 2 1 2 − 12   0 −3   0 = 2 1 −2  −9  4 . −6 10. Since det A = 1 · 6 − 2 · 4 = −2 6= 0, A is invertible, so by Theorem 3.12 in Section 3.3, it can be written as a product of elementary matrices. As in Example 3.29 in Section 3.3, we start by row-reducing A: A= 1 4 2 1 R2 −4R1 6 −−−−−→ 0 1 2 − 21 R2 −2 −−−−→ 0 R1 −2R2 2 − −−−−→ 1 0 1 0 . 1 So by applying " 1 E1 = 4 # 0 , 1 " 1 E2 = 0 0 # − 12 , h E3 = 1 −2 0 i 1 in that order to A we get the identity matrix, so that E3 E2 E1 A = I. Since elementary matrices are invertible, this means that A = (E3 E2 E1 )−1 = E1−1 E2−1 E3−1 . Since E1 is R2 − 4R1 , it follows that E1−1 is R2 + 4R1 . Since E2 is − 12 R2 , it follows that E2−1 is −2R2 . Since E3 is R1 − 2R2 , it follows that E3−1 is R1 + 2R2 . Thus 1 0 1 0 1 2 −1 −1 −1 A = E1 E2 E3 = . −1 1 0 −2 0 1 11. To show that (I − A)−1 = I + A + A2 , it suffices to show that (I + A + A2 )(I − A) = I. But (I + A + A2 )(I − A) = II − IA + AI − A2 + A2 I − A3 = I − A + A − A2 + A2 − A3 = I + A3 = I + O = I. 12. Use the multiplier method. We want to reduce A to an upper triangular matrix:  1 A = 3 2   1 1 1 R2 −3R1 1 1 −−−−−→ 0 R3 −2R1 −1 1 − −−−−→ 0   1 1 1 0 −2 −2 R3 − 23 R2 −3 −1 − −−−−−→ 0  1 1 −2 −2 = U. 0 2 Then `21 = 3, `31 = 2, and `32 = 23 , so that  1  L = 3 2 0 1 3 2  0  0 . 1 13. We follow the comment after Example 3.48 in Section 3.5. Start by finding the reduced row echelon form of A:       2 −4 5 8 5 1 −2 2 3 1 1 −2 2 3 1 R −2R 2 1 R1 ↔R2  1 −2 2 3 1− 0 1 2 3 −−−−→ 2 −4 5 8 5 −−−−−→ 0 R −4R 3 1 4 −8 3 2 6 4 −8 3 2 6 −−−−−→ 0 0 −5 −10 2     1 −2 2 3 1 1 −2 2 3 1 0 0 0 1 2 3 0 1 2 3 1 R3 +5R2 8 R3 0 0 0 0 8 0 0 0 0 1 −−−−−→ −−−→  R1 −2R2   R1 −R3  −−− −−→ 1 −2 2 3 0 − −−−−→ 1 −2 0 −1 0 R −3R3 0 0 0 1 2 0 0 1 2 0 . −−2−−−→ 0 0 0 0 1 0 0 0 0 1 Chapter Review 233 Thus a basis for row(A) is given by the nonzero rows of the matrix, or { 1 −2 0 −1 0 , 0 0 1 2 0 , 0 0 0 0 1 }. Note that since the matrix is rank 3, so all three rows are nonzero, the rows of the original matrix also provide a basis for row(A). A basis for col(A) is given by the column vectors of the original matrix A corresponding to the leading 1s in the row-reduced matrix, so a basis for col(A) is       5 5   2 1 , 2 , 1 .   4 3 6 Finally, from the row-reduced form, the solutions to Ax = 0 where   x1 x2     x= x3  x4  x5 can be found by noting that x2 = s and x4 = t are free variables and that x1 = 2s + t, x3 = −2t, and x5 = 0. Thus the nullspace of A is       1 2 2s + t  0 1  s         −2t  = s 0 + t −2 ,        1 0  t  0 0 0 so that a basis for the nullspace is given by     1 2 1  0        { 0 , −2}. 0  1 0 0 14. The row spaces will be the same since by Theorem 3.20 in Section 3.5, if A and B are row-equivalent, they have the same row spaces. However, the column spaces need not be the same. For example, let     1 0 0 1 0 0 0 1 0 = B. A = 0 1 0 R3 −R1 1 0 0 − −−−−→ 0 0 0 Then A and B are row-equivalent, but         1 0 1 0 col(A) = span 0 , 1 but col(B) = span 0 , 1 . 1 0 0 0 15. If A is invertible, then so is AT . But by Theorem 3.27 in Section 3.5, invertible matrices have trivial null spaces, so that null(A) = null(AT ) = 0. If A is not invertible, then the null spaces need not be the same. For example, let     1 1 1 1 0 0 A = 0 0 0 , so that AT = 1 0 0 . 0 0 0 1 0 0 234 CHAPTER 3. MATRICES Then AT row reduces to  1 0 0 But 0 0 0      0 0 0 0 , so that null(AT ) = span 1 , 0 . 0 0 1     −1 −1 null(A) = span  1 ,  0 . 0 1 Since every vector in null(AT ) has a zero first component, neither of the basis vectors for null(A) is in null(AT ), so the spaces cannot be equal. 16. If the rows of A add up to zero, then the rows are linearly dependent, so by Theorem 3.27 in Section 3.5, A is not invertible. 17. Since A has n linearly independent columns, rank(A) = n. Theorem 3.28 of Section 3.5 tells us that rank(AT A) = rank(A) = n, so AT A is invertible since it is an n × n matrix. For the case of AAT , since m may be greater than n, there is no reason to believe that this matrix must be invertible. One counterexample is         1 0 1 0 1 1 0 1 0 1 1 0 1 1 0 1 , AAT = 0 1 A = 0 1 , AT = = 0 1 0 → 0 1 0 . 0 1 0 0 1 0 1 0 0 0 0 1 0 1 0 1 Thus AAT is not invertible, since it row-reduces to a matrix with a zero row. 18. The matrix of T is [T ] = T (e1 ) T (e2 ) . From the given data, we compute 1 1 2 1 0 1 1 1 1 1 + = + = T =T 4 0 2 1 2 −1 2 3 2 5 1 1 1 1 2 1 0 0 1 1 T =T − = − = . 1 −1 2 1 2 −1 2 3 2 5 Thus [T ] = 1 4 1 . −1 19. Let R be the rotation and P the projection. The given line has direction vector Examples 3.58 and 3.59 in Section 3.6, " cos 45◦ [R] = sin 45◦ − sin 45◦ cos 45◦ " 12 1 [P ] = 2 1 + 22 1 · (−2) "√ # = √ 2 √2 2 2 − 22 # " 1 · (−2) 1 . Then from −2 # √ 2 2 = (−2)2 1 5 − 25 − 25 4 5 # . So R followed by P is " [P ◦ R] = [P ][R] = 1 5 − 25 − 52 4 5 # "√ 2 √2 2 2 √ − 22 √ 2 2 # = " √ − 102 √ 2 5 √ − 3102 √ 3 2 5 # . 20. Suppose that cv + dT (v) = 0. We want to show that c = d = 0. Since T is linear and T 2 (v) = 0, applying T to both sides of this equation gives T (cv + T (dT (v))) = cT (v) + dT (T (v)) = cT (v) + dT 2 (v) = cT (v) = 0. Since T (v) 6= 0, it follows that c = 0. But then cv + dT (v) = dT (v) = 0. Again since T (v) 6= 0, it follows that d = 0. Thus c = d = 0 and v and T (v) are linearly independent. Chapter 4 Eigenvalues and Eigenvectors 4.1 Introduction to Eigenvalues and Eigenvectors 1. Following example 4.1, to see that v is an eigenvector of A, we show that Av is a multiple of v: 0 3 1 0·1+3·1 3 1 Av = = = =3 = 3v. 3 0 1 3·1+0·1 3 1 Thus v is an eigenvector of A with eigenvalue 3. 2. Following example 4.1, to see that v is an eigenvector of A, we show that Av is a multiple of v: 1 2 3 1 · 3 + 2 · (−3) −3 Av = = = = −v 2 1 −3 2 · 3 + 1 · (−3) 3 Thus v is an eigenvector of A with eigenvalue −1. 3. Following example 4.1, to see that v is an eigenvector of A, we show that Av is a multiple of v: −1 1 1 −1 · 1 + 1 · (−2) −3 1 Av = = = = −3 = −3v. 6 0 −2 6 · 1 + 0 · (−2) 6 −2 Thus v is an eigenvector of A with eigenvalue −3. 4. Following example 4.1, to see that v is an eigenvector of A, we show that Av is a multiple of v: 4 −2 4 4·4−2·2 12 Av = = = = 3v 5 −7 2 5·4−7·2 6 Thus v is an eigenvector of A with eigenvalue 3. 5. Following example 4.1, to see that v is an eigenvector of A, we show that Av is a multiple of v:          6 2 3 0 0 2 3 · 2 + 0 · (−1) + 0 · 1 Av = 0 1 −2 −1 = 0 · 2 + 1 · (−1) − 2 · 1 = −3 = 3 −1 = 3v. 1 0 1 1 1 · 2 + 0 · (−1) + 1 · 1 3 1 Thus v is an eigenvector of A with eigenvalue 3. 6. Following example 4.1, to see that v is an eigenvector of A, we show that Av is a multiple of v:        0 1 −1 −2 0 · (−2) + 1 · 1 − 1 · 1 0 1  1 = 1 · (−2) + 1 · 1 + 1 · 1 = 0 = 0v Av = 1 1 1 2 0 1 1 · (−2) + 2 · 1 + 0 · 1 0 Thus v is an eigenvector of A with eigenvalue 0. 235 236 CHAPTER 4. EIGENVALUES AND EIGENVECTORS 7. We want to show that λ = 3 is an eigenvalue of 2 A= 2 2 −1 Following Example 4.2, that means we wish to find a vector v with Av = 3v, i.e. with (A − 3I)v = 0. So we show that (A − 3I)v = 0 has nontrivial solutions. 2 2 3 0 −1 2 A − 3I = − = . 2 −1 0 3 2 −4 Row-reducing, we get −R1 −1 2 − 1 −2 −−→ 1 −2 . R2 −2R1 2 −4 2 −4 − 0 −−−−→ 0 x It follows that any vector with x = 2y is an eigenvector, so for example one eigenvector correy 2 sponding to λ = 3 is . 1 8. We want to show that λ = −1 is an eigenvalue of A= 2 3 3 2 Following Example 4.2, that means we wish to find a vector v with Av = −v, i.e. with (A − (−I))v = (A + I)v = 0. So we show that (A + I)v = 0 has nontrivial solutions. 2 3 1 0 3 3 A+I = + = . 3 2 0 1 3 3 Row-reducing, we get 3 3 3 3 R2 −R1 3 − −−−−→ 0 1 R1 3 3 −− −→ 1 0 0 1 0 x It follows that any vector with x = −y is an eigenvector, so for example one eigenvector corre y 1 sponding to λ = −1 is . −1 9. We want to show that λ = 1 is an eigenvalue of 0 A= −1 4 5 Following Example 4.2, that means we wish to find a vector v with Av = 1v, i.e. with (A − I)v = 0. So we show that (A − I)v = 0 has nontrivial solutions. 0 4 1 0 −1 4 A−I = − = . −1 5 0 1 −1 4 Row-reducing, we get −R1 −1 4 − 1 −4 −−→ 1 −4 . R2 +R1 −1 4 −1 4 − 0 −−−−→ 0 x It follows that any vector with x = 4y is an eigenvector, so for example one eigenvector correy 4 sponding to λ = 1 is . 1 4.1. INTRODUCTION TO EIGENVALUES AND EIGENVECTORS 10. We want to show that λ = −6 is an eigenvalue of 4 A= 5 237 −2 −7 Following Example 4.2, that means we wish to find a vector v with Av = −6v, i.e. with (A−(−6I))v = (A + 6I)v = 0. So we show that (A + 6I)v = 0 has nontrivial solutions. 4 −2 6 0 10 −2 + = . A + 6I = 5 −7 0 6 5 −1 Row-reducing, we get " 10 5 # −2 " # 1R " 1 10 −2 −10 −−→ 1 − 15 0 0 R −1R 2 1 −1 −−2−−− −→ 0 0 # x It follows that any vector with x = 51 y is an eigenvector, so for example one eigenvector correy 1 sponding to λ = −6 is . 5 11. We want to show that λ = −1 is an eigenvalue of  1 A = −1 2 0 1 0  2 1 1 Following Example 4.2, that means we wish to find a vector v with Av = −1v, i.e. with (A + I)v = 0. So we show that (A + I)v = 0 has nontrivial solutions.       2 0 2 1 0 0 1 0 2 A + I = −1 1 1 + 0 1 0 = −1 2 1 . 2 0 2 0 0 1 2 0 1 Row-reducing, we get  2 −1 2 0 2 0   2 2 −1 1 R3 −R1 2 − 0 −−−−→ 0 2 0    −R1   2 −1 2 1 − −−→ 1 −2 −1 R1 ↔R2  R2 −2R1 2 1 − 2 0 2 0 2 − −−−−→ −−−−→ 0 0 0 0 0 0 0   R1 +2R2    1 −2 −1 1 1 −2 −1 − −−−−→ 1 0  0 4 R2 0 4 4 −− 1 1 −→ 0 0 0 0 0 0 0 0 1 0  1 1 0   x It follows that any vector y  with x = y = −z is an eigenvector, so for example one eigenvector z  −1 corresponding to λ = −1 is −1. 1 12. We want to show that λ = 2 is an eigenvalue of  3 A = 1 4 1 1 2  −1 1 0 238 CHAPTER 4. EIGENVALUES AND EIGENVECTORS Following example 4.2, that means we wish to find a vector v with Av = 2v, i.e. with (A − 2I)v = 0. So we show that (A − 2I)v = 0 has nontrivial solutions.       3 1 −1 2 0 0 1 1 −1 1 − 0 2 0 = 1 −1 1 A − 2I = 1 1 4 2 0 0 0 2 4 2 −2 Row-reducing, we have  1 1 4   1 −1 1 R2 −R1 −1 1 −−− −−→ 0 R3 −4R1 2 −2 − −−−−→ 0 It follows that any vector   1 −1 1 0 −2 2 R3 −R2 −2 2 − −−−−→ 0 hxi y z   1 1 −1 − 21 R2  −2 2 − −−−→ 0 0 0 0 1 1 0  R1 −R2  −1 − −−−−→ 1 0 −1 0 0 0 1 0  0 −1 0   0 with x = 0 and y = z is an eigenvector. Thus for example 1 is an 1 eigenvector for λ = 2. 13. Since A is the reflection F in the y-axis, the only vectors that F maps parallel to themselves are vectors 1 parallel to the x-axis (multiples of ), which are reversed (eigenvalue λ = −1) and vectors parallel 0 0 to the y-axis (multiples of ), which are sent to themselves (eigenvalue λ = 1). So λ = ±1 are the 1 eigenvalues of A, with eigenspaces 1 0 E−1 = span E1 = span . 0 1 F He1 L = e1 e-1 F He-1 L =- e-1 x F HxL 14. The only vectors that the transformation F corresponding to A maps to themselves are vectors perpendicular to y = x, which are reversed (so correspond to the eigenvalue λ = −1) and vectors parallel to y = x, which are sent to (so correspond to the eigenvalue λ=1. Vectors perpendicular themselves 1 1 to y = x are multiples of ; vectors parallel to y = x are multiples of , so the eigenspaces are −1 1 E−1 = span 1 −1 E1 = span 1 . 1 4.1. INTRODUCTION TO EIGENVALUES AND EIGENVECTORS 239 F He1 L = e1 F He-1 L =- e-1 F HxL e-1 x 15. If x is an eigenvalue of A, then Ax is a multiple of x: 1 0 x x Ax = = . 0 0 y 0 x Then clearly any vector parallel to the x-axis, i.e., of the form , is an eigenvector for A with 0 eigenvalue 1. But this projection also transforms vectors parallel to the y-axis to the zero vector, since 0 the projection of such a vector on the x-axis is 0. So vectors of the form are eigenvectors for A y x with eigenvalue 0. Finally, a vector of the form where neither x nor y is zero is not an eigenvector: y x it is transformed into , and comparing the x-coordinates shows that λ would have to be 1, while 0 comparing y-coordinates shows that it would have to be zero. Thus there is no such λ. So in summary, 1 λ = 1 is an eigenvalue with eigenspace E1 = span 0 0 λ = 0 is an eigenvalue with eigenspace E0 = span . 1 16. This projection sends a vector to its projection on the given line. Then a vector parallel to the direction vector of that line will be sent to itself, since its projection on the line is the vector itself. A vector orthogonal to the direction vector of the line will be "sent zero vector, since its projection on the # to the 4 4 line is the zero vector. Since the direction vector is 53 , or , there are two eigenspaces 3 5 4 span , λ = 1; 3 span −3 , λ = −1. 4 If a vector v is neither parallel to nor orthogonal to the direction vector of the line, then the projection transforms v to a vector parallel to the direction vector, thus not parallel to v. Thus the image of v cannot be a multiple of v. Hence the eigenvalues and eigenspaces above are the only ones. x 17. Consider the action of A on a vector x = . If y = 0, then A multiplies x by 2, so this is an y eigenspace for λ = 2. If x = 0, then A multiplies x by 3, so this is an eigenspace for λ = 3. If neither x 240 CHAPTER 4. EIGENVALUES AND EIGENVECTORS nor y is zero, then x 2x x is mapped to , which is not a multiple of . So the only eigenspaces are y 3y y 1 0 span , λ = 2; span , λ = 3. 0 1 18. Since A rotates the plane 90◦ counterclockwise around the origin, any nonzero vector is taken to a vector orthogonal to itself. So there are no nonzero vectors that are taken to vectors parallel to themselves (that is, to multiples of themselves). The only vector taken to a multiple of itself is the zero vector; this is not an eigenvector since eigenvectors are by definition nonzero. So this matrix has no eigenvectors and no eigenvalues. Analytically, the matrix of A is cos θ − sin θ 0 −1 A= = , sin θ cos θ 1 0 so that 0 Ax = 1 −1 x −y = . 0 y x If this is a multiple of the original vector, then there is a constant a such that x = −ay and y = ax. Substituting ax for y in the first equation gives x = −a2 x, which is impossible unless x = 0. But then the second equation forces y = 0. So again we see that there are no eigenvectors. 19. Since eigenvectors x are those taken to a multiple of themselves, i.e., to a vector parallel to themselves, by A, we look for unit vectors whose image under A is in the same direction. It appears that these vec 1 0 tors are vectors along the x and y-axes, so the eigenspaces are E1 = span and E2 = span . 0 1 The length of the resultant vectors in the first eigenspace appears are same length as the original vector, so that λ1 = 1. The length of the resultant vectors in the second eigenspace are twice the original length, so that λ2 = 2. 20. Since eigenvectors x are those taken to a multiple of themselves, i.e., to a vector parallel to themselves, by A, we look for unit vectors whose image under A is in the same direction. The only vectors satisfying 1 this criterion are those parallel to the line y = x, i.e., span , so this is the only eigenspace. The 1 1 terminal point of the line in the direction appears to be the point (2, 5), so that it has a length of 1 √ √ 29. Since the original vector is a unit vector, the eigenvalue is 29. 21. Since eigenvectors x are those taken to a multiple of themselves, i.e., to a vector parallel to themselves, by A, we look for unit vectors whose image under A is in the same direction. The vectors parallel to 1 the line y = x, i.e., span satisfy this criterion, so this is an eigenspace E1 . The vectors parallel 1 1 to the line y = −x, i.e. span , also satisfy this criterion, so this is another eigenspace E2 . For −1 √ E1 , the terminal point of the line along y = x appears to be (2, 2),√so its length is 2 2. Since the original vector is a unit vector, the corresponding eigenvalue is λ1 = 2 2. For E2 , the resultant vectors are zero length, so λ2 = 0. 22. Since eigenvectors x are those taken to a multiple of themselves, i.e., to a vector parallel to themselves, by A, we look for unit vectors whose image under A is in the same direction. There are none — every vector is “bent” to the left by the transformation. So this transformation has no eigenvectors. 23. Using the method of Example 4.5, we wish to find solutions to the equation 4 −1 λ 0 4−λ −1 0 = det(A − λI) = det − = det 2 1 0 λ 2 1−λ = (4 − λ)(1 − λ) − 2 · (−1) = λ2 − 5λ + 6 = (λ − 2)(λ − 3). 4.1. INTRODUCTION TO EIGENVALUES AND EIGENVECTORS 241 Thus the eigenvalues are 2 and 3. To find the eigenspace corresponding to λ = 2, we must find the null space of A − 2I. " # " # h i 2 −1 0 1 − 21 0 → A − 2I 0 = 2 −1 0 0 0 0 so that eigenvectors corresponding to this eigenvalue have the form 1 t 2t , or . 2t t 1 A basis for this eigenspace is given by choosing for example t = 1 to get . As a check, 2 1 4 A = 2 2 −1 1 4·1−1·2 2 1 = = =2 . 1 2 2·1+1·2 4 2 To find the eigenspace corresponding to λ = 3, we must find the null space of A − 3I. 1 −1 0 1 −1 0 A − 3I 0 = → 2 −2 0 0 0 0 so that eigenvectors corresponding to this eigenvalue have the form t . t A basis for this eigenspace is given by choosing for example t = 1 to get A 1 4 = 1 2 −1 1 4·1−1·1 3 1 = = =3 1 1 2·1+1·1 3 1 The plot below shows the effect of A on the eigenvectors 1 1 v= w= . 2 1 4 3 2v 2 3w 1 1 . As a check, 1 v w 1 2 3 242 CHAPTER 4. EIGENVALUES AND EIGENVECTORS 24. Using the method of Example 4.5, we wish to find solutions to the equation 0 = det(A − λI) = det 4 λ − 0 0 2 6 0 2−λ = det λ 6 4 −λ = (2 − λ)(−λ) − 4 · 6 = λ2 − 2λ − 24 = (λ − 6)(λ + 4). Thus the eigenvalues are −4 and 6. To find the eigenspace corresponding to λ = −4, we must find the null space of A + 4I. h " i 6 0 = 6 A + 4I 4 0 4 0 # → " 1 2 3 # 0 0 0 0 so that eigenvectors corresponding to this eigenvalue have the form 2 −3t , t or −2t 3t −2 A basis for this eigenspace is given by choosing for example t = 1 to get . As a check, 3 A −2 2 = 3 6 4 0 −2 2 · (−2) + 4 · 3 8 −2 = = = −4 . 3 6 · (−2) + 0 · 3 −12 3 To find the eigenspace corresponding to λ = 6, we must find the null space of A − 6I. A − 6I −4 0 = 6 4 −6 0 1 → 0 0 −1 0 0 0 so that eigenvectors corresponding to this eigenvalue have the form t t 1 A basis for this eigenspace is given by choosing for example t = 1 to get . As a check, 1 1 2 A = 1 6 4 0 −1 2·1+4·1 6 1 = = =6 1 6·1+0·1 6 1 The plot below shows the effect of A on the eigenvectors v= −2 3 w= 1 . 1 4.1. INTRODUCTION TO EIGENVALUES AND EIGENVECTORS 243 6 6w 4 v 2 w 2 -2 4 6 8 -2 -4 -6 -8 -10 -4 v -12 25. Using the method of Example 4.5, we wish to find solutions to the equation 2 5 λ 0 2−λ 5 0 = det(A − λI) = det − = det 0 2 0 λ 0 2−λ = (2 − λ)(2 − λ) − 5 · 0 = (λ − 2)2 Thus the only eigenvalue is λ = 2. To find the eigenspace corresponding to λ = 2, we must find the null space of A − 2I. # " # " i h 0 1 0 0 5 0 → A − 2I 0 = 0 0 0 0 0 0 so that eigenvectors corresponding to this eigenvalue have the form t . 0 1 A basis for this eigenspace is given by choosing for example t = 1, to get . As a check, 0 1 2 5 1 2·1+5·0 2 1 A = = = =2 . 0 0 2 0 0·1+2·0 0 0 1 The plot below shows the effect of A on the eigenvector v = : 0 v 2v 1 2 26. Using the method of Example 4.5, we wish to find solutions to the equation 1 2 λ 0 1−λ 2 0 = det(A − λI) = det − = det −2 3 0 λ −2 3−λ = (1 − λ)(3 − λ) + 4 = λ2 − 4λ + 7 Since this quadratic has no (real) roots, A has no (real) eigenvalues. 244 CHAPTER 4. EIGENVALUES AND EIGENVECTORS 27. Using the method of Example 4.5, we wish to find solutions to the equation 0 = det(A − λI) = det 1 −1 1 λ − 1 0 0 1−λ = det λ −1 1 1−λ = (1 − λ)(1 − λ) − 1 · (−1) = λ2 − 2λ + 2. Using the quadratic formula, this quadratic has roots √ 2 ± 2i 2 ± 22 − 4 · 2 = = 1 ± i. λ= 2 2 Thus the eigenvalues are 1 + i and 1 − i. To find the eigenspace corresponding to λ = 1 + i, we must find the null space of A − (1 + i)I. iR1 −i 1 0 − i 0 1 i 0 −→ 1 A − (1 + i)I 0 = R2 +R1 −1 −i 0 −1 −i 0 − −−−−→ 0 0 0 so that eigenvectors corresponding to this eigenvalue have the form −it . t A basis for this eigenspace is given by choosing for example t = 1, to get −i . 1 To find the eigenspace corresponding to λ = 1 − i, we must find the null space of A − (1 − i)I. −iR1 i 1 0 − 1 −i 0 −−→ 1 −i 0 A − (1 − i)I 0 = R2 +R1 −1 i 0 −1 i 0 −−−−−→ 0 0 0 so that eigenvectors corresponding to this eigenvalue have the form it . t A basis for this eigenspace is given by choosing for example t = 1, to get i . 1 28. We wish to find solutions to the equation 0 = det(A − λI) = det 2 1 −3 λ − 0 0 0 2−λ = det λ 1 −3 −λ = (2 − λ)(−λ) − (−3) · 1 = λ2 − 2λ + 3. √ √ Using the quadratic formula, this has roots λ = 2± 24−4·3 = 1 ± i 2, so these are the eigenvalues. √ √ To find the eigenspace corresponding to λ = 1 + i 2, we must find the null space of A − (1 + i 2)I: √ A − (1 + i 2)I √ 1−i 2 0 = 1 −3 √ 0 R1 ↔R2 −−−−−→ −1 − i 2 0 √ 1√ −1 − i 2 0 1 √ R2 −(1−i 2)R1 1−i 2 −3 0 − −−−−−−−−−→ 0 so that eigenvectors corresponding to this eigenvalue have the form √ (1 + i 2)t . t √ −1 − i 2 0 0 , 0 4.1. INTRODUCTION TO EIGENVALUES AND EIGENVECTORS 245 √ 1+i 2 . As a check, 1 √ √ √ √ √ √ −1 + 2i 2 −3 1 + i 2 2(1 + i√ 2) − 3 · 1 1+i 2 √ 2 = (1 + i 2) 1 + i 2 = = = A 1 0 1 1 1 (1 + i 2) + 0 · 1 1+i 2 √ √ To find the eigenspace corresponding to λ = 1 − i 2, we must find the null space of A − (1 − i 2)I. √ √ −3 √ 0 R1 ↔R2 1+i 2 −−−−−→ A − (1 − i 2)I 0 = 1 −1 + i 2 0 √ √ 1√ −1 + i 2 0 1 −1 + i 2 0 √ R2 −(1+i 2)R1 0 0 −3 0 − 1+i 2 −−−−−−−−−→ 0 A basis for this eigenspace is given by choosing for example t = 1, to get so that eigenvectors corresponding to this eigenvalue have the form √ (1 − i 2)t . t √ 1−i 2 A basis for this eigenspace is given by choosing for example t = 1, to get . As a check, 1 √ √ √ √ √ √ 2 2 −3 1 − i 2 2(1 − i√ 2) − 3 · 1 −1 − 2i 1−i 2 1 − i 2 √ A = = = = (1 − i 2) 1 0 1 1 1 (1 − i 2) + 0 · 1 1−i 2 29. Using the method of Example 4.5, we wish to find solutions to the equation 1 i λ 0 1−λ i 0 = det(A − λI) = det − = det i 1 0 λ i 1−λ = (1 − λ)(1 − λ) − i · i = λ2 − 2λ + 2. Using the quadratic formula, this quadratic has roots √ 2 ± 22 − 4 · 2 2 ± 2i λ= = = 1 ± i. 2 2 Thus the eigenvalues are 1 + i and 1 − i. To find the eigenspace corresponding to λ = 1 + i, we must find the null space of A − (1 + i)I. iR1 −i i 0 − 1 −1 0 1 −1 0 − → A − (1 + i)I 0 = R2 −iR1 i −i 0 i −i 0 − 0 0 −−−−→ 0 so that eigenvectors corresponding to this eigenvalue have the form t . t A basis for this eigenspace is given by choosing for example t = 1, to get 1 . 1 To find the eigenspace corresponding to λ = 1 − i, we must find the null space of A − (1 − i)I. −iR1 i i 0 − 1 1 0 −−→ 1 1 0 A − (1 − i)I 0 = R2 −iR1 i i 0 i i 0 − −−−−→ 0 0 0 so that eigenvectors corresponding to this eigenvalue have the form −t . t A basis for this eigenspace is given by choosing for example t = 1, to get −1 . 1 246 CHAPTER 4. EIGENVALUES AND EIGENVECTORS 30. Using the method of Example 4.5, we wish to find solutions to the equation 0 = det(A − λI) = det 0 1+i λ − 1−i 1 0 0 λ = det −λ 1−i 1+i 1−λ = (−λ)(1 − λ) − (1 + i)(1 − i) = λ2 − λ − 2 = (λ − 2)(λ + 1) Thus the eigenvalues are 2 and −1. To find the eigenspace corresponding to λ = 2, we must find the null space of A − 2I. " # 1 " # " − 2 R1 i h −2 1 + i 0 −−− 1 − 12 (1 + i) 0 1 − 12 (1 + i) − → A − 2I 0 = R2 −(1−i)R1 1 − i −1 0 1−i −1 0 −−−−−−−−→ 0 0 # 0 0 so that eigenvectors corresponding to this eigenvalue have the form " # " # 1 (1 + i)t 2 (1 + i)t , or, clearing fractions, . t 2t A basis for this eigenspace is given by choosing for example t = 1, to get 1+i . 2 To find the eigenspace corresponding to λ = −1, we must find the null space of A − (−I) = A + I. A+I 0 = 1 1+i 1−i 2 1 1+i 0 R2 −(1−i)R1 0 − 0 −−−−−−−→ 0 0 0 so that eigenvectors corresponding to this eigenvalue have the form −(1 + i)t . t A basis for this eigenspace is given by choosing for example t = 1, to get −1 − i . 1 31. Using the method of Example 4.5, we wish to find solutions in Z3 to the equation 0 = det(A − λI) = det 1 1 0 λ − 2 0 0 1−λ = det λ 1 0 2−λ = (1 − λ)(2 − λ) − 0 · 1 = (λ − 1)(λ − 2). Thus the eigenvalues are 1 and 2. 32. Using the method of Example 4.5, we wish to find solutions in Z3 to the equation 0 = det(A − λI) = det 2 1 1 λ − 2 0 0 2−λ = det λ 1 1 2−λ = (2 − λ)(2 − λ) − 1 · 1 = λ2 − 4λ + 4 − 1 = λ2 + 2λ = λ(λ + 2) = λ(λ − 1). Thus the eigenvalues are 0 and 1. 33. Using the method of Example 4.5, we wish to find solutions in Z5 to the equation 0 = det(A − λI) = det 3 4 1 λ − 0 0 0 3−λ = det λ 4 1 −λ = (3 − λ)(−λ) − 1 · 4 = λ2 − 3λ − 4 = λ2 + 2λ + 1 = (λ + 1)2 = (λ − 4)2 . Thus the only eigenvalue is 4. 4.1. INTRODUCTION TO EIGENVALUES AND EIGENVECTORS 247 34. Using the method of Example 4.5, we wish to find solutions in Z5 to the equation 0 = det(A − λI) = det 1 4 4 λ − 0 0 0 1−λ = det λ 4 4 −λ = (1 − λ)(−λ) − 4 · 4 = λ2 − λ − 16 = λ2 + 4λ + 4 = (λ + 2)2 = (λ − 3)2 . Thus the only eigenvalue is 3. 35. (a) To find the eigenvalues of A, find solutions to the equation 0 = det(A − λI) = det a c b λ − d 0 0 a−λ = det λ c b d−λ = (a − λ)(d − λ) − bc = λ2 − (a + d)λ + (ad − bc) = λ2 − tr(A)λ + det(A). (b) Using the quadratic formula to find the roots of this quadratic gives p tr(A) ± (tr(A))2 − 4 det(A) λ= 2 √ a + d ± a2 + 2ad + d2 − 4ad + 4bc = 2 p 1 2 = a + d ± a − 2ad + d2 + 4bc 2 p 1 = a + d ± (a − d)2 + 4bc . 2 (c) Let p 1 a + d + (a − d)2 + 4bc , 2 be the two eigenvalues. Then λ1 = λ2 = p 1 a + d − (a − d)2 + 4bc 2 1 p p 1 a + d + (a − d)2 + 4bc + a + d − (a − d)2 + 4bc 2 2 1 = (a + d + a + d) 2 = a + d = tr(A), 1 p p 1 2 2 λ1 λ2 = a + d − (a − d) + 4bc a + d − (a − d) + 4bc 2 2 1 (a + d)2 − (a − d)2 + 4bc = 4 1 = (a2 + 2ad + d2 − a2 + 2ad − d2 − 4bc) 4 1 = (4ad − 4bc) = det(A). 4 λ1 + λ2 = 36. A quadratic equation will have two distinct real roots when its discriminant (the quantity under the square root sign in the quadratic formula) is positive, one real root when it is zero, and no real roots when it is negative. In Exercise 35, the discriminant is (a − d)2 + 4bc. (a) For A to have to distinct real eigenvalues, we must have (a − d)2 + 4bc > 0. (b) For A to have a single real eigenvalue, it must be the case that (a − d)2 + 4bc = 0, so that (a − d)2 = −4bc. Note that this implies that either a = d and at least one of b or c is zero, or that a 6= d and either b or c, but not both, is negative. (c) For A to have no real eigenvalues, we must have (a − c)2 + 4bc < 0. 248 CHAPTER 4. EIGENVALUES AND EIGENVECTORS 37. From Exercise 35, since c = 0, the eigenvalues are 1 p 1 a + d ± (a − d)2 + 4b · 0 = (a + d ± |a − d|) 2 2 1 1 (a + d + a − d), (a + d + d − a) = {a, d}. = 2 2 To find the eigenspace corresponding to λ = a, we must find the null space of A − aI. 0 b 0 A − aI 0 = . 0 d−a 0 If b = 0 and a = d, then this is the zero matrix, so that any vector is an eigenvector; a basis for the eigenspace is 1 0 , . 0 1 The reason for this is that under these conditions, the matrix A is a multiple of the identity matrix, so it simply stretches any vector by a factor of a = d. Otherwise, either b 6= 0 or d − a 6= 0, so the above matrix row-reduces to 0 1 0 0 0 0 so that eigenvectors corresponding to this eigenvalue have the form t . 0 1 A basis for this eigenspace is given by choosing for example t = 1, to get . As a check, 0 1 a·1+b·0 a 1 A = = =a . 0 0·1+d·0 0 0 To find the eigenspace corresponding to λ = d, we must find the null space of A − dI. a−d b 0 A − dI 0 = . 0 0 0 If b = 0 and a = d, then this is again the zero matrix, which we dealt with above. Otherwise, either b 6= 0 or d − a 6= 0, so that eigenvectors corresponding to this eigenvalue have the form −bt bt , or . (a − d)t (d − a)t b A basis for this eigenspace is given by choosing for example t = 1, to get . As a check, d−a b a · b + b · (d − a) bd b = = =d . A d−a 0 · b + d · (d − a) d(d − a) d−a 38. Using the method of Example 4.5, we wish to find solutions to the equation 0 = det(A − λI) = det a b λ − −b a 0 0 a−λ = det λ −b b a−λ = (a − λ)(a − λ) − 4 · (−b) = λ2 − 2aλ + (a2 + 4b). 4.1. INTRODUCTION TO EIGENVALUES AND EIGENVECTORS 249 Using Exercise 35, the eigenvalues are p 1 1p a + a ± (a − a)2 + 4b · (−b) = a ± −4b2 = a ± bi. 2 2 To find the eigenspace corresponding to λ = a + bi, we must find the null space of A − (a + bi)I. A − (a + bi)I −bi b 0 = −b −bi 0 . 0 If b = 0 then this is the zero matrix, so that any vector is an eigenvector; a basis for the eigenspace is 1 0 , . 0 1 The reason for this is that under these conditions, the matrix A is a times the identity matrix, so it simply stretches any vector by a factor of a. Otherwise, b 6= 0, so we can row-reduce the matrix: −bi b −b −bi 1 1 0 −b−iR i −→ 1 −b −bi 0 0 1 R2 +bR1 0 − −−−−→ 0 i 0 0 . 0 Thus eigenvectors corresponding to this eigenvalue have the form −it t = . t it 1 A basis for this eigenspace is given by choosing for example t = 1, to get . As a check, i 1 a + bi 1 A = = (a + bi) . i −b + ai i To find the eigenspace corresponding to λ = a − bi, we must find the null space of A − (a − bi)I. A − (a − bi)I bi 0 = −b b bi 0 . 0 If b = 0 then this is the zero matrix, which we dealt with above. Otherwise, b 6= 0, so we can row-reduce the matrix: − 1 iR1 bi b 0 − −b−−→ 1 −i 0 R +bR 1 −i 0 . 1 −b bi 0 −b bi 0 −−2−−−→ 0 0 0 Thus eigenvectors corresponding to this eigenvalue have the form it . t i A basis for this eigenspace is given by choosing for example t = 1, to get . As a check, 1 A i ai + b b + ai i = = = (a − bi) . 1 −bi + a a − bi 1 250 4.2 CHAPTER 4. EIGENVALUES AND EIGENVECTORS Determinants 1. Expanding along the first row gives 1 5 0 0 1 1 3 1 1 =1 1 2 5 1 −0 0 2 5 1 +3 0 2 1 = 1(1 · 2 − 1 · 1) − 0 + 3(5 · 1 − 1 · 0) = 16. 1 Expanding along the first column gives 1 5 0 0 1 1 3 1 1 =1 1 2 0 1 −5 1 2 0 3 +0 1 2 3 = 1(1 · 2 − 1 · 1) − 5(0 · 2 − 3 · 1) + 0 = 16. 1 2. Expanding along the first row gives 0 2 −1 1 3 3 −1 3 −2 = 0 3 0 2 −2 + (−1) −1 0 2 −2 −1 −1 0 3 3 = 0 − 1(2 · 0 − (−2) · (−1)) − 1(2 · 3 − 3 · (−1)) = −7. Expanding along the first column gives 0 2 −1 1 3 3 −1 3 −2 = 0 3 0 1 −2 −2 3 0 1 −1 + (−1) 3 0 −1 −2 = 0 − 2(1 · 0 − (−1) · 3) − 1(1 · (−2) − (−1) · 3) = −7. 3. Expanding along the first row gives 1 −1 0 −1 0 0 0 1 =1 1 1 −1 −1 1 −(−1) 0 −1 −1 1 +0 0 −1 0 = 1(0·(−1)−1·1)+1(−1·(−1)−1·0)+0 = 0. 1 −1 0 +0 0 −1 0 = 1(0·(−1)−1·1)+1(−1·(−1)−0·1)+0 = 0. 1 Expanding along the first column gives 1 −1 0 −1 0 0 0 1 =1 1 1 −1 −1 1 −(−1) 1 −1 4. Expanding along the first row gives 1 1 0 1 0 1 0 0 1 =1 1 1 1 1 −1 1 0 1 1 +0 1 0 0 = 1(0 · 1 − 1 · 1) − 1(1 · 1 − 1 · 0) + 0 = −2. 1 Expanding along the first column gives 1 1 0 1 0 1 0 0 1 =1 1 1 1 1 −1 1 1 0 1 +0 1 0 0 = 1(0 · 1 − 1 · 1) − 1(1 · 1 − 1 · 0) + 0 = −2. 1 5. Expanding along the first row gives 1 2 3 2 3 1 3 3 1 =1 1 2 1 2 −2 2 3 1 2 +3 2 3 3 = 1(3 · 2 − 1 · 1) − 2(2 · 2 − 1 · 3) + 3(2 · 1 − 3 · 3) = −18. 1 4.2. DETERMINANTS 251 Expanding along the first column gives 1 2 3 2 3 1 3 3 1 =1 1 2 2 1 +3 3 2 2 1 −2 3 2 3 = 1(3 · 2 − 1 · 1) − 2(2 · 2 − 1 · 3) + 3(2 · 1 − 3 · 3) = −18. 1 6. Expanding along the first row gives 1 4 7 2 5 8 3 5 6 =1 8 9 4 6 +3 7 9 4 6 −2 7 9 5 = 1(5 · 9 − 6 · 8) − 2(4 · 9 − 6 · 7) + 3(4 · 8 − 5 · 7) = 0. 8 Expanding along the first column gives 1 4 7 2 5 8 3 5 6 =1 8 9 2 3 +7 5 9 2 6 −4 8 9 3 = 1(5 · 9 − 6 · 8) − 4(2 · 9 − 3 · 8) + 7(2 · 6 − 3 · 5) = 0. 6 7. Noting the pair of zeros in the third row, use cofactor expansion along the third row to get 5 −1 3 2 1 0 2 2 2 =3 1 0 5 2 −0 −1 2 5 2 +0 −1 2 2 = 3(2 · 2 − 2 · 1) − 0 + 0 = 6. 1 8. Noting the zero in the second row, use cofactor expansion along the second row to get 1 2 3 1 −1 1 0 1 = −2 −2 −2 1 1 −1 −1 3 1 1 = −2(1 · 1 − (−1) · (−2)) − 1(1 · (−2) − 1 · 3) = 7. −2 9. Noting the zero in the bottom right, use the third row, since the multipliers there are smaller: −4 2 1 1 3 1 −2 4 = 1 −2 −1 0 −4 3 − (−1) 2 4 3 = 1(1 · 4 − 3 · (−2)) + 1(−4 · 4 − 3 · 2) = −12. 4 10. Noting that the first column has only one nonzero entry, we expand along the first column: cos θ 0 0 sin θ cos θ sin θ tan θ − sin θ = cos θ(cos θ · cos θ − (− sin θ) · sin θ) = cos θ(cos2 θ + sin2 θ) = cos θ. cos θ 11. Use the first column: a 0 a b a 0 0 a b =a 0 b b b +a b a 0 = a(a · b − b · 0) + a(b · b − 0 · a) = a2 b + ab2 . b 12. Use the first row (other choices are equally easy): 0 b 0 a 0 b d c d = −a = −a(b · 0 − d · 0) = 0. 0 0 e 0 252 CHAPTER 4. EIGENVALUES AND EIGENVECTORS 13. Use the third row, since it contains only one nonzero entry: −1 0 3 1 5 2 6 = (−1)3+2 · 1 2 1 0 0 1 4 2 1 1 2 0 1 1 3 6 =− 2 1 1 0 2 2 0 2 2 3 6 . 1 To compute this determinant, use cofactor expansion along the first row: 1 − 2 1 0 2 2 3 2 6 = −1 2 1 2 6 −3 1 1 2 = −1(2 · 1 − 6 · 2) − 3(2 · 2 − 2 · 1) = 4. 2 14. Noting that the second column has only one nonzero entry, we expand along that column. The nonzero entry is in row 3, column 2, so it gets a minus sign, since (−1)3+2 = −1: 0 3 −1 2 0 2 2 = −(−1) 1 −1 1 4 2 0 1 −3 2 1 0 2 2 −1 2 = 1 2 −3 3 2 1 3 2 1 −1 2 . −3 To evaluate the remaining determinant, we use expansion along the first row: 2 1 2 3 −1 2 2 2 =2 1 1 −3 1 2 −3 2 −3 1 2 −1 2 −3 2 1 = 2(2 · (−3) − 2 · 1) − 3(1 · (−3) − 2 · 2) − 1(1 · 1 − 2 · 2) = 8. 15. Expand along the first row: 0 0 0 g 0 0 d h 0 a 0 b c = −a 0 e f g i j 0 d h b e . i Now expand along the first row again: 0 −a 0 g 0 b 0 d e = −ab g h i d = −ab(0 · h − dg) = abdg. h 16. Using the method of Example 4.9, we get 1 4 7 2 3 5 6 8 9 105 48 72 1 4 7 2 5 8 45 84 96 so that the determinant is −(105 + 48 + 72) + (45 + 84 + 96) = 0. 17. Using the method of Example 4.9, 1 2 3 1 −1 0 1 −2 1 0 −2 1 2 3 1 0 −2 0 3 so that the determinant is −(0 + (−2) + 2) + (0 + 3 + 4) = 7. 2 4 4.2. DETERMINANTS 253 18. Using the method of Example 4.9, we get 0 a 0 a b a 0 0 b b 0 a 0 a b a 0 a2 b 2 0 ab2 0 2 so that the determinant is −(0 + 0 + 0) + (a b + ab + 0) = a2 b + ab2 . 19. Consider (2) in the text. The three upward-pointing diagonals have product a31 a22 a13 , a32 a23 a11 , and a33 a21 a12 , while the downward-pointing diagonals have product a11 a22 a33 , a12 a23 a31 , and a13 a21 a32 , so the determinant by that method is det A = −(a31 a22 a13 + a32 a23 a11 + a33 a21 a12 ) + (a11 a22 a33 + a12 a23 a31 + a13 a21 a32 ). Expanding along the first row gives det A = a11 a22 a32 a a23 − a12 21 a31 a33 a a23 + a13 21 a31 a33 a22 a32 = a11 a22 a33 − a11 a32 a23 − a12 a21 a33 + a12 a23 a31 + a13 a21 a32 − a13 a22 a31 , which, after rearranging, is the same as the first answer. a11 a12 20. Consider the 2 × 2 matrix A = . Then a21 a22 C11 = (−1)1+1 det[a22 ] = a22 , C12 = (−1)1+2 det[a21 ] = −a21 , so that from definition (4) det A = 2 X a1j C1j = a11 C11 + a12 C12 = a11 a12 + a12 · (−a21 ) = a11 a22 − a12 a21 , j=1 which agrees with the definition of a 2 × 2 determinant. 21. If A is lower triangular, then AT is upper triangular. Since by Theorem 4.10 det A = det AT , if we prove the result for upper triangular matrices, we will have proven it for lower triangular matrices as well. So assume A is upper triangular. The basis for the induction is the case n = 1, where A = [a11 ] and det A = a11 , which is clearly the “product” of the entries on the diagonal. Now assume that the statement holds for n × n upper triangular matrices, and let A be an (n + 1) × (n + 1) upper triangular matrix. Using cofactor expansion along the last row, we get det A = (−1)(n+1)+(n+1) an+1,n+1 An+1,n+1 = an+1,n+1 An+1,n+1 . But An+1,n+1 is an n × n upper triangular matrix with diagonal entries a1 , a2 , . . . , an , so applying the inductive hypothesis we get det A = an+1,n+1 An+1,n+1 = an+1,n+1 (a1 a2 · · · an ) = a1 a2 · · · an an+1 , proving the result for (n + 1) × (n + 1) matrices. So by induction, the statement is true for all n ≥ 1. 22. Row-reducing the matrix in Exercise 1, and modifying the determinant as required by Theorem 4.3, gives 1 0 3 1 0 3 1 0 3 R2 −5R1 5 1 1 − 0 1 −14 0 1 −14 −−−−→ R3 −R2 0 1 2 0 1 2 − −−−−→ 0 0 16 Note that we did not introduce any factors in front of the determinant since all of the row operations were of the form Ri + kRj , which does not change the determinant. The final determinant is easy to evaluate since the matrix is upper triangular; the determinant of the original matrix is 1 · 1 · 16 = 16. 254 CHAPTER 4. EIGENVALUES AND EIGENVECTORS 23. Row-reducing the matrix in Exercise 9, and modifying the determinant as required by Theorem 4.3, gives 1 −1 0 1 −1 0 1 −1 0 −4 1 3 R2 −2R1 R2 ↔R3 R1 ↔R3 0 4 − 2 −2 4 − −−−−→ 0 −3 3 . −−−−→ − 2 −2 4 −−−−−→ − 0 R3 +4R1 0 0 4 0 −3 3 −4 1 3 − 1 −1 0 −−−−→ There were two row interchanges, each of which introduced a factor of −1 in the determinant. The other row operations were of the form Ri + kRj , which does not change the determinant. The final determinant is easy to evaluate since the matrix is upper triangular; the determinant of the original matrix is 1 · (−3) · 4 = −12. 24. Row-reducing the matrix in Exercise 13, and modifying the determinant as required by Theorem 4.3, gives 1 2 0 1 −1 0 3 1 0 0 7 2 0 5 2 −2 1 −1 0 3 0 7 2 0 R2 ↔R3 − 0 1 0 0 −−−−−→ 0 5 2 −2 1 −1 0 3 R −2R1 0 5 2 6 −−2−−−→ 0 1 0 0 R4 −R1 4 2 1 − −−−−→ 0 1 0 R −7R2 − −−3−−−→ 0 R4 −5R2 0 −−−−−→ 1 −1 0 3 0 1 0 0 − 0 0 2 0 R4 −R3 0 0 2 −2 − −−−−→ −1 0 3 1 0 0 . 0 2 0 0 0 −2 There was one row interchange, which introduced a factor of −1 in the determinant. The other row operations were of the form Ri + kRj , which does not change the determinant. The final determinant is easy to evaluate since the matrix is upper triangular; the determinant of the original matrix is −(1 · 1 · 2 · (−2)) = 4. 25. Row-reducing the matrix in Exercise 14, and modifying the determinant as required by Theorem 4.3, gives 2 1 0 2 1 0 3 −1 2 0 2 2 R1 ↔R2 − 0 −1 1 4 −−−−−→ 2 0 1 −3 1 R ↔R3 0 −−2−−−→ 0 0 1 0 2 2 R −2R1 0 0 3 −1 −−2−−−→ − 0 −1 1 4 R2 −2R1 0 0 1 −3 − −−−−→ 0 2 2 1 −1 1 4 0 0 −1 −5 0 R4 −3R3 0 −3 −7 − −−−−→ 0 0 2 2 0 −1 −5 −1 1 4 0 −3 −7 0 2 2 −1 1 4 . 0 −1 −5 0 0 8 There were two row interchanges, each of which introduced a factor of −1 in the determinant. The other row operations were of the form Ri + kRj , which does not change the determinant. The final determinant is easy to evaluate since the matrix is upper triangular; the determinant of the original matrix is 1 · (−1) · (−1) · 8 = 8. 26. Since the third row is a multiple of the first row, the determinant is zero since we can use the row operation R3 − 2R1 to zero the third row, and doing so does not change the determinant. 27. This matrix is upper triangular, so its determinant is the product of the diagonal entries, which is 3 · (−2) · 4 = −24. 28. If we interchange the first and third rows, the result is an upper triangular matrix with diagonal entries 3, 5, and 1. Since interchanging the rows negates the determinant, the determinant of the original matrix is −(3 · 5 · 1) = −15. 4.2. DETERMINANTS 255 29. The third column is −2 times the first column, so the columns are not linearly independent and thus the matrix does not have rank 3, so is not invertible. By Theorem 4.6, its determinant is zero. 30. Since the third row is the sum of the first and second rows, the rows are not linearly independent and thus the matrix does not have rank 3, so is not invertible. By Theorem 4.6, its determinant is zero. 31. Since the first column is the sum of the second and third columns, the columns are not linearly independent and thus the matrix does not have rank 3, so is not invertible. By Theorem 4.6, its determinant is zero. 32. Interchanging the second and third rows gives the identity matrix, which has determinant 1. Since interchanging the rows introduces a minus sign, the determinant of the original matrix is −1. 33. Interchange the first and second rows, and also the third and fourth rows, to give a diagonal matrix with diagonal entries −3, 2, 1, and 4. Performing two row interchanges does not change the determinant, so the determinant of the original matrix is −3 · 2 · 1 · 4 = −24. 34. The sum of the first and second rows is equal to the sum of the third and fourth rows, so the rows are not linearly independent and thus the matrix does not have rank 4, so is not invertible. By Theorem 4.6, its determinant is zero. 35. This matrix results from the original matrix by multiplying the first row by 2, so its determinant is 2 · 4 = 8. 36. This matrix results from the original matrix by multiplying the first column by 2, the second column by 31 , and the third column by −1. Thus the determinant of this matrix is 2 · 13 · (−1) = − 32 times the original determinant, so its value is − 32 · 4 = − 83 . 37. This matrix results from the original matrix by interchanging the first and second rows. This multiplies the determinant by −1, so the determinant of this matrix is −4. 38. This matrix results from the original matrix by subtracting the third column from the first. This operation does not change the determinant, so the determinant of this matrix is also 4. 39. This matrix results from the original matrix by exchanging the first and third columns (which multiplies the determinant by −1) and then multiplying the first column by 2 (which multiplies the determinant by 2). So the determinant of this matrix is 2 · (−1) · 4 = −8. 40. To get from the original matrix to this one, row 2 was multiplied by 3, and then twice row 3 was added to each of rows 1 and 2. The first of these operations multiplies the determinant by 3, while the additions do not change the determinant. Thus the determinant of this matrix is 3 · 4 = 12. 41. Suppose first that A has a zero row, say the ith row is zero. Using cofactor expansion along that row gives n n X X det A = aij Cij = 0Cij = 0, j=1 j=1 so the determinant is zero. Similarly, if A has a zero column, say the j th column, then using cofactor expansion along that column gives det A = n X i=1 aij Cij = n X 0Cij = 0, i=1 and again the determinant is zero. Alternatively, the column case can be proved from the row case by noting that if A has a zero column, then AT has a zero row, and det A = det AT by Theorem 4.10. 256 CHAPTER 4. EIGENVALUES AND EIGENVECTORS Ri +kRj 42. Dealing first with rows, we want to show that if A −−−−−→ B where i 6= j then det B = det A. Let Ai be the ith row of A. Then     A1 A1  A2    A2      ..    ..     .  .   B= Ai + kAj  ; let C = kAj  .      .    .. .  .    . An An Then A, B, and C are identical except that the ith row of B is the sum of the ith rows of A and C, so that by part (a), det B = det A + det C. However, since i 6= j, the ith row of C, which is kAj , is a multiple of the j th row, which is Aj . By Theorem 4.3(d), multiplying the j th row by k multiplies the determinant of C by k. However, the result is a matrix with two identical rows; both the ith and j th rows are kAj . So det C = 0 by Theorem 4.3(c). Hence det B = det A as desired. Ci +kCj Ri +kRj If instead A −−−−−→ B where i 6= j, then AT −−−−−→ B T , so that by the first part and Theorem 4.10 det A = det AT = det B T = det B. 43. If E interchanges two rows, then det E = −1 by Theorem 4.4. Also, EB is the same as B but with two rows interchanged, so by Theorem 4.3(b), det(EB) = − det B = det(E) det(B). Next, if E results from multiplying a row by a constant k, then det E = k by Theorem 4.4. But EB is the same as B except that one row has been multiplied by k, so that by Theorem 4.3(d), det(EB) = k det(B) = det(E) det(B). Finally, if E results from adding a multiple of one row to another row, then det E = 1 by Theorem 4.4. But in this case EB is obtained from B by adding a multiple of one row to another, so by Theorem 4.3(f), det(EB) = det(B) = det(E) det(B). 44. Let the row vectors of A be A1 , A2 , . . . , An . Then the row vectors of kA are kA1 , kA2 , . . . , kAn . Using Theorem 4.3(d) n times, kA1 A1 A1 A1 kA2 kA2 A2 A2 det(kA) = . = k . = k 2 . = · · · = k n . = k n det A. .. .. .. .. kAn kAn kAn An 45. By Theorem 4.6, A is invertible if and only if det A 6= 0. To find det A, expand along the first column: det A = k k+1 −8 1 −k +k k−1 k+1 3 1 = k((k + 1)(k − 1) − 1 · (−8)) + k(−k · 1 − 3(k + 1)) = k 3 − 4k 2 + 4k = k(k − 2)2 . Thus det A = 0 when k = 0 or k = 2, so A is invertible for all k except k = 0 and k = 2. 46. By Theorem 4.6, A is invertible if and only if det A 6= 0. To find det A, expand along the first row: k k2 0 k 2 k 0 2 k =k k k k k2 −k k 0 k k = k(2k − k 2 ) − k(k 3 − 0) = k(2k − k 2 − k 3 ) = k 2 (2 − k − k 2 ) = −k 2 (k + 2)(k − 1). Since the matrix is invertible precisely when its determinant is nonzero, and since the determinant is zero when k = 0, k = 1, or k = −2, we see that the matrix is invertible for all values of k except for 0, 1, and −2. 47. By Theorem 4.8, det(AB) = det(A) det(B) = 3 · (−2) = −6. 4.2. DETERMINANTS 257 48. By Theorem 4.8, det(A2 ) = det(A) det(A) = 3 · 3 = 9. 1 49. By Theorem 4.9, det(B −1 ) = det(B) , so that by Theorem 4.8 det(A) 3 =− . det(B) 2 det(B −1 A) = det(B −1 ) det(A) = 50. By Theorem 4.7, det(2A) = 2n det(A) = 3 · 2n . 51. By Theorem 4.10, det(B T ) = det(B), so that by Theorem 4.7, det(3B T ) = 3n det(B T ) = −2 · 3n . 52. By Theorem 4.10, det(AT ) = det(A), so that by Theorem 4.8, det(AAT ) = det(A) det(AT ) = det(A)2 = 9. 53. By Theorem 4.8, det(AB) = det(A) det(B) = det(B) det(A) = det(BA). 1 , so that using Theorem 4.8 54. If B is invertible, then det(B) 6= 0 and det(B −1 ) = det(B) det(B −1 AB) = det(B −1 (AB)) = det(B −1 ) det(AB) = det(B −1 ) det(A) det(B) = 1 det(A) det(B) = det(A). det(B) 55. If A2 = A, then det(A2 ) = det(A). But det(A2 ) = det(A) det(A) = det(A)2 , so that det(A) = det(A)2 . Thus det(A) is either 0 or 1. 56. We first show that Theorem 4.8 extends to prove that if m ≥ 1, then det(Am ) = det(A)m . The case m = 1 is obvious, and m = 2 is Theorem 4.8. Now assume the statement is true for m = k. Then using Theorem 4.8 and the truth of the statement for m = k, det(Ak+1 ) = det(AAk ) = det(A) det(Ak ) = det(A) det(A)k = det(A)k+1 . So the statement holds for all m ≥ 1. Now, if Am = O, then det(Am ) = det(A)m = det(O) = 0, so that det(A) = 0. 57. The coefficient matrix and constant vector are 1 1 A= , 1 −1 b= 1 . 2 To apply Cramer’s Rule, we compute det A = 1 1 1 = −2 −1 det A1 (b) = 1 2 1 = −3 −1 det A2 (b) = 1 1 1 = 1. 2 Then by Cramer’s Rule, x= 3 det A1 (b) = , det A 2 58. The coefficient matrix and constant vector are 2 −1 A= , 1 3 y= det A2 (b) 1 =− . det A 2 b= 5 . −1 258 CHAPTER 4. EIGENVALUES AND EIGENVECTORS To apply Cramer’s Rule, we compute det A = −1 =7 3 2 1 det A1 (b) = 5 −1 −1 = 14 3 det A2 (b) = 2 1 5 = −7. −1 Then by Cramer’s Rule, det A1 (b) det A2 (b) = 2, y = = −1. det A det A 59. The coefficient matrix and constant vector are     2 1 3 1 A = 0 1 1 , b = 1 . 0 0 1 1 x= To apply Cramer’s Rule, we compute 2 det A = 0 0 1 1 0 3 1 =2·1·1=2 1 1 det A1 (b) = 1 1 1 1 0 3 1 1 =1 1 1 1 3 +1 1 1 2 det A2 (b) = 0 0 1 1 1 3 1 1 =2 1 1 1 =0 1 2 det A3 (b) = 0 0 1 1 0 1 1 1 =2 0 1 1 = 2. 1 y= det A2 (b) = 0, det A 1 = −2 1 Then by Cramer’s Rule, x= det A1 (b) = −1, det A 60. The coefficient matrix and constant vector are   1 1 −1 1 1 , A = 1 1 −1 0 z= det A3 (b) = 1. det A   1 b = 2 . 3 To apply Cramer’s Rule, we compute 1 det A = 1 1 1 −1 1 1 1 =1 1 −1 0 −1 1 − (−1) 1 1 −1 =4 1 1 det A1 (b) = 2 3 1 −1 1 1 1 =3 1 −1 0 −1 1 − (−1) 1 2 −1 =9 1 −1 1 1 =1 2 0 −1 1 −3 1 1 −1 = −3 1 1 1 1 1 2 =1 −1 −1 3 2 1 −1 3 1 2 1 +1 3 1 1 1 det A2 (b) = 1 2 1 3 1 det A3 (b) = 1 1 1 = 2. −1 4.2. DETERMINANTS 259 Then by Cramer’s Rule, 61. We have det A = x= det A1 (b) 9 = , det A 4 1 1 1 = −2. The four cofactors are −1 y= det A2 (b) 3 =− , det A 4 C11 = (−1)1+1 · (−1) = −1, C21 = (−1)2+1 · 1 = −1, z= det A3 (b) 1 = . det A 2 C12 = (−1)1+2 · 1 = −1, C22 = (−1)2+2 · 1 = 1. Then " −1 adj A = −1 62. We have det A = 2 1 #T " −1 −1 = 1 −1 # " −1 1 1 −1 −1 , so A = adj A = − det A 2 −1 1 # " 1 −1 = 21 1 2 1 2 − 12 −1 = 7. The four cofactors are 3 C11 = (−1)1+1 · 3 = 3, C21 = (−1)2+1 · (−1) = 1, C12 = (−1)1+2 · 1 = −1, C22 = (−1)2+2 · 2 = 2. Then " 3 adj A = 1 #T " −1 3 = 2 −1 # " 1 3 1 1 −1 , so A = adj A = det A 7 −1 2 # " 3 1 7 = 1 2 −7 1 7 2 7 # 63. We have det A = 2 from Exercise 59. The nine cofactors are C11 = (−1)1+1 · 1 = 1, C12 = (−1)1+2 · 0 = 0, C13 = (−1)1+3 · 0 = 0, C21 = (−1)2+1 · 1 = −1, C22 = (−1)2+2 · 2 = 2, C23 = (−1)2+3 · 0 = 0, C31 = (−1)3+1 · (−2) = −2, Then C32 = (−1)3+2 · 2 = −2, C33 = (−1)3+3 · 2 = 2. T   1 −1 −2 0 0 2 −2 , 2 0 = 0 0 0 2 −2 2     1 1 −1 −2 − 12 −1 2 1 1    A−1 = adj A = 0 2 −2 =  0 1 −1 . det A 2 0 0 2 0 0 1  1 adj A = −1 −2 so 64. We have det A = 4 from Exercise 60. The nine cofactors are C11 = (−1)1+1 · 1 = 1, C12 = (−1)1+2 · (−1) = 1, C13 = (−1)1+3 · (−2) = −2, C21 = (−1)2+1 · (−1) = 1, C31 = (−1)3+1 · 2 = 2, Then C22 = (−1)2+2 · 1 = 1, C23 = (−1)2+3 · (−2) = 2, C32 = (−1)3+2 · 2 = −2, C33 = (−1)3+3 · 0 = 0. T   1 −2 1 1 2 1 2 =  1 1 −2 , −2 0 −2 2 0     1 1 1 1 1 2 4 4 2 1 1    adj A =  1 1 −2 =  14 14 − 12  . A−1 = det A 4 −2 2 0 − 12 12 0  1 adj A = 1 2 so . # . 260 CHAPTER 4. EIGENVALUES AND EIGENVECTORS 65. First, since A−1 = det1 A adj A, we have adj A = (det A)A−1 . Then (adj A) · 1 1 1 A = (det A)A−1 · A = det A · A−1 A = I. det A det A det A Thus adj A times det1 A A is the identity, so that adj A is invertible and (adj A)−1 = 1 A. det A This proves the first equality. Finally, substitute A−1 for A in the equation adj A = (det A)A−1 ; this gives adj(A−1 ) = (det A−1 )A = det1 A A, proving the second equality. 66. By Exercise 65, det(adj A) = det((det A)A−1 ); by Theorem 4.7, this is equal to (det A)n det(A−1 ). But det(A−1 ) = det1 A , so that det(adj A) = (det A)n det1 A = (det A)n−1 . 67. Note that exchanging Ri−1 and Ri results in a matrix in which row i has been moved “up” one row and all other rows remain the same. Refer to the diagram at the end of the solution when reading what follows. Starting with Rs ↔ Rs−1 , perform s − r adjacent interchanges, each one moving the row that started as row s up one row. At the end of these interchanges, row s will have been moved “up” to row r, and all other rows will remain in the same order. The original row r is therefore at row r + 1. We wish to move that row down to row s. Note that exchanging Ri and Ri+1 results in a matrix in which row i has been moved “down” one row and all other rows remain the same. So starting with Rr+1 ↔ Rr+2 , perform s − r − 1 adjacent interchanges, each one moving the row that started as row r + 1 “down” one row. At the end of these interchanges, row r + 1 will have been moved down s − r − 1 rows, so it will be at row s − r − 1 + r + 1 = s, and all other rows will be as they were. Now, row r + 1 originally started life as row r, so the end result is that row s has been moved to row r, and row r to row s. We used s − r adjacent interchanges to move row s, and another s − r − 1 adjacent interchanges to move row r, for a total of 2(s − r) − 1 adjacent interchanges.   R1  R2     ..   .    Rr−1     Rr    Rr+1    Original:  .   ..    Rs−1     Rs    Rs+1     .   ..  Rn     R1 R1  R2   R2       ..   ..   .   .      Rr−1  Rr−1       Rs   Rs       Rr  Rr+1      After initial s − r exchanges: Rr+1  After next s − r − 1 exchanges:  .  .   ..    ..     .  Rs−1      Rs−1   Rr      Rs+1  Rs+1       .   .   ..   ..  Rn Rn 4.2. DETERMINANTS 261 68. The proof is very similar to the proof given in the text for the case of expansion along a row. Let B be the matrix obtained by moving column j of matrix A to column 1:   a11 a12 · · · a1,j−1 a1j a1,j+1 · · · a1n  a21 a22 · · · a2,j−1 a2j a2,j+1 · · · a2n    A= . .. .. .. .. ..  .. ..  .. . . . . . . .  an1 an2 · · · an,j−1 anj an,j+1 · · · ann  a1j  a2j  B= .  .. a11 a21 .. . a12 a22 .. . ··· ··· .. . a1,j−1 a2,j−1 .. . a1,j+1 a2,j+1 .. . ··· ··· .. .  a1n a2n   ..  .  anj an1 an2 ··· an,j−1 an,j+1 ··· ann By the discussion in Exercise 67, it requires j − 1 adjacent column interchanges to move column j to column 1, so by Lemma 4.14, det B = (−1)j−1 det A, so that det A = (−1)j−1 det B. Then by Lemma 4.13, evaluating det B using cofactor expansion along the first column gives det A = (−1)j−1 det B = (−1)j−1 n X bi1 Bi1 = (−1)j−1 i=1 n X aij Bi1 , i=1 where Aij and Bij are the cofactors of A and B. It remains only to understand how the cofactors of A and B are related. Clearly the matrix formed by removing row i and column j of A is the same matrix as is formed by removing row i and column 1 of B, so the only difference is the sign that we put on the determinant of that matrix. In A, the sign is (−1)i+j ; in B, it is (−1)i+1 . Thus Bi1 = (−1)j−1 Aij . Substituting that in the equation for det A above gives det A = (−1)j−1 n X aij Bi1 = (−1)j−1 i=1 n X (−1)j−1 aij Aij = (−1)2j−2 i=1 n X i=1 aij Aij = n X aij Aij , i=1 proving Laplace expansion along columns. 69. Since P and S are both square, suppose P is n × n and S is m × m. Then O is m × n and Q is n × m. Proceed by induction on n. If n = 1, then p Q A = 11 , O S and Q is 1 × n while O is n × 1. Compute the determinant by cofactor expansion along the first column. The only nonzero entry in the first column is p11 , so that det A = p11 A11 . Since O has one column and Q one row, removing the first row and column of A leaves only S, so that A11 = det S and thus det A = p11 det S = (det P ) det S). This proves the case n = 1 and provides the basis for the induction. Now suppose that the statement is true when P is k × k; we want to prove it when P is (k + 1) × (k + 1). So let P be a (k + 1) × (k + 1) matrix. Again compute det A using column 1: det A = k+1+m X ai1 (−1)i+1 det Ai1 . i=1 Now, ai1 = pi1 for i ≤ k + 1, and ai1 = 0 for i > k + 1, so that det A = k+1 X i=1 pi1 (−1)i+1 det Ai1 . 262 CHAPTER 4. EIGENVALUES AND EIGENVECTORS Ai1 is the submatrix formed by removing row i and column 1 of A; since i ranges from 1 to k + 1, we see that this process removes row i and column 1 of P , but does not change S. So we are left with a matrix Pi1 Qi Ai1 = . O S Now Pi1 is k × k, so we can apply the inductive hypothesis to get det Ai1 = (det Pi1 )(det S). Substitute that into the equation for det A above to get det A = k+1 X pi1 (−1)i+1 det Ai1 i=1 = k+1 X pi1 (−1)i+1 (det Pi1 )(det S) i=1 = (det S) k+1 X ! pi1 (−1) i+1 det Pi1 . i=1 But the sum is just cofactor expansion of P along the first column, so its value is det P , and finally we get det A = (det S)(det P ) = (det P )(det S) as desired. 70. (a) There are many examples. For instance, let 1 0 1 P =S= , Q= 0 1 1 Then A= P R  1 0 Q = 0 S 0 0 , 0 R= 0 1 1 1  0 0 . 0 1 1 1 1 0 0 0 1 . 1 Since the second and third rows of A are identical, det A = 0, but (det P )(det S) − (det Q)(det R) = 1 · 1 − 0 · 0 = 1, and the two are unequal. (b) We have P −1 O P Q BA = −RP −1 I R S −1 P P + OR P −1 Q + OS = −RP −1 P + IR −RP −1 Q + IS I P −1 Q = −R + R −RP −1 Q + S I P −1 Q . = O S − RP −1 Q It is clear that the proof in Exercise 69 can easily be modified to prove the same result for a block lower triangular matrix. That is, if A, P , and S are square and P O A= , Q S Exploration: Geometric Applications of Determinants 263 then det A = (det P )(det S). Applying this to the matrix B gives det B = (det P −1 )(det I) = det P −1 = 1 . det P Further, applying Exercise 69 to BA gives det(BA) = (det I)(det(S − RP −1 Q)) = det(−SRP −1 Q). So finally det A = 1 det(BA) = (det P )(det(S − RP −1 Q)). det B (c) If P R = RP , then det A = (det P )(det(S − RP −1 Q)) = det(P (S − RP −1 Q)) = det(P S − (P R)P −1 Q)) = det(P S − RP P −1 Q) = det(P S − RQ). Exploration: Geometric Applications of Determinants 1. (a) We have i u×v = 0 3 j k 1 1 1 =i −1 −1 2 0 1 −j 3 2 0 1 +k 3 2 1 = 3i + 3j − 3k. −1 i u×v = 3 0 j k −1 −1 2 = i 1 1 1 3 2 −j 0 1 3 2 +k 0 1 −1 = −3i − 3j + 3k. 1 (b) We have (c) We have i u × v = −1 2 j k 2 2 3 =i −4 −4 −6 −1 3 −j 2 −6 −1 3 +k 2 −6 2 = 0. −4 (d) We have i u×v = 1 1 j 1 2 k 1 1 =i 2 3 1 1 −j 1 3 1 1 +k 1 3 1 = i − 2j + k. 2 2. First, i u · (v × w) = u · v1 w1 j v2 w2 k v3 w3 = (u1 i + u2 j + u3 k) · (i(v2 w3 − v3 w2 ) − j(v1 w3 − v3 w1 ) + k(v1 w2 − v2 w1 )) = (u1 v2 w3 − u1 v3 w2 ) − (u2 v1 w3 − u2 v3 w1 ) + (u3 v1 w2 − u3 v2 w1 ). Computing the determinant, we get u1 v1 w1 u2 v2 w2 u3 v v3 = u 1 2 w2 w3 v3 v − u2 1 w3 w1 v3 v + u3 1 w3 w1 v2 w2 = u1 (v2 w3 − v3 w2 ) − u2 (v1 w3 − v3 w1 ) + u3 (v1 w2 − v2 w1 ) = (u1 v2 w3 − u1 v3 w2 ) − (u2 v1 w3 − u2 v3 w1 ) + (u3 v1 w2 − u3 v2 w1 ), and the two expressions are equal. 264 CHAPTER 4. EIGENVALUES AND EIGENVECTORS 3. With u, v, and w as in the previous exercise, (a) We have i v × u = v1 u1 j v2 u2 i k R2 ↔R3 v3 − −−−−→ = − u1 v1 u3 j u2 v2 k u3 = −(u × v). v3 (b) We have i u × 0 = u1 0 j u2 0 k u3 . 0 This matrix has a zero row, so its determinant is zero by Theorem 4.3. Thus u × 0 = 0. (c) We have i j k u × 0 = u1 u2 u3 . u1 u2 u3 This matrix has two equal rows, so its determinant is zero by Theorem 4.3. Thus u × u = 0. (d) We have by Theorem 4.3 i u × kv = u1 kv1 j u2 kv2 k i u3 = k u1 kv3 v1 j u2 v2 k u3 = k(u × v). v3 (e) We have by Theorem 4.3 u × (v + w) = i u1 v1 + w 1 j u2 v2 + w2 k i u3 = u1 v3 + w3 v1 j u2 v2 k i u3 + u1 v3 w1 u1 u · (u × v) = u1 v1 u2 u2 v2 u3 u3 v3 v1 v · (u × v) = u1 v1 v2 u2 v2 v3 u3 . v3 j u2 w2 k u3 = u × v + u × w. w3 (f ) From Exercise 2 above, But each of these matrices has two equal rows, so each has zero determinant. Thus u · (u × v) = v · (u × v) = 0. (g) From Exercise 2 above, u1 u · (v × w) = v1 w1 u2 v2 w2 u3 v3 , w3 w1 (u × v) · w = w · (u × v) = u1 v1 w2 u2 v2 w3 u3 . v3 The second matrix is obtained from the first by exchanging rows 1 and 2 and then exchanging rows 1 and 3. Since each row exchange multiplies the determinant by −1, it remains unchanged so that the two products are equal. 4. The vectors u and v, considered as vectors in R3 , have their third coordinates equal to zero, so the area of the parallelogram with these edges is   i j k u1 u2 u1 u2 u1 v1   A = ku × vk = det u1 u2 0 = k det = det = det . v1 v2 v1 v2 u2 v2 v1 v2 0 Exploration: Geometric Applications of Determinants 265 5. The area of the rectangle is (a + c)(b + d). The small rectangles at top left and bottom right each have area bc. The triangles at top and bottom each have base a and height b, so the area of each is 1 2 ab; altogether they have area ab. Finally, the smaller triangles at left and right each have base d and height c, so the area of each is 12 cd; altogether they have area cd. So in total, the area in the big rectangle that is not in the parallelogram is bc + ab + cd, so the area of the parallelogram is (a + c)(b + d) − 2bc − ab − cd = ab + ad + bc + cd − 2bc − ab − cd = ad − bc. a c For this parallelogram, u = and v = , and indeed b d det u v = ad − bc. As for the absolute value sign, note that the formula A = ku × vk does not depend on which of the two sides we choose for u and which for v, but the sign of the determinant does. In our particular case, when we choose u and v as above, we don’t need the absolute value sign, as the area works out properly. It is present in the norm formula to represent the fact that we need not make particular choices for u and v. 6. (a) By Exercise 5, the area is A = det 2 3 −1 = |2 · 4 − (−1) · 3| = 11. 4 (b) By Exercise 5, the area is 3 A = det 4 5 = |3 · 5 − 4 · 5| = 5. 5 7. A parallelepiped of height h with base area a has volume ha. This can be seen using calculus; the Rh h volume is 0 a dz = [az]0 = ah. Alternatively, it can be seen intuitively, since the parallelepiped consists of “slices” each of area a, and the height of those slices is h. Now, for the parallelepiped in Figure 4.11, the height is clearly kuk cos θ. The base is a parallelogram with sides v and w, so its area is kv × wk. Thus the volume of the parallelepiped is kuk kv × wk cos θ. But u · (v × w) = kuk kv × wk cos θ, so that the volume  is u · (v × w) which, by Exercise 2, is equal to the absolute value of the determinant u of the matrix  v , which is equal to the absolute value of the determinant of the transpose of that w matrix, or u v w . 8. Compare Figure 4.12 to Figure 4.11. The base of the tetrahedron is a triangle that is one half of the area of the parallelogram in Figure 4.11. So from the hint, we have 1 (area of the base)(height) 3 1 = (area of the parallelogram)(height) 6 1 = (volume of the parallelepiped) 6 1 = |u · (v × w)| . 6 V= 9. By problem 4, the area of the parallelogram determined by Au and Av is TA (P ) = det Au Av = det(A u v ) = (det A)(det u v ) = |det A| (area of P ). 266 CHAPTER 4. EIGENVALUES AND EIGENVECTORS 10. By Problem 7, the volume of the parallelepiped determined by Au, Av, and Aw is TA (P ) = det Au Av Aw = det(A u v w ) = (det A)(det u v w ) = |det A| (volume of P ). 11. (a) The line through (2, 3) and (−1, 0) is given by x y 2 3 −1 0 1 1 . 1 Evaluate by cofactor expansion along the third row: −1 y 3 x 1 +1 2 1 y = −1(y − 3) + (3x − 2y) = 3x − 3y + 3. 3 The equation of the line is 3x − 3y + 3 = 0, or x − y + 1 = 0. (b) The line through (1, 2) and (4, 3) is given by x 1 4 y 2 3 1 1 . 1 Evaluate by cofactor expansion along the first row: x 2 3 1 1 −y 4 1 1 1 +1 4 1 2 = −x + 3y − 5. 3 The equation of the line is −x + 3y − 5 = 0, or x − 3y + 5 = 0. 12. Suppose the three points satisfy ax1 + by1 + c = 0 ax2 + by2 + c = 0 ax3 + by3 + c = 0, which can be written as the matrix equation  x1 XA = x2 x3 y1 y2 y3   1 a 1  b  = O. c 1 Then det X 6= 0 if and only if X is invertible, which implies that X −1 XA = X −1 O = O, so that A = O. Thus A 6= O if and only if det X = O; this says precisely that the three points lie on the line ax + by + c = 0 with a and b not both zero if and only if det X = 0. 13. Following the discussion at the beginning of the section, since the three points are noncollinear, they determine a unique plane, say ax + by + cz + d = 0. Since all three points lie on the plane, their coordinates satisfy this equation, so that ax1 + by1 + cz1 + d = 0 ax2 + by2 + cz2 + d = 0 ax3 + by3 + cz3 + d = 0. These three equations together with the equation of the plane then form a system of linear equations in the variables a, b, c, and d. Since we know that the plane exists, it follows that there are values Exploration: Geometric Applications of Determinants 267 of a, b, c, and d, not all zero, satisfying these equations. So the system has a nontrivial solution, and therefore its coefficient matrix, which is   x y z 1 x1 y1 z1 1   x2 y2 z2 1 x3 y3 z3 1 cannot be invertible, by the Fundamental Theorem of Invertible Matrices. Thus its determinant must be zero by Theorem 4.6. To see what happens if the three points are collinear, try row reducing the above coefficient matrix:  x x1  x2 x3 y y1 y2 y3 z z1 z2 z3   x 1  x1 1  R3 −R2  −−→ x2 − x1 1 −−− R4 −R2 1 − −−−−→ x3 − x1 y y1 y2 − y1 y3 − y1 z z1 z2 − z1 z3 − z1  1 1 . 0 0 If the points are collinear, then     x3 − x1 x2 − x1  y3 − y1  = k  y2 − y1  z3 − z1 z2 − z1 for some k 6= 0. Then performing R4 − kR3 gives a matrix with a zero row. So its determinant is trivially zero and the system has an infinite number of solutions. This corresponds to the fact that there are an infinite number of planes passing through the line determined by the three collinear points. 14. Suppose the four points satisfy ax1 + by1 + cz1 + d = 0 ax2 + by2 + cz2 + d = 0 ax3 + by3 + cz3 + d = 0 ax4 + by4 + cz4 + d = 0, which can be written as the matrix equation  x1 x2 XA =  x3 x4 y1 y2 y3 y4 z1 z2 z3 z4   1 a   1   b  = O.  1 c d 1 Then det X 6= 0 if and only if X is invertible, which implies that X −1 XA = X −1 O = O, so that A = O. Thus A 6= O if and only if det X = O; this says precisely that the four points lie on the plane ax + by + cz + d = 0 with a, b, and c not all zero if and only if det X = 0. 15. Substituting the values (x, y) = (−1, 10), (0, 5) and (3, 2) in turn into the equation gives the linear system in a, b, and c a − b + c = 10 a = 5 a + 3b + 9c = 2. The determinant of the coefficient matrix of this system is 1 X= 1 1 −1 1 −1 0 0 = −1 3 3 9 1 = 12 6= 0, 9 268 CHAPTER 4. EIGENVALUES AND EIGENVECTORS Since X is has nonzero determinant, it is invertible by Theorem 4.6, so we have         a 10 a 10 −1         b 5 b 5 , X = ⇒ =X c 2 c 2 so that the system has a unique solution. To find that solution, we can either invert X or we can row-reduce the augmented matrix. Using the second method, and writing the second equation first in the augmented matrix, we have  1 0 0 1 −1 1 1 3 9   1 5 R2 −R1 10 −−− −−→ 0 R3 −R1 2 − −−−−→ 0    1 0 0 5 5 −R2  5 − −−→ 0 1 −1 −5 R3 −3R2 −3 0 3 9 −3 − −−−−→     1 0 5 1 0 0 5 R +R 2 3  0 1 −1 −5 −−− −1 −5 1 −−→ 0 R3 12 12 −12 0 0 1 1 0 −−→ 0 0 −1 1 3 9  1 0 0 1 0 0 0 1 0 0 0 1  5 −4 . 1 So the system has the unique solution a = 5, b = −4, c = 1 and the parabola through the three points is y = 5 − 4x + x2 . 16. (a) The linear system in a, b, and c is a + b + c = −1 a + 2b + 4c = 4 a + 3b + 9c = 3, so row-reducing the augmented matrix gives  1 1 1 1 2 4 1 3 9      1 1 1 −1 −1 1 1 1 −1 R2 −R1 0 1 3 −−→ 0 1 3 4 −−− 5 5 1 R3 −2R2 R3 −R1 2 R3 3 − 0 2 8 4 0 0 2 −6 −−−−−→ −− −→ −−−−→   R1 −R3   R1 −R2  1 1 1 −1 −−−−−→ 1 1 0 2 −−−−−→ 1 0 0 R2 −3R3 0 1 0 0 1 3  0 1 0 5 − 14 −−−−→ 0 0 1 −3 0 0 1 −3 0 0 1  −12 14 . −3 So the system has the unique solution a = −12, b = 14, c = −3 and the parabola through the three points is y = −12 + 14x − 3x2 . (b) The linear system in a, b, and c is a − b + c = −3 a + b + c = −1 a + 3b + 9c = 1, so row-reducing the augmented matrix gives  1 1 1 −1 1 1 1 3 9   −3 1 R2 −R1 −1 −−− −−→ 0 R3 −R1 1 − −−−−→ 0  1 0 0    −3 1 R 1 −1 1 −3 2 2 2 −− −→ 0 1 0 1 R3 −R2 4 14 R3 0 1 2 1 − −−−−→ −−−→    R1 +R2 −R3  −3 1 −1 1 −3 −−−−−−−→ 1 0 0 0 0 1 0 1 1 1 0 1 2 R3 0 −− 0 0 0 1 0 0 1 −→ −1 1 2 0 4 8 −1 1 1 0 0 2  −2 1 . 0 So the system has the unique solution a = −2, b = 1, c = 0. Thus the equation of the curve passing through these points is in fact a line, since c = 0, and its equation is y = −2 + x. Exploration: Geometric Applications of Determinants 269 17. Computing the determinant, we get 1 1 1 a1 a2 a3 a21 a a22 = 1 2 a3 a23 a a22 −1 1 a3 a23 a a21 +1 1 a2 a23 a21 a22 = a2 a23 − a22 a3 − a1 a23 + a21 a3 + a1 a22 − a21 a2 . Now compute the product on the right: (a2 − a1 )(a3 − a1 )(a3 − a2 ) = (a2 a3 − a1 a3 − a1 a2 + a21 )(a3 − a2 ) = a2 a23 − a22 a3 − a1 a23 + a1 a2 a3 − a1 a2 a3 + a1 a22 + a21 a3 − a21 a2 = a2 a23 − a22 a3 − a1 a23 + a1 a22 + a21 a3 − a21 a2 . The two expressions are equal. Furthermore, the determinant is nonzero since it is the product of the three differences a2 − a1 , a3 − a1 , and a3 − a2 , which are all nonzero because we are assuming that the ai are distinct real numbers. 18. Recall that adding a multiple of one column to another does not change the determinant (see Theorem 4.3). So given the matrix 1 1 1 1 a1 a2 a3 a4 a21 a22 a23 a24 a31 a32 , a33 a34 apply in order the column operations C4 − a1 C3 , C3 − a1 C2 , C2 − a1 C1 to get 1 1 1 1 a1 − a1 a2 − a1 a3 − a1 a4 − a1 a21 − a21 a22 − a2 a1 a23 − a3 a1 a24 − a4 a1 a31 − a31 1 0 a32 − a22 a1 1 1 = (a4 − a1 )(a3 − a1 )(a2 − a1 ) 3 2 a3 − a3 a1 1 1 3 2 a4 − a4 a1 1 1 0 a2 a3 a4 0 a22 . a23 a24 Then expand along the first row, giving 1 (a4 − a1 )(a3 − a1 )(a2 − a1 ) 1 1 a2 a3 a4 a22 a23 . a24 But we know how to evaluate this determinant from Exercise 17; doing so, we get for the original determinant (a4 − a1 )(a3 − a1 )(a2 − a1 )(a3 − a2 )(a4 − a2 )(a4 − a3 ) which is the same as the expression given in the problem statement. For the second part, we can interpret the matrix above as the coefficient matrix formed by starting with the cubic y = a + bx + cx2 + dx3 , substituting the four given points for x, and regarding the resulting equations as a linear system in a, b, c, and d, just as in Exercise 15. Since the determinant is nonzero if the ai are all distinct, it follows that the system has a unique solution, so that there is a unique polynomial of degree at most 3 (it may be less than a cubic; for example, see Exercise 16(b)) passing through the four points. 19. For the n × n determinant given, use induction on n (note that Exercise 18 was essentially an inductive computation, since it invoked Exercise 17 at the end). We have already proven the cases n = 1, n = 2, 270 CHAPTER 4. EIGENVALUES AND EIGENVECTORS and n = 3. So assume that the statement holds for n = k, and consider the determinant 1 1 .. . 1 1 ··· ··· .. . a1 a2 .. . ak ak+1 ··· ··· ak−1 1 a2k−1 .. . k−1 ak ak−1 k+1 ak1 ak2 .. . . akk akk+1 apply in order the column operations Ck+1 − a1 Ck , Ck − a1 Ck−1 , . . . , C2 − a1 C1 to get 1 1 .. . 1 1 a1 − a1 a2 − a1 .. . ak − a1 ak+1 − a1 ··· ··· .. . ··· ··· ak−1 − a1k−1 1 ak−1 − a1 ak−2 2 2 .. . ak−1 − a1 akk−2 k k−2 ak−1 k+1 − a1 ak+1 1 0 ak1 − ak1 k−1 k 1 1 a2 − a1 a2 k+1 Y . . .. = (aj − a1 ) .. .. . j=2 1 1 akk − a1 ak−1 k k−1 k 1 1 ak+1 − a1 ak+1 ··· ··· .. . 0 0 ak−2 2 ak−1 2 .. . .. . ··· ··· ak−2 k ak−2 k+1 ak−1 k ak−1 k+1 . Now expand along the first row, giving k+1 Y (aj − a1 ) j=2 1 .. . 1 1 ··· .. . ··· ··· ak−2 2 .. . ak−1 2 .. . ak−2 k ak−2 k+1 ak−1 k ak−1 k+1 . But the remaining determinant is again a Vandermonde determinant in the k variables a2 , a3 , . . . , ak+1 , so we know its value from the inductive hypothesis. Substituting that value, we get for the determinant of the original matrix k+1 Y Y j=2 2≤i<j≤k+1 (aj − a1 ) (aj − ai ) = Y (aj − ai ) 1≤i<j≤k+1 as desired. For the second part, we can interpret the matrix above as the coefficient matrix formed by starting with the (n − 1)st degree polynomial y = c0 + c1 x + · · · + cn−1 xn−1 , substituting the n given points for x, and regarding the resulting equations as a linear system in the ci , just as in Exercises 15 and 18. Since the determinant is nonzero if the x-coordinates of the points are all distinct (being a product of their differences), it follows that the system has a unique solution, so that there is a unique polynomial of degree at most n − 1 (it may be less; for example, see Exercise 16(b)) passing through the n points. 4.3 Eigenvalues and Eigenvectors of n × n Matrices 1. (a) The characteristic polynomial of A is det(A − λI) = 1−λ −2 3 = (1 − λ)(6 − λ) − (−2) · 3 = λ2 − 7λ + 12 = (λ − 3)(λ − 4) 6−λ (b) The eigenvalues of A are the roots of the characteristic polynomial, which are λ = 3 and λ = 4. (c) To find the eigenspace corresponding to the eigenvalue λ = 3 we must find the null space of −2 3 A − 3I = −2 3 4.3. EIGENVALUES AND EIGENVECTORS OF N × N MATRICES 271 Row reduce this matrix: −2 0 = −2 3 3 −3 0 = −2 3 2 0 2 −3 0 −→ 0 0 0 0 3t 3 Thus eigenvectors corresponding to λ = 3 are of the form , so that E3 = span . 2t 2 To find the eigenspace corresponding to the eigenvalue λ = 4 we must find the null space of −3 3 A − 4I = −2 2 A − 3I Row reduce this matrix: 0 1 −1 0 −→ 0 0 0 0 t 1 Thus eigenvectors corresponding to λ = 4 are of the form , so that E4 = span . t 1 A − 4I (d) Since λ = 3 and λ = 4 are single roots of the characteristic polynomials, the algebraic multiplicity of each is 1. Since each eigenspace is one-dimensional, the geometric multiplicity of each eigenvalue is also 1. 2. (a) The characteristic polynomial of A is det(A − λI) = 2−λ −1 1 = (2 − λ)(−λ) − 1 · (−1) = λ2 − 2λ + 1 = (λ − 1)2 . −λ (b) The eigenvalues of A are the roots of the characteristic polynomial; this polynomial has as its only root λ = 1. (c) To find the eigenspace corresponding to the eigenvalue we must find the null space of 1 1 A−I = −1 −1 Row reduce this matrix: 0 1 1 0 −→ 0 0 0 0 −t −1 Thus eigenvectors corresponding to λ = 1 are of the form , which is span . t 1 A−I 0 = 1 −1 1 −1 (d) Since λ = 1 is a double root of the characteristic polynomial, its algebraic multiplicity is 2. However, the eigenspace is one-dimensional, so its geometric multiplicity is 1. 3. (a) The characteristic polynomial of A is det(A − λI) = 1−λ 0 0 1 0 −2 − λ 1 = (1 − λ)(−2 − λ)(3 − λ) = −(λ − 1)(λ + 2)(λ − 3). 0 3−λ (b) The eigenvalues of A are the roots of the characteristic polynomial, which are λ = 1, λ = −2, and λ = 3. (c) To find the eigenspace corresponding to the eigenvalue λ = 1 we must find the null space of   0 1 0 A − I = 0 −3 1 0 0 2 272 CHAPTER 4. EIGENVALUES AND EIGENVECTORS Row reduce this matrix:  0 0 = 0 0    0 0 1 0 0 0 −→ 0 0 1 0 0 0 0 0 0     t 1 Thus eigenvectors corresponding to λ = 1 are of the form 0, so that E1 = span 0. 0 0 To find the eigenspace corresponding to the eigenvalue λ = −2 we must find the null space of   3 1 0 A + 2I = 0 0 3 0 0 5 A−I 1 0 −3 1 0 2 Row reduce this matrix:  3 1 0  0 = 0 0 3 0 0 5 i h A + 2I   1 0   0 −→ 0 0 0 1 3 0 0 0 1 0  0  0 0  1    −1 −3t Thus eigenvectors corresponding to λ = −2 are of the form  t , so that E−2 = span  3. 0 0 To find the eigenspace corresponding to the eigenvalue λ = 3 we must find the null space of   −2 1 0 A − 3I =  0 −5 1 0 0 0 Row reduce this matrix: h A − 3I  −2  0 = 0 0 i 1 0 −5 1 0 0   0 1   0 −→ 0 0 0 Thus eigenvectors corresponding to λ = 3 are of the form     1 1 10 t   1   5 t  , or, clearing fractions,  2  , so that 10 t 0 1 0 1 − 10 − 15 0  0  0 0   1   E−2 = span  2  . 10 (d) Since all three eigenvalues are single roots of the characteristic polynomials, the algebraic multiplicity of each is 1. Since each eigenspace is one-dimensional, the geometric multiplicity of each eigenvalue is also 1. 4. (a) The characteristic polynomial of A is 1−λ 0 1 0 1−λ 1 1 1 −λ = (1 − λ) 1−λ 1 1 0 1−λ +1 −λ 1 1 det(A − λI) = = (1 − λ)(λ2 − λ − 1) + 1(−1 + λ) = −λ3 + 2λ2 + λ − 2 = −(λ + 1)(λ − 1)(λ − 2). 4.3. EIGENVALUES AND EIGENVECTORS OF N × N MATRICES 273 (b) The eigenvalues of A are the roots of the characteristic polynomial, which are λ1 = −1, λ2 = 1, and λ3 = 2. (c) To find the eigenspace corresponding to λ1 = −1 we must find the null space of   2 0 1 A + I = 0 2 1 . 1 1 1 Row reduce this matrix: h  i 2 0 1 0 = 0 2 1 1 1 1 A+I   1 0 0     0 −→ 0 1 0 0 0 1 2 1 2 0  0  0  0  1      −2t −t −1 Thus eigenvectors corresponding to λ1 are of the form − 12 t, or −t, which is span −1. 2t 2 t To find the eigenspace corresponding to λ2 = 1 we must find the null space of   0 0 1 1 . A − I = 0 0 1 1 −1 Row reduce this matrix:    1 1 0 0 0 0 −→ 0 0 1 0 0 0 0 0 0     −t −1 Thus eigenvectors corresponding to λ2 are of the form  t, which is span  1. 0 0 To find the eigenspace corresponding to λ3 = 2 we must find the null space of   −1 0 1 1 . A − 2I =  0 −1 1 1 −2 A−I  0 0 = 0 1 0 0 1 1 1 −1 Row reduce this matrix:    0 1 0 −1 0 0 −→ 0 1 −1 0 0 0 0 0 0    t 1     t 1. Thus eigenvectors corresponding to λ3 are of the form , which is span t 1 A − 2I  −1 0 = 0 1 0 1 −1 1 1 −2 (d) Since each eigenvalue is a single root of the characteristic polynomial, each has algebraic multiplicity 1. Since each eigenspace is one-dimensional, each eigenvalue also has geometric multiplicity 1. 5. (a) The characteristic polynomial of A is det(A − λI) = 1−λ −1 0 2 0 −1 − λ 1 = −λ3 + λ2 = −λ2 (λ − 1). 1 1−λ 274 CHAPTER 4. EIGENVALUES AND EIGENVECTORS (b) The eigenvalues of A are the roots of the characteristic polynomial, which are λ = 0 and λ = 1. (c) To find the eigenspace corresponding to the eigenvalue λ = 0 we must find the null space of A. Row reduce this matrix:     1 2 0 0 1 0 −2 0 A 0 = −1 −1 1 0 −→ 0 1 1 0 0 1 1 0 0 0 0 0 Thus eigenvectors corresponding to λ = 0 are of the form     2t 2 −t = t −1 , t 1   2 so that the eigenspace is E0 = span −1. 1 To find the eigenspace corresponding to the eigenvalue λ = 1 we must find the null space of   0 2 0 A − I = −1 −2 1 0 1 0 Row reduce this matrix: h A−I  0 i  0 = −1 0 2 0 −2 1 1 0   0 1 0   0 −→ 0 1 0 0 0 −1 0 0  0  0 0 Thus eigenvectors corresponding to λ = 1 are of the form     t 1     0 , so that E1 = span 0 . t 1 (d) Since λ = 0 is a double root of the characteristic polynomial, it has algebraic multiplicity 2; since its eigenspace is one-dimensional, it has geometric multiplicity 1. Since λ = 1 is a single root of the characteristic polynomial, it has algebraic multiplicity 1; since its eigenspace is one-dimensional, it also has geometric multiplicity 1. 6. (a) The characteristic polynomial of A is det(A − λI) = 1−λ 3 2 = (1 − λ) 0 2 −1 − λ 3 0 1−λ −1 − λ 0 3 3 +2 1−λ 2 −1 − λ 0 = (1 − λ)(−(1 + λ)(1 − λ) − 3 · 0) + 2(3 · 0 + 2(1 + λ)) = (1 − λ)(λ2 − 1) + 4 + 4λ = −λ3 + λ2 + 5λ + 3 = −(λ + 1)2 (λ − 3). (b) The eigenvalues of A are the roots of the characteristic polynomial, which are λ1 = −1 and λ2 = 3. (c) To find the eigenspace corresponding to λ1 = −1 we must find the null space of   2 0 2 A + I = 3 0 3 . 2 0 2 4.3. EIGENVALUES AND EIGENVECTORS OF N × N MATRICES 275 Row reduce this matrix: A+I  2 0 = 3 2 0 0 0   0 1 0 1 0 −→ 0 0 0 0 0 0 0 2 3 2  0 0 . 0 Thus eigenvectors corresponding to λ1 are of the form               −s −s 0 −1 0 −1 0  t =  0 +  t = s  0 + t 1 = span  0 , 1 . s s 0 1 0 1 0 To find the eigenspace corresponding to λ2 = 3 we must find the null space of   −2 0 2 3 . A − 3I =  3 −4 2 0 −2 Row reduce this matrix: h A − 3I  i −2 0 =  3 2 0 2 −4 3 0 −2   1 0     0 −→ 0 0 0  0 −1 0 1 − 23 0 0  0 . 0 Thus eigenvectors corresponding to λ2 are of the form         t 1 2 2  3 t = t  3  = s 3 = span 3 . 2 2 2 2 1 t (d) λ1 is a double root of the characteristic polynomial, and its eigenspace has dimension 2, so its algebraic and geometric multiplicities are both equal to 2. λ2 is a single root of the characteristic polynomial, and its eigenspace has dimension 1, so its algebraic and geometric multiplicities are both equal to 1. 7. (a) The characteristic polynomial of A is det(A − λI) = 4−λ 2 −1 0 1 3−λ 2 = −λ3 + 9λ2 − 27λ + 27 = −(λ − 3)3 . 0 2−λ (b) The eigenvalues of A are the roots of the characteristic polynomial, so the only eigenvalue is λ = 3. (c) To find the eigenspace corresponding to the eigenvalue λ = 3 we must find the null space of   1 0 1 2 A − 3I =  2 0 −1 0 −1 Row reduce this matrix: h A − 3I  1 i  = 0  2 −1 0 0 0 1 2 −1   0 1 0 1   −→ 0 0 0 0 0 0 0 0  0  0 0 Thus eigenvectors corresponding to λ = 3 are of the form           −t 0 −1 0 −1            s  = s 1 + t  0 , so that E3 = span 1 ,  0 . t 0 1 0 1 276 CHAPTER 4. EIGENVALUES AND EIGENVECTORS (d) Since λ = 3 is a triple root of the characteristic polynomial, it has algebraic multiplicity 3; since its eigenspace is two-dimensional, it has geometric multiplicity 2. 8. (a) The characteristic polynomial of A is 1−λ 0 −1 −1 2−λ −1 −1 0 1−λ = (2 − λ) 1−λ −1 −1 1−λ det(A − λI) = = (2 − λ)((1 − λ)2 − (−1) · (−1)) = (2 − λ)(λ2 − 2λ) = −λ(λ − 2)2 (b) The eigenvalues of A are the roots of the characteristic polynomial, which are λ1 = 0 and λ2 = 2. (c) To find the eigenspace corresponding to λ1 = 0 we must find the null space of   1 −1 −1 2 0 . A + 0I =  0 −1 −1 1 Row reduce this matrix:  A + 0I 1 0 = 0 −1 −1 −1 2 0 −1 1   1 0 0 0 −→ 0 1 0 0 0 −1 0 0  0 0 . 0 Thus eigenvectors corresponding to λ1 are of the form       s 1 1 0 = s 0 = span 0 . s 1 1 To find the eigenspace corresponding to λ2 = 2 we must find the null space of   −1 −1 −1 0 0 . A − 2I =  0 −1 −1 −1 Row reduce this matrix: A − 2I  −1 0 = 0 −1 −1 −1 0 0 −1 −1   0 1 0 −→ 0 0 0 1 0 0 1 0 0  0 0 . 0 Thus eigenvectors corresponding to λ1 are of the form           −s − t −1 −1 −1 −1  s  = s  1 + t  0 = span  1 ,  0 . t 0 1 0 1 (d) Since λ1 = 0 is a single root of the characteristic polynomial and its eigenspace has dimension 1, its algebraic and geometric multiplicities are both equal to 1. Since λ2 = 2 is a double root of the characteristic polynomial and its eigenspace has dimension 2, its algebraic and geometric multiplicities are both equal to 2. 9. (a) The characteristic polynomial of A is 3−λ −1 det(A − λI) = 0 0 1 0 1−λ 0 0 1−λ 0 1 0 0 = λ4 − 6λ3 + 9λ2 + 4λ − 12 = (λ + 1)(λ − 2)2 (λ − 3) 4 1−λ 4.3. EIGENVALUES AND EIGENVECTORS OF N × N MATRICES 277 (b) The eigenvalues of A are the roots of the characteristic polynomial, so the eigenvalues are λ = −1, λ = 2, and λ = 3. (c) To find the eigenspace corresponding to the eigenvalue λ = −1 we must find the null space of   4 1 0 0 −1 2 0 0  A+I =  0 0 2 4 0 0 1 2 Row reduce this matrix:  h A+I 4 1 0 0 i −1 2 0 0  0 =  0 0 2 4 0 0 1 2   1 0 0 0 0   0 0 1 0 0  −→  0 0 1 2 0 0 0 0 0 0  0 0   0 0 Thus eigenvectors corresponding to λ = −1 are of the form     0 0  0   0     −2t , so that E−1 = span −2 . t 1 To find the eigenspace corresponding to the eigenvalue λ = 2 we must find the null space of   1 1 0 0 −1 −1 0 0  A − 2I =   0 0 −1 4 0 0 1 −1 Row reduce this matrix:  h A − 2I 1 i −1  0 =  0 0 1 0 0 −1 0 0 0 −1 4 0 1 −1   1 1 0 0 0   0 0 0 1 0  −→  0 0 0 1  0 0 0 0 0 0  0 0   0 0 Thus eigenvectors corresponding to λ = 2 are of the form     −s −1 s  1       , so that E2 = span   . 0  0 0 0 To find the eigenspace corresponding to the eigenvalue λ = 3 we must find the null space of   0 1 0 0 −1 −2 0 0  A − 3I =   0 0 −2 4 0 0 1 −2 Row reduce this matrix:  h A − 3I 0 i −1  0 =  0 0 1 0 0 −2 0 0 0 −2 4 0 1 −2   0 1 0 0    −→  0 0 0 0 0 1 0 0 0 0 1 0 0 0 −2 0  0 0   0 0 278 CHAPTER 4. EIGENVALUES AND EIGENVECTORS Thus eigenvectors corresponding to λ = 3 are of the form     0 0 0 0       , so that E3 = span   . 2 2t 1 t (d) Since λ = 3 and λ = −1 are single roots of the characteristic polynomial, they each have algebraic multiplicity 1; their eigenspaces are each one-dimensional, so those two eigenvalues have geometric multiplicity 1 as well. Since λ = 2 is a double root of the characteristic polynomial, it has algebraic multiplicity 2; since its eigenspace is one-dimensional, it only has geometric multiplicity 1. 10. (a) The characteristic polynomial of A is 2−λ 0 det(A − λI) = 0 0 1 1 1−λ 4 0 3−λ 0 0 0 5 1 2−λ Since this is a triangular matrix, its determinant is the product of the diagonal entries by Theorem 4.2, so the characteristic polynomial is (2 − λ)2 (1 − λ)(3 − λ). (b) The eigenvalues of A are the roots of the characteristic polynomial, which are λ1 = 1, λ2 = 2, and λ3 = 3. (c) To find the eigenspace corresponding to λ1 = 1 we must find the null space of   1 1 1 0 0 0 4 5  A−I = 0 0 2 1 . 0 0 0 1 Row reduce this matrix: A−I  1 1 1 0 0 0 4 5 0 = 0 0 2 1 0 0 0 1   0 1 0 0  −→  0 0 0 0 1 0 0 0 0 1 0 0 0 0 1 0  0 0 . 0 0 Thus eigenvectors corresponding to λ1 are of the form       −t −1 −1  t  1  1   = t   = span   .  0  0  0 0 0 0 To find the eigenspace corresponding to λ2 = 2 we must find the null space of   0 1 1 0 0 −1 4 5 . A − 2I =  0 0 1 1 0 0 0 0 Row reduce this matrix: A − 2I  0 0 0 = 0 0 1 1 0 −1 4 5 0 1 1 0 0 0   0 0 1 0 0 0 1 0  −→  0 0 0 0 0 0 0 0 −1 1 0 0  0 0 . 0 0 4.3. EIGENVALUES AND EIGENVECTORS OF N × N MATRICES 279 Thus eigenvectors corresponding to λ2 are of the form           0 1 0 1 s 0  1  1 0  t   = s   + t   = span   ,   . 0 −1 −1 0 −t 1 0 1 0 t To find the eigenspace corresponding to λ3 = 3 we must find the null space of   −1 1 1 0  0 −2 4 5 . A − 3I =   0 0 0 1 0 0 0 −1 Row reduce this matrix: A − 3I  −1  0 0 =  0 0 1 1 0 −2 4 5 0 0 1 0 0 −1   1 0 0 0 1 0  −→  0 0 0 0 0 0 −3 0 −2 0 0 1 0 0  0 0 . 0 0 Thus eigenvectors corresponding to λ1 are of the form       3 3 3t 2 2 2t   = t   = span   . 1 1  t 0 0 0 (d) Since λ1 = 1 is a single root of the characteristic polynomial and its eigenspace has dimension 1, its algebraic and geometric multiplicities are both equal to 1. Since λ2 = 2 is a double root of the characteristic polynomial and its eigenspace has dimension 2, its algebraic and geometric multiplicities are both equal to 2. Since λ3 = 3 is a single root of the characteristic polynomial and its eigenspace has dimension 1, its algebraic and geometric multiplicities are both equal to 1. 11. (a) The characteristic polynomial of A is 1−λ 0 det(A − λI) = 1 −2 0 0 1−λ 0 1 3−λ 1 2 0 0 = (λ − 1)2 (λ − 3)(λ + 1) 0 −1 − λ (b) The eigenvalues of A are the roots of the characteristic polynomial, so the eigenvalues are λ = −1, λ = 1, and λ = 3. (c) To find the eigenspace corresponding to the eigenvalue λ = −1 we must find the null space of   2 0 0 0  0 2 0 0  A+I =  1 1 4 0 −2 1 2 0 Row reduce this matrix:  A+I 2 0 0 0  0 2 0 0 0 =  1 1 4 0 −2 1 2 0   0 1 0 0  −→  0 0 0 0 0 1 0 0 0 0 1 0 0 0 0 0  0 0  0 0 280 CHAPTER 4. EIGENVALUES AND EIGENVECTORS Thus eigenvectors corresponding to λ = −1 are of the form     0 0 0 0   , so that E−1 = span   . 0 0 1 t To find the eigenspace corresponding to the eigenvalue λ = 1 we must find the null space of   0 0 0 0  0 0 0 0  A−I =  1 1 2 0 −2 1 2 −2 Row reduce this matrix:  h A−I 0 0 0 i  0 0 0  0 =  1 1 2 −2 1 2 0 0 0 −2   0 1   0 0  −→  0 0 0 0 1 0 0 0 0 2 0 0 2 3 − 23 0 0  0 0   0 0 Thus eigenvectors corresponding to λ = 1 are of the form  2    −3 0 − 32 t  2 −2 −2s + 2 t     3  = s   + t  3 ,    0     1 s 0 t 1    −2 0 −2  2     E1 = span   ,   .  1  0 3 0   so that To find the eigenspace corresponding to the eigenvalue λ = 3 we must find the null space of   −2 0 0 0  0 −2 0 0  A − 3I =   1 1 0 0 −2 1 2 −4 Row reduce this matrix: h A − 3I  −2 i  0  0 =  1 −2 0 0 0 −2 0 0 1 0 0 1 2 −4   1 0 0 0   0 0 1 0  −→  0 0 1 0 0 0 0 0 0 0 −2 0  0 0   0 0 Thus eigenvectors corresponding to λ = 3 are of the form   0 0    , 2t t so that   0 0   E3 = span   . 2 1 (d) Since λ = 3 and λ = −1 are single roots of the characteristic polynomial, they each have algebraic multiplicity 1; their eigenspaces are each one-dimensional, so those two eigenvalues have geometric multiplicity 1 as well. Since λ = 1 is a double root of the characteristic polynomial, it has algebraic multiplicity 2; since its eigenspace is two-dimensional, it also has geometric multiplicity 2. 4.3. EIGENVALUES AND EIGENVECTORS OF N × N MATRICES 281 12. (a) The characteristic polynomial of A is 4−λ 0 det(A − λI) = 0 0 = −3 0 1 4−λ 1 0 1−λ 0 3 4−λ 0 0 0 1 2 −λ 4−λ 0 0 4 − λ 1 + (−λ) 0 0 0 2 0 1 4−λ 1 0 1−λ = −6(4 − λ)2 − λ(1 − λ)(4 − λ)2 = (4 − λ)2 (λ2 − λ − 6) = (4 − λ)2 (λ − 3)(λ + 2) since each of the 3 × 3 matrices is upper triangular, so that by Theorem 4.2 their determinants are the product of their diagonal elements. (b) The eigenvalues of A are the roots of the characteristic polynomial, which are λ1 = 4, λ2 = 3, and λ3 = −2. (c) To find the eigenspace corresponding to λ1 = 4 we must find the null space of   0 0 1 0 0 0 1 1 . A − 4I =  0 0 −3 2 0 0 0 −4 Row reduce this matrix: A − 4I  0 0 0 = 0 0 0 0 0 0 1 0 1 1 −3 2 0 −4   0 0 0 0  −→  0 0 0 0 0 0 0 0 1 0 0 0 0 1 0 0  0 0 . 0 0 Thus eigenvectors corresponding to λ1 are of the form           0 1 0 1 s 0 1 1 0  t   = s   + t   = span   ,   . 0 0 0 0 0 0 0 0 0 0 To find the eigenspace corresponding to λ2 = 3 we must find the null space of   1 0 1 0 0 1 1 1 . A − 3I =  0 0 −2 2 0 0 0 −3 Row reduce this matrix: A − 3I  1 0 0 1 0 = 0 0 0 0 1 0 1 1 −2 2 0 −3   1 0 0 0  −→  0 0 0 0 0 1 0 0 Thus eigenvectors corresponding to λ1 are of the form       −t −1 −1 −2t −2 −2        t = t  1 = span  1 . t 1 1 0 0 1 0 1 2 −1 0  0 0 . 0 0 282 CHAPTER 4. EIGENVALUES AND EIGENVECTORS To find the eigenspace corresponding to λ3 = −2 we must find the null space of   6 0 1 0 0 6 1 1  A + 2I =  0 0 3 2 . 0 0 3 2 Row reduce this matrix: h A + 2I  6 0 1 0 i  0 6 1 1 0 = 0 0 3 2  0 0 3 2   1 0 0 0    0  −→ 0 1 0 0 0 1  0  0 0 0 0 − 19 1 18 2 3 0 0   0 . 0  0 Thus eigenvectors corresponding to λ1 are of the form         1 2 2 2s 9t     1        − 18 t  −s    = s  −1 = span  −1 . = −12  − 2 t −12s −12     3     18 18 18s t (d) Since λ1 = 4 is a double root of the characteristic polynomial and its eigenspace has dimension 2, its algebraic and geometric multiplicities are both equal to 2. Since λ2 = 3 is a single root of the characteristic polynomial and its eigenspace has dimension 1, its algebraic and geometric multiplicities are both equal to 1. Since λ3 = −2 is a single root of the characteristic polynomial and its eigenspace has dimension 1, its algebraic and geometric multiplicities are both equal to 1. 13. We must show that if A is invertible and Ax = λx, then A−1 x = λ1 x = λ−1 x. First note that since A is invertible, 0 is not an eigenvalue, so that λ 6= 0 and this statement makes sense. Now, x = (A−1 A)x = A−1 (Ax) = A−1 (λx) = λ(A−1 x). Dividing through by λ gives A−1 x = λ−1 x as required. 14. The remark referred to in Section 3.3 tells us that if A is invertible and n > 0 is an integer, then A−n = (A−1 )n = (An )−1 . Now, part (a) gives us the result in part (c) when n > 0 is an integer. When n = 0, the statement says that λ0 = 1 is an eigenvalue of A0 = I, which is true. So it remains to prove the statement for n < 0. We use induction on n. For n = −1, this is the statement that if Ax = λx, then A−1 x = λ−1 x, which is Theorem 4.18(b). Now assume the statement holds for n = −k, and prove it for n = −(k + 1). Note that by the remark in Section 3.3, A−(k+1) = (Ak+1 )−1 = (Ak A)−1 = A−1 (Ak )−1 = A−1 A−k . Now A−(k+1) x = A−1 (A−k x) inductive hypothesis = n=−1 A−1 (λ−k x) = λ−k A−1 x = λ−k λ−1 x = λ−(k+1) x and we are done. 15. If we can express x as a linear combination of the eigenvectors, it will be easy to computeA10 x since c1 5 10 we know what A does to the eigenvectors. To so express x, we solve the system v1 v2 = , c2 1 which is c1 + c2 = 5 −c1 + c2 = 1. 4.3. EIGENVALUES AND EIGENVECTORS OF N × N MATRICES 283 This has the solution c1 = 2 and c2 = 3, so that x = 2v1 + 3v2 . But by Theorem 4.4(a), v1 and v2 10 are eigenvectors of A10 with eigenvalues λ10 1 and λ2 . Thus 10 A10 x = A10 (2v1 + 3v2 ) = 2A10 v1 + 3A10 v2 = 2λ10 1 v1 + 3λ2 v2 = 1 v1 + 3072v2 . 512 Substituting for v1 and v2 gives # 1 3072 + 512 . A x= 1 3072 − 512 " 10 16. As in Exercise 15, k 1 3 · 2k + 21−k k v1 + 3 · 2 v2 = A x=2· . 3 · 2k − 21−k 2 k As k → ∞, the 21−k terms approach zero, so that Ak x ≈ 3 · 2k v2 . 17. We want to express x as a linear combination of the eigenvectors, say x = c1 v1 + c2 v2 + c3 v3 , so we want a solution of the linear system c1 + c2 + c3 = 2 c2 + c3 = 1 c3 = 2. Working from the bottom and substituting gives c3 = 2, c2 = −1, and c1 = 1, so that x = v1 −v2 +2v3 . Then A20 x = A20 (v1 − v2 + 2v3 ) = A20 v1 − A20 v2 + 2A20 v3 20 20 = λ20 1 v1 − λ2 v2 + 2λ3 v3       20 1 20 1 1 1 1 0 − 1 + 2 · 120 1 = − 3 3 0 0 1   2 = 2 − 3120  . 2 18. As in Exercise 17, Ak x = Ak (v1 − v2 + 2v3 ) = Ak v1 − Ak v2 + 2Ak v3 = λk1 v1 − λk2 v2 + 2λk3 v3       k 1 k 1 1 1 1 0 − 1 + 2 · 1k 1 = − 3 3 0 0 1  1  1 ± 3k − 3k + 2 =  − 31k + 2  . 2   2 As k → ∞, the 31k terms approach zero, and Ak x → 2. 2 284 CHAPTER 4. EIGENVALUES AND EIGENVECTORS 19. (a) Since (λI)T = λI T = λI, we have AT − λI = AT − (λI)T = (A − λI)T , so that using Theorem 4.10, det(AT − λI) = det (A − λI)T = det(A − λI). But the left-hand side is the characteristic polynomial of AT , and the right-hand side is the characteristic polynomial of A. Thus the two matrices have the same characteristic polynomial and thus the same eigenvalues. (b) There are many examples. For instance A= 1 1 0 2 has eigenvalues λ1 = 1 and λ2 = 2, with eigenspaces −1 0 E1 = span , E2 = span . 1 1 AT has the same eigenvalues by part (a), but the eigenspaces are 1 1 E1 = span , E2 = span . 0 1 20. By Theorem 4.18(b), if λ is an eigenvalue of A, then λm is an eigenvalue of Am = O. But the only eigenvalue of O is zero, since Am takes any vector to zero. Thus λm = 0, so that λ = 0. 21. By Theorem 4.18(a), if λ is an eigenvalue of A with eigenvector x, then λ2 is an eigenvalue of A2 = A with eigenvector x. Since different eigenvalues have linearly independent eigenvectors by Theorem 4.20, we must have λ2 = λ, so that λ = 0 or λ = 1. 22. If Av = λv, then Av − cIv = λv − cIv, so that (A − cI)v = (λ − c)v. But this equation means exactly that v is an eigenvector of A − cI with eigenvalues λ − c. 23. (a) The characteristic polynomial of A is A − λI = 3−λ 5 2 = λ2 − 3λ − 10 = (λ − 5)(λ + 2). −λ So the eigenvalues of A are λ1 = 5 and λ2 = −2. For λ1 , we have −2 2 1 −1 A − 5I = → , 5 −5 0 0 and for λ2 , we have 5 A + 2I = 5 2 1 → 2 0 2 5 0 . It follows that 1 E5 = span , 1 E−2 = span −2 − 25 = span . 5 1 (b) By Theorem 4.4(b), A−1 has eigenvalues 15 and − 21 , with 1 −2 E1/5 = span , E−1/2 = span . 1 5 4.3. EIGENVALUES AND EIGENVECTORS OF N × N MATRICES 285 By Exercise 22, A − 2I has eigenvalues 5 − 2 = 3 and −2 − 2 = −4 with 1 −2 E3 = span , E−4 = span . 1 5 Also by Exercise 22, A + 2I = A − (−2I) has eigenvalues 5 + 2 = 7 and −2 + 2 = 0 with 1 E7 = span , 1 −2 E0 = span . 5 24. (a) There are many examples. For instance, let 0 A= 1 1 , 0 1 B= 0 1 . 1 Then the characteristic equation of A is λ2 − 1, so that A has eigenvalues λ = ±1. The characteristic equation of B is (µ − 1)2 , so that B has eigenvalue µ = 1. But A+B = 1 1 2 1 has characteristic polynomial λ2 − 2λ − 1, so it has eigenvalues 1 ± λ + µ are 0 and 2. √ 2. The possible values of (b) Use the same A and B as above. Then 0 AB = 1 1 , 1 √ with characteristic polynomial λ2 − λ − 1 and eigenvalues 21 (1 ± 5), which again does not match λ + µ. (c) If λ and µ both have eigenvector x, then (A + B)x = Ax + Bx = λx + µx = (λ + µ)x, which shows that λ + µ is an eigenvalue of A + B with eigenvector x. For the product, we have (AB)x = A(Bx) = A(µx) = µAx = µλx = λµx, showing that λµ is an eigenvalue of AB with eigenvector x. 25. No, they do not. For example, take B = I2and let A be any 2 × 2 rank 2 matrix that has an eigenvalue 0 2 other than 1. For example let A = , which has eigenvalues 2 and −2. Since A has rank 2, it 2 0 row-reduces to I, so A and I are row-equivalent but do not have the same eigenvalues. 26. By definition, the companion matrix of p(x) = x2 − 7x + 12 is −(−7) −12 7 C(p) = = 1 0 1 −12 , 0 which has characteristic polynomial 7−λ 1 −12 = (7 − λ)(−λ) + 12 = λ2 − 7λ + 12. −λ 286 CHAPTER 4. EIGENVALUES AND EIGENVECTORS 27. By definition, the companion matrix of p(x) = x3 + 3x2 − 4x + 12 is     −3 −(−4) −12 −3 4 −12 0 0 = 1 0 0 , C(p) =  1 0 1 0 0 1 0 which has characteristic polynomial −3 − λ 1 0 4 −λ 1 −12 −3 − λ 0 = −1 1 −λ −3 − λ −12 −λ 1 0 4 −λ = −12 − λ((−3 − λ)(−λ) − 4) = −λ3 − 3λ2 + 4λ − 12. 28. (a) If p(x) = x2 + ax + b, then C(p) = −a 1 −b , 0 so the characteristic polynomial of C(p) is −a − λ 1 −b = (−a − λ)(−λ) + b = λ2 + aλ + b. −λ x (b) Suppose that λ is an eigenvalue of C(p), and that a corresponding eigenvector is x = 1 . Then x2 x x λ 1 = C(p) 1 means that x2 x2 λx1 −a −b x1 −ax1 − bx2 = = , λx2 1 0 x2 x1 λ so that x1 = λx2 . Setting x2 = 1 gives an eigenvector . 1 29. (a) If p(x) = x3 + ax2 + bx + c, then  −a C(p) =  1 0  −b −c 0 0 , 1 0 so the characteristic polynomial of C(p) is −a − λ 1 0 −b −λ 1 −c −a − λ 0 = −1 1 −λ −c −a − λ −λ 0 1 −b −λ = −c − λ((−a − λ)(−λ) + b) = −λ3 − aλ2 − bλ − c.   x1 (b) Suppose that λ is an eigenvalue of C(p), and that a corresponding eigenvector is x = x2 . Then x3     x1 x1 λ x2  = C(p) x2  means that x3 x3        λx1 −a −b −c x1 −ax1 − bx2 − cx3 λx2  =  1 , 0 0 x2  =  x1 λx3 0 1 0 x3 x2 4.3. EIGENVALUES AND EIGENVECTORS OF N × N MATRICES 287  2 λ so that x1 = λx2 and x2 = λx3 . Setting x3 = 1 gives an eigenvector  λ . 1 30. Using Exercise 28, the characteristic polynomial of the matrix −a −b C(p) = 1 0 is λ2 + aλ + b. If we want this matrix to have eigenvalues 2 and 5, then λ2 + aλ + b = (λ − 2)(λ − 5) = λ2 − 7λ + 10, so we can take 7 −10 A= . 1 0 31. Using Exercise 28, the characteristic polynomial of the matrix   −a −b −c 0 0 C(p) =  1 0 1 0 is −(λ3 + aλ2 + bλ + c). If we want this matrix to have eigenvalues −2, 1, and 3, then −(λ3 + aλ2 + bλ + c) = −(λ + 2)(λ − 1)(λ − 3) = λ3 − 2λ2 − 5λ + 6, so we can take  2 A = 1 0  −6 0 . 0 5 0 1 32. (a) Exercises 28 and 29 provide a basis for the induction, proving the cases n = 2 and n = 3. So assume that the statement holds for n = k; that is, that −ak−1 − λ 1 −ak−2 −λ 0 0 0 1 0 0 · · · −a1 ··· 0 .. . 0 · · · −λ ··· 1 −a0 0 k 0 = (−1) p(λ), 0 −λ where p(x) = xk + ak−1 xk−1 + · · · + a1 x + a0 . Now suppose that q(x) = xk+1 + bk xk + · · · + b1 x + b0 has degree k + 1; then C(q) = −bk − λ 1 −bk−1 −λ 0 0 0 1 0 0 ··· ··· .. . ··· ··· −b1 0 −b0 0 0 −λ 1 0 . 0 −λ Evaluate this determinant by expanding along the last column, giving 1 −λ C(q) = (−1)k (−b0 ) 0 0 0 1 0 0 ··· .. . ··· ··· 0 0 + (−λ) −λ 1 −bk − λ 1 −bk−1 −λ 0 0 0 1 0 0 ··· ··· .. . ··· ··· −b2 0 −b1 0 0 −λ 1 0 . 0 −λ The first determinant is of an upper triangular matrix, so it is the product of its diagonal entries, which is 1. The second determinant is the characteristic polynomial of xk +bk xk−1 +· · ·+b2 x+b1 , 288 CHAPTER 4. EIGENVALUES AND EIGENVECTORS which is a k th degree polynomial. So by the inductive hypothesis, this determinant is (−1)k (λk + bk λk−1 + · · · + b2 λ + b1 ). So the entire expression becomes C(q) = (−1)k (−b0 ) − λ (−1)k (λk + bk λk−1 + · · · + b2 λ + b1 ) = (−1)k+1 b0 + (−1)k+1 (λk+1 + bk λk + · · · + b2 λ2 + b1 λ) = (−1)k+1 (λk+1 + bk λk + · · · + b2 λ2 + b1 λ + b0 ) = (−1)k+1 q(λ). (b) The method is quite similar to Exercises 28(b) and 29(b). Suppose that λ is an eigenvalue of C(p) with eigenvector   x1  x2      x =  ...  .   xn−1  xn Then λx = C(p)x, so that      λx1 x1 −an−1 −an−2 · · · −a1 −a0  λx2   1   0 ··· 0 0     x2     ..   0   ..  1 · · · 0 0  . =      . .. .. ..   .  .. λxn−1   ..   . xn−1  . . . λxn 0 0 ··· 1 0 xn   −an−1 x1 − an−2 x2 − · · · − a1 xn−1 − a0 xn   x1     x2 = .   ..   . xn−1 But this means that xi = λxi+1 for i = 1, 2, . . . , n − 1. Setting xn = 1 gives an eigenvector  n−1  λ λn−2      x =  ...  .    λ  1 33. The characteristic polynomial of A is cA (λ) = det(A − λI) = 1−λ 2 −1 = (1 − λ)(3 − λ) + 2 = λ2 − 4λ + 5. 3−λ Now, A2 = 1 2 2 −1 −1 = 3 8 −4 , 7 so that cA (A) = A2 − 4A + 5I2 −1 −4 1 = −4 8 7 2 = O. −1 1 +5 3 0 0 1 4.3. EIGENVALUES AND EIGENVECTORS OF N × N MATRICES 289 34. The characteristic polynomial of A is 1−λ 1 0 cA (λ) = det(A − λI) = 1 −λ 1 0 1 = −λ3 + 2λ2 + λ − 2. 1−λ Now,  1 A2 = 1 0 1 0 1 2  0 2 1 = 1 1 1 1 2 1  1 1 , 2  1 A3 = 1 0 1 0 1 3  0 3 1 = 3 1 2 3 2 3  2 3 , 3 so that cA (A) = −A3 + 2A2 + A − 2I3        3 3 2 2 1 1 1 1 0 1 0 = − 3 2 3 + 2 1 2 1 + 1 0 1 − 2 0 1 2 3 3 1 1 2 0 1 1 0 0   −3 + 4 + 1 − 2 −3 + 2 + 1 − 0 −2 + 2 + 0 − 0 = −3 + 2 + 1 − 0 −2 + 4 + 0 − 2 −3 + 2 + 1 − 0 −2 + 2 + 0 − 0 −3 + 2 + 1 − 0 −3 + 4 + 1 − 2  0 0 1 = O. 35. Since A2 − 4A + 5I = 0, we have A2 = 4A − 5I, and A3 = A(4A − 5I) = 4A2 − 5A = 4(4A − 5I) − 5A = 11A − 20I A4 = (A2 )(A2 ) = (4A − 5I)(4A − 5I) = 16A2 − 40A + 25I = 16(4A − 5I) − 40A + 25I = 24A − 55I. 36. Since −A3 + 2A2 + A − 2I = 0, we have A3 = 2A2 + A − 2I, A4 = A(A3 ) = A(2A2 + A − 2I) = 2A3 + A2 − 2A = 2(2A2 + A − 2I) + A2 − 2A = 5A2 − 4I. 37. Since A2 − 4A + 5I = 0, we get A(A − 4I) = −5I. Then solving for A−1 yields −4 1 4 1 I =− A+ I A−1 = − A − 5 5 5 5 2 1 4 1 2 8 16 1 8 16 4 11 2 A−2 = A−1 = − A + I = A − A + I2 = (4A − 5I) − A + I = − A + I. 5 5 25 25 25 25 25 25 25 25 38. Since −A3 + 2A2 + A − 2I = 0, rearranging terms gives A(−A2 + 2A + I) = 2I. Then solving for A−1 yields (using Exercise 36) 1 1 1 I(−A2 + 2A + I) = − A2 + A + I 2 2 2 2 1 1 2 A−2 = A−1 = − A2 + A + I 2 2 1 4 1 1 = A − A3 + A2 + A + I 2 4 2 4 1 1 1 2 2 = (5A − 4I) − (2A + A − 2I) + A2 + A + I 4 2 4 1 2 5 = − A + I. 4 4 A−1 = 290 CHAPTER 4. EIGENVALUES AND EIGENVECTORS 39. Apply Exercise 69 in Section 4.2 to the matrix A − λI: P − λI Q cA (λ) = det(A − λI) = det = (det(P − λI))(det(S − λI)) = cP (λ)cS (λ). O S − λI (Note that neither O or Q is affected by subtracting an identity matrix, since all the 1’s are along the diagonal, in either P or S.) 40. Start with the equation given in the problem statement: det(A − λI) = (−1)n (λ − λ1 )(λ − λ2 ) · · · (λ − λn ). This is an equation, so in particular it holds for λ = 0, and it becomes det A = det(A − 0I) = (−1)n (−λ1 )(−λ2 ) · · · (−λn ) = (−1)n (−1)n λ1 λ2 · · · λn = λ1 λ2 · · · λn , proving the first result. For the second result, we will find the coefficients of λn−1 on the left and right sides. On the left side, the easiest method given the tools we have is induction on n. For n = 2, a11 − λ a12 det(A − λI) = det = (a11 − λ)(a22 − λ) − a12 a21 = λ2 − (a11 + a22 )λ − a12 a21 , a21 a22 − λ so that the coefficient of λn−1 = λ is − tr(A). Claim that in general, for n ≥ 1, the coefficient of λn−1 in det(A − λI) is (−1)n−1 tr(A). Suppose this statement holds for n = k. Then if A is (k + 1) × (k + 1), we have (expanding along the first column of A − λI) det A = (a11 − λ) det(A11 − λI) + k+1 X (−1)i+1 ai1 det(Ai1 − λI). i=2 Now, the terms in the sum on the right each consist of a constant times the determinant of a k × k matrix derived by removing row i and column 1 from A. Since i > 1, this means we have removed both the entries a11 − λI and ai1 − λI from A, so that there are only k − 1 entries in A that contain λ. So the sum on the right cannot produce any terms with λk . For the other term, note that A11 − λI is a k × k matrix; using the inductive hypothesis we see that the coefficient of λk−1 in det(A11 − λI) is (−1)k−1 tr(A11 ). Now det A = (a11 − λ)((−1)k λk + (−1)k−1 tr(A11 )λk−1 + . . . ) + X ..., so that the coefficient of λk in det A is (−1)k a11 − (−1)k−1 tr(A11 ) = (−1)k (a11 + tr(A11 )) = (−1)k (a11 + a22 + · · · + ak+1,k+1 ) = (−1)k tr(A). Now look at the product on the right in the original expression det(A − λI) = (−1)n (λ − λ1 )(λ − λ2 ) · · · (λ − λn ). We can get a constant times λn−1 by expanding the right-hand side when exactly n − 1 of the factors use λ and one uses λi for some i; that contribution to the λn−1 term in the characteristic polynomial is −λi λn−1 . Summing over all possible choices for i shows that the coefficient of the λn−1 term is (−1)n (−λ1 − λ2 − · · · − λn ) = (−1)n+1 (λ1 + λ2 + · · · + λn ) = (−1)n−1 (λ1 + λ2 + · · · + λn ). Since the coefficient of λn−1 on the left must be the same as that on the right, we have (−1)n−1 tr(A) = (−1)n−1 (λ1 + λ2 + · · · + λn ), or tr(A) = λ1 + λ2 + · · · + λn . 4.4. SIMILARITY AND DIAGONALIZATION 291 41. Suppose that the eigenvalues of A are α1 , α2 , . . . , αn (repetitions included) and for B are β1 , β2 , . . . , βn . Let the eigenvalues of A + B be λ1 , λ2 , . . . , λn . Then by Exercise 40, tr(A + B) = λ1 + λ2 + · · · + λn tr(A) + tr(B) = α1 + · · · + αn + β1 + · · · + βn . Since tr(A + B) = tr(A) + tr(B) by Exercise 44(a) in Section 3.2, we get λ1 + λ2 + · · · + λ2 = α1 + · · · + αn + β1 + · · · + βn as desired. Let µ1 , µ2 , . . . , µn be the eigenvalues of AB. Then again by Exercise 40 we have µ1 µ2 · · · µn = det(AB) = det(A) det(B) = α1 α2 · · · αn β1 β2 · · · βn as desired. Pn 42. Suppose x = i=1 ci vi , where the vi are eigenvectors of A with corresponding eigenvalues λi . Then Theorem 4.18 tells us that vi are eigenvectors of Ak with corresponding eigenvalues λki , so that k A x=A k n X i=1 4.4 ! ci vi = n X i=1 ci Ak vi = n X ci λki vi . i=1 Similarity and Diagonalization 1. The two characteristic polynomials are cA (λ) = 4−λ 3 1 = (4 − λ)(1 − λ) − 3 = λ2 − 5λ + 1 1−λ cB (λ) = 1−λ 0 0 = (1 − λ)2 = λ2 − 2λ + 1. 1−λ Since the two characteristic polynomials are unequal, the two matrices are not similar by Theorem 4.22. 2. The two characteristic polynomials are cA (λ) = 2−λ −4 1 = (2 − λ)(6 − λ) + 4 = λ2 − 8λ + 16 6−λ cB (λ) = 3−λ −5 −1 = (3 − λ)(7 − λ) − 5 = λ2 − 10λ + 16. 7−λ Since the two characteristic polynomials are unequal, the two matrices are not similar by Theorem 4.22. 3. Both A and B are triangular matrices, so by Theorem 4.15, A has eigenvalues 2 and 4, while B has eigenvalues 1 and 4. By Theorem 4.22, the two matrices are not similar. 292 CHAPTER 4. EIGENVALUES AND EIGENVECTORS 4. The two characteristic polynomials are 1−λ 0 0 2 1−λ −1 0 −1 1−λ = (1 − λ) 1−λ −1 0 −1 −2 0 1−λ cA (λ) = −1 1−λ = (1 − λ)((1 − λ)2 − 1) = (1 − λ)(λ2 − 2λ) = −λ3 + 3λ2 − 2λ cB (λ) = 2−λ 0 2 = (1 − λ) 1 1 1−λ 0 0 1−λ 2−λ 2 1 1−λ = (1 − λ)((2 − λ)(1 − λ) − 2) = (1 − λ)(λ2 − 3λ) = −λ3 + 4λ2 − 3λ. Since the two characteristic polynomials are unequal, the two matrices are not similar by Theorem 4.22. 5. By Theorem 4.23, the columns of P are the eigenvectors of A corresponding to the eigenvalues, which are the diagonal entries of D. Thus 1 λ1 = 4 with eigenspace span 1 1 λ2 = 3 with eigenspace span . 2 6. By Theorem 4.23, the columns of P are the eigenvectors of A corresponding to the eigenvalues, which are the diagonal entries of D. Thus   3 λ1 = 2 with eigenspace span 1 2   1 λ2 = 0 with eigenspace span −1 0   0 λ3 = −1 with eigenspace span  1 . −1 7. By Theorem 4.23, the columns of P are the eigenvectors of A corresponding to the eigenvalues, which are the diagonal entries of D. Thus   3 λ1 = 6 with eigenspace span 2 3     0 1 λ2 = −2 with eigenspace span  1 ,  0 . −1 −1 8. To determine whether A is diagonalizable, we first compute its eigenvalues: det(A − λI) = 5−λ 2 2 = (5 − λ)2 − 4 = λ2 − 10λ + 21 = (λ − 3)(λ − 7). 5−λ 4.4. SIMILARITY AND DIAGONALIZATION 293 Since A is a 2 × 2 matrix with 2 distinct eigenvalues, it is diagonalizable by Theorem 4.25. To find the diagonalizing matrix P , we find the eigenvectors of A. To find the eigenspace corresponding to λ1 = 3 we must find the null space of 2 2 A − 3I = . 2 2 Row reduce this matrix: 2 0 = 2 A − 3I 2 2 0 1 1 −→ 0 0 0 0 . 0 Thus eigenvectors corresponding to λ1 are of the form −t −1 −1 =t = span . t 1 1 To find the eigenspace corresponding to λ2 = 7 we must find the null space of −2 2 A − 7I = . 2 −2 Row reduce this matrix: −2 0 = 2 A − 7I 0 1 −→ 0 0 2 −2 0 . 0 −1 0 Thus eigenvectors corresponding to λ1 are of the form t 1 1 =t = span . t 1 1 Thus " P = # −1 1 1 1 " − 21 , so that P −1 = " − 12 1 2 1 2 1 2 # and so by Theorem 4.23, P −1 AP = 1 2 1 2 1 2 #" 5 2 #" −1 # 1 2 5 1 1 " = 3 # 0 0 7 . 9. To determine whether A is diagonalizable, we first compute its eigenvalues: det(A − λI) = −3 − λ −1 4 = (−3 − λ)(1 − λ) − 4 · (−1) = λ2 + 2λ + 1 = (λ + 1)2 . 1−λ So the only eigenvalue is λ = −1, with algebraic multiplicity 2. To see if A is diagonalizable, we must see if its geometric multiplicity is also 2 by finding its eigenvectors. To find the eigenspace corresponding to λ = −1, we must find the null space of −2 4 A+I = −1 2 by row-reducing it: A+I −2 4 0 = −1 2 0 1 −→ 0 0 −2 0 0 . 0 Thus eigenvectors corresponding to λ1 are of the form 2t 2 , so a basis for the eigenspace is . t 1 Thus λ = −1 has geometric multiplicity 1, which is less than the algebraic multiplicity, so that A is not diagonalizable by Theorem 4.27. 294 CHAPTER 4. EIGENVALUES AND EIGENVECTORS 10. To determine whether A is diagonalizable, we first compute its eigenvalues: det(A − λI) = 3−λ 0 0 1 0 3−λ 1 = (3 − λ)3 0 3−λ since the matrix is a triangular matrix. Thus the only eigenvalue is λ = 3, with algebraic multiplicity 3. To see if A is diagonalizable, we must compute the eigenspace corresponding to this eigenvalue. To find the eigenspace corresponding to λ = 3 we row-reduce A − 3I 0 : A − 3I  0 0 = 0 0 1 0 0 0 1 0  0 0 . 0 But this matrix is already row-reduced, and thus the eigenspace is       s 1 1 0 = s 0 = span 0 . 0 0 0 Thus the geometric multiplicity is only 1 while the algebraic multiplicity is 3. So by Theorem 4.27, the matrix is not diagonalizable. 11. To determine whether A is diagonalizable, we first compute its eigenvalues: det(A − λI) = 1−λ 0 1 0 1−λ 1 1 1 = −λ3 + 2λ2 + λ − 2 = −(λ + 1)(λ − 1)(λ − 2). −λ Since A is a 3 × 3 matrix with 3 distinct eigenvalues, it is diagonalizable by Theorem 4.25. To find the diagonalizing matrix P , we find the eigenvectors of A. To find the eigenspace corresponding to λ = −1, we must find the null space of   2 0 1 A + I = 0 2 1 1 1 1 by row-reducing it: h A+I  2 0 1 i  0 = 0 2 1 1 1 1   0 1 0   0 −→ 0 1 0 0 0 1 2 1 2 0  0  0 0 Thus eigenvectors corresponding to λ1 are of the form     − 12 t −t  1    − 2 t or, clearing fractions −t . t 2t   −1 Thus a basis for this eigenspace is −1. 2 To find the eigenspace corresponding to λ = 1, we must find the null space of   0 0 1 1 A − I = 0 0 1 1 −1 4.4. SIMILARITY AND DIAGONALIZATION 295 by row-reducing it: A−I  0 0 0 = 0 0 1 1 1 1 −1   0 1 1 0 0 −→ 0 0 1 0 0 0 0  0 0 0 Thus eigenvectors corresponding to λ1 are of the form     −t −1  t , so a basis for the eigenspace is  1 . 0 0 To find the eigenspace corresponding to λ = 2, we must find the null space of   −1 0 1 1 A − 2I =  0 −1 1 1 −2 by row-reducing it:  −1 0 = 0 1 A − 2I 0 1 −1 1 1 −2   0 1 0 −→ 0 0 0 0 1 0 −1 −1 0  0 0 0 Thus eigenvectors corresponding to λ1 are of the form    1 t t , so a basis for the eigenspace is 1 . 1 t Then a diagonalizing matrix P is the matrix whose columns are the eigenvectors of A:     − 61 − 16 13 −1 −1 1     1 P = −1 1 1 , so that P −1 = − 12 0 2 1 1 1 2 0 1 3 3 3 and so by Theorem 4.23,  − 16  P −1 AP = − 21 − 16 1 2 1 3 1 3  1  0 0 1 1 3 1 3 0 1 1  1 −1  1 −1 0 2   −1 1 −1   1 1 =  0 0 1 0 0 1 0  0  0 . 2 12. To determine whether A is diagonalizable, we first compute its eigenvalues: det(A − λI) = 1−λ 2 3 0 0 2−λ 2−λ 1 = (1 − λ) 0 0 1−λ 1 = (1 − λ)2 (2 − λ). 1−λ So the eigenvalues are λ1 = 1 with algebraic multiplicity 2, and algebraic multiplicity 1. λ2 = 2 with To find the eigenspace corresponding to λ1 = 1 we row-reduce A − I 0 :     1 0 0 0 1 0 0 0 A − I 0 = 2 2 1 0 → 0 1 1 0 . 3 0 1 0 0 0 0 0 Thus the eigenspace is       0 0 0 −t = t −1 = span −1 . t 1 1 Thus the geometric multiplicity of this eigenvalue is only 1 while the algebraic multiplicity is 2. So by Theorem 4.27, the matrix is not diagonalizable. 296 CHAPTER 4. EIGENVALUES AND EIGENVECTORS 13. To determine whether A is diagonalizable, we first compute its eigenvalues: det(A − λI) = 1−λ −1 1 2 −λ 1 1 1 = λ2 − λ3 = −λ2 (λ − 1). −λ So A has eigenvalues 0 and 1. Considering λ = 0, we row-reduce A − 0I = A:     1 2 1 0 1 0 −1 0 −1 0 1 0 → 0 1 1 0 , 1 1 0 0 0 0 0 0 so that the eigenspace is   1 span −1 . 1 Since λ = 0 has algebraic multiplicity 2 but geometric multiplicity 1, the matrix is not diagonalizable. 14. To determine whether A is diagonalizable, we first compute its eigenvalues: 2−λ 0 det(A − λI) = 0 0 0 0 3−λ 2 0 3−λ 0 0 2 1 = (1 − λ)(2 − λ)(3 − λ)2 0 1−λ since the matrix is upper triangular. So the eigenvalues are λ1 = 1 with algebraic multiplicity 1, λ2 = 2 with algebraic multiplicity 1, and λ3 = 3 with algebraic multiplicity 2. To find the eigenspace corresponding to λ3 = 3 we row-reduce A − 3I 0 :     −1 0 0 2 0 1 0 0 0 0    0 0 2 1 0  → 0 0 1 0 0 . A − 3I 0 =   0 0 0   0 0 0 0 0 1 0 0 0 0 −2 0 0 0 0 0 0 Thus the eigenspace is       0 0 0 1 1 t   = t   = span   . 0 0 0 0 0 0 Thus the geometric multiplicity of this eigenvalue is only 1 while the algebraic multiplicity is 2. So by Theorem 4.27, the matrix is not diagonalizable. 15. Since A is triangular, we can read off its eigenvalues, which are λ1 = 2 and λ2 = −2, each of which has algebraic multiplicity 2. Computing eigenspaces, we have         0 0 0 4 0 0 0 1 0 0 1 0          0 0 0 1 0 0 0 0 0 0 0 →  , so E2 has basis   , 1 A − 2I 0 =  0 0 −4        0 0 0 0 0 0 0 0 0       0 0 0 −4 0 0 0 0 0 0 0 0         4 0 0 4 0 1 0 0 1 0 −1 0         0 1 0 0 0 0 4 0 0 0 0      0 A + 2I 0  0 0 0 0 0 → 0 0 0 0 0 , so E−2 has basis  0 , 1 .     0 0 0 0 0 0 0 0 0 0 1 0 Since each eigenvalue also has geometric multiplicity 2, the matrix is diagonalizable, and     1 0 −1 0 2 0 0 0 0 1  0 0 0 0  , D = 0 2  P = 0 0   0 1 0 0 −2 0 0 0 1 0 0 0 0 −2 4.4. SIMILARITY AND DIAGONALIZATION provides a solution to  1 0  0 1 P −1 AP =  0 0 0 0 0 0 0 1  2 1 0 0  1 0 0 0 0 2 0 0 297  0 4 1  0 0  0 −2 0 0 0 −2 0 0 1 0 0 16. As in Example 4.29, we must find a matrix P such that −1 −1 −4 P AP = P −3   2 −1 0 0 0 0 = 0 1 0 0 1 0 0 2 0 0  0 0 0 0  = D. −2 0 0 −2 6 P 5 is diagonal. To do this, we find the eigenvalues and a basis for the eigenspaces of A. The eigenvalues of A are the roots of its characteristic polynomial det(A − λI) = −4 − λ −3 6 = (−4 − λ)(5 − λ) + 18 = λ2 − λ − 2. 5−λ This polynomial factors as (λ − 2)(λ + 1), so the eigenvalues are λ1 = −1 and λ2 = 2. The eigenspace associated with λ1 can be found by finding the null space of A + I: −3 6 0 1 −2 0 A+I 0 = → −3 6 0 0 0 0 2 Thus a basis for this eigenspace is given by . 1 The eigenspace associated with λ2 can be found by finding the null space of A − 2I: −6 6 0 1 −1 0 A − 2I 0 = → . −3 3 0 0 0 0 Thus a basis for this eigenspace is given by Set 1 P = 1 2 , 1 1 . 1 so that P −1 −1 = 1 2 −1 Then D = P −1 AP = 2 0 0 1 , so that A5 = P D5 P −1 = −1 1 2 1 32 0 0 −1 −1 1 2 −34 = −1 −33 66 65 17. As in Example 4.29, we must first diagonalize A. To do this, we find the eigenvalues and a basis for the eigenspaces of A. The eigenvalues of A are the roots of its characteristic polynomial det(A − λI) = −1 − λ 1 6 = (−1 − λ)(−λ) − 6 = λ2 + λ − 6 = (λ + 3)(λ − 2), −λ so the are λ1 = −3 and λ2 = 2. To compute the eigenspaces, we row-reduce A + 3I eigenvalues and A − 2I 0 : A + 3I A − 2I 2 6 0 1 3 0 −3 0 = → , so E−3 has basis 1 3 0 0 0 0 1 −3 6 0 1 −2 0 2 0 = → so E2 has basis . 1 −2 0 0 0 0 1 0 298 CHAPTER 4. EIGENVALUES AND EIGENVECTORS Thus " −3 P = 1 # 2 1 " P −1 = 1 5 1 5 − 52 3 5 # " −3 D= 0 0 2 # satisfy P −1 AP = D, so that " #10 6 = (P DP −1 )10 = P D10 P =1 0 " #" #10 " # 1 −3 2 −3 0 − 52 5 = 1 3 1 1 0 2 5 5 " #" #10 " # 1 −3 2 (−3)10 0 − 25 5 = 1 3 1 1 0 210 5 5 " # 35,839 −69,630 = . −11,605 24,234 −1 1 18. As in Example 4.29, we must first diagonalize A. To do this, we find the eigenvalues and a basis for the eigenspaces of A. The eigenvalues of A are the roots of its characteristic polynomial 4−λ −1 det(A − λI) = −3 = (4 − λ)(2 − λ) − 3 = λ2 − 6λ + 5 = (λ − 5)(λ − 1), 2−λ A−I so the eigenvalues are λ = 1 and λ = 5. To compute the eigenspaces, we row-reduce 1 2 A − 5I 0 : A−I A − 5I 0 1 → 0 0 0 1 → 0 0 −3 1 3 0 = −1 −1 0 = −1 −3 −3 −1 0 3 0 0 and 0 1 , so that E1 has basis , 0 1 0 −3 , so that E5 has basis . 0 1 Set " P = 1 # −3 1 1 −6 −1 " , # so that P −1 = 1 4 − 14 " #" 3 4 1 4, " , D= 1 # 0 0 5 . Then A −6 = PD P = 1 #" −3 1 0 1 1 5−6 0 1 4 − 14 3 4 1 4 # " = 3907 56 3906 56 11718 56 11719 56 # . 19. As in Example 4.29, we must first diagonalize A. To do this, we find the eigenvalues and a basis for the eigenspaces of A. The eigenvalues of A are the roots of its characteristic polynomial det(A − λI) = −λ 1 3 = (−λ)(2 − λ) − 3 = λ2 − 2λ − 3 = (λ − 3)(λ + 1), 2−λ so the are λ1 = 3 and λ2 = −1. To compute the eigenspaces, we row-reduce A − 3I eigenvalues and A + I 0 : −3 3 0 1 −1 0 1 0 = → , so that E3 has basis , 1 −1 0 0 0 0 1 1 3 0 1 3 0 −3 A+I 0 = → , so that E−1 has basis . 1 3 0 0 0 0 1 A − 3I 0 4.4. SIMILARITY AND DIAGONALIZATION Set " P = 299 " # 1 −3 1 1 3 4 1 4, #" −3 3k 0 so that P −1 = , # 1 4 − 41 , D= " 3 # 0 0 −1 . Then k k A = PD P −1 = 1 1 3 +3(−1)k 4 3k −(−1)k 4 #" 1 4 − 41 (−1)k 0 3k+1 −3(−1)k 4 3k+1 +(−1)k 4 " k = " 1 3 4 1 4 # # . 20. As in Example 4.29, we must first diagonalize A. To do this, we find the eigenvalues and a basis for the eigenspaces of A. The eigenvalues of A are the roots of its characteristic polynomial det(A − λI) = 2−λ 2 2 1 2 1−λ 2 = 5λ2 − λ3 = −λ2 (λ − 5), 1 2−λ are λ1 = 0 and λ2 = 5. To compute the eigenspaces we row-reduce A − 0I so the eigenvalues A − 5I 0 :           1 12 1 0   −1 −1 h i 2 1 2 0    → 0 0 0 0 , so E0 has basis  2 ,  0 A 0 = 2 1 2 0             0 1  2 1 2 0 0 0 0 0       1 0 −1 0 −3 1 2 0  1  A − 5I 0 =  2 −4 2 0 → 0 1 −1 0 , so E5 has basis 1 .   1 2 1 −3 0 0 0 0 0 Set  −1   P = 2 0 Then  −1 1 0  1 , 1 1  − 51  2 so that P −1 =  − 5 2 5  0 D = P −1 AP = 0 0 0 0 0  0 0 , 5 2 5 − 51 1 5 − 51 2 5 − 15 1 5 − 15 0 and   3 . 5 2 5 so that  −1  8 8 −1  A = PD P =  2 0  1 0   0 1  0 1 1 0 −1 0 0 0  0 − 51   2 0  − 5 2 58 5   156250   3 =  156250 5 2 156250 5  78125 156250 78125  156250 . 78125 156250 21. As in Example 4.29, we must first diagonalize A. To do this, we find the eigenvalues and a basis for the eigenspaces of A. Since A is upper triangular, its eigenvalues are the diagonal are entries, which λ1 = −1 and λ2 = 1. To compute the eigenspaces, we row-reduce A + I 0 and A − I 0 :          2 1 1 0 1 21 12 0 −1   −1  h i         A + I 0 = 0 0 0 0 → 0 0 0 0 , so E−1 has basis  2 ,  0 ,     0 0 0 0 0 0 0 0 0 2       0 1 1 0 0 1 0 0  1  A − I 0 = 0 −2 0 0 → 0 0 1 0 , so E1 has basis 0 .   0 0 −2 0 0 0 0 0 0 300 CHAPTER 4. EIGENVALUES AND EIGENVECTORS Set  0  −1 so that P =  0 1  1  2 0 , 0 0  −1  P =  0 2 −1 0 1 2 1 2  1 2 0 . 1 2 Then  −1 D = P −1 AP =  0 0  0 0 −1 0 , 0 1 so that  −1  2015 2015 −1  A = PD P = 0 2  −1   = 0 −1 2 0 −1 2 2  1   = 0 0 0 1 −1   0 0 0 1 (−1)2015   2015     0 (−1) 0  0 0  2015 0 0 1 0 1    0 0 0 0 21 1 −1    1    0   0 −1 0 0 2 0 0 0 1 1 12 12 0  1  0  = A. 0 1 2 1 2  1 2 0  1 2 0 −1 22. As in Example 4.29, we must first diagonalize A. To do this, we find the eigenvalues and a basis for the eigenspaces of A. The eigenvalues of A are the roots of its characteristic polynomial det(A − λI) = 2−λ 1 1 = (1 − λ) 0 1 1−λ 1 0 2−λ 2−λ 1 1 2−λ = (1 − λ)((2 − λ)2 − 1) = (1 − λ)(λ2 − 4λ + 3) = −(λ − 1)2 (λ − 3). Thus the eigenvalues are λ1 = 1 and λ2 = 3. To compute the eigenspaces, we row-reduce A − I and A − 3I 0 . A−I A − 3I  1 0 1 0 = 1 0 1 1 0 1  −1 0 0 =  1 −2 1 0        0 1 0 1 0 0   −1 0 → 0 0 0 0 , so E1 has basis  0 , 1   0 0 0 0 0 1 0      1 0 1 0 −1 0  1  1 0 → 0 1 −1 0 , so E3 has basis 1 .   −1 0 0 0 0 0 1 Set  1  P = 1 1  0  0 1 , 1 0 −1  1  2 −1 1 so that P =  − 2 − 12 0 0 1  1 2 1 , 2 − 21  3  D= 0 0 0 1 0  0  0 , 1 0 4.4. SIMILARITY AND DIAGONALIZATION 301 so that  1  k k −1 A = PD P =  1 1  1  = 1 1  −1 0 1 −1 0 1  0 3k   1  0 0 0  0 3k   1  0 0 0 1 k (3 + 1) 0 2 1  =  2 (3k − 1) 1 1 k 2 (3 − 1) 0 0 1k 0  1  2  1 0  − 2 1k − 12 0  1 0 0  2  1 1 0  − 2 0 1 − 12  1 k 2 (3 − 1) 1 k  2 (3 − 1) 1 k 2 (3 + 1)  1 2 1 2 − 21 0 0 1 0 0 1  1 2 1 2 − 21 23. As in Example 4.29, we must first diagonalize A. To do this, we find the eigenvalues and a basis for the eigenspaces of A. The eigenvalues of A are the roots of its characteristic polynomial det(A − λI) = 1−λ 2 0 1 1 −2 − λ 2 = −(λ − 1)(λ − 2)(λ + 3). 1 1−λ Thus the eigenvalues are λ1 =1, λ2 = 2, and λ2 = −3. To compute the eigenspaces, we row-reduce A − I 0 , A − 2I 0 , and A + 3I 0 .       1 0 1 0 0 1 0 0  −1  A − I 0 = 2 −3 2 0 → 0 1 0 0 , so E1 has basis  0   1 0 1 0 0 0 0 0 0        1 0 −1 1  −1 1 0 0 0  A − 2I 0 =  2 −4 2 0 → 0 1 −1 0 , so E2 has basis 1   1 0 1 −1 0 0 0 0 0         1 0 −1 0 4 1 0 0  1  A + 3I 0 = 2 1 2 0 → 0 1 4 0 , so E3 has basis −4 .   1 0 1 4 0 0 0 0 0 Set  −1  P = 0 1 1 1 1  1  −4 , 1  − 12  so that P −1 =  52 1 10 0 1 5 − 15  1 2 2 , 5 1 10  1  D = 0 0 0 2 0  0  0 , −3 so that    −1 1 1 1 0 0 − 21 0    2 k k −1 1 k A = P D P =  0 1 −4 0 2 0  5 5 1 1 1 1 0 0 (−3)k − 15 10  1 1 k k k+2 k ) 10 (5 + (−3) + 2 5 (2 − (−3) )  1 k k = − 25 ((−3)k − 2k ) 5 (4(−3) + 2 ) 1 1 k k k+2 k ) 5 (2 − (−3) ) 10 (−5 + (−3) + 2  1 2 2 5 1 10 24. The characteristic polynomial of A is det(A − λI) = 1−λ 0  1 k k+2 ) 10 (−5 + (−3) + 2  2 k k − 5 ((−3) − 2 ) . 1 k k+2 ) 10 (5 + (−3) + 2 1 = (λ − 1)(λ − k), k−λ 302 CHAPTER 4. EIGENVALUES AND EIGENVECTORS so that if k 6= 1, then A is a 2 × 2 matrix with two distinct eigenvalues, so it is diagonalizable. If k = 1, then 0 1 A−I = , 0 0 1 which gives a one-dimensional eigenspace with basis . Since λ = 1 has algebraic multiplicity 2 but 0 geometric multiplicity 1, A is not diagonalizable. So A is diagonalizable exactly when k 6= 1. 25. The characteristic polynomial of A is det(A − λI) = 1−λ 0 k = (λ − 1)2 . 1−λ The only eigenvalue is 1, and then A−I = 0 0 k . 0 If k = 0, then A is diagonal to start with, while if k 6= 0, then we get a one-dimensional eigenspace 1 with basis . Since λ = 1 then has algebraic multiplicity 2 but geometric multiplicity 1, A is not 0 diagonalizable. So A is diagonalizable exactly when k = 0. 26. The characteristic polynomial of A is det(A − λI) = k−λ 1 √ 1 = λ(λ − k) − 1 = λ2 − kλ − 1, −λ 2 so that A has the eigenvalues λ = k± 2k +4 . Since k 2 + 4 is never zero, and always positive, the eigenvalues of A are real and distinct, so A is diagonalizable for all k. 27. Since A is triangular, the only eigenvalue is 1, with algebraic multiplicity 3. Then   0 0 k A − I = 0 0 0 . 0 0 0 If k = 0, then to start with, while if k 6= 0, then we get a one-dimensional eigenspace  A  isdiagonal  0   1 with basis 0 , 1 . Since λ = 1 then has algebraic multiplicity 3 but geometric multiplicity 2,   0 0 A is not diagonalizable. So A is diagonalizable exactly when k = 0. 28. Since A is triangular, the eigenvalues are 1, with algebraic multiplicity 2, and 2, with algebraic multiplicity 1. Then     0 k 0 0 1 0 A − I = 0 1 0 → 0 0 0 0 0 0 0 0 0     0   1 regardless of the value of k, so λ = 1 has a two-dimensional eigenspace with basis 0 , 0 , i.e.,   0 1 with geometric multiplicity 2. Since this is equal to the algebraic multiplicity of λ, we see that A is diagonalizable for all k. 29. The characteristic polynomial of A is det(A − λI) = 1−λ 1 1 1 1−λ 1 k k = −λ3 + 2λ2 + kλ2 = −λ2 (λ − (k + 2)). k−λ 4.4. SIMILARITY AND DIAGONALIZATION 303 If k = −2, then A has only the eigenvalue 0, with algebraic multiplicity 3, and then     1 1 −2 1 1 −2 0 , A = 1 1 −2 → 0 0 1 1 −2 0 0 0 so the eigenspace is two-dimensional and therefore the geometric multiplicity is not equal to the algebraic multiplicity, so that A is not diagonalizable. If k 6= −2, then 0 has algebraic multiplicity 2, and then     1 1 −k 1 1 −k 0 , A = 1 1 −k  → 0 0 1 1 −k 0 0 0 and the eigenspace is again two-dimensional, so the geometric multiplicity equals the algebraic multiplicity and A is diagonalizable. Thus A is diagonalizable exactly when k 6= −2. 30. We are given that A ∼ B, so that B = P −1 AP for some invertible matrix P , and that B ∼ C, so that C = Q−1 BQ for some invertible matrix Q. But then C = Q−1 BQ = Q−1 P −1 AP Q = (P Q)−1 AP Q, so that C is R−1 AR for the invertible matrix R = P Q. 31. A is invertible if and only if det A 6= 0. But since A and B are similar, det A = det B by part (a) of this theorem. Thus A is invertible if and only if det B 6= 0, which happens if and only if B is invertible. 32. By Exercise 61 in Section 3.5, if U is invertible then rank A = rank(U A) = rank(AU ). Now, since A and B are similar, we have B = P −1 AP for some invertible matrix P , so that rank B = rank(P −1 (AP )) = rank(AP ) = rank A, using the result above twice. 33. Suppose that B = P −1 AP where P is invertible. This can be rewritten as BP −1 = P −1 A. Let λ be an eigenvalue of A with eigenvector x. Then B(P −1 x) = (BP −1 )x = (P −1 A)x = P −1 (Ax) = P −1 (λx) = λ(P −1 x). Thus P −1 x is an eigenvector for B, corresponding to λ. So every eigenvalue of A is also an eigenvalue of B. By writing the similarity equation as A = Q−1 BQ, we see in exactly the same way that every eigenvalue of B is an eigenvalue of A. So A and B have the same eigenvalues. (Note that we have also proven a relationship between the eigenvectors corresponding to a given eigenvalue). 34. Suppose that A ∼ B, say A = P −1 BP . Then if m = 0, we have Am = B m = I, so that A0 ∼ B 0 . If m > 0, then m Am = P −1 BP = |P −1 BP · P −1 BP ·{z · · P −1 BP · P −1 BP} = P −1 B m P. m times But A m =P −1 m B P means that A m m ∼B . 35. From part (f), Am ∼ B m for m ≥ 0. If m < 0, then −1 −1 Am = A−m = P −1 B −m P −1 −1 −1 since A−m and B −m are similar because −m > 0. But P −1 B −m P = P −1 (B −m ) P −1 = P −1 B m P . Thus Am ∼ B m for m < 0, so they are similar for all integers m. 36. Let P = B −1 . Then BA = BABB −1 = B(AB)B −1 = P −1 (AB)P, showing that AB ∼ BA. 304 CHAPTER 4. EIGENVALUES AND EIGENVECTORS 37. Exercise 45 in Section 3.2 shows that if A and B are n × n, then tr(AB) = tr(BA). But then if B = P −1 AP , we have tr B = tr(P −1 AP ) = tr((P −1 A)P ) = tr(P (P −1 A)) = tr((P P −1 )A) = tr A. 38. A is triangular, so it has eigenvalues −1 and 3. For B, we have det(B − λI) = 1−λ 2 2 = (1 − λ)2 − 4 = λ2 − 2λ − 3 = (λ + 1)(λ − 3). 1−λ So the eigenvalues of B are also −1 and 3, and thus −1 A∼B∼ 0 0 = D. 3 1 1 Reducing A + I and A − 3I shows that E−1 = span and E3 = span . Reducing B + I −4 0 1 1 and B − 3I shows that E−1 = span and E3 = span . Thus −1 1 " −1 D=Q #" #" # 3 1 1 1 0 −1 −4 0 #" #" # 1 1 − 12 1 2 , 1 2 1 −1 1 2 1 4 − 41 0 AQ = 1 " D = R−1 BR = 1 2 1 2 so that finally B = RDR−1 = RQ−1 AQR−1 = (QR−1 )−1 A(QR−1 ). Thus P = QR −1 1 = −2 0 2 is an invertible matrix such that B = P −1 AP . 39. We have det(A − λI) = 5−λ 4 −3 = (5 − λ)(−2 − λ) + 12 = λ2 − 3λ + 2 = (λ − 1)(λ − 2) −2 − λ det(B − λI) = −1 − λ −6 1 = (−1 − λ)(4 − λ) + 6 = λ2 − 3λ + 2 = (λ − 1)(λ − 2). 4−λ So A and B both have eigenvalues 1 and 2, so that 1 A∼B∼ 0 0 = D. 2 3 1 Reducing A − I and A − 2I shows that E1 = span and E2 = span . Reducing B − I and 4 1 1 1 B − 2I shows that E1 = span and E3 = span . Thus 2 3 −1 4 3 D = R−1 BR = −2 D = Q−1 AQ = 1 5 −3 3 1 −3 4 −2 4 1 −1 −1 1 1 1 , 1 −6 4 2 3 4.4. SIMILARITY AND DIAGONALIZATION 305 so that finally B = RDR−1 = RQ−1 AQR−1 = (QR−1 )−1 A(QR−1 ). Thus P = QR−1 = 7 −2 10 −3 is an invertible matrix such that B = P −1 AP . 40. A is upper triangular, so it has eigenvalues −2, 1, and 2. For B, we have det(B − λI) = 3−λ 1 2 2 2−λ 2 −5 −1 = −(λ + 2)(λ − 1)(λ − 2), −4 − λ So B also has eigenvalues −2, 1, and 2, so that  −2 A∼B∼ 0 0 0 1 0  0 0 = D. 2 Reducing A + 2I, A − I, and A − 2I gives     1 1 E−2 = span −4 , E1 = span −1 , −3 0   1 E2 = span 0 . 0 Reducing B + 2I, B − I, and B − 2I gives     1 1 E−2 = span 0 , E1 = span −1 , 0 1   1 E2 = span 2 . 1 Therefore    1 0 − 41 1 1 2 1 0 12    D = Q−1 AQ = 0 0 − 13  0 −2 1 −4 −1 1 1 0 −3 0 0 1 1 4 4    1 1 3 −2 −2 1 3 2 −5 1 2    −1 D = R BR =  1 0 −1 1 2 −1 0 −1 1 1 0 − 21 2 2 −4 1 2 2  1  0 0  1  2 , 1 so that finally B = RDR−1 = RQ−1 AQR−1 = (QR−1 )−1 A(QR−1 ). Thus  1 P = QR−1 =  1 −3 0 2 0  0 −5 3 is an invertible matrix such that B = P −1 AP . 41. We have det(A − λI) = 1−λ 1 2 det(B − λI) = −3 − λ 6 4 0 2 −1 − λ 1 = −(λ + 1)2 (λ − 3) 0 1−λ −2 5−λ 4 0 0 = −(λ + 1)2 (λ − 3), −1 − λ 306 CHAPTER 4. EIGENVALUES AND EIGENVECTORS so that A and B both have eigenvalues −1 (of algebraic multiplicity 2) and 3. Therefore   −1 0 0 A ∼ B ∼  0 −1 0 = D. 0 0 3 Reducing A + I and A − 3I gives     0 1 E−1 = span  0 , 1 , −1 0   2 E3 = span 1 . 2 Reducing B + I and B − 3I gives     1 0 E−1 = span −1 , 0 , 0 1   1 E3 = span −3 . −2 Therefore  1 2  D = Q−1 AQ = − 41 1 4  3 2  D = R−1 BR =  −1 − 21   − 12 1 0 2 1 0  1  − 4  1 −1 1  0 1 1 2 0 1 −1 0 4   1 1 0 0 −3 −2 2   5 0 −1 −1 1  6 0 4 4 −1 − 12 0 0 1 0  2  1 2 0 0 1  1  −3 , −2 so that finally B = RDR−1 = RQ−1 AQR−1 = (QR−1 )−1 A(QR−1 ). Thus  1 2  P = QR−1 = − 23 − 25 − 12 − 32 − 32  0  1 0 is an invertible matrix such that B = P −1 AP . 42. Suppose that B = P −1 AP . This gives P B = AP , so that (P B)T = (AP )T . Expanding yields B T P T = P T AT , so that B T ∼ AT and thus AT ∼ B T . 43. Suppose that P −1 AP = D, where D is diagonal. Then D = DT = (P −1 AP )T = P T AT (P T )−1 . Now let Q = (P T )−1 ; the equation above becomes D = Q−1 AT Q, so that AT is diagonalizable. 44. If A is invertible, it has no zero eigenvalues. So if it is diagonalizable, the diagonal matrix D is also invertible since none of the diagonal entries are zero. Then D−1 is also diagonal, consisting of the inverses of each of the diagonal entries of D. Suppose that D = P −1 AP . Then −1 −1 D−1 = P −1 AP = P −1 A−1 P −1 = P −1 A−1 P, showing that A−1 is also diagonalizable (and, incidentally, showing again that the eigenvalues of A−1 are the inverses of the eigenvalues of A, by looking at the entries of D−1 .) 45. Suppose that A is diagonalizable, and that λ is its only eigenvalue. Then if D = P −1 AP is diagonal, all of its diagonal entries must equal λ, so that D = λI. Then λI = P −1 AP implies that P (λI)P −1 = P P −1 AP P −1 = A. However, P (λI)P −1 = λP IP −1 = λI. Thus A = λI. 4.4. SIMILARITY AND DIAGONALIZATION 307 46. Since A and B each have n distinct eigenvalues, they are both diagonalizable by Theorem 4.25. Recall that diagonal matrices commute: if D and E are both diagonal, then DE = ED. Now, if A and B have the same eigenvectors, then since the diagonalizing matrix P has columns equal to the eigenvectors, we see that for the same invertible matrix P , both DA = P −1 AP and DB = P −1 BP are diagonal. But then AB = (P DA P −1 )(P DB P −1 ) = P DA DB P −1 = P DB DA P −1 = (P DB P −1 )(P DA P −1 ) = BA. Conversely, if AB = BA, let α1 , α2 , . . . , αn be the distinct eigenvalues of A, and pi the corresponding eigenvectors, which are also distinct since they are linearly independent. Then A(Bpi ) = (AB)pi = (BA)pi = B(Api ) = B(αi Pi ) = αi Bpi . Thus Bpi is also an eigenvector of A corresponding to αi , so it must be a multiple of pi . That is, Bpi = βi pi for all i, so that the eigenvectors of B are also pi . So every eigenvector of A is an eigenvector of B. Since B also has n distinct eigenvectors, the set of eigenvectors must be the same. 47. If A ∼ B, then A and B have the same characteristic polynomial, so the algebraic multiplicities of their eigenvalues are the same. 48. From the solution to Exercise 33, we know that if B = P −1 AP and x is an eigenvector of A with eigenvalue λ, then P −1 x is an eigenvalue of B with eigenvalue λ. Therefore, if λ is an eigenvalue of A with multiplicity m, its eigenspace has a basis consisting of m distinct vectors v1 , v2 , . . . , vm , so that the vectors P −1 v1 , P −1 v2 , . . . , P −1 vm are in the eigenspace of B corresponding to λ. These vectors are linearly independent since P −1 is an invertible linear transformation. So each of the n eigenvectors of A corresponds to a distinct eigenvector of B; since each matrix has n eigenvectors, the reverse is true as well. Therefore the m eigenvalues of B found above are all of the eigenvalues of B corresponding to λ, so that the dimension of the eigenspace of λ for B is m as well. So the geometric multiplicities of the eigenvalues of A and B are the same. 49. If D = P −1 AP and D is diagonal, then every entry of D is either 0 or 1, since every eigenvalue of A is either 0 or 1 by assumption. Since D2 is computed by squaring each diagonal entry, we see that D2 = D. But then P DP −1 = A, and A2 = (P DP −1 )2 = (P DP −1 )(P DP −1 ) = P −1 D2 P = P −1 DP = A, so that A is idempotent. 50. Suppose that A is nilpotent and diagonalizable, so that D = P −1 AP (so that A = P DP −1 ) and Am = O for some m > 1. Then Dm = (P −1 AP )m = P −1 Am P = P −1 OP = O. Since D is diagonal, Dm is found by taking the mth power of each diagonal entry. If this is the zero matrix, then all the diagonal entries must be zero, so that D = O. But then A = P OP −1 = O, so that A is the zero matrix. 51. (a) Suppose we have three vectors v1 , v2 , and v3 with Av1 = v1 , Av2 = v2 , Av3 = v3 . Then all three vectors must lie in the eigenspace E1 corresponding to λ = 1, since all three vectors are multiplied by 1 when A is applied. However, since the algebraic multiplicity of λ = 1 is 2, the geometric multiplicity is no greater than 2, by Lemma 4.26. But any three vectors in a twodimensional space must be linearly independent, so that v1 , v2 , and v3 are linearly dependent. (b) By Theorem 4.27, A is diagonalizable if and only if the algebraic multiplicity of each eigenvalue equals its geometric multiplicity. So the dimensions of the eigenspaces must be dim E−1 = 1, dim E1 = 2, dim E2 = 3, since the algebraic multiplicity of −1 is 1, that of 1 is 2, and that of 2 is 3. 308 CHAPTER 4. EIGENVALUES AND EIGENVECTORS 52. (a) The characteristic polynomial of A is det(A − λI) = a−λ c b = λ2 − (a + d)λ + (ad − bc) = λ2 − tr(A) + det(A). d−λ Therefore the eigenvalues are p p a + d ± (a + d)2 − 4(ad − bc) a + d ± (a − d)2 + 4bc λ= = . 2 2 So if (a − d)2 + 4bc > 0, then the two eigenvalues of A are real and distinct and therefore A is diagonalizable. If (a − d)2 + 4bc < 0, then A has no real eigenvalues and so is not diagonalizable. (b) If (a − d)2 + 4bc = 0, then A has one real eigenvalue, and A may or may not be diagonalizable depending on the geometric multiplicity of that eigenvalue. For example, the zero matrix O has (a − d)2 + 4bc = 0, and it is already a diagonal matrix, so is diagonalizable. However, 1 −1 A= 1 −1 satisfies (a − d)2 + 4bc = 0; from the formula above, its only real eigenvalue is λ = 0, and it is 1 easy to see by row-reducing A that E0 = span . So the geometric multiplicity of λ = 0 is 1 only 1 and thus A is not diagonalizable, by Theorem 4.27. 4.5 Iterative Methods for Computing Eigenvalues 1. (a) For the given value of x5 , we estimate a dominant eigenvector of A to be # " # " 1 1 . 11,109 ≈ 2.500 4443 The corresponding dominant eigenvalue is computed by finding " #" # " # 1 2 4443 26,661 x6 = Ax5 = = , 5 4 11,109 66,651 66,651 and 26,661 4443 ≈ 6.001, 11,109 ≈ 6.000. (b) det(A − λI) = 1−λ 5 2 = (1 − λ)(4 − λ) − 10 = λ2 − 5λ − 6 = (λ + 1)(λ − 6), so the dominant 4−λ eigenvalue is 6. 2. (a) For the given value of x5 , we estimate a dominant eigenvector of A to be # " # " 1 1 ≈ . 3904 −0.500 − 7811 The corresponding dominant eigenvalue is computed by finding " #" # " # 7 4 7811 39,061 x6 = Ax5 = = , −3 −1 −3904 −19,529 −19,529 and 39,061 7811 ≈ 5.001, −3904 ≈ 5.002. 4.5. ITERATIVE METHODS FOR COMPUTING EIGENVALUES 309 7−λ 4 = (7 − λ)(−1 − λ) + 12 = λ2 − 6λ + 5 = (λ − 1)(λ − 5), so the −3 −1 − λ dominant eigenvalue is 5. (b) det(A − λI) = 3. (a) For the given value of x5 , we estimate a dominant eigenvector of A to be " # " # 1 1 ≈ . 89 0.618 144 The corresponding dominant eigenvalue is computed by finding " #" # " # 2 1 144 377 x6 = Ax5 = = , 1 1 89 233 233 and 377 144 ≈ 2.618, 89 ≈ 2.618. √ 2−λ 1 = (2 − λ)(1 − λ) − 1 = λ2 − 3λ + 1, which has roots λ = 12 (3 ± 5). 1 1−λ √ So the dominant eigenvalue is 12 (3 + 5) ≈ 2.618. (b) det(A − λI) = 4. (a) For the given value of x5 , we estimate a dominant eigenvector of A to be # " # " 1 1 ≈ . 239.500 3.951 60.625 The corresponding dominant eigenvalue is computed by finding # # " #" " 210.688 60.625 1.5 0.5 , = x6 = Ax5 = 839.75 2.0 3.0 239.500 839.75 and 210.688 60.625 ≈ 3.475, 239.500 ≈ 3.506. 1.5 − λ 0.5 = (1.5 − λ)(3.0 − λ) − 1.0 = λ2 − 4.5λ + 3.5, which has roots 2.0 3.0 − λ λ = 1 and λ = 3.5. So the dominant eigenvalue is 3.5. (b) det(A − λI) = 5. (a) The entry of largest magnitude in x5 is m5 = 11.001, so that " # " # 3.667 − 11.001 −0.333 1 y5 = x5 = = . 11.001 1 1 So the dominant eigenvalue is about 11 and y5 approximates the dominant eigenvector. (b) We have " #" # " # 2 −3 −0.333 −3.666 Ay5 = = −3 10 1 10.999 " # " # −0.333 −3.663 λ1 y5 = 11.001 = , 1 11.001 so we have indeed approximated an eigenvalue and an eigenvector of A. 6. (a) The entry of largest magnitude in x10 is m10 = 5.530, so that " # " # 1 1 1 y10 = x10 = 1.470 ≈ . 5.530 0.266 5.530 So the dominant eigenvalue is about 5.53 and y10 approximates the dominant eigenvector. 310 CHAPTER 4. EIGENVALUES AND EIGENVECTORS (b) We have " 5 Ay10 = 2 #" # " # 2 1 5.532 = −2 0.266 1.468 " # " # 1 5.530 λ1 y10 = 5.530 = , 0.266 1.471 so we have indeed approximated an eigenvalue and an eigenvector of A. 7. (a) The entry of largest magnitude in x8 is m8 = 10, so that     1 1 1     y8 = ≈ 0.0001 . x8 =  0.001 10.0  10 1 1 So the dominant eigenvalue is about 10 and y8 approximates the dominant eigenvector. (b) We have      4 0 6 1 10      Ay8 = −1 3 1 0.0001 = 0.0003 6 0 4 1 10     1 10     λ1 y8 = 10 0.0001 = 0.001 , 1 10 so we have indeed approximated an eigenvalue and an eigenvector of A, though the approximation in the middle component is not great. 8. (a) The entry of largest magnitude in x10 is m10 = 3.415, so that     1 1 1     y10 = x10 =  2.914 ≈  0.853  . 3.415  3.415 −0.353 − 1.207 3.415 So the dominant eigenvalue is about 3.415 and y10 approximates the dominant eigenvector. (b) We have      3.412 1 1 2 −2      Ay10 = 1 1 −3  0.853  =  2.912  −1.206 0 −1 1 −0.353     1 3.415     λ1 y10 = 3.415  0.853  =  2.913  , −0.353 −1.205 so we have indeed approximated an eigenvalue and an eigenvector of A. 9. With the given A and x0 , we get the values below: k yk 0 1 1 1 1 mk 1 xk 1 26 8 1.000 0.308 2 17.692 5.923 1.000 0.335 3 18.017 6.004 1.000 0.333 4 17.999 6.000 1.000 0.333 5 18.000 6.000 1.000 0.333 26 17.692 18.017 17.999 18 4.5. ITERATIVE METHODS FOR COMPUTING EIGENVALUES 311 3 . 1 We conclude that the dominant eigenvalue of A is 18, with eigenvector 10. With the given A and x0 , we get the values below: k yk 0 1 0 1 0 mk 1 xk 1 −6 8 −0.75 1.00 2 8.500 8.000 1.000 −0.941 3 −9.765 9.882 −0.988 1.000 8 8.5 9.882 4 9.929 −9.905 1.000 −0.998 5 −9.990 9.995 −1.000 1.000 9.929 9.995 6 9.997 −9.996 1.000 −1.000 9.997 We conclude that the dominant eigenvalue of A is −10 (not 10, since at each step the sign of each 1 component of yk is reversed) , with eigenvector . −1 11. With the given A and x0 , we get the values below: k yk 0 1 0 1 0 mk 1 xk 1 7 2 1.000 0.286 2 7.571 2.857 1.000 0.377 3 7.755 3.132 1.000 0.404 4 7.808 3.212 1.000 0.411 5 7.823 3.234 −1.000 0.412 6 7.827 3.240 1.000 0.414 7 7.571 7.755 7.808 7.823 7.827 1 We conclude that the dominant eigenvalue of A is about 7.83, with eigenvector about . The 0.414 √ 1 . true values are λ = 5 + 2 2 with eigenvector √ 2−1 12. With the given A and x0 , we get the values below: k yk 0 1 0 1 0 mk 1 xk 1 2 3.5 4.143 1.5 1.286 1.000 1.000 0.429 0.310 3.5 3 3.966 1.345 1.000 0.339 4 4.009 1.330 1.000 0.332 5 3.998 1.334 −1.000 0.334 6 4.001 1.333 1.000 0.333 3.966 4.009 3.998 4.001 4.143 3 We conclude that the dominant eigenvalue of A is 4, with eigenvector . 1 13. With the given A and x0 , we get the values below: k xk yk mk 0 1 2       1 21 16.810 1 15 12.238 1 13 10.714       1 1.000 1.000 1 0.714 0.728 1 0.619 0.637 3 4 5       17.011 16.999 17.000 12.371 12.363 12.264 10.824 10.818 10.818       1.000 1.000 1.000 0.727 0.727 0.727 0.636 0.636 0.636 1 17.011 21 16.81 16.999 17 312 CHAPTER 4. EIGENVALUES AND EIGENVECTORS   1 We conclude that the dominant eigenvalue of A is 17, with corresponding eigenvector about 0.727. 0.636     1 0 In fact the eigenspace corresponding to this eigenvalue is span 2 , −2, and we have found one 0 1 eigenvector in this space. 14. With the given A and x0 , we get the values below: k xk yk mk 0 1 2       1 4 3.4 1 5 4.6 1 4 3.4       1 0.8 0.739 1 1.0 1.000 1 0.8 0.739 3   3.217 4.478 3.217   0.718 1.000 0.718 4   3.155 4.437 3.155   0.711 1.000 0.711 5   3.133 4.422 3.133   0.709 1.000 0.709 6   3.126 4.417 3.126   0.708 1.000 0.708 1 4.478 4.437 4.422 4.417 5 4.6   0.708 We conclude that the dominant eigenvalue of A is ≈ 4.4, with corresponding eigenvector ≈ 1.000. 0.708 √  2 √  2  In fact the dominant eigenvalue is 3 + 2 with eigenspace spanned by  √1 . 2 2   1 15. With the given matrix, and using x0 = 1, we get 1 k xk yk mk 0 1     8 1 2 1 4 1     1 1 1 0.25 0.50 1 2   5.75 0.50 2.25   1 0.09 0.39 3   5.26 0.17 1.87   1 0.03 0.36 4   5.10 0.07 1.74   1 0.01 0.34 5   5.04 0.03 1.70   1 0.01 0.34 6   5.02 0.01 1.68   1  0  0.33 7   5.01  0  1.67   1  0  0.33 1 5.75 5.26 5.10 5.04 5.02 5.01 8 8  9    5 5  0   0  1.67 1.67     1 1  0   0  0.33 0.33 5   3 We conclude that the dominant eigenvalue of A is 5, with corresponding eigenvector 0. 1   1 16. With the given matrix, and using x0 = 1, we get 1 k xk yk mk 0   2 1 1   2 1 1 2 1 2     24 11.0 2  1.5 6 −2.5     1 1 0.08  0.14  0.25 −0.23 24 11 3 4    14.18 16.38  2.45  3.12 −7.91 −11.65     1 1  0.17   0.19  −0.56 −0.71  14.18 16.38 5  16.38  3.12 −11.65   1  0.20  −0.77  17.41 5 4.5. ITERATIVE METHODS FOR COMPUTING EIGENVALUES k 6 7 8      17.80 17.93 17.98  3.54  3.58  3.59 −14.05 −14.28 −14.36       1 1 1  0.20   0.2   0.2  −0.79 −0.8 −0.8 9  17.99  3.60 −14.39   1  0.2  −0.8  xk yk mk 17.8 17.93 313  17.98 10 11    18.00 18.00  3.60  3.60 −14.40 −14.40     1 1  0.2   0.2  −0.8 −0.8  17.99 18 18   1 We conclude that the dominant eigenvalue of A is 18, with corresponding eigenvector  0.2 , or −0.8   5  1. −4 17. With the given A and x0 , we get the values below: xk 0 1 0 1 7 2 R(xk ) 7 7.755 k 2 7.571 2.857 7.823 3 7.755 3.132 7.828 The Rayleigh quotient converges to the dominant eigenvalue, to three decimal places, after three iterations. 18. With the given A and x0 , we get the values below: xk 0 1 0 1 3.5 1.5 2 4.143 1.286 3 3.966 1.345 R(xk ) 3.5 3.966 3.998 4 k The Rayleigh quotient converges to the dominant eigenvalue after three iterations. 19. With the given A and x0 , we get the values below: k xk R(xk ) 0   1 1 1 1   21 15 13 2   16.810 12.238 10.714 16.333 16.998 17 The Rayleigh quotient converges to the dominant eigenvalue after two iterations. 20. With the given A and x0 , we get the values below: k xk R(xk ) 0   1 1 1 1   4 5 4 2   3.4 4.6 3.4 3   3.217 4.478 3.217 4.333 4.404 4.413 4.414 The Rayleigh quotient converges to the dominant eigenvalue, to three decimal places, after three iterations. 314 CHAPTER 4. EIGENVALUES AND EIGENVECTORS 21. With the given matrix, and using x0 = 1 , we get 1 yk 0 1 1 1 1 1 5 4 1 0.8 2 3 4 4.8 4.67 4.57 3.2 2.67 2.29 1 1 1 0.67 0.57 0.5 5 6 7 8 4.50 4.44 4.40 4.36 2.00 1.78 1.60 1.45 1 1 1 1 0.44 0.40 0.36 0.33 mk 4.5 4.49 4.46 4.37 k xk 4.43 4.4 4.34 4.32 4.3 The characteristic polynomial is 4−λ det(A − λI) = 0 1 = (λ − 4)2 , 4−λ so that A has the double eigenvalue 4. To find the corresponding eigenspace, 0 1 0 = 0 0 A − 4I 0 , 0 1 . The values mk appear to be converging to 4, but very slowly; 0 the vectors yk may also be converging to the eigenvector, but very slowly if at all. so that the eigenspace has basis 1 22. With the given matrix, and using x0 = , we get 1 yk 0 1 1 1 1 1 4 0 1 0 mk 2 3 k xk 2 3 3 −1 1 −0.33 4 2.67 −1.33 1 −0.5 2.8 5 2.50 −1.50 1 −0.6 2.6 2.47 6 2.40 −1.60 1 −0.67 7 8 2.33 −1.67 1 −0.71 2.29 −1.71 1 −0.75 2.25 −1.70 1 −0.78 2.32 2.28 2.25 2.38 The characteristic polynomial is 3−λ det(A − λI) = −1 1 = λ2 − 4λ + 4 = (λ − 2)2 , 1−λ so that A has the double eigenvalue 2. To find the corresponding eigenspace, A − 2I 0 = 1 −1 1 −1 0 1 1 → 0 0 0 0 0 1 . The values mk appear to be converging to 2, but very slowly; −1 the vectors yk may also be converging to the eigenvector, but very slowly. so that the eigenspace has basis 4.5. ITERATIVE METHODS FOR COMPUTING EIGENVALUES 315   1 23. With the given matrix, and using x0 = 1, we get 1 k xk yk mk 0   1 1 1   1 1 1 1 2     5 4.2 4 3.2 1 0.2     1 1 0.8 0.76 0.2 0.05 3   4.05 3.05 0.05   1 0.75 0.01 4   4.01 3.01 0.01   1 0.75 0 5   4 3 0   1 0.75 0 3.33 4.05 4.01 4 4 4.03 6 7     4 4 3 3 0 0     1 1 0.75 0.75 0 0 4 8   4 3 0   1 0.75 0 4 4 The matrix is upper triangular, so that A has an eigenvalue 1 and a double eigenvalue 4. To find the eigenspace for λ = 4,     0 0 1 0 0 0 1 0 0 0 → 0 0 0 0 A − 4I 0 = 0 0 0 0 −3 0 0 0 0 0     0   1 so that the eigenspace has basis 0 , 1 . The values mk converge to 4, while yk converge   0 0   1 apparently to 0.75, which is in the eigenspace. 0   1 24. With the given matrix, and using x0 = 1, we get 1 k xk yk mk 0 1     0 1 6 1 5 1     1 0 1  1  1 0.83 3.67 5.49 2  3   4  5   7 0 0 0 0 0 5.83 5.71 5.62 5.56 5.50 4.17 3.57 3.12 2.78 2.50           0 0 0 0 0  1   1   1   1   1  0.71 0.62 0.56 0.50 0.45 5.47 5.36 5.42   6  5.45  5.4  5.38 8    0 0 5.45 5.42 2.27 2.08     0 0  1   1  0.42 0.38 5.34 The matrix is upper triangular, so that A has an eigenvalue 0 and a double eigenvalue 5. To find the eigenspace for λ = 5,     1 0 0 0 −5 0 0 0 A − 5I 0 =  0 0 1 0 → 0 0 1 0 0 0 0 0 0 0 0 0   0 so that the eigenspace has basis 1. The values mk appear to converge slowly to 5, while yk also 0   0 appear to converge slowly to 1. 0 316 CHAPTER 4. EIGENVALUES AND EIGENVECTORS 1 , we get 1 25. With the given matrix, and using x0 = yk 0 1 1 1 1 1 1 0 1 0 2 −1 −1 1 1 3 4 1 −1 0 −1 1 1 0 1 mk 1 1 −1 1 k xk −1 The characteristic polynomial is det(A − λI) = −1 − λ −1 2 = λ2 + 1, 1−λ so that A has no real eigenvalues, so the power method cannot succeed. 1 26. With the given matrix, and using x0 = , we get 1 yk 0 1 1 1 1 1 3 3 1 1 2 3 3 1 1 3 3 3 1 1 mk 1 3 3 3 k xk The characteristic polynomial is 2−λ det(A − λI) = −2 1 = λ2 − 7λ + 12 = (λ − 3)(λ − 4) 5−λ so that the power method converges to a non-dominant eigenvalue. This happened because we chose 1 as the initial value, which is in fact an eigenvector for λ = 3: 1 2 1 1 3 1 = =3 . −2 5 1 3 1 27. With the given matrix, and using x0 = k xk yk mk 0 1     1 3 1 4 1 3     1 0.75 1  1  1 0.75 1 1 , we get 1 2   2.5 4.0 2.5   0.62  1  0.62 3   2.25  4  2.25   0.56  1  0.56 4   2.12  4  2.12   0.53  1  0.53 5   2.06  4  2.06   0.52  1  0.52 6   2.03  4  2.03   0.51  1  0.51 4 4 4 4 4 1 4−λ 1  7 0  = (λ + 12)(λ − 4)(λ − 2) −5 − λ 4 7 8     2.02 2.01  4   4  2.02 2.01     0.5 0.5 1 1 0.5 0.5 4 The characteristic polynomial is  −5 − λ det(A − λI) =  0 7 4 4.5. ITERATIVE METHODS FOR COMPUTING EIGENVALUES 317 so that the power method converges to a non-dominant eigenvalue. This happened because we chose   1 1 as the initial value. This vector is orthogonal to the eigenvector corresponding to the dominant 1   1 eigenvalue λ = −12, which is  0: −1        −5 1 7 1 −12 1  0 4 0  0 =  0 = −12  0 . 7 1 −5 −1 12 −1 1 28. With the given matrix, and using x0 = , we get 1 k xk yk mk 0   1 1 1   1 1 1 1   0 2 1   0 1 0.5 1 0.6 2 3 4   −1 −2 0.8  1   0  0.8 −0.5 −2.5 1.8       −1 0.8 0.44  1   0  0.44 −0.5 1 1    1.44  1.49 5 6    0 −0.89 0.89  0.89 1 0.11     0 −1 0.89  1  1 0.12  1 0.5 7  8 −2  0  −1.88   1  0  0.94  1  1  1.94   0.52 0.52 1 1.5 1 0.88   Neither mk nor yk shows clear signs of converging to anything. The characteristic polynomial is   1−λ −1 0 1−λ 0  = (λ − 1)(λ2 − 2x + 2) det(A − λI) =  1 1 −1 1−λ so that A has eigenvalues 1 and 1 ± i. The absolute value of each of the complex eigenvectors is 2, so 1 is not the eigenvalue of largest magnitude, and the power method cannot succeed. 29. In Exercise 9, we found that λ1 = 18. To find λ2 , we apply the power method to −4 12 1 A − 18I = with x0 = . 5 −15 1 k yk 0 1 1 1 1 8 −10 −0.8 1 mk 1 −10 xk 1 2 15.2 −19.0 −0.8 1 3 15.2 −19.0 −0.8 1 −19 −19 Thus λ2 − λ1 = −19, so that λ2 = −1 is the second eigenvalue of A. 30. In Exercise 10, we found that λ1 = −10, so we apply the power method to k 4 A + 10I = 8 4 1 with x0 = : 8 1 yk 0 1 1 1 1 1 8 16 0.5 1 2 6 12 0.5 1 3 6 12 0.5 1 mk 1 16 12 12 xk Thus λ2 − λ1 = 12, so that λ2 = 2 is the second eigenvalue of A. 318 CHAPTER 4. EIGENVALUES AND EIGENVECTORS 31. In Exercise 13, we found that λ1 = 17, so we apply the power method to     −8 4 8 1 A − 17I =  4 −2 −4 with x0 = 1 . 8 −4 −8 1 k xk yk 0   1 1 1   1 1 1 1  1 mk 2 3    4 18 −2  −9 −4 −18     −1 −1 0.5 0.5 1 1  −4 −18  18  −9 −18   −1 0.5 1 −18 Thus λ2 − λ1 = −18, so that λ2 = −1 is the second eigenvalue of A. 32. In Exercise 14, we found that λ1 ≈ 4.4, so we apply the power method to     −1.4 1 0 1 −1.4 1  with x0 = 1 . A − 4.4I =  1 0 1 −1.4 1 k xk yk 0 1     −0.4 1 1  0.6 −0.4 1     −0.667 1 1  1  −0.667 1 mk 1 2 3 4 5  1.990 −2.814 1.990   −0.707  1  −0.707 1.990 1.990 1.933 −2.733 −2.815 −2.814 1.990 1.990 1.933       −0.707 −0.707 −0.707  1   1   1  −0.707 −0.707 −0.707  −2.733 −2.814 0.6       −2.815 −2.814 Thus λ2 − λ1 ≈ −2.81, so that λ2 ≈ 1.59 is the second eigenvalue of A (in fact, λ2 = 3 − √ 2). 33. With A and x0 as in Exercise 9, proceed to calculate xk by solving Axk = yk−1 at each step to apply the inverse power method. We get k yk 0 1 1 1 1 mk 1 xk 1 2 3 0.798 −0.997 −0.801 1 4 0.8 −1 −0.8 1 5 0.8 −1 −0.8 1 −0.997 −1 −1 0.5 −0.5 −1 1 0.833 −1.056 −0.789 1 −0.5 −1.056 1 We conclude that the eigenvalue of A of smallest magnitude is −1 = −1. 34. With A and x0 as in Exercise 10, proceed to calculate xk by solving Axk = yk−1 at each step to apply the inverse power method. We get k yk 0 1 1 1 1 mk 1 xk 1 2 3 4 5 0.1 0.225 0.256 0.249 0.250 0.4 0.400 0.525 0.495 0.501 0.25 0.562 0.488 0.502 0.5 1 1 1 1 1 0.4 0.4 0.525 0.495 0.501 1 We conclude that the eigenvalue of A of smallest magnitude is 0.5 = 2. 6 0.25 0.50 0.5 1 0.5 4.5. ITERATIVE METHODS FOR COMPUTING EIGENVALUES 319 35. With A as in Exercise 7, and x0 as given, proceed to calculate xk by solving Axk = yk−1 at each step to apply the inverse power method. We get k 0 1    1 −0.5  1  0  −1 0.5     1 −1  1  0 −1 1  xk yk −1 mk 2  0.500  0.333 −0.500   −1 −0.667 1  3  0.500  0.111 −0.500   −1 −0.222 1  −0.5 0.5 4 5 6    0.500 0.50  0.259  0.16 −0.500 −0.50     −1 −1 −0.519 −0.321 1 1  −0.5 −0.5 −0.5 1 We conclude that the eigenvalue of A of smallest magnitude is −0.5 = −2. 36. With A and x0 as in Exercise 14, proceed to calculate xk by solving Axk = yk−1 at each step to apply the inverse power method. We get k xk yk 0   1 1 1   1 1 1 1   0.286 0.143 0.286   1 0.5 1 1 0.286 mk 2  3   4    0.357 0.457 0.545 −0.071 −0.371 −0.634 0.357 0.457 0.545       1 1 −0.859 −0.2 −0.812  1  1 1 −0.859 0.357 0.457 −0.634 5   −0.511  0.674 −0.511   −0.758  1  −0.758 6   −0.468  0.645 −0.468   −0.725  1  −0.725 0.674 0.645 1 We conclude that the eigenvalue of A of smallest magnitude is ≈ 0.645 ≈ 1.55. In fact the eigenvalue √ is 3 − 2. 37. With α = 0, we are just using the inverse power method, so the solution to Exercise 33 provides the eigenvalue closest to zero. 1 38. Since α = 0, there is no shift required, so this is the inverse power method. Take x0 = y0 = , and 1 proceed to calculate xk by solving Axk = yk−1 at each step to apply the inverse power method. We get k 0 1 2 3 4 5 6 1 0.125 0.417 0.306 0.340 0.332 0.334 xk 1 0.375 −0.750 −1.083 −0.981 −1.005 −0.999 1 0.333 −0.556 −0.282 −0.346 −0.33 −0.334 yk 1 1 1 1 1 1 1 mk 1 0.375 −0.75 −1.083 −0.981 −1.005 −0.999 1 We conclude that the eigenvalue of A closest to 0 (i.e., the eigenvalue smallest in magnitude) is −1 = −1. 39. We first shift A:  −1 A − 5I = −1 6  0 6 −2 1 . 0 −1   1 Now take x0 = y0 = 1, and proceed to calculate xk by solving Axk = yk−1 at each step to apply 1 320 CHAPTER 4. EIGENVALUES AND EIGENVECTORS the inverse power method. We get k 0   1 1 1   1 1 1 xk yk 1 2 3      0.2 −0.08 0.32 −0.5 −0.50 −0.50 0.2 −0.08 0.32       −0.4 0.16 −0.064  1   1   1  −0.4 0.16 −0.064  −0.5 1 mk −0.5 −0.5 1 We conclude that the eigenvalue of A closest to 5 is 5 + −0.5 = 3. 40. We first shift A:   11 4 7 A + 2I =  4 17 −4 . 8 −4 11   1 Now take x0 = y0 = 1, and proceed to calculate xk by solving Axk = yk−1 at each step to apply 1 the inverse power method. We get k xk yk mk 0 1     −0.158 1 1  0.158 0.263 1     −0.6 1 1  0.6  1 1 1 0.263 2   −0.832  0.432 0.853   −0.975  0.506  1 3 4     −0.999 −0.990  0.496  0.500 1.000 0.991     −0.999 −1  0.5  0.5 1 1 0.853 0.991 5   −1 0.5 1   −1 0.5 1 1 1 We conclude that the eigenvalue of A closest to −2 is −2 + 11 = −1. 41. The companion matrix of p(x) = x2 + 2x − 2 is 2 , 0 1 so we apply the inverse power method with x0 = y0 = , calculating xk by solving Axk = yk−1 at 1 each step. We get C(p) = yk 0 1 1 1 1 0.667 1 mk 1 1.5 k xk 1 1 1.5 −2 1 2 3 4 5 6 1 1 1 1 1 1.333 1.375 1.364 1.367 1.366 0.75 0.727 0.733 0.732 0.732 1 1 1 1 1 1.333 1.375 1.364 1.367 1 So the root of p(x) closest to 0 is ≈ 1.366 ≈ 0.732. The exact value is −1 + 42. The companion matrix of p(x) = x2 − x − 3 is C(p) = 1 1 3 , 0 1.366 √ 3. 4.5. ITERATIVE METHODS FOR COMPUTING EIGENVALUES 321 so we apply the inverse power method to −1 1 C(p) − 2I = with x0 = y0 = 3 −2 1 , calculating xk by solving Axk = yk−1 at each step. We get 1 k xk yk 0 1 1 1 1 1 5 2 1 0.4 2 3.2 1.4 1 0.438 1 5 3.2 3 4 5 3.312 3.302 3.303 1.438 1.434 1.434 1 1 1 0.434 0.434 0.434 3.303 √ 1 ≈ 2.303. The exact value is 12 (1 + 13). So the root of p(x) closest to 2 is ≈ 2 + 3.303 mk 3.312 43. The companion matrix of p(x) = x3 − 2x2 + 1 is  2 C(p) = 1 0 3.302 6 3.303 1.434 1 0.434 3.303  −1 0 , 0 0 0 1   1 so we apply the inverse power method to C(p). If we first start with y0 = 1, we find that this is an 1 eigenvector of C(p) with eigenvalue 1, but this may not be the closest root to 0, so we start instead   1 with with x0 = y0 = 0, calculating xk by solving Axk = yk−1 at each step. We get 1 k xk yk mk 0   1 0 1   1 0 1 1 k xk yk mk 1  0  1 −1   0 −1 1 2   −1  1 −2   0.5 −0.5 1 −1 −2  3 4 5       −0.5 −0.667 −0.6  1   1   1  0.5 −1.667 −1.6       0.333 0.4 0.375 −0.667 −0.6 −0.625 1 1 1 −1.6 −1.667 −1.6 6   −0.625  1  −1.625   0.385 −0.615 1 7   −0.615  1  −1.615   0.381 −0.619 1 8   −0.619  1  1.619   0.382 −0.618 1 9   −0.618  1  −1.618   0.382 −0.618 1 −1.625 −1.615 −1.619 −1.618 1 We conclude that the root of p(x) closest to zero is −1.618 ≈= −0.618. The actual root is 12 (1 − 44. The companion matrix of p(x) = x3 − 2x2 + 1 is  5 C(p) = 1 0  −1 −1 0 0 , 1 0 √ 5). 322 CHAPTER 4. EIGENVALUES AND EIGENVECTORS   1 so we apply the inverse power method to C(p) − 5I starting with x0 = y0 = 1, calculating xk by 1 solving Axk = yk−1 at each step. We get k xk yk mk 0 1     1 −2.333 1 −0.667 1 −0.333     1 1 1 0.286 1 0.143 1 −2.333 2 3 4       −3.762 −3.909 −3.918 −0.810 −0.825 −0.826 −0.190 −0.175 −0.174       1 1 1 0.215 0.211 0.211 0.051 0.045 0.044 5   −3.919 −0.826 −0.174   1 0.211 0.044 6   −3.919 −0.826 −0.174   1 0.211 0.044 −3.762 −3.919 −3.919 −3.909 −3.918 1 We conclude that the root of p(x) closest to zero is 5 + −3.919 ≈= 4.745. 45. First, A−αI is invertible, since α is an eigenvalue of A precisely when A−αI is not invertible. Let x be an eigenvector for λ. Then Ax = λx, so that Ax − αx = λx − αx, and therefore (A − αI)x = (λ − α)x. Now left multiply both sides of this equation by (A − αI)−1 , to get x = (A − αI)−1 (λ − α)x, so that 1 x = (A − αI)−1 x. λ−α But the last equation says exactly that λ − α is an eigenvalue of (A − αI)−1 with eigenvector x. 46. If λ is dominant, then |λ| > |γ| for all other eigenvalues γ. But this means that the algebraic multiplicity of λ is 1, since it appears only once in the list of eigenvalues listed with multiplicity, so its geometric multiplicity is 1 and thus its eigenspace is one-dimensional. 47. Using rows, the Gerschgorin disks are: Center 1, radius 1 + 0 = 1, 1 1 Center 4, radius + = 1, 2 2 Center 5, radius 1 + 0 = 1. Using columns, they are 1 3 +1= , 2 2 Center 4, radius 1 + 0 = 1, 1 1 Center 5, radius 0 + = : 2 2 Center 1, radius From this diagram, we see that A has a real eigenvalue between 0 and 2. 4.5. ITERATIVE METHODS FOR COMPUTING EIGENVALUES 323 48. Using rows, the Gerschgorin disks are: Center 2, radius |−i| + 0 = 1, Center 2i, radius |1| + |1 + i| = 1 + √ 2, Center − 2i, radius 0 + 1 = 1. Using columns, they are Center 2, radius 1 + 0 = 1, Center 2i, radius |−i| + 1 = 2, √ Center − 2i, radius 0 + |1 + i| = 2 : From this diagram, we conclude that A has an eigenvalue within one unit of −2i. 49. Using rows, the Gerschgorin disks are: Center 4 − 3i, radius |i| + 2 + 2 = 5, Center − 1 + i, radius |i| + 0 + 0 = 1, Center 5 + 6i, radius |1 + i| + |−i| + |2i| = 3 + Center − 5 − 5i, radius 1 + |−2i| + |2i| = 5. √ Using columns, they are Center 4 − 3i, radius |i| + |1 + i| + 1 = 2 + Center − 1 + i, Center 5 + 6i, radius |i| + |−i| + |−2i| = 4, radius 2 + 0 + |2i| = 4 Center − 5 − 5i, radius |−2| + 0 + |2i| = 4. √ 2, 2 324 CHAPTER 4. EIGENVALUES AND EIGENVECTORS From this diagram, we conclude that A has an eigenvalue within 1 unit of −1 + i. 50. Using rows, the Gerschgorin disks are: 1 , 2 1 1 1 Center 4, radius + = , 4 4 2 1 1 2 Center 6, radius + = 6 6 3 1 Center 8, radius . 8 Center 2, radius Using columns, they are 1 , 4 1 1 1 Center 4, radius + = , 2 6 3 1 1 3 Center 6, radius + = 4 8 8 1 Center 8, radius . 6 Center 2, radius Since the circles are all disjoint, we conclude that A has four real eigenvalues, near 2, 4, 6, and 8. 51. Let A be a strictly diagonally dominant square matrix. Then inP particular, none of the diagonal entries of A can be zero, since each row has the property that |aii | > j6=i |aij |. Therefore each Gerschgorin disk is centered at a nonzero point, and its radius is less than the absolute value of its center, so the disk does not contain the origin. Since every eigenvalue is contained in some Gerschgorin disk, it follows that 0 is not an eigenvalue, so that the matrix is invertible by Theorem 4.17. 4.5. ITERATIVE METHODS FOR COMPUTING EIGENVALUES 325 52. Let λ be any eigenvalue of A. Then by Theorem 4.29, λ is contained in the Gerschgorin disk for some row i, so that X |λ − aii | ≤ |aij | . j6=i By the Triangle Inequality, |λ| ≤ |λ − aii | + |aii |, so that |λ| ≤ |λ − aii | + |aii | ≤ |aii | + X |aij | = j6=i n X |aij | ≤ kAk , j=1 since kAk is the maximum of the possible row sums. 53. A stochastic matrix is a square matrix whose entries are probability vectors, which means that the sum of the entries in any column of A is 1, so the sum of the entries of any row in AT is 1, so that AT = 1. By Exercise 52, any eigenvalue of AT must satisfy |λ| ≤ AT = 1. But by Exercise 19 in Section 4.3, the eigenvalues of A and of AT are the same, so that if λ is any eigenvalue of A, then |λ| ≤ 1. 54. Using rows, the Gerschgorin disks are: Center 0, radius 1, Center 5, radius 2, 1 1 Center 3, radius + = 1 2 2 3 Center 7, radius . 4 Using columns, they are Center 0, radius 2 + 1 5 = , 2 2 Center 5, radius 1, 3 Center 3, radius 4 1 Center 7, radius . 2 From the row diagram, we see that there is exactly one eigenvalue in the circle of radius 1 around 0, so it must be real and therefore there is exactly one eigenvalue in the interval [−1, 1]. From the column diagram, conclude that there is exactly one eigenvalue in the interval [4, 6] and exactly we15similarly one in 13 2 , 2 . Since the matrix has real entries, the fourth eigenvalue must be real as well. From the row diagram, since the circle around zero contains exactly one eigenvalue, the fourth eigenvalue must be in the interval 2, 7 34 . But it cannot be larger than 3 43 since then, from the column diagram, it would either 326 CHAPTER 4. EIGENVALUES AND EIGENVECTORS not bein a disk or would be in an isolated disk already containing an eigenvalue. So it must lie in the range 2, 3 34 . In summary, the four eigenvalues are each in a different one of the four intervals [−1, 1], [2, 3.75], [4, 6], [6.5, 7.5]. The actual values are ≈ −0.372, ≈ 2.908, ≈ 5.372, and ≈ 7.092. 4.6 Applications and the Perron-Frobenius Theorem 0 1 regular. 1. Since 2 1 1 = 0 0 0 0 , it follows that any power of 1 1 1 contains zeroes. So this matrix is not 0 2. Since the given matrix is upper triangular, any of its powers is upper triangular as well, so will contain a zero. Thus this matrix is not regular. 3. Since " #2 " 7 1 = 92 0 9 1 3 2 3 1 3 2 3 # , the square of the given matrix has all positive entries, so this matrix is regular. 4. We have  1 2 1 2 0  1 2 1 2 0  1 2 1 2 0 2  1 0 1 4  1 0 0 =  4 1 1 0 2 3  1 0 1 2  1 0 0 =  2 1 0 0 4  1 0 1 2  1 0 0 =  2 1 0 0 1 0 0 0 0 1 0 0 1  1 2 1 2 0  1 1 4  1 0  4 1 0 2  5 1 8  1 0  8 1 0 4 1 0 0 1 2 1 2 0     1 5 2 8   1 =  18 2 1 0 4  1 2 1 2 1 4 1 4 1 2 0 1 9 4 16   5 1 =  16 4 1 1 2 8 1 4 1 4 1 2  5 8 1 8 1 4 so that the fourth power of this matrix contains all positive entries, so this matrix is regular. 5. If A and B are any 3 × 3 matrices with a12 = a32 = b12 = b32 = 0, then (AB)12 = a11 b12 + a12 b22 + a13 b32 = 0 + 0 + 0 = 0, (AB)32 = a31 b12 + a32 b22 + a33 b32 = 0 + 0 + 0 = 0, so that AB has the same property. Thus any power of the given matrix A has this property, so no power of A consists of positive entries, so A is not regular. 6. Since the third row of the matrix is all zero, the third row of any power will also be all zero, so this matrix is not regular. 7. The transition matrix P has characteristic equation det(P − λI) = 1 3 −λ 2 3 1 6 5 6 −λ = 1 (λ − 1)(6λ − 1), 6 so its eigenvalues are λ1 = 1 and λ2 = 16 . Thus P has two distinct eigenvalues, so is diagonalizable. Now, this matrix is regular, so the remark at the end of the proof of Theorem 4.33 shows that the 4.6. APPLICATIONS AND THE PERRON-FROBENIUS THEOREM 327 vector x which forms the columns of L is a probability vector parallel to an eigenvector for λ = 1. To compute this, we row-reduce P − I: " # " # h i 1 1 − 23 0 1 − 0 6 4 → P −I 0 = 2 − 16 0 0 0 0 3 so that the corresponding eigenspace is 1 E1 = span . 4 t Thus the columns of L are of the form where t + 4t = 5t = 1, so that t = 15 . Therefore 4t # " L= 1 5 4 5 1 5 4 5 . 8. The transition matrix P has characteristic equation det(P − λI) = 1 2 −λ 1 2 0 1 3 1 2 −λ 1 6 1 6 1 3 1 2 −λ =− 1 (λ − 1)(36λ2 − 18λ + 1). 36 So one eigenvalue is 1; the roots of the quadratic are distinct, so that P has three distinct eigenvalues, so is diagonalizable. Now, this matrix is regular (its square has all positive entries), so the remark at the end of the proof of Theorem 4.33 shows that the vector x which forms the columns of L is a probability vector parallel to an eigenvector for λ = 1. To compute this, we row-reduce P − I:     1 1 0 − 12 1 0 − 73 0 h i 3 6     1 P − I 0 =  12 − 21 0 → 0 1 −3 0 3 1 − 12 0 0 0 0 0 0 6 so that the corresponding eigenspace is   7 E1 = span 9 . 3   7t 1 Thus the columns of L are of the form 9t where 7t + 9t + 3t = 19t = 1, so that t = 19 . Therefore 3t   7 7 7 1  9 9 9 . L= 19 3 3 3 9. The transition matrix P has characteristic equation det(P − λI) = 0.2 − λ 0.6 0.2 0.3 0.1 − λ 0.6 0.4 = −(λ − 1)(λ2 + 0.5λ + 0.08) 0.4 0.2 − λ So one eigenvalue is 1; the roots of the quadratic are distinct (and complex), so that P has three distinct eigenvalues, so is diagonalizable. Now, this matrix is regular, so the remark at the end of the 328 CHAPTER 4. EIGENVALUES AND EIGENVECTORS proof of Theorem 4.33 shows that the vector x which forms the columns of L is a probability vector parallel to an eigenvector for λ = 1. To compute this, we row-reduce P − I:     1 0 −0.889 0 −0.8 0.3 0.4 0 h i     0.4 0 → 0 1 −1.037 0 P − I 0 =  0.6 −0.9 0.2 0.6 −0.8 0 0 0 0 0 so that the corresponding eigenspace is   0.889 E1 = span 1.037 . 1   0.889t Thus the columns of L are of the form 1.037t where 0.889t + 1.037t + t = 2.926t = 1, so that 1t 1 t = 2.926 . Therefore   0.304 0.304 0.304 L = 0.354 0.354 0.354 . 0.341 0.341 0.341 (If the matrix   is converted to fractions and the computation done that way, the resulting eigenvector 24 will be 28, which must then be converted to a unit vector.) 27 10. Suppose that the sequence of iterates xk approach both u and v, and choose > 0. Choose k such that |xk − u| < 2 and |xk − v| < 2 . Then |u − v| = |(u − xk ) + (xk − v)| ≤ |u − xk | + |xk − v| < , so that |u − v| is less than any positive . So it must be zero; that is, we must have u = v. 11. The characteristic polynomial is det(L − λI) = −λ 0.5 2 = λ2 − 1 = (λ + 1)(λ − 1), −λ so the positive eigenvalue is 1. To find a corresponding eigenvector, row-reduce −1 2 0 1 −2 0 L−I 0 = → . 0.5 −1 0 0 0 0 2 Thus E1 = span . 1 12. The characteristic polynomial is det(L − λI) = 1−λ 0.5 1.5 = λ2 − λ − 0.75 = (λ − 1.5)(λ + 0.5), −λ so the positive eigenvalue is 1.5. To find a corresponding eigenvector, row-reduce −0.5 1.5 0 1 −3 0 L − 1.5I 0 = → . 0.5 −1.5 0 0 0 0 3 Thus E1.5 = span . 1 4.6. APPLICATIONS AND THE PERRON-FROBENIUS THEOREM 329 13. The characteristic polynomial is −λ det(L − λI) = 0.5 0 7 −λ 0.5 4 1 0 = −λ3 + 3.5λ + 1 = − (λ − 2)(2λ2 + 4λ + 1), 2 −λ so 2 is a positive eigenvalue. The other two roots are −4 ± 4 √ 8 , both of which are negative. To find an eigenvector for 2, row-reduce L − 2I  −2 0 = 0.5 0 7 4 −2 0 0.5 −2   0 1 0 → 0 0 0 0 1 0 −16 −4 0  0 0 . 0   16 Thus E2 = span  4 . 1 14. The characteristic polynomial is 1−λ det(L − λI) = 1 3 5 −λ 0 2 3 3 2 1 5 0 = −x3 + x2 + x + = − (x − 2)(3x2 + 3x + 1) 3 3 3 −λ so 2 is a positive eigenvalue. The other two eigenvalues are complex. To find an eigenvector for 2, row-reduce     1 0 −18 0 −1 5 3 0 h i     L − 2I 0 =  13 −2 0 0 → 0 1 −3 0 . 2 0 −2 0 0 0 0 0 3   18 Thus E2 = span  3 . 1 15. Since the eigenvector corresponding to the positive eigenvalue λ1 for the Leslie matrix represents a stable ratio of population classes, one that proportion is reached, if λ1 > 1, the population will grow without bound; if λ1 = 1, it will be stable, and if λ1 < 1 it will decrease and eventually die out. 16. From the Leslie matrix in Equation (3), we have b1 − λ s1 0 det(L − λI) = 0 .. . 0 b2 −λ s2 0 .. . b3 0 −λ s3 .. . ··· ··· ··· ··· .. . bn−1 0 0 0 .. . bn 0 0 0 . .. . 0 0 ··· sn−1 −λ For n = 1, this is just b1 − λ, which is equal to cL (λ) = (−1)1 (λ1 − b1 λ0 ) = b1 − λ. 330 CHAPTER 4. EIGENVALUES AND EIGENVECTORS This forms the basis for the induction. Now assume the statement holds for n − 1, and we prove it for n. Expand the above determinant along the last column, giving s1 0 n+1 det(L − λI) = (−1) bn 0 .. . −λ s2 0 .. . 0 −λ s3 .. . ··· ··· ··· .. . 0 0 0 .. . 0 0 0 ··· sn−1 b1 − λ s1 0 + (−λ) 0 .. . 0 b2 −λ s2 0 .. . b3 0 −λ s3 .. . ··· ··· ··· ··· .. . bn−2 0 0 0 .. . bn−1 0 0 0 . .. . 0 0 ··· sn−2 0 The first matrix is upper triangular, so its determinant is the product of its diagonal entries; for the second, we use the inductive hypothesis. So det(L − λI) = (−1)n+1 bn (s1 s2 · · · sn−1 ) − λ (−1)n−1 (λn−1 − b1 λn−2 − b2 s1 λn−3 − · · · −bn−2 s1 s2 · · · sn−3 λ − bn−1 s1 s2 · · · sn−2 ) n+1 = (−1) bn (s1 s2 · · · sn−1 ) + (−1)n (λn − b1 λn−1 − b2 s1 λn−2 − · · · − bn−2 s1 s2 · · · sn−3 λ2 − bn−1 s1 s2 · · · sn−2 λ) = (−1)n (λn − b1 λn−1 − b2 s1 λn−2 − · · · − bn−2 s1 s2 · · · sn−3 λ2 − bn−1 s1 s2 · · · sn−2 λ − bn s1 s2 · · · sn−1 ), proving the result. 17. P is a diagonal matrix, so inverting it consists of inverting each of the diagonal entries. Now, multiplying any matrix on the left by a diagonal matrix multiplies each row by the corresponding diagonal entry, and multiplying on the right by a diagonal matrix multiplies each column by the corresponding diagonal entry, so   b1 b2 b3 · · · bn−1 bn   0 ··· 0 0 1 0   0 1 0 ··· 0 0 s1   −1 P L = 0 0 1 ··· 0 0   s1 s2 . .. .. .. ..  .. .  . . . . . . 1 0 0 0 · · · s1 s2 ···s 0 n−2 and then  b1  1  0  −1 P LP =  0 . . . 0 b2 s1 0 1 0 .. . 0 b3 s1 s2 0 0 1 .. . 0 ··· ··· ··· ··· .. . ··· bn−1 s1 s2 · · · sn−2 0 0 0 .. . 1  bn s1 s2 · · · sn−1  0    0  . 0   ..  .  0 But this is the companion matrix of the polynomial p(x) = xn − b1 xn−1 − b2 s1 xn−2 − · · · − bn−1 s1 s2 · · · sn−2 x − bn s1 s2 · · · sn−1 , and by Exercise 32 in Section 4.3, the characteristic polynomial of P −1 LP is (−1)n p(λ). Finally, the characteristic polynomials of L and P −1 LP are the same since they have the same eigenvalues, so the characteristic polynomial of L is det(L − λI) = det(P −1 LP − λI) = (−1)n p(λ) = (−1)n (λn − b1 λn−1 − b2 s1 λn−2 − · · · − bn−1 s1 s2 · · · sn−2 λ − bn s1 s2 · · · sn−1 ) 4.6. APPLICATIONS AND THE PERRON-FROBENIUS THEOREM 331 18. By Exercise 46 in Section 4.4, the eigenvectors of P −1 LP are of the form P −1 x where x is an eigenvalue of L, and by Exercise 32 in Section 4.3, an eigenvector of P −1 LP for λ1 is  n−1  λ1 λn−2   1n−3  λ   1   . .  ..     λ1  1 So an eigenvector of L corresponding to λ1 is   n−1   λ1n−1 λ1  λn−2   s1 λn−2 1   1n−3   n−3  λ   s1 s2 λ1   1   P . = . . ..   ..        λ1   s1 s2 · · · sn−2 λ1  s1 s2 · · · sn−2 sn−1 1 Finally, we can scale this vector by dividing each entry by λn−1 to get 1   1   s /λ 1 1   2   s s /λ 1 2 1    . ..   .    s1 s2 · · · sn−2 /λn−2  1 n−1 s1 s2 · · · sn−2 sn−1 /λ1 19. Using a CAS, we find the positive eigenvalue of  1 L = 0.7 0 1 0 0.5  3 0 0 to be λ ≈ 1.7456, so this is the steady state growth rate of the population with this Leslie matrix. From Exercise 18, a corresponding eigenvector is     1 1   ≈ 0.401 . 0.7/1.7456 0.115 0.7 · 0.5/(1.7456)2 So the long-term proportions of the age classes are in the ratios 1 : 0.401 : 0.115. 20. Using a CAS, we find the positive eigenvalue of  0 0.5 L= 0 0 1 0 0.7 0  2 5 0 0  0 0 0.3 0 to be λ ≈ 1.2023, so this is the steady state growth rate of the population with this Leslie matrix. From Exercise 18, a corresponding eigenvector is     1 1   0.416 0.5/1.2023      0.5 · 0.7/(1.2023)2  ≈ 0.242 0.5 · 0.7 · 0.3/(1.2023)3 0.060 So the long-term proportions of the age classes are in the ratios 1 : 0.416 : 0.242 : 0.06. 332 CHAPTER 4. EIGENVALUES AND EIGENVECTORS 21. Using a CAS, we find the positive eigenvalue of  0 0.4 1.8 0.3 0 0   0 0.7 0  0 0.9 L= 0 0 0 0  0 0 0 0 0 0 1.8 1.8 1.6 0 0 0 0 0 0 0 0 0 0.9 0 0 0 0.9 0 0 0 0.6  0.6 0  0  0 . 0  0 0 to be λ ≈ 1.0924, so this is the steady state growth rate of the population with this Leslie matrix. From Exercise 18, a corresponding eigenvector is     1 1  0.275  0.3/1.0924     2   0.176 0.3 · 0.7/(1.0924)       ≈ 0.145 . 0.3 · 0.7 · 0.9/(1.0924)3       0.119 0.3 · 0.7 · 0.9 · 0.9/(1.0924)4      0.3 · 0.7 · 0.9 · 0.9 · 0.9/(1.0924)5  0.098 0.054 0.3 · 0.7 · 0.9 · 0.9 · 0.9 · 0.6/(1.0924)6 The entries of this vector give the long-term proportions of the age classes. 22. (a) The table gives us 13 separate age classes, each of duration two years; from the table, the Leslie matrix of the system is   0 0.02 0.70 1.53 1.67 1.65 1.56 1.45 1.22 0.91 0.70 0.22 0 0.91 0 0 0 0 0 0 0 0 0 0 0 0    0 0.88 0 0 0 0 0 0 0 0 0 0 0    0 0 0.85 0 0 0 0 0 0 0 0 0 0    0 0 0 0.80 0 0 0 0 0 0 0 0 0    0 0 0 0 0.74 0 0 0 0 0 0 0 0   0 0 0 0 0.67 0 0 0 0 0 0 0 L= .  0   0 0 0 0 0 0 0.59 0 0 0 0 0 0    0 0 0 0 0 0 0 0.49 0 0 0 0 0    0 0 0 0 0 0 0 0 0.38 0 0 0 0    0 0 0 0 0 0 0 0 0 0.27 0 0 0    0 0 0 0 0 0 0 0 0 0 0.17 0 0 0 0 0 0 0 0 0 0 0 0 0 0.15 0 Using a CAS, this matrix has positive eigenvalue ≈ 1.333, with a corresponding eigenvector (normalized to add to 100)   36.1  24.65     16.27     10.38     6.23     3.46     1.74  .    0.77     0.28     0.08     0.016     0.0021  0.00023 4.6. APPLICATIONS AND THE PERRON-FROBENIUS THEOREM 333 (b) In the long run, the percentage of seals in each age group is given by the entries in the eigenvector in part (a), and the growth rate will be about 1.333. 23. (a) Each term in r is the number of daughters born to a female in a particular age class times the probability of a given individual surviving to be in that age class, so it represents the expected number of daughters produced by a given individual while she is in that age class. So the sum of the terms gives the expected number of daughters ever born to a particular female over her lifetime. (b) Define g(λ) as suggested. Then g(λ) = 1 if and only if λn g(λ) = b1 λn−1 + b2 s1 λn−2 + · · · + bn s1 s2 · · · sn−1 = λn n λ − b1 λ n−1 n−2 − b2 s1 λ ⇒ − · · · − bn s1 s2 · · · sn−1 = 0. But from Exercise 16, this is true if and only if λ is a zero of the characteristic polynomial of the Leslie matrix, indicating that λ is an eigenvalue of L. So r = g(1) = 1 if and only if λ = 1 is an eigenvalue of L, as desired. (c) For given values of bi and si , it is clear that g(λ) is a decreasing function of λ. Also, r = g(1) and g(λ1 ) = 1. So if λ1 > 1, then r = g(1) > g(λ1 ) = 1 so that r > 1. On the other hand, if λ1 < 1, then r = g(1) < g(λ1 ) = 1, so that r < 1. So we have shown that if the population is decreasing, then r < 1; if the population is increasing, then r > 1, and (part (b)) if the population is stable, then r = 1. Since this exhausts the possibilities, we have proven the converses as well. Thus r < 1 if and only if the population is decreasing and r > 1 if and only if the population is increasing. Note that this corresponds to our intuition, since if the net reproduction rate is less than one, we would expect the population in the next generation to be smaller. 24. If λ1 is the unique positive eigenvalue, then λ1 x = Lx for the corresponding eigenvector x. If h is the 1 1 x. Thus λ1 = 1−h , so that sustainable harvest ratio, then (1 − h)Lx = x, which means that Lx = 1−h 1 h = 1 − λ1 . 25. (a) Exercise 21 found the positive eigenvalue of L to be λ1 ≈ 1.0924, so from Exercise 24, h = 1− λ11 ≈ 1 ≈ 0.0846. 1 − 1.0924 (b) Reducing the initial population levels by 0.0846 gives     10 9.15  2   1.831       8   7.323         x00 = (1 − 0.0846)   5  =  4.577  . 12 10.985     0  0  1 0.915 Then  0 0.3  0  0 0 x1 = Lx0 =  0 0  0 0 0.4 0 0.7 0 0 0 0 1.8 1.8 1.8 1.6 0 0 0 0 0 0 0 0 0.9 0 0 0 0 0.9 0 0 0 0 0.9 0 0 0 0 0.6     0.6 9.15 42.47     0   1.831   2.75      0   7.323   1.28       0   4.577  =  6.59  .     0  10.985  4.12   0   0   9.89  0 0.915 0 The population has still grown substantially, because we did not start with a stable population 334 CHAPTER 4. EIGENVALUES AND EIGENVECTORS vector. If we repeat with the stable population ratios computed in Exercise 21, we get     0.915 1 0.275 0.252     0.176 0.161        x00 = (1 − 0.0846)  0.145 = 0.133 , 0.119 0.109     0.098 0.090 0.049 0.054 and  0 0.3  0  x01 = Lx00 =  0 0  0 0 0.4 0 0.7 0 0 0 0 1.8 1.8 1.8 1.6 0 0 0 0 0 0 0 0 0.9 0 0 0 0 0.9 0 0 0 0 0.9 0 0 0 0 0.6     0.999 0.6 0.915     0  0.252 0.275 0.161 0.176 0         0  0.133 = 0.145 , 0.109 0.119 0     0  0.090 0.098 0.053 0.049 0 which is essentially x0 . 1 ≈ 0.25, which is about one quarter of the population. 26. Since λ1 ≈ 1.333, we have h = 1 − λ11 = 1 − 1.333 27. Suppose λ1 = r1 (cos 0 + i sin 0); then r1 = λ1 > 0 since λ1 is a positive eigenvalue. Let λ = r(cos θ + i sin θ) be some other eigenvalue. Then g(λ) = g(λ1 ) = 1. Computing g(λ) gives b1 b2 s1 bn s1 s2 · · · sn−1 + + ··· + n r(cos θ + i sin θ) r2 (cos θ + i sin θ)2 r (cos θ + i sin θ)n b2 s1 b2 s1 s2 · · · sn−1 b1 (cos θ − i sin θ)n , = (cos θ − i sin θ) + 2 (cos θ − i sin θ)2 + · · · + r r rn 1 = g(λ) = k θ−i sin θ) where we get to the final equation by multiplying each fraction by (cos = 1 and using the (cos θ−i sin θ)k identity (cos θ + i sin θ)(cos θ − i sin θ) = cos2 θ + sin2 θ = 1. Now apply DeMoivre’s theorem to get 1 = g(λ) = b2 s1 b2 s1 s2 · · · sn−1 b1 (cos θ − i sin θ) + 2 (cos 2θ − i sin 2θ) + · · · + (cos nθ − i sin nθ). r r rn Since g(λ) = 1, the imaginary parts must all cancel, so we are left with 1 = g(λ) = b1 b2 s1 b2 s1 s2 · · · sn−1 cos θ + 2 cos 2θ + · · · + cos nθ. r r rn But cos kθ ≤ 1 for all k, and bi , si , and r are all nonnegative, so 1 = g(λ) ≤ b1 b2 s1 b2 s1 s2 · · · sn−1 + 2 + ··· + . r r rn Since g(λ1 ) = 1 as well, we get b1 b2 s1 b2 s1 s2 · · · sn−1 b1 b2 s1 b2 s1 s2 · · · sn−1 + 2 + ··· + ≤ + 2 + ··· + . n r1 r1 r1 r r rn Then since r1 > 0, we get r1 ≥ r. 28. The characteristic polynomial of A is det(A − λI) = 2−λ 1 0 = (λ − 2)(λ − 1), 1−λ 4.6. APPLICATIONS AND THE PERRON-FROBENIUS THEOREM 335 so that the Perron root is λ1 = 2. To find its Perron eigenvector, 0 0 0 1 −1 0 A − 2I 0 = → , 1 −1 0 0 0 0 1 so that is an eigenvector for λ1 = 2. The entries of the Perron eigenvector must sum to 1, however, 1 " # so dividing through by 2 gives for the Perron eigenvector 1 2 1 2 . 29. The characteristic polynomial of A is det(A − λI) = 1−λ 2 3 = (1 − λ)(−λ) − 6 = λ2 − λ − 6 = (λ − 3)(λ + 2), −λ so that the Perron root is λ1 = 3. To find its Perron eigenvector, " # " h i −2 3 0 1 − 23 → A − 3I 0 = 2 −3 0 0 0 # 0 , 0 3 so that is an eigenvector for λ1 = 3. The entries of the Perron eigenvector must sum to 1, however, 2 " # so dividing through by 5 gives for the Perron eigenvector 3 5 2 5 . 30. The characteristic polynomial of A is det(A − λI) = −λ 1 1 1 −λ 1 1 1 = −(λ + 1)2 (λ − 2), −λ so that the Perron root is λ1 = 2. To find its Perron eigenvector,    1 0 −2 1 1 0 h i    1 0 → 0 1 A − 2I 0 =  1 −2 1 1 −2 0 0 0 −1 −1 0  0  0 , 0   1 so that 1 is an eigenvector for λ1 = 2. The entries of the Perron eigenvector must sum to 1, however, 1   1 3   so dividing through by 3 gives for the Perron eigenvector  13 . 1 3 31. The characteristic polynomial of A is det(A − λI) = 2−λ 1 1 1 1 1−λ 0 = −λ(λ − 1)(λ − 3), 0 1−λ so that the Perron root is λ1 = 3. To find its Perron eigenvector,    −1 1 1 0 1 0 h i    A − 3I 0 =  1 −2 0 0 → 0 1 1 0 −2 0 0 0 −2 −1 0  0  0 , 0 336 CHAPTER 4. EIGENVALUES AND EIGENVECTORS   2 so that 1 is an eigenvector for λ1 = 3. The entries of the Perron eigenvector must sum to 1, however, 1   1 2   so dividing through by 4 gives for the Perron eigenvector  14 . 1 4 32. Here n = 4, and  1  0 (I + A)n−1 =  0 1 0 1 1 0 3  1 0 3 1  = 1 0 3 1 1 0 1 0 3 1 3 1  1 3  > O, 3 1 3 1 1 3 so A is irreducible. 33. Here n = 4, and  1  0 (I + A)n−1 =  1 1 0 1 0 1 1 1 1 0 3  0 4 6 1  = 4 0 1 6 0 4 0 4 4 6 4 6  0 4 , 0 4 so A is reducible. Interchanging rows 1 and 4 and then columns 1 and 4 produces a matrix in the required form, so that       0 1 0 1 0 0 0 1 0 0 0 1 0 1 0 0 0 1 0 0 1 0 1 0       0 0 1 0 A 0 0 1 0 = 0 0 0 1 . 0 0 1 0 1 0 0 0 1 0 0 0 34. Here n = 5, and  1 0  (I + A)n−1 =  1 0 1 1 1 0 0 0 0 1 2 1 0 0 0 0 2 0 4  11 0 22 1    1  = 28 23  0 6 1 6 11 16 7 6 11 17 23 33 5 0 0 0 16 0  11 17  22 , 18 6 so that A is reducible. Exchanging rows 1 and 4 and then exchanging columns 1 and 4 produces a matrix in the required form, so that       0 0 0 1 0 0 0 0 1 0 1 0 1 0 0 0 1 0 0 0 0 1 0 0 0 0 0 1 0 1       0 0 1 0 0 A 0 0 1 0 0 = 0 0 1 1 1 ,       1 0 0 0 0 1 0 0 0 0 0 1 0 0 0 0 0 0 0 1 0 0 0 0 1 0 0 0 1 0 and B is a 1 × 1 matrix while D is 4 × 4. 35. Here n = 5, and  1 0  (I + A)n−1 =  1 0 0 so that A is irreducible. 1 1 0 0 0 0 0 1 1 0 0 0 0 1 1 4  0 1 1 1    1  = 5  6 0 2 5 4 1 6 4 1 1 5 6 5 11  5 11 11 16  12 21 , 6 12 16 22 4.6. APPLICATIONS AND THE PERRON-FROBENIUS THEOREM 337 36. (a) Let G be a graph and A its adjacency matrix. Let G0 be the graph obtained from G by adding an edge from each vertex to itself if no such edge exists in G. Then G0 is connected if and only if G is, since every vertex is always connected to itself. The adjacency matrix A0 of G results from A by replacing every 0 on the diagonal by a 1. Now consider the matrix A + I. This matrix is identical to A0 except that some diagonal entries of A0 equal to 1 may be equal to 2 in A + I (if the vertex in question had an edge from it to itself in G). But since all entries in A + I and A0 are nonnegative, (A0 )n−1 > O if and only (A + I)n−1 > O — whether those diagonal entries are 1 or 2 does not affect whether a particular entry in the power is zero or not. So G is connected if and only if G0 is connected, which is the case if and only if (A0 )n−1 of its adjacency matrix is > O, so that there is a path between any two vertices of length n − 1 (note that any shorter path can be made into a path of length n − 1 by adding traverses of the edge from the vertex to itself). But (A0 )n−1 > O then means that (A + I)n−1 > O, so that A is irreducible. (b) Since all the graphs in Section 4.0 are connected, they all have irreducible adjacency matrices. The graph in Figure 4.1 has a primitive adjacency matrix, since 2    3 2 2 2 0 1 1 1 2 3 2 2 1 0 1 1    A2 =  1 1 0 1 = 2 2 3 2 > O. 2 2 2 3 1 1 1 0 So does the graph in Figure 4.2:  0 1 0 0 1 0 1 0 1 0 1 0 0 0 0 1  0 1 0 1 0 0 0 0  0 0 1 0 1 0 0 0  1 0 0 1 0 1 0 0 4 A = 0 0 0 0 1 0 0 1  1 0 0 0 0 0 0 0  0 1 0 0 0 1 0 0  0 0 1 0 0 1 1 0 0 0 0 1 0 0 1 1 0 0 1 0 0 1 1 0 0 0 4  15 0 4 0   9 0   9 1    0  =4 9 0   4 1   9 1   9 0 9 0 4 15 4 9 9 9 9 4 9 9 9 4 15 4 9 9 9 9 4 9 9 9 4 15 4 9 9 9 9 4 The graph in Figure 4.3 has a primitive adjacency matrix: 4   6 1 4 0 1 0 0 1 1 6 1 1 0 1 0 0      A4 =  0 1 0 1 0 = 4 1 6 4 4 1 0 0 1 0 1 1 4 4 1 0 0 1 0 4 4 1 6 1 4 9 9 4 15 4 9 9 9 9 9 9 9 9 4 15 9 4 4 9 4 9 9 9 9 9 15 9 4 4 9 4 9 9 9 4 9 15 9 4 9 9 4 9 9 4 4 9 15 9  9 9  9  4  9 >O 9  4  4  9 15  1 4  4  > O. 1 6 However, the adjacency matrix for the graph is Figure 4.4 is   0 0 0 1 1 1 0 0 0 1 1 1   0 0 0 1 1 1  , A=  1 1 1 0 0 0 1 1 1 0 0 0 1 1 1 0 0 0 and any power of A will always have a 3 × 3 block of zeroes in either the upper left or upper right, so A is not primitive. 37. (a) Suppose G is bipartite with disjoint vertex sets U and V and A is its adjacency matrix. Recall that (Ak )ij gives the number of paths between vertices i and j. Also, from Exercise 80 in Section 3.7, if u ∈ U and v ∈ V , then any path from u to itself will have even length, and any path from 338 CHAPTER 4. EIGENVALUES AND EIGENVECTORS u to v will have odd length, since in any path, each successive edge moves you from U to V and back. So odd powers of A will have zero entries along the diagonal, while even powers of A will have zero entries in any cell corresponding to a vertex in U and a vertex in V . Since every power of A has a zero entry, A is not primitive. (b) Number the vertices so that v1 , v2 , . . . , vk are in U and vk+1 , vk+2 , . . . , vn are in V . Then the adjacency matrix of G is O B A= . BT O v Write the eigenvector v corresponding to λ as 1 , where v1 is a k-element column vector and v2 v2 is an (n − k)-element column vector. Then λv1 T Av = λv ⇐⇒ Bv2 B v1 = , λv2 so that Bv2 = λv1 and B T v1 = λv2 . Then −Bv2 v1 −λv1 −v1 −λ = = = A . −v2 λv2 v2 B T v1 This shows that −λ is an eigenvalue of A with corresponding eigenvector −v1 . v2 38. (a) Since each vertex of G connects to k other vertices, we see that each column of the adjacency matrix A of G sums to k. Thus A = kP , where P is a matrix each of whose columns sums to 1. So P is a transition matrix, and by Theorem 4.30, 1 is an eigenvalue of P , so that k is an eigenvalue of kP = A. (b) If A is primitive, then Ar > O for some r, and therefore P r > O as well. Therefore P is regular, so by Theorem 4.31, every other eigenvalue of P has absolute value strictly less than 1, so that every other eigenvalue of A = kP has absolute value strictly less than k. 39. This exercise is left to the reader after doing the exploration suggested in Section 4.0. 40. (a) Let A = [aij ]. Then cA = [caij ], so that |cA| = [|caij |] = [|c| · |aij |] = |c| [|aij |] = |c| · |A| . (b) Using the Triangle Inequality, |A + B| = [|aij + bij |] ≤ [|aij | + |bij |] = [|aij |] + [|bij |] = |A| + |B| . (c) Ax is the n × 1 matrix whose ith entry is ai1 x1 + ai2 x2 + · · · + ain xn . Thus the i th entry of |Ax| is |(Ax)i | = |ai1 x1 + ai2 x2 + · · · + ain xn | ≤ |ai1 x1 | + |ai2 x2 | + · · · + |ain xn | = |ai1 | |x1 | + |ai2 | |x2 | + · · · + |ain | |xn | . But this is just the ith entry of |A| |x|. (d) The ij entry of |AB| is |AB|ij = |ai1 b1j + ai2 b2j + · · · + ain bnj | ≤ |ai1 b1j | + |ai2 b2j | + · · · + |ain bnj | = |ai1 | |b1j | + |ai2 | |b2j | + · · · + |ain | |bnj | , which is the ij entry of |A| |B|. 4.6. APPLICATIONS AND THE PERRON-FROBENIUS THEOREM 339 41. A matrix is reducible if, after permuting rows and columns according to the same permutation, it can be written B C , O D where B and D are square and O is the zero matrix. If A is 2 × 2, then B, C, O, and D must all be 1 × 1 matrices (which are therefore just the entries of A). If A is in the above form after applying no permutations, leaving A alone, then a21 = 0. If A is in this form after applying the only other possible permutation — exchanging the two rows and the two columns — then the resulting matrix is a22 a21 , a12 a11 so that a12 = 0. This proves the result. 42. (a) If λ1 and v1 are the Perron eigenvalue and eigenvector respectively, then by Theorem 4.37, λ1 > 0 and v1 > 0. By Exercise 22 in Section 4.3, λ1 − 1 is an eigenvalue, with eigenvector v1 , for A − I, and therefore 1 − λ1 is an eigenvalue for I − A, again with eigenvector v1 . Since I − A is 1 is an eigenvalue for (I − A)−1 , again with eigenvector v1 . invertible, Theorem 4.18 says that 1−λ 1 But (I − A)−1 ≥ 0 and v1 ≥ 0, so that 1 v1 = (I − A)−1 v1 ≥ 0. 1 − λ1 1 ≥ 0 and thus λ1 < 1. Since we also have λ1 > 0, the result follows: 0 < λ1 < 1. It follows that 1−λ 1 (b) Since 0 < λ1 < 1, v1 > λ1 v1 = Av1 . 43. x0 = 1, x1 = 2x0 = 2, x2 = 2x1 = 4, x3 = 2x2 = 8, x4 = 2x3 = 16, x5 = 2x4 = 32. 44. a1 = 128, a2 = a21 = 64, a3 = a22 = 32, a4 = a23 = 16, a5 = a24 = 8. 45. y0 = 0, y1 = 1, y2 = y1 − y0 = 1, y3 = y2 − y1 = 0, y4 = y3 − y2 = −1, y5 = y4 − y3 = −1. 46. b0 = 1, b1 = 1, b2 = 2b1 + b0 = 3, b3 = 2b2 + b1 = 7, b4 = 2b3 + b2 = 17, b5 = 2b4 + b3 = 41. 47. The recurrence is xn −3xn−1 −4xn−2 = 0, so the characteristic equation is λ2 −3λ−4 = (λ−4)(λ+1) = 0. Thus the system has eigenvalues −1 and 4, so xn = c1 (−1)n + c2 · 4n , where c1 and c2 are constants. Using the initial conditions, we have 0 = x0 = c1 (−1)0 + c2 · 40 = c1 + c2 , 5 = x1 = c1 (−1)1 + c2 · 41 = −c1 + 4c2 . Add the two equations to solve for c2 , then substitute back in to get c1 = −1 and c2 = 1, so that xn = −(−1)n + 4n = 4n + (−1)n−1 . 48. The recurrence is xn −4xn−1 +3xn−2 = 0, so the characteristic equation is λ2 −4λ+3 = (λ−3)(λ−1) = 0. Thus the system has eigenvalues 1 and 3, so xn = c1 · 1n + c2 · 3n , where c1 and c2 are constants. Using the initial conditions, we have 0 = x0 = c1 · 10 + c2 · 30 = c1 + c2 , 1 = x1 = c1 · 11 + c2 · 31 = c1 + 3c2 . Subtract the two equations to solve for c2 , then substitute back in to get c1 = − 21 and c2 = 21 , so that xn = − 12 · 1n + 12 · 3n = 12 (3n − 1). 49. The recurrence is yn − 4yn−1 + 4yn−2 = 0, so the characteristic equation is λ2 − 4λ + 4 = (λ − 2)2 = 0. Thus the system has the double eigenvalue 2, so yn = c1 · 2n + c2 · n2n , where c1 and c2 are constants. Using the initial conditions, we have 1 = y1 = c1 · 21 + c2 · 1 · 21 = 2c1 + 2c2 , 6 = y2 = c1 · 22 + c2 · 2 · 22 = 4c1 + 8c2 . Solving the system gives c1 = − 12 and c2 = 1, so that yn = − 21 · 2n + n · 2n = n2n − 2n−1 . 340 CHAPTER 4. EIGENVALUES AND EIGENVECTORS 50. The recurrence is an −an−1 + 14 an−2 = 0, so the characteristic equation is 14 (4λ2 −4λ+1) = 14 (2λ−1)2 = 0. Thus the system has the double eigenvalue 12 = 2−1 , so an = c1 · 2−n + c2 · n2−n , where c1 and c2 are constants. Using the initial conditions, we have 4 = a0 = c1 · 2−0 + c2 · 0 · 2−0 = c1 , 1 1 1 = a1 = c1 · 2−1 + c2 · 1 · 2−1 = c1 + c2 . 2 2 Solving the system gives c1 = 4 and c2 = −2, so that an = 4 · 2−n − 2n · 2−n = 22−n − n21−n . 51. The recurrence is bn − 2bn−1 − 2bn−2 = 0, so the characteristic equation is λ2 − 2λ − 2 = 0. Using √ 3, which are therefore the eigenvalues. So the quadratic√formula, this equation has roots λ = 1 ± √ bn = c1 · (1 + 3)n + c2 · (1 − 3)n , where c1 and c2 are constants. Using the initial conditions, we have √ √ 0 = b0 = c1 · (1 + 3)0 + c2 · (1 − 3)0 = c1 + c2 , √ √ √ 1 = b1 = c1 · (1 + 3)1 + c2 · (1 − 3)1 = c1 + c2 + (c1 − c2 ) 3. √ √ From the first equation, c1 + c2 = 0, so the second becomes 1 = (c1 − c2 ) 3, so that c1 − c2 = 33 . √ √ Since c2 = −c1 , we get c1 = 63 and c2 = − 63 , so that the recurrence is √ √ √ √ n √ n √ √ 3 3 3 (1 + 3) − (1 − 3) = (1 + 3)n − (1 − 3)n . bn = 6 6 6 52. The recurrence is yn − yn−1 + yn−2 = 0, so the characteristic equation is λ2 − λ + 1 = 0. Using the √ 1 quadratic formula, this equation has roots λ = 2 (1 ± 3 i), which are therefore the eigenvalues. So √ n √ n yn = c1 · 21 (1 + 3 i) +c2 · 12 (1 − 3 i) , where c1 and c2 are constants. Using the initial conditions, we have 0 0 √ √ 1 1 0 = y0 = c1 · (1 + 3 i) + c2 · (1 − 3 i) = c1 + c2 , 2 2 1 1 √ √ √ √ 1 1 1 1 1 = y1 = c1 · (1 + 3 i) + c2 · (1 − 3 i) = c1 (1 + 3 i) + c2 (1 − 3 i). 2 2 2 2 √ √ From the first equation, c1 + c2 = 0, so the second becomes 2 = (c1 − c2 ) 3 i, so that c1 − c2 = − 2 3 3 i. √ √ Since c2 = −c1 , we get c1 = − 33 i and c2 = 63 , so that the recurrence is √ n √ n √ √ 3 1 3 1 i (1 + 3 i) + i (1 − 3 i) . yn = − 3 2 3 2 53. Working with the matrix A from the Theorem, since it has distinct eigenvalues λ1 6= λ2 , it can be diagonalized, so that there is some invertible P such that e f λ1 0 −1 P = , P AP = D = . g h 0 λ2 First consider the case where one of the eigenvalues, say λ2 , is zero. In this case the matrix must be of the form a 0 A= , 1 0 so that the other eigenvalue is a. In this case, the recurrence relation is xn = axn−1 + 0xn−2 = axn−1 , so if we are given xj and xj−1 as initial conditions, we have xn = an−j xj . So take c1 = xj a−j and c2 = 0; then xn = c1 λn1 + c2 λn2 as desired. Now suppose that neither eigenvalue is zero. Then 1 1 e f λk1 0 h −f ehλk1 − f gλk2 −ef λk1 + ef λk2 Ak = P Dk P −1 = = . 0 λk2 −g e eh − f g g h eh − f g ghλk1 − ghλk2 −f gλk1 + ehλk2 4.6. APPLICATIONS AND THE PERRON-FROBENIUS THEOREM 341 If we are given two terms of the sequence, say xj−1 and xj , then for any n we have xn xn = = An−j xj xn−1 1 xj + ef λn−j −ef λn−j − f gλn−j ehλn−j 2 1 2 1 = xj−1 eh − f g ghλ1n−j − ghλn−j + ehλn−j −f gλn−j 2 1 2   xj−1 xj + −ef λn−j + ef λn−j ehλn−j − f gλn−j 1 1 2 1 2   = n−j n−j n−j eh − f g x x + −f gλ + ehλ ghλn−j − ghλ j−1 j 1 2 1 2   −j λn (ehxj − ef xj−1 ) λ1 λn1 + (−f gxj + ef xj−1 ) λ−j 1 2 2.  = eh − f g λn2 λn1 + (−ghxj + ehxj−1 ) λ−j (ghxj − f gxj−1 ) λ−j 2 1 Looking at the first row, we see that for c1 = ehxj − ef xj−1 −j λ1 , eh − f g c2 = −f gxj + ef xj−1 −j λ2 , eh − f g c1 and c2 are independent of n, and we have xn = c1 λn1 + c2 λn2 , as required. 54. (a) First suppose that the eigenvalues are distinct, i.e., that λ1 6= λ2 . Then Exercise 53 gives one formula for c1 and c2 involving the diagonalizing matrix P . But since we now know that c1 and c2 exist, we can derive a simpler formula. We know the solutions are of the form xn = c1 λn1 + c2 λn2 , valid for all n. Since x0 = r and x1 = s, we get the following linear system in c1 and c2 : λ01 c1 + λ02 c2 = r or c1 + λ11 c1 + λ12 c2 = s c2 = r λ1 c1 + λ2 c2 = s. Solving this system, for example by setting up the augmented matrix and row-reducing, gives a unique solution s − λ1 r s − λ1 r c1 = r − , c2 = . λ2 − λ1 λ2 − λ1 So the system has a unique solution in terms of x0 = r, x1 = s, λ1 , and λ2 . Note that the system has a unique solution (and these expressions make sense) since λ1 6= λ2 . Next suppose that the eigenvalues are the same, say λ1 = λ2 = λ. Then the solutions are of the form xn = c1 λn + c2 nλn . Note that we must have λ 6= 0, since otherwise the characteristic polynomial of the recurrence matrix is λ2 , so the recurrence relation is xn = 0. With x0 = r and x1 = s, we get the following linear system in c1 and c2 : λ0 c1 + 0 · λ0 c2 = r 1 1 λ c1 + 1 · λ c2 = s or c1 =r λc1 + λc2 = s. Thus c1 = r, and substituting into the second equation gives (since λ 6= 0) c2 = s−λr λ . So in this case as well we can find the scalars c1 and c2 uniquely. Finally, note that 0 is an eigenvalue of the recurrence matrix only when b = 0 in the recurrence relation xn = axn−1 + bxn−2 . (b) From part (a), the general solution to the recurrence is s − λ1 r s − λ1 r n n n x n = c1 λ 1 + c2 λ 2 = r − λ1 + λn2 . λ2 − λ1 λ2 − λ1 Substituting x0 = r = 0 and x1 = s = 1 gives 1 − λ1 · 0 1 − λ1 · 0 1 1 xn = 0 − λn1 + λn2 = (−λn1 + λn2 ) = (λn − λn2 ). λ2 − λ1 λ2 − λ1 λ2 − λ1 λ1 − λ2 1 342 CHAPTER 4. EIGENVALUES AND EIGENVECTORS 55. (a) For n = 1, since f0 = 0 and f1 = f2 = 1, the equation holds: 1 1 f f1 A1 = A = = 2 . 1 0 f1 f0 Now assume that the equation is true for n = k, and consider Ak+1 : f fk 1 1 f + fk fk+1 f Ak+1 = Ak A = k+1 = k+1 = k+2 fk fk−1 1 0 fk + fk−1 fk fk+1 fk+1 fk as desired. n (b) Since det A = −1, it follows that det(An ) = (det A) = (−1)n . But from part (a), 2 det(An ) = fn+2 fn − fn+1 = (−1)n . (c) If we look at the 5 × 13 “rectangle” as being composed of two “triangles”, we see that in fact they are not triangles: the “diagonal” of the “rectangle” is not straight because the 8 × 3 triangular 5 6= 38 . So there is empty space along the pieces are not similar to the entire triangles, since 13 diagonal of the figure that adds up to the missing 1 square unit. 56. (a) t1 = 1 (the only way is using the 1 × 1 rectangle); t2 = 3 (two 1 × 1 rectangles or either of the 1 × 2 rectangles); t3 = 5; t4 = 11; t5 = 21. For t0 , we can think of that as the number of ways to tile an empty rectangle; there is only one way — using no rectangles. (b) The rightmost rectangle in a tiling of a 1 × n rectangle is either the 1 × 1 or one of the 1 × 2 rectangles. So the number of tilings of a 1 × n rectangle is the number of tilings of a 1 × (n − 1) rectangle (adding the 1 × 1) plus twice the number of tilings of a 1 × (n − 2) rectangle (adding either 1 × 2). So the recurrence relation is tn = tn−1 + 2tn−2 . (c) The characteristic equation is λ2 − λ − 2 = (λ − 2)(λ + 1), so the eigenvalues are −1 and 2. To find the constants, we solve t1 = 1 = c1 (−1)1 + c2 · 21 = −c1 + 2c2 t2 = 3 = c1 (−1)2 + c2 · 22 = c1 + 4c2 . This gives c1 = 31 and c2 = 23 , so that tn = 2 1 1 (−1)n + · 2n = (−1)n + 2n+1 . 3 3 3 Evaluating for various values of t gives 1 (−1)0 + 2 = 1, 3 1 t2 = (−1)2 + 23 = 3, 3 1 t4 = (−1)4 + 25 = 11, 3 t0 = 1 (−1)1 + 22 = 1, 3 1 t3 = (−1)3 + 24 = 5, 3 1 t5 = (−1)5 + 26 = 21. 3 t1 = These agree with part (a). 57. (a) d1 = 1, d2 = 2, d3 = 3, d4 = 5, d5 = 8. As in Exercise 56, d0 can be interpreted as the number of ways to cover a 2 × 0 rectangle, which is one — use no dominos. (b) The right end of any tiling of a 2 × n rectangle either consists of a domino laid vertically, or of two dominos each laid horizontally. Removing those leaves either a 2 × (n − 1) or a 2 × (n − 2) rectangle. So the number of tilings of a 2 × n rectangle is equal to the number of tilings of a 2 × (n − 1) rectangle plus the number of tilings of a 2 × (n − 2) rectangle, or dn = dn−1 + dn−2 . 4.6. APPLICATIONS AND THE PERRON-FROBENIUS THEOREM 343 (c) We recognize the recurrence as the recurrence for the Fibonacci sequence; the initial values shift it by 1, so we get √ !n+1 √ !n+1 1 1 1+ 5 1− 5 dn = √ −√ . 2 2 5 5 Computing values using this formula gives the same values as in part (a), including the value for d0 . 58. Row-reducing gives h h √ A − 1+2 5 I √ A − 1−2 5 I " √ i 1 − 1+2 5 0 = 1 " √ i 1 − 1−2 5 0 = 1 So the eigenvectors are " v1 = √ # 1+ 5 2 1 " 1 √ → 1+ 5 0 − 2 " # 0√ 1 → 1− 5 0 − 2 # 0 " and v2 = √ # −1− 5 2 0 √ # −1+ 5 2 . 0 √ # 1− 5 2 . 1 Then √ k √ k 1+ 5 1− 5 √1 − 2 2   5 √ k−1   √ k−1 1+ 5 1− 5 1 1 √ − √5 2 2 5  xk = lim k→∞ λk k→∞ 1 lim √1 5  √ k 1+ 5 2 √ k 1−√5   1+ 5 = lim  √ k  . √ k→∞ 2√ √1 · √5 − √15 · 1−2 5 1− 5 1+ 5 1+ 5  Since √ 1−√5 1+ 5 √1 − √1 5 5  ≈ 0.381 < 1, the terms involving k th powers of this fraction go to zero as k → ∞, leaving us with (in the limit) " 1 √ # 5 √ 2 √ 5(1+ 5) 2 √ =√ 5(1 + 5) " √ # 1+ 5 2 =√ 1 √ 2 5− 5 √ v1 = v1 , 10 5(1 + 5) which is a constant times v1 . 59. The coefficient matrix has characteristic polynomial 1−λ 3 = (1 − λ)(2 − λ) − 6 = λ2 − 3λ − 4 = (λ − 4)(λ + 1), 2 2−λ so that the eigenvalues are λ1 = 4 and λ2 = −1. The corresponding eigenvectors are found by rowreduction: −3 3 0 1 −1 0 1 A − 4I 0 = → ⇒ v1 = 2 −2 0 0 0 0 1 3 2 3 0 3 1 2 0 A+I 0 = → ⇒ v2 = . 2 3 0 −2 0 0 0 Then by Theorem 4.40, x = C1 e4t 1 3 + C2 e−t . 1 −2 344 CHAPTER 4. EIGENVALUES AND EIGENVECTORS Substitute t = 0 to determine the constants: 0 1 3 C1 + 3C2 x(0) = = C1 e0 + C 2 e0 = . 5 1 −2 C1 − 2C2 These two equations give C1 = 3 and C2 = −1, so that the solution is x = 3e4t x(t) = 3e4t − 3e−t 1 3 − e−t , or 1 −2 y(t) = 3e4t + 2e−t . 60. The coefficient matrix has characteristic polynomial 2−λ −1 = (2 − λ)2 − 1 = λ2 − 4λ + 3 = (λ − 1)(λ − 3), −1 2−λ so that the eigenvalues are λ1 = 1 and λ2 = 3. The corresponding eigenvectors are found by rowreduction: 1 −1 0 1 −1 0 1 A−I 0 = → ⇒ v1 = −1 1 0 0 0 0 1 −1 −1 0 1 1 0 1 A − 3I 0 = → ⇒ v2 = . −1 −1 0 0 0 0 −1 Then by Theorem 4.40, 1 1 3t x = C1 e + C2 e . 1 −1 t Substitute t = 0 to determine the constants: 1 1 C1 + C2 0 1 0 x(0) = = C1 e + C2 e = . 1 1 −1 C1 − C2 These two equations give C1 = 1 and C2 = 0, so that the solution is x(t) = et 1 1 3t x=e +0·e , or 1 −1 y(t) = et . t 61. The coefficient matrix has characteristic polynomial √ √ 1−λ 1 = (1 − λ)(−1 − λ) − 1 = λ2 − 2 = (λ − 2 )(λ + 2 ), 1 −1 − λ √ √ so that the eigenvalues are λ1 = 2 and λ2 = − 2. The corresponding eigenvectors are found by row-reduction: √ √ √ √ 1− 2 1√ 0 1 −1 − 2 0 1+ 2 → ⇒ v1 = A − 2I 0 = 0 0 0 1 1 −1 − 2 0 √ √ √ √ 1+ 2 1√ 0 1 −1 + 2 0 1− 2 → ⇒ v2 = . A + 2I 0 = 0 0 0 1 1 −1 + 2 0 Then by Theorem 4.40, √ x = C1 e 2 t √ √ √ 1+ 2 1− 2 + C 2 e− 2 t . 1 1 Substitute t = 0 to determine the constants: √ √ √ √ 1 1+ 2 1− 2 C1 (1 + 2) + C2 (1 − 2) x(0) = = C1 e0 + C 2 e0 = . 0 1 1 C1 + C2 4.6. APPLICATIONS AND THE PERRON-FROBENIUS THEOREM 345 √ The second equation gives C1 + C2 = 0; substituting into √the first equation gives (C1 − C2 ) 2 = 1, so √ √ that C1 − C2 = 22 . It follows that C1 = 42 and C2 = − 42 , so that the solution is √ √ 2 + 2 √2 t 2 − 2 −√2 t √ √ √ √ x1 (t) = e + e 2 √2 t 1 + 2 2 −√2 t 1 − 2 4 x= − , or e e √ 4 √ 1 1 4 4 2 √2 t − e− 2 t . x2 (t) = e 4 62. The coefficient matrix has characteristic polynomial 1−λ −1 = (1 − λ)2 + 1 = λ2 − 2λ + 2, 1 1−λ so that the eigenvalues are λ1 = 1 + i and λ2 = 1 − i. The corresponding eigenvectors are found by row-reduction: −i −1 0 1 −i 0 i A − (1 + i)I 0 = → ⇒ v1 = 1 −i 0 0 0 0 1 i −1 0 1 i 0 −i A − (1 − i)I 0 = → ⇒ v2 = . 1 i 0 0 0 0 1 Then by Theorem 4.40, y = C1 e(1+i)t i −i + C2 e(1−i)t . 1 1 Substitute t = 0 to determine the constants: 1 (C1 − C2 )i 0 i 0 −i y(0) = = C1 e + C2 e = . 1 1 1 C1 + C2 1+i Solving gives C1 = 1−i 2 , C1 = 2 , so that the solution is 1 − i (1+i)t i 1 + i (1−i)t −i y= e + e , 1 1 2 2 or 1 + i (1−i)t i + 1 (1+i)t 1 − i (1−i)t 1 − i (1+i)t e i+ e · (−i) = e + e 2 2 2 2 1 − i (1+i)t 1 + i (1−i)t y2 (t) = e + e . 2 2 y1 (t) = Now substitute cos t + i sin t for eit , giving i + 1 (1+i)t 1 − i (1−i)t e + e 2 2 i+1 t 1−i t = e (cos t + i sin t) + e (cos t − i sin t) 2 2 i − 1 −i − 1 t = et cos t + + e sin t 2 2 y1 (t) = = et (cos t − sin t) 1 − i (1+i)t 1 + i (1−i)t y2 (t) = e + e 2 2 1−i t 1+i t = e (cos t + i sin t) + e (cos t − i sin t) 2 2 1+i 1−i t = et cos t + + e sin t 2 2 = et (cos t + sin t). 346 CHAPTER 4. EIGENVALUES AND EIGENVECTORS 63. The coefficient matrix has characteristic polynomial   −λ 1 −1  1 −λ 1  = −λ3 + λ = −λ(1 − λ)(1 + λ), 1 1 −λ so that the eigenvalues are λ1 = 0, λ2 = 1, and λ3 = −1. The corresponding eigenvectors are found by row-reduction:       0 1 −1 0 1 0 1 0 −1 1 0 → 0 1 −1 0 ⇒ v1 =  1 A − 0I 0 = 1 0 1 1 0 0 0 0 0 0 1       1 0 0 0 0 −1 1 −1 0 1 0 → 0 1 −1 0 ⇒ v2 = 1 A − I 0 =  1 −1 1 1 −1 0 0 0 0 0 1       1 1 0 0 −1 1 1 −1 0 1 0 → 0 0 1 0 ⇒ v2 =  1 . A + I 0 = 1 1 1 1 1 0 0 0 0 0 0 Then by Theorem 4.40,       0 −1 −1 x = C1 e0t  1 + C2 et 1 + C3 e−t  1 . 1 0 1 Substitute t = 0 to determine the constants:           1 −C1 − C3 −1 0 −1 x(0) =  0 = C1 e0  1 + C2 e0 1 + C3 e0  1 = C1 + C2 + C3  −1 1 1 0 C1 + C2 Solving gives C1 = −2, C3 = 1, C2 = 1, so that the solution is       −1 0 −1 x = −2  1 + et 1 + e−t  1 1 1 0 or x(t) = 2 − e−t y(t) = −2 + et + e−t z(t) = −2 + et . 64. The coefficient matrix has characteristic polynomial   1−λ 0 3  1 −2 − λ 1  = −(λ − 4)(λ + 2)2 3 0 −λ so that the eigenvalues are λ1 = 4 and λ2 = −2 (with algebraic multiplicity 2). The corresponding eigenvectors are found by row-reduction:       1 0 −1 0 −3 0 3 0 3 1 0 → 0 1 − 13 0 ⇒ v1 = 1 A − 4I 0 =  1 −6 3 0 −3 0 3 0 0 0 0         3 0 3 0 1 0 1 0 0 −1 A + 2I 0 = 1 0 1 0 → 0 0 0 0 ⇒ v2 = 1 , v3 =  0 . 3 0 3 0 0 0 0 0 0 1 4.6. APPLICATIONS AND THE PERRON-FROBENIUS THEOREM 347 Then by Theorem 4.40,       3 0 −1 x = C1 e4t 1 + C2 e−2t 1 + C3 e−2t  0 . 3 0 1 Substitute t = 0 to determine the constants:           2 3 0 −1 3C1 − C3 x(0) = 3 = C1 e0 1 + C2 e0 1 + C3 e0  0 =  C1 + C2  4 3 0 1 3C1 + C3 Solving gives C1 = 1, C2 = 2, C3 = 1, so that the solution is       3 0 −1 x = e4t 1 + 2e−2t 1 + e−2t  0 3 0 1 or x(t) = 3e4t − e−2t y(t) = e4t + 2e−2t z(t) = 3e4t + e−2t . 65. (a) The coefficient matrix has characteristic polynomial 1.2 − λ −0.2 = (1.2 − λ)(1.5 − λ) − 0.04 = λ2 − 2.7λ + 1.76 = (λ − 1.1)(λ − 1.6) −0.2 1.5 − λ so that the eigenvalues are λ1 = 1.1 and λ2 = 1.6. The corresponding eigenvectors are found by row-reduction: 0.1 −0.2 0 1 −2 0 2 A − 1.1I 0 = → ⇒ v1 = 0.2 0.4 0 0 0 0 1 1 −0.4 −0.2 0 −1 1 2 0 A − 1.6I 0 = → ⇒ v2 = . −0.2 −0.1 0 2 0 0 0 Then by Theorem 4.40, x = C1 e 1.1t 2 1.6t −1 + C2 e . 1 2 Substitute t = 0 to determine the constants: 400 2 −1 2C1 − C2 x(0) = = C 1 e0 + C2 e0 = . 500 1 2 C1 + 2C2 Solving gives C1 = 260, C2 = 120, so that the solution is 1.1t 2 1.6t −1 x = 260e + 120e 1 2 or x(t) = 520e1.1t − 120e1.6t y(t) = 260e1.1t + 240e1.6t . A plot of the populations over time is 348 CHAPTER 4. EIGENVALUES AND EIGENVECTORS Population 15 000 X 10 000 5000 Y 0 1.0 0.5 1.5 2.0 2.5 3.0 Time HdaysL From the graph, the population of Y dies out after about three days. (b) Substituting a and b for 400 and 500 in the initial value computation above gives a 2 −1 2C1 − C2 x(0) = = C 1 e0 + C2 e0 = . b 1 2 C1 + 2C2 Solving gives C1 = 51 (2a + b), C2 = 15 (−a + 2b), so that the solution is 1 1 1.1t 2 1.6t −1 x = (2a + b)e + (−a + 2b)e 1 2 5 5 or 1 2 x(t) = (2a + b)e1.1t − (−a + 2b)e1.6t 5 5 1 2 y(t) = (2a + b)e1.1t + (−a + 2b)e1.6t . 5 5 In the long run, the e1.6t term will dominate, so if a > 2b, then −a + 2b < 0, so the coefficient of e1.6t in x(t) is positive and in y(t) is negative, so that Y will die out and X will dominate. If a < 2b, then −a + 2b > 0, so the reverse will happen — X will die out and Y will dominate. Finally, if a = 2b, then the e1.6t term disappears in the solutions, and both populations will grow at a rate proportional to e1.1t , though one will grow twice as fast as the other. 66. The coefficient matrix has characteristic polynomial −0.8 − λ 0.4 = (−0.8 − λ)(−0.2 − λ) − 0.16 = λ2 + λ = λ(λ + 1) 0.4 −0.2 − λ so that the eigenvalues are λ1 = 0 and λ2 = −1. The corresponding eigenvectors are found by rowreduction: −0.8 0.4 0 1 1 − 12 0 A − 0I 0 = ⇒ v1 = → 2 0.4 −0.2 0 0 0 0 0.2 0.4 0 1 2 0 −2 A+I 0 = → ⇒ v2 = . 0.4 0.8 0 0 0 0 1 Then by Theorem 4.40, 1 −t −2 x = C1 e + C2 e . 2 1 0t Substitute t = 0 to determine the constants: 15 C1 − 2C2 0 1 0 −2 x(0) = = C1 e + C2 e = . 10 2 1 2C1 + C2 Solving gives C1 = 7, C2 = −4, so that the solution is 1 −2 x=7 − 4e−t 2 1 or x(t) = 7 + 8e−t y(t) = 14 − 4e−t . As t → ∞, the exponential terms go to zero, and the populations stabilize at 7 of X and 14 of Y . 4.6. APPLICATIONS AND THE PERRON-FROBENIUS THEOREM 349 67. We want to convert x0 = Ax + b into an equation u0 = Au. If we could write b = Ab0 , then letting u = x + b0 would do the trick. To find b0 , we invert A: # #" # " " 1 −10 − 12 −30 0 −1 2 = . b =A b= 1 1 −20 −10 2 2 −10 Then let u = x + , so that −20 0 −10 −10 −10 u0 = x + = x0 = Ax + b = Ax + AA−1 = A x + A−1 = Au. −20 −20 −20 So now we want to solve this equation. A has characteristic polynomial 1−λ 1 = (1 − λ)2 + 1 = λ2 − 2λ + 2, −1 1−λ so the eigenvalues are 1 + i and 1 − i. Eigenvectors are found by row-reducing: −i 1 0 1 i 0 1 A − (1 + i)I 0 = → ⇒ v1 = −1 −i 0 0 0 0 i 1 −i 0 i i 1 0 A − (1 − i)I 0 = → ⇒ v2 = . −1 i 0 0 0 0 1 Then by Theorem 4.40, u = C1 e (1+i)t 1 (1−i)t i + C2 e . i 1 Substitute t = 0 to determine the constants: −10 10 C1 + iC2 0 1 0 i u(0) = x(0) + = = C1 e + C2 e = . −20 10 i 1 iC1 + C2 Solving gives C1 = C2 = 5 − 5i, so that the solution is 1 i u = (5 − 5i)e(1+i)t + (5 − 5i)e(1−i)t i 1 1 i = (5 − 5i)et (cos t + i sin t) + (5 − 5i)et (cos t − i sin t) i 1 (5 − 5i) cos t + (5 + 5i) sin t + (5 + 5i) cos t + (5 − 5i) sin t = et (5 + 5i) cos t + (−5 + 5i) sin t + (5 − 5i) cos t + (−5 − 5i) sin t 10 cos t + 10 sin t = et . 10 cos t − 10 sin t Rewriting in terms of the original variables gives −10 10 t 10 cos t + 10 sin t t 10 cos t + 10 sin t x=e − =e + . 10 cos t − 10 sin t −20 10 cos t − 10 sin t 20 A plot of the populations is below. Both populations die out; species Y after about 1.2 time units and species X after about 2.4. Population 60 X 50 40 30 20 Y 10 0 0.0 0.5 1.0 1.5 2.0 2.5 3.0 Time HdaysL 350 CHAPTER 4. EIGENVALUES AND EIGENVECTORS 68. We want to convert x0 = Ax + b into an equation u0 = Au. If we could write b = Ab0 , then letting u = x + b0 would do the trick. To find b0 , we invert A: # #" # " " −20 0 − 12 − 12 0 −1 = . b =A b= 1 −20 − 12 40 2 −20 Then let u = x + , so that −20 0 −20 0 0 0 0 −1 −1 u = x+ = x = Ax + b = Ax + AA =A x+A = Au. −20 40 40 So now we want to solve this equation. A has characteristic polynomial −1 − λ 1 = (−1 − λ)2 + 1 = λ2 + 2λ + 2, −1 −1 − λ so the eigenvalues are −1 + i and −1 − i. Eigenvectors are found by row-reducing: −i 1 0 1 i 0 1 A − (−1 + i)I 0 = → ⇒ v1 = −1 −i 0 0 0 0 i i 1 0 1 −i 0 i A − (−1 − i)I 0 = → ⇒ v2 = . −1 i 0 0 0 0 1 Then by Theorem 4.40, u = C1 e (−1+i)t 1 (−1−i)t i + C2 e . i 1 Substitute t = 0 to determine the constants: −20 −10 1 i C1 + iC2 u(0) = x(0) + = = C1 e0 + C 2 e0 = . −20 10 i 1 iC1 + C2 Solving gives C1 = −5 − 5i and C2 = 5 + 5i, so that the solution is 1 i u = (−5 − 5i)e(−1+i)t + (5 + 5i)e(−1−i)t i 1 1 i = (−5 − 5i)e−t (cos t + i sin t) + (5 + 5i)e−t (cos t − i sin t) i 1 (−5 − 5i) cos t + (5 − 5i) sin t + (−5 + 5i) cos t + (5 + 5i) sin t = e−t (5 − 5i) cos t + (5 + 5i) sin t + (5 + 5i) cos t + (5 − 5i) sin t −10 cos t + 10 sin t = e−t . 10 cos t + 10 sin t Rewriting in terms of the original variables gives −20 20 −t −10 cos t + 10 sin t −t −10 cos t + 10 sin t x=e − =e + . 10 cos t + 10 sin t −20 10 cos t + 10 sin t 20 A plot of the populations is below; both populations stabilize at 20 as the e−t term goes to zero. Population 30 Y 25 20 X 15 10 5 0 0 1 2 3 4 Time HdaysL 5 4.6. APPLICATIONS AND THE PERRON-FROBENIUS THEOREM 351 69. (a) Making the change of variables y = x0 and z = x, the original equation x00 + ax0 + bx = 0 becomes y 0 + ay + bz = 0; since z = x, we have z 0 = x0 = y. So we get the system y 0 = −ay − bz z0 = y (b) The coefficient matrix of the system in (a) is −a 1 . −b , 0 whose characteristic polynomial is (−a − λ)(−λ) + b = λ2 + aλ + b. 70. Make the substitutions y1 = x(n−1) , y2 = x(n−2) , . . . , yn−1 = x0 , yn = x. Then the original equation x(n) + an−1 x(n−1) + · · · + a1 x0 + a0 = 0 can be written as y10 + an−1 y1 + an−2 y2 + · · · + a2 yn−2 + a1 yn−1 + a0 yn = 0, and for i = 2, 3, . . . , n, we have yi0 = yi−1 . So we get the n equations y10 = an−1 y1 − an−2 y2 − · · · − a2 yn−2 − a1 yn−1 − a0 yn y20 = y1 y30 = y2 .. . yn0 = yn−1 whose coefficient matrix is  −an−1  1   0   ..  . −an−2 0 1 .. . 0 0 · · · −a1 ··· 0 ··· 0 .. .. . . ··· 1  −a0 0   0  . ..  .  0 This is the companion matrix of p(λ). 71. From Exercise 69, we know that the characteristic polynomial is λ2 − 5λ + 6, so the eigenvalues are 2 and 3. Therefore the general solution is x(t) = C1 e2t + C2 e3t . 72. From Exercise 69, we know that the characteristic polynomial is λ2 + 4λ + 3, so the eigenvalues are −1 and −3. Therefore the general solution is x(t) = C1 e−t + C2 e−3t . 73. From Exercise 59, the matrix of the system has eigenvalues 4 and −1 with corresponding eigenvectors 1 3 and . So by Theorem 4.41, the system x0 = Ax has solution x = eAt x(0). In this case 1 −2 0 x(0) = , we get 5 " #" #" #−1 " # 2 4t 3 −t 3 4t 3 −t 4t e + e e − e 1 3 e 0 1 3 eAT = P (et )D P −1 = = 52 4t 25 −t 53 4t 52 −t . 1 −2 0 e−t 1 −2 5e − 5e 5e + 5e Thus the solution is " 2 4t 3 −t 5e + 5e 2 4t 2 −t 5e − 5e 3 4t 3 −t 5e − 5e 3 4t 2 −t 5e + 5e which matches the result from Exercise 59. #" # " # 0 3e4t − 3e−t = , 5 3e4t + 2e−t 352 CHAPTER 4. EIGENVALUES AND EIGENVECTORS 74. From Exercise 60, the matrix of the system has eigenvalues 1 and 3 with corresponding eigenvectors 1 1 and . So by Theorem 4.41, the system x0 = Ax has solution x = eAt x(0). In this case 1 −1 1 x(0) = , we get 1 " e AT t D = P (e ) P −1 1 = 1 #" 1 et −1 0 0 e3t #−1 " 1 t e + 1 e3t 1 = 21 t 21 3t −1 2e − 2e #" 1 1 1 t 1 3t 2e − 2e 1 t 1 3t 2e + 2e # . Thus the solution is " 1 t 1 3t 2e + 2e 1 t 1 3t 2e − 2e 1 t 1 3t 2e − 2e 1 t 1 3t 2e + 2e #" # " # 1 et = t , 1 e which matches the result from Exercise 60. 75. From of the system has eigenvalues 0, 1, and −1 with corresponding eigenvectors  Exercise   63, the matrix  −1 0 −1  1, 1, and  1. So by Theorem 4.41, the system x0 = Ax has solution x = eAt x(0). In this 1 1 0   1 case x(0) =  0, and we get −1    −1 −1 0 −1 1 0 0 −1 0 −1 1 0 et 0  1 1 1 eAT = P (et )D P −1 =  1 1 −t 1 1 0 0 0 e 1 1 0   −t −t 1 1−e −1 + e = −1 + e−t −1 + e−t + et 1 − e−t  . −1 + et −1 + et 1 Thus the solution is  1 −1 + e−t −1 + et 1 − e−t −1 + e−t + et −1 + et     −1 + e−t 1 2 − e−t 1 − e−t   0 = −2 + et + e−t  , −1 1 −2 + et which matches the result from Exercise 63. 76. From Exercise 64, the matrix of the system has  eigenvalues   4 and −2 (with algebraic multiplicity 2) 0 −1  3  with corresponding eigenvectors 1 and 1 ,  0 . So by Theorem 4.41, the system x0 = Ax   3 0  1 2 has solution x = eAt x(0). In this case x(0) = 3, and we get 4  3  AT t D −1 e = P (e ) P = 1 3  0 1 0  −1 e4t  0  0 1 0 1 4t 1 −2t 2e + 2e  1 4t 1 −2t = 6e − 6e 1 4t 1 −2t 2e − 2e 0 e−2t 0 0 e−2t 0  0 3  0  1 −2t e 3 0 1 0  1 4t 1 −2t 2e − 2e 1 4t 1 −2t  . 6e − 6e 1 4t 1 −2t e + e 2 2 −1 −1  0 1 4.6. APPLICATIONS AND THE PERRON-FROBENIUS THEOREM 353 Thus the solution is  1 4t 1 −2t 2e + 2e  1 4t 1 −2t 6e − 6e 1 4t 1 −2t 2e − 2e   x3 = Ax2 = 27 . 27   1 4t 1 −2t 3e4t − e−2t 2 2e − 2e   4t 1 4t 1 −2t     3 = e + 2e−2t  , 6e − 6e 1 4t 1 −2t 3e4t + e−2t 4 2e + 2e 0 e−2t 0 which matches the result from Exercise 64. 2 1 77. (a) With A = , we have 0 3 1 3 9 x1 = A = , x2 = Ax1 = , 1 3 9 25 20 15 10 5 10 5 2 (b) With A = 0 1 , we have 3 1 2 x1 = A = , 0 0 20 15 4 x2 = Ax1 = , 0 25 8 x3 = Ax2 = . 0 1.0 0.5 2 4 6 8 -0.5 -1.0 (c) Since the matrix is upper triangular, we see that its eigenvalues are 2 and 3; since these are both greater than 1 in magnitude, the origin is a repeller. (d) Some other trajectories are 10 5 5 -5 -5 -10 10 354 CHAPTER 4. EIGENVALUES AND EIGENVECTORS 78. (a) With A = 0.5 0 −0.5 , we have 0.5 1 0 x1 = A = , 1 0.5 −0.25 x2 = Ax1 = , 0.25 −0.25 x3 = Ax2 = . 0.125 1.0 0.8 0.6 0.4 0.2 0.2 -0.2 (b) With A = 0.5 0 0.4 0.6 0.8 1.0 −0.5 , we have 0.5 x1 = A 1 0.5 = , 0 0 0.25 , 0 x3 = Ax2 = 0.8 1.2 x2 = Ax1 = 0.125 . 0 0.1 0.2 0.4 0.6 1.0 1.4 -0.1 (c) Since the matrix is upper triangular, we see that its only eigenvalue is 0.5; since this is less than 1 in magnitude, the origin is an attractor. (d) Some other trajectories are 1.0 0.5 -1.0 0.5 -0.5 1.0 -0.5 -1.0 -1.5 -2.0 2 79. (a) With A = −1 −1 , we have 2 x1 = A so that 1 is a fixed point: 1 1 1 = , 1 1 1.5 2.0 4.6. APPLICATIONS AND THE PERRON-FROBENIUS THEOREM 355 2.0 1.5 1.0 0.5 1.0 0.5 (b) With A = 2 −1 2.0 1.5 −1 , we have 2 1 2 x1 = A = , 0 −1 2 5 x2 = Ax1 = , −4 4 6 8 10 14 x3 = Ax2 = . −13 12 14 -2 -4 -6 -8 -10 -12 (c) The characteristic polynomial of the matrix is 2−λ −1 det(A − λI) = = λ2 − 4λ + 3 = (λ − 3)(λ − 1). −1 2−λ Thus one eigenvalue is greater than 1 in magnitude and the other is equal to 1. So the origin is not a repeller, an attractor, or a saddle point; there are no nonzero values of x0 for which the trajectory of x0 approaches the origin. (d) Some other trajectories are 10 5 5 -5 10 -5 -10 −4 80. (a) With A = 1 2 , we have −3 x1 = A 1 −2 = , 1 −2 x2 = Ax1 = 4 , 4 x3 = Ax2 = −8 . −8 356 CHAPTER 4. EIGENVALUES AND EIGENVECTORS 4 H4, 4L 2 H1, 1L -8 -6 -4 2 -2 H-2, -2L 4 -2 -4 -6 H-8, -8L −4 (b) With A = 1 -8 2 , we have −3 1 −4 x1 = A = , 0 1 x2 = Ax1 = 18 , −7 x3 = Ax2 = −86 . 39 40 H-86, 39L 30 20 10 H-4, 1L -80 -60 -40 H1, 0L 20 -20 H18, -7L (c) The characteristic polynomial of the matrix is −4 − λ 2 det(A − λI) = = λ2 + 7λ + 10 = (λ + 5)(λ + 2). 1 −3 − λ Since both eigenvalues, −5 and −2, are greater than 1 in magnitude, the origin is a repeller. (d) Some other trajectories are x3 x2 20 30 x0 =H-1, 1L 50 20 100 150 x1 -20 10 -40 x0 =H2, 1L -80 -60 -40 -20 -60 20 x2 x1 x3 -80 x3 x3 50 100 40 30 20 50 10 x1 x1 x0 =H2, -1L -250 -200 -150 -100 -50 50 x2 100 -60 -40 -20 x0 =H-1, -2L -10 x2 4.6. APPLICATIONS AND THE PERRON-FROBENIUS THEOREM 81. (a) With A = 1.5 −1 357 −1 , we have 0 1 −2 x1 = A = , 1 −2 4 x2 = Ax1 = , 4 1.0 −8 x3 = Ax2 = . −8 H1, 1L 0.5 0.5 1.0 2.0 1.5 3.0 2.5 H1.75, -0.5L -0.5 -1.0 H0.5, -1L -1.5 H3.125, -1.75L (b) With A = 1.5 −1 −1 , we have 0 x1 = A 1 −4 = , 0 1 H1, 0L 1 x2 = Ax1 = 2 3 18 , −7 4 x3 = Ax2 = 5 −86 . 39 6 -0.5 H1.5, -1L -1.0 H3.25, -1.5L -1.5 -2.0 -2.5 -3.0 H6.375, -3.25L (c) The characteristic polynomial of the matrix is 1.5 − λ det(A − λI) = −1 −1 = λ2 − 1.5λ + 1 = (λ − 2)(λ + 0.5) −λ Since one eigenvalue is greater than 1 in magnitude and the other is less, the origin is a saddle point. Eigenvectors for λ = −0.5 are attracted to the origin, and other vectors are repelled. 358 CHAPTER 4. EIGENVALUES AND EIGENVECTORS (d) Some other trajectories are 2.0 5 x3 x0 =H1, 2L 1.5 4 1.0 3 0.5 x2 x2 2 -0.4 0.2 -0.2 0.4 0.6 0.8 1.0 x0 =H-1, 1L x3 x1 1 x1 -0.5 -1.0 -8 -6 -4 -2 1.0 10 5 x0 =H2, -1L 15 x1 0.5 x3 x1 -2 -1.0 -0.8 -0.6 x2 -4 -0.4 0.2 -0.2 x2 -0.5 -1.0 -6 -1.5 0.1 82. (a) With A = 0.5 -2.0 0.9 , we have 0.5 x1 = A so x0 =H-1, -2L x3 -8 1 1 = , 1 1 1 is a fixed point - it is an eigenvector for the eigenvalue λ = 1. 1 2.0 1.5 1.0 H1, 1L 0.5 0.5 (b) With A = 0.1 0.5 1.0 1.5 2.0 0.9 , we have 0.5 x1 = A 1 0.1 = , 0 0.5 x2 = Ax1 = 0.46 , 0.30 x3 = Ax2 = 0.316 . 0.38 0.4 4.6. APPLICATIONS AND THE PERRON-FROBENIUS THEOREM 359 0.5 H0.1, 0.5L 0.4 H0.316, 0.38L 0.3 H0.46, 0.3L 0.2 0.1 H1, 0L 0.2 0.4 0.6 0.8 1.0 (c) The characteristic polynomial of the matrix is det(A − λI) = 0.1 − λ 0.5 0.9 = λ2 − 0.6λ − 0.45 = (λ − 1)(λ + 0.4). 0.5 − λ Since one eigenvalue is less than 1 in magnitude and the other is equal to 1, the origin is neither an attractor, a repeller, nor a saddle point. (d) Some other trajectories are 1.5 5 x0 =H-9, 5L x1 4 x3 x2 3 1.0 x0 =H2, 1L 2 x2 1 0.5 -8 -6 -4 2 -2 x3 -1 0.0 -2.0 0.5 1.0 1.5 -1.5 -1.0 -0.5 x1 2.0 -2 x1 0.5 -0.5 -0.2 -0.5 x3 -0.4 x2 -1.0 -0.6 x1 x3 x2 -1.5 -0.8 x0 =H-1, -2L -2.0 -1.0 x0 =H1, -1L 0.2 0.4 83. (a) With A = , we have −0.2 0.8 x1 = A 1 0.6 = , 1 0.6 x2 = Ax1 = 0.36 , 0.36 x2 = Ax2 = 0.216 . 0.216 1.0 360 CHAPTER 4. EIGENVALUES AND EIGENVECTORS 1.0 0.8 0.6 0.4 0.4 0.6 0.8 1.0 0.2 0.4 (b) With A = , we have −0.2 0.8 x1 = A 1 0.2 = , 0 −0.2 x2 = Ax1 = −0.04 , −0.20 x3 = Ax2 = −0.088 . −0.152 0.2 0.1 0.2 -0.2 0.4 0.6 0.8 1.0 -0.1 -0.2 -0.3 (c) The characteristic polynomial of the matrix is det(A − λI) = 0.4 = λ2 − λ + 0.24 = (λ − 0.4)(λ − 0.6). 0.8 − λ 0.2 − λ −0.2 Since both eigenvalues are smaller than 1 in magnitude, the origin is an attractor. (d) Some other trajectories are 4 2 -2 1 -1 2 -2 -4 84. (a) With A = 0 1.2 −1.5 , we have 3.6 x1 = A 1 −1.5 = , 1 4.8 x2 = Ax1 = −7.2 , 15.48 x2 = Ax2 = −23.22 . 47.088 4.6. APPLICATIONS AND THE PERRON-FROBENIUS THEOREM 361 40 30 20 10 -20 (b) With A = 0 1.2 -10 -15 -5 −1.5 , we have 3.6 x1 = A 1 0 = , 0 1.2 x2 = Ax1 = −1.8 , 4.32 −6.48 . 13.392 x3 = Ax2 = 12 10 8 6 4 2 -6 -4 -5 -3 -2 1 -1 (c) The characteristic polynomial of the matrix is −λ −1.5 det(A − λI) = = λ2 − 3.6λ + 1.8 = (λ − 3)(λ − 0.6). 1.2 3.6 − λ Since one eigenvalue is smaller than 1 in magnitude and the other is greater, the origin is a saddle point; the origin attracts some initial points and repels others. (d) Some other trajectories are 6 4 2 -4 2 -2 4 -2 -4 85. With the given matrix A, we have r = " r A= 0 √ #" 0 cos θ r sin θ a 2 + b2 = √ 2 and θ = π4 , so that # "√ − sin θ 2 = cos θ 0 #" √1 0 2 √ 2 √12 − √12 √1 2 # . 362 CHAPTER 4. EIGENVALUES AND EIGENVECTORS Since r = |λ| = √ 2 > 1, the origin is a spiral repeller: 2.0 1.5 1.0 0.5 -4 -3 -2 1 -1 √ 86. With the given matrix A, we have r = a2 + b2 = 0.5 and θ = 3π 4 , so that " #" # " #" # r 0 cos θ − sin θ 0.5 0 0 1 . A= = 0 r sin θ cos θ 0 0.5 −1 0 Since r = |λ| = 0.5 < 1, the origin is a spiral attractor: 1.0 0.5 0.2 -0.2 0.4 0.6 0.8 1.0 1.2 -0.5 √ 87. With the given matrix A, we have r = a2 + b2 = 2 and θ = 5π 3 , so that #" # " #" " √ # 3 1 2 0 r 0 cos θ − sin θ 2√ 2 = A= . 3 1 0 2 − 2 cos θ 0 r sin θ 2 Since r = |λ| = 2 > 1, the origin is a spiral repeller: -8 -6 -4 2 -2 -2 -4 -6 -8 88. With the given matrix A, we have r = " r A= 0 0 r √ a 2 + b2 = #" cos θ sin θ r √ − 23 # " − sin θ 1 = cos θ 0 2 + 1 2 = 1 and θ = 5π 2 6 , so that #" √ 0 − 23 1 1 2 # − 12 √ . − 23 4.6. APPLICATIONS AND THE PERRON-FROBENIUS THEOREM 363 Since r = |λ| = 1, the origin is the orbital center: 1.0 x3 x0 0.5 -1.0 1.0 0.5 -0.5 x1 x2 -0.5 -1.0 89. The characteristic polynomial of A is 0.1 − λ det(A − λI) = 0.1 −0.2 = λ2 − 0.4λ + 0.05; 0.3 − λ solving this quadratic gives λ = 0.2 ± 0.1i for the eigenvalues. Row-reducing gives −1 −1 eigenvector. Thus P = Re x Im x = , and so 1 0 0 1 0.1 −0.2 −1 −1 0.2 −0.1 −1 C = P AP = = . −1 −1 0.1 0.3 1 0 0.1 0.2 √ Since |λ| = 0.05 < 1, the origin is a spiral attractor. The first six points are 1 −0.1 −0.09 x0 = , x1 = , x2 = , 1 0.4 0.11 −0.031 −0.0079 −0.00161 x3 = , x4 = , x5 = . 0.024 0.0041 0.00044 1.0 0.8 0.6 0.4 0.2 0.2 0.4 0.6 0.8 1.0 90. The characteristic polynomial of A is det(A − λI) = 2−λ −2 1 = λ2 − 2λ + 2; −λ −1 − i for an 1 364 CHAPTER 4. EIGENVALUES AND EIGENVECTORS −1 − i solving this quadratic gives λ = 1 ± i for the eigenvalues. Row-reducing gives for an eigen2 −1 −1 vector. Thus P = Re x Im x = , and so 2 0 # #" #" # " " 1 1 1 2 1 −1 −1 0 2 = . C = P −1 AP = −1 1 −1 − 21 −2 0 2 0 Since |λ| = √ 2 > 1, the origin is a spiral repeller. The first six points are 1 3 4 2 −4 x0 = , x1 = , x2 = , x3 = , x4 = , 1 −2 −6 −8 −4 x5 = −12 . 8 5 -10 -5 -5 91. The characteristic polynomial of A is det(A − λI) = solving this quadratic gives λ = 1 2 ± 1−λ 1 −1 = λ2 − λ + 1; −λ √ 3 2 i for the eigenvalues. eigenvector. Thus P = Re x Im x = 1 2 1 " C=P AP = #" 0 1 #" 1 −1 − √23 √1 3 1 0 Row-reducing gives √ 1 2 − 23 1 0 # " = 1 √2 3 2 1.0 0.5 0.5 -0.5 -0.5 -1.0 1.0 √ − 23 Since |λ| = 1, the origin is an orbital center. The first six points are 1 0 −1 −1 0 x0 = , x1 = , x2 = , x3 = , x4 = , 1 1 0 −1 −1 -1.0 √ 3 1 − 2 2 i for an 1 √ − 23 , 0 and so −1 1 2 # . 1 x5 = . 0 Chapter Review 365 92. The characteristic polynomial of A is −λ det(A − λI) = 1 √ √ −1 = λ2 − 3λ + 1; 3−λ √ solving this quadratic gives λ = 3 1 2 ± 2 i for the eigenvalues. eigenvector. Thus √ 3 Im x = − 2 1 P = Re x and so " C=P −1 AP = 0 −2 #" 1 0 √ − 3 1 √ − 23 − 12 i Row-reducing gives for an 1 − 12 , 0 #" √ −1 − 23 √ 1 3 − 12 "√ # = 0 3 2 1 2 − 21 √ 3 2 # . Since |λ| = 1, the origin is an orbital center. The first six points are √ −1√ 1 −1 −√ 3 x0 = , x1 = , x2 = , 1 1+ 3 2+ 3 √ √ √ −2 −√ 3 −2 −√ 3 −1 − 3 , x4 = , x5 = x3 = . 1 2+ 3 1+ 3 3 2 1 -3 -2 1 -1 2 3 -1 -2 -3 Chapter Review 1. (a) False. What is true is that if A is an n × n matrix, then det(−A) = (−1)n det A. This is because multiplying any row by −1 multiplies the determinant by −1, and −A multiplies every row by −1. See Theorem 4.7. (b) True, since det(AB) = det(A) det(B) = det(B) det(A) = det(BA). See Theorem 4.8. (c) False. It depends on what order the columns are in. Swapping any two columns of A multiplies the determinant by −1. See Theorem 4.4. (d) False. While det AT = det A by Theorem 4.10, det A−1 = det1 A by Theorem 4.9. So unless det A = ±1, the statement is false. 0 1 (e) False. For example, the matrix has a characteristic polynomial of 0, so 0 is its only 0 0 eigenvalue. (f ) False. For example, the identity matrix has only λ = 1 as an eigenvalue, but ei is an eigenvector for every i. (g) True. See Theorem 4.25 in Section 4.4. (h) False. Any diagonal matrix with two identical entries on the diagonal (such as the identity matrix) provides a counterexample. 366 CHAPTER 4. EIGENVALUES AND EIGENVECTORS (i) False. If A and B are similar, then they have the same eigenvalues, but not necessarily the same eigenvectors. Exercise 48 in Section 4.4 shows that if B = P −1 AP and v is an eigenvector of A, then P −1 v is an eigenvector for B for the same eigenvalue. (j) False. For example, if A is invertible, then its reduced row echelon form is the identity matrix, which has only the eigenvalue 1. So if A has an eigenvalue other than 1, it cannot be similar to the identity matrix. 2. (a) For example, expand along the first row: 1 3 7 3 5 9 5 5 7 =1 9 11 3 7 −3 7 11 3 7 +5 7 11 5 9 = 1(5 · 11 − 7 · 9) − 3(3 · 11 − 7 · 7) + 5(3 · 9 − 5 · 7) = 0. (b) Apply row operations to diagonalize the matrix:  1 3 7 3 5 9   1 5 R2 −3R1 7  −−−−−→ 0 R3 −7R1 11 − −−−−→ 0   1 3 5 0 −4 −8 R3 −3R2 −12 −24 − −−−−→ 0  3 5 −4 −8 . 0 0 Then det A is the product of the diagonal entries, which is zero. 3. To get from the first matrix to the second: • Multiply the first column by 3; this multiplies the determinant by 3. • Multiply the second column by 2; this multiplies the determinant by 2. • Subtract 4 times the third column from the second; this does not change the determinant. • Interchange the first and second rows; this multiplies the determinant by −1. So the determinant of the resulting matrix is 3 · 3 · 2 · (−1) = −18. 4. (a) We have det C = det (AB)−1 = 1 1 1 = −2. = = det(AB) det A det B 2 · − 14 (b) We have det C = det A2 B(3AT ) = det A · det A · det B · det(3AT ) = − det(3AT ) = −34 det(AT ) = −34 det A = −162. 5. For any n × n matrix, det A = det(AT ), and det(−A) = (−1)n det A. So if A is skew-symmetric, then AT = −A and det A = det(AT ) = det(−A) = (−1)n det A = − det A since n is odd. But det A = − det A means that det A = 0. 6. Compute the determinant by expanding along the first row: 1 1 2 −1 2 1 1 k =1 4 2 4 k k 1 − (−1) k2 2 k 1 +2 k2 2 1 4 = 1(k 2 − 4k) + 1(k 2 − 2k) + 2(4 − 2) = 2k 2 − 6k + 4 = 2(k − 2)(k − 1). So the determinant is zero when k = 1 or k = 2. Chapter Review 367 7. Since 3 Ax = 4 1 3 1 5 1 = =5 , 2 10 2 we see that x is indeed an eigenvector, with corresponding eigenvalue 5. 8. Since          13 −60 −45 3 3 · 13 − 1 · (−60) + 2 · (−45) 9 3 18 15 −1 =  3 · (−5) − 1 · 18 + 2 · 15  = −3 = 3 −1 , Ax = −5 10 −40 −32 2 3 · 10 − 1 · (−40) + 2 · (−32) 6 2 we see that x is indeed an eigenvector, with corresponding eigenvalue 3. 9. (a) The characteristic polynomial of A is −5 − λ 3 0 −6 4−λ 0 3 −3 −2 − λ = (−2 − λ) −5 − λ 3 −6 4−λ det(A − λI) = = −(λ + 2)(λ2 + λ − 2) = −(λ + 2)(λ + 2)(λ − 1). (b) The eigenvalues of A are the roots of the characteristic polynomial, which are λ1 = 1 and λ2 = −2 (with algebraic multiplicity 2). (c) We row-reduce A − λI for each eigenvalue λ: A−I  −6 0 = 3 0 −6 3 3 −3 0 −3  1 R +6R1  0 −−2−−−→ 0  R1 +R2 −−− −−→ 1 0 R3 +3R1 −−−−−→ 0 A + 2I  −3 0 = 3 0 1 0 0 −1 −3 1 1 0 0 1 0 0 −6 3 6 −3 0 0 Thus the eigenspaces are   −1 E1 = span  1 , 0   3 0 R1 ↔R2  0 − −6 −−−−→ 0 0 −1 1 −3  13 R1  1 0 −− −→ −6 0 0 − 13 R3 0 −−−−→  0 0 0 −6 3 0 0 0 0  − 1 R1  3 0 − −− −→ 1 2 0 0 0 0 0 0 3 −3 −6 3 0 −3   1 1 0 R2 ↔R3  0 − −−−−→ 0 0 0 0 0  0 0 0   0 −3 R2 +R2  0 − 0 −−−−→ 0 0 1 −1 −6 3 0 1 −1 0 0  0 0 0  0 0 . 0       1 −2  −2s + t  E−2 =  s  = span  1 , 0 .   t 0 1 (d) Even though A does not have three distinct eigenvalues, it is diagonalizable since the eigenvalue of algebraic multiplicity 2, which is λ2 = −2, also has geometric multiplicity 2. A diagonalizing matrix is a matrix P whose columns are the eigenvectors above, so   −1 −2 1 1 0 , P = 1 0 0 1 368 CHAPTER 4. EIGENVALUES AND EIGENVECTORS and  1 P −1 AP = −1 0  2 −1 −5 −1 1  3 0 1 0  −6 3 −1 4 −3  1 0 −2 0   −2 1 1 1 0 = 0 0 1 0  0 0 −2 0 = D. 0 −2 10. Since A is diagonalizable, A ∼ D where D is a diagonal matrix whose diagonal entries are its eigenvalues. Since A ∼ D, we know that A and D have the same eigenvalues, so the diagonal entries of D are −2, 3, and 4 and thus det D = −24. But A ∼ D also implies that det D = det(P −1 AP ) = det(P −1 ) det A det P = det A, so that det A = −24. 11. Since 3 1 1 =5 −2 , 7 1 −1 we know by Theorem 4.19 that −5 1 1 1 160 2 162 3 − 2(−1)−5 = + = . A−5 =5 1 −1 160 −2 158 7 2 12. Since A is diagonalizable, it has n linearly independent eigenvectors, which therefore form a basis for P Rn . So if x is any vector, then x = ci vi for some constants ci , so that by Theorem 4.19, and using the Triangle Inequality, ! n X X X n n n = ci λni vi ≤ |ci | |λi | kvi k . ci v i kA xk = A i=1 n Then |λi | → 0 as n → ∞ since |λi | < 1, so each term of this sum goes to zero, so that kAn xk → 0 as n → ∞. Since this holds for all vectors x, it follows that An → O as n → ∞. 13. The two characteristic polynomials are pA = 4−λ 3 2 = (4 − λ)(1 − λ) − 6 = λ2 − 5λ − 2 1−λ pB = 2−λ 3 2 = (2 − λ)(2 − λ) − 6 = λ2 − 4λ − 2. 2−λ Since the characteristic polynomials are different, the matrices are not similar. 14. Since both matrices are diagonal, we can read off their eigenvalues; both matrices have eigenvalues 2 and 3. To convert A to B, we want to reverse the rows, and reverse the columns, so choose 0 1 . P = P −1 = 1 0 Then P −1 AP = 0 1 1 0 2 0 0 0 3 1 1 3 = 0 0 0 = B. 2 15. These matrices are not similar. If there were, then there would be some 3 × 3 matrix   a b c P = d e f  g h i such that P −1 AP = B, or AP = P B. But  a+d b+e AP = d + g e + h g h  c+f f + i i  a P B = d g  a+b c d + e f . g+h i Chapter Review 369 If these matrices are to be equal, then comparing entries in the last row and column gives f = i = g = 0. The upper left entries also give d = 0. The two middle entries are e + h and d + e = e, so that h = 0 as well. Thus the entire bottom row of P is zero, so P cannot be invertible and A and B are not similar. Note that this is a counterexample to the converse of Theorem 4.22. These two matrices satisfy parts (a) through (g) of the conclusion, yet they are not similar. 16. The characteristic polynomial of A is det(A − λI) = 2−λ 1 k = (2 − λ)(−λ) − k = λ2 − 2λ − k. −λ (a) A has eigenvalues 3 and −1 if λ2 − 2λ − k = (λ − 3)(λ + 1), so when k = 3. (b) A has an eigenvalue with algebraic multiplicity 2 when its characteristic polynomial has a double root, which occurs when its discriminant is 0. But its discriminant is (−2)2 − 4 · 1 · (−k) = 4 + 4k, so we must have k = −1. (c) A has no real eigenvalues when the discriminant 4 + 4k of its characteristic polynomial is negative, so for k < −1. 17. If λ is an eigenvalue of A with eigenvector x, then Ax = λx, so that A3 x = A(A(A(x))) = A(A(λx)) = A(λ2 x) = λ3 x. If A3 = A, then Ax = λ3 x, so that λ = λ3 and then λ = 0 or λ = ±1. 18. If A has two identical rows, then Theorem 4.3(c) says that det A = 0, so that A is not invertible. But then by Theorem 4.17, 0 must be an eigenvalue of A. 19. If x is an eigenvalue of A with eigenvalue 3, then (A2 − 5A + 2I)x = A2 x − 5Ax + 2Ix = 9x − 5 · 3x + 2x = −4x. This shows that x is also an eigenvector of A2 − 5A + 2I, with eigenvalue −4. 20. Suppose that B = P −1 AP where P is invertible. This can be rewritten as BP −1 = P −1 A. Let λ be an eigenvalue of A with eigenvector x. Then B(P −1 x) = (BP −1 )x = (P −1 A)x = P −1 (Ax) = P −1 (λx) = λ(P −1 x). Thus P −1 x is an eigenvector for B, corresponding to λ. Chapter 5 Orthogonality 5.1 Orthogonality in Rn 1. Since v1 · v2 = (−3) · 2 + 1 · 4 + 2 · 1 = 0, v1 · v3 = (−3) · 1 + 1 · (−1) + 2 · 2 = 0, v2 · v3 = 2 · 1 + 4 · (−1) + 1 · 2 = 0, this set of vectors is orthogonal. 2. Since v1 · v2 = 4 · (−1) + 2 · 2 − 5 · 0 = 0, v1 · v3 = 4 · 2 + 2 · 1 − 5 · 2 = 0, v2 · v3 = −1 · 2 + 2 · 1 + 0 · 2 = 0, this set of vectors is orthogonal. 3. Since v1 · v2 = 3 · (−1) + 1 · 2 − 1 · 1 = −2 6= 0, this set of vectors is not orthogonal. There is no need to check the other two dot products; all three must be zero for the set to be orthogonal. 4. Since v1 · v3 = 5 · 3 + 3 · 1 + 1 · (−1) = 17 6= 0, this set of vectors is not orthogonal. There is no need to check the other two dot products; all three must be zero for the set to be orthogonal. 5. Since v1 · v2 = 2 · (−2) + 3 · 1 − 1 · (−1) + 4 · 0 = 0, v1 · v3 = 2 · (−4) + 3 · (−6) − 1 · 2 + 4 · 7 = 0, v2 · v3 = −2 · (−4) + 1 · (−6) − 1 · 2 + 0 · 7 = 0, this set of vectors is orthogonal. 6. Since v2 · v4 = −1 · 0 + 0 · (−1) + 1 · 1 + 2 · 1 = 3 6= 0, this set of vectors is not orthogonal. There is no need to check the other five dot products (they are all zero); all dot products must be zero for the set to be orthogonal. 7. Since the vectors are orthogonal: 4 1 · = 4 · 1 − 2 · 2 = 0, −2 2 371 372 CHAPTER 5. ORTHOGONALITY they are linearly independent by Theorem 5.1. Since there are two vectors, they form an orthogonal basis for R2 . By Theorem 5.2, we have v1 · w v2 · w 4 · 1 − 2 · (−3) 1 · 1 + 2 · (−3) 1 v1 + v2 = v1 + v2 = v1 − v2 , v1 · v1 v2 · v2 4 · 4 − 2 · (−2) 1·1+2·2 2 w= so that [w]B = 1 2 −1 . 8. Since the vectors are orthogonal: 1 −6 · = 1 · (−6) + 3 · 2 = 0, 3 2 they are linearly independent by Theorem 5.1. Since there are two vectors, they form an orthogonal basis for R2 . By Theorem 5.2, we have w= v1 · w −6 · 1 + 2 · 1 2 1 v2 · w 1·1+3·1 v1 + v2 = v1 − v2 , v1 + v2 = v1 · v1 v2 · v2 1·1+3·3 −6 · (−6) + 2 · 2 5 10 so that " [w]B = 2 5 1 − 10 # . 9. Since the vectors are orthogonal:     1 1  0 · 2 = 1 · 1 + 0 · 2 − 1 · 1 = 0 −1 1     1 1  0 · −1 = 1 · 1 + 0 · (−1) − 1 · 1 = 0 −1 1     1 1 2 · −1 = 1 · 1 + 2 · (−1) + 1 · 1 = 0, 1 1 they are linearly independent by Theorem 5.1. Since there are three vectors, they form an orthogonal basis for R3 . By Theorem 5.2, we have v1 · w v2 · w v3 · w v1 + v2 + v3 v1 · v1 v2 · v2 v3 · v3 1·1+0·1−1·1 1·1+2·1+1·1 1·1−1·1+1·1 = v1 + v2 + v3 1 · 1 + 0 · 0 − 1 · (−1) 1·1+2·2+1·1 1 · 1 − 1 · (−1) + 1 · 1 2 1 = v2 + v3 , 3 3 w= so that   0 2 [w]B =  3  . 1 3 5.1. ORTHOGONALITY IN RN 373 10. Since the vectors are orthogonal:     1 1 1 · −1 = 1 · 1 + 1 · (−1) + 1 · 0 = 0 1 0     1 1 1 ·  1 = 1 · 1 + 1 · 1 + 1 · (−2) = 0 1 −2     1 1 −1 ·  1 = 1 · 1 − 1 · 1 + 0 · (−2) = 0, 0 −2 they are linearly independent by Theorem 5.1. Since there are three vectors, they form an orthogonal basis for R3 . By Theorem 5.2, we have v2 · w v3 · w v1 · w v1 + v2 + v3 v1 · v1 v2 · v2 v3 · v3 1·1−1·2+0·3 1·1+1·2−2·3 1·1+1·2+1·3 v1 + v2 + v3 = 1·1+1·1+1·1 1 · 1 − 1 · (−1) + 0 · 0 1 · 1 + 1 · 1 − 2 · (−2) 1 1 = 2v1 − v2 − v3 , 2 2 w= so that   2   [w]B = − 21  . − 12 11. Since kv1 k = mal. q 3 2 + 5 4 2 = 1 and kv2 k = 5 q − 45 2 + 3 2 = 1, the given set of vectors is orthonor5 q q 1 2 1 2 1 2 1 2 √1 6= 1 and kv2 k = + = + − = √12 6= 1, the given set of vectors 12. Since kv1 k = 2 2 2 2 2 is not orthonormal. To normalize the vectors, we divide each by its norm to get " # " #) (" # " #) ( 1 1 √1 √1 1 1 2 2 , 2 √ 2 , √ = . √1 √1 − 1/ 2 12 1/ 2 − 12 2 2 13. Taking norms, we get s 2 2 2 1 2 2 kv1 k = + + =1 3 3 3 s √ 2 2 2 1 5 kv2 k = + − + 02 = 3 3 3 s √ 2 5 3 5 2 2 kv3 k = 1 + 2 + = . 2 2 So the first vector is already normalized; we must normalize the other two by dividing by their norms, to get          2   2  2 √ √  13  1   1  1  31  1    32   15   3 4 5  2 √ √ √ √ , , = , , − 2 − 3  3   3    3 5 . 5     5/3 3 5/2  2  2  5  5 −2 0 − √ 0 3 3 3 5 374 CHAPTER 5. ORTHOGONALITY 14. Taking norms, we get s 2 2 2 2 1 1 1 1 kv1 k = + + − + =1 2 2 2 2 s √ 2 2 2 2 1 1 6 2 2 kv2 k = 0 + + + +0 = 3 3 3 3 s √ 2 2 2 2 1 1 1 3 1 + − + + − = kv3 k = . 2 6 6 6 3 So the first vector is already normalized; we must normalize the other two by dividing by their norms, to get  1     1   1     √3    0 0    2 2  2      √23           1    1 1  1   √1    − − 1 1 3  6  2   6   √6   2 . 2 , √  1 =  1 ,  2  ,   1 , √   − 2   √6   63  − 2  6/3  3  3/3  6           √     1 1 1  √1 − 16 − 63  2 3 2 6 15. Taking norms, we get s 2 2 2 2 1 1 1 1 + + − + =1 kv1 k = 2 2 2 2 v u √ !2 2 2 u 1 6 1 t 2 + −√ =1 + √ kv2 k = 0 + 3 6 6 v u √ !2 √ !2 √ !2 √ !2 u 3 3 3 3 kv3 k = t + − + + − =1 2 6 6 6 s 2 2 1 1 2 2 + √ = 1. kv4 k = 0 + 0 + √ 2 2 So these vectors are already orthonormal. 16. The columns of the matrix are orthogonal: 0 −1 · = 0 · (−1) + 1 · 0 = 0. 1 0 Further, they are orthonormal since 0 0 · = 0 · 0 + 1 · 1 = 1, 1 1 −1 −1 · = −1 · (−1) + 0 · 0 = 1. 0 0 Thus the matrix is an orthogonal matrix, so by Theorem 5.5, the inverse of this matrix is its transpose, or 0 1 . −1 0 17. The columns of the matrix are orthogonal: " √1 2 − √12 # " · √1 2 √1 2 # 1 1 1 1 = √ · √ − √ · √ = 0. 2 2 2 2 5.1. ORTHOGONALITY IN RN 375 Further, they are orthonormal since " # " √1 2 − √12 " · √1 2 √1 2 √1 2 − √12 # " · √1 2 √1 2 # 1 1 1 1 = √ · √ − √ · −√ =1 2 2 2 2 # 1 1 1 1 = √ · √ + √ · √ = 1. 2 2 2 2 Thus the matrix is an orthogonal matrix, so by Theorem 5.5, the inverse of this matrix is its transpose, or # " √1 √1 − 2 2 . 1 1 √ √ 2 2 18. This is not an orthogonal matrix since none of the columns have norm 1: 1 1 1 1 + + = , 9 9 9 3 1 1 1 + +0= , 4 4 2 q3 · q3 = 19. This matrix Q is orthogonal by Theorem 5.4 since QQT = I:   cos θ sin θ cos θ sin θ − cos θ − sin2 θ QQT =  cos2 θ sin θ − cos θ sin θ  − cos θ sin θ 0 cos θ − sin2 θ  2 2 2 4 3 cos2 θ sin θ − cos θ sin θ q1 · q1 = q2 · q2 = cos θ sin θ+cos θ+sin θ 0 1 0  cos θ sin2 θ−cos θ sin2 θ cos4 θ+sin2 θ+cos2 θ sin2 θ cos2 θ sin θ−cos2 θ sin θ  2 2 sin θ cos θ−sin θ cos θ  1 = 0 0  sin θ 0  cos θ cos θ sin θ−sin θ cos θ+cos θ sin3 θ = cos3 θ sin θ−sin θ cos θ+cos θ sin3 θ 2 1 1 4 6 + + = . 25 25 25 25 2 sin θ cos θ−cos θ sin θ sin2 θ+cos2 θ  0 0 . 1 Thus by Theorem 5.5 Q−1 = QT . 20. The columns are all orthogonal since each dot product consists of two terms of 21 · 12 and two of − 12 · 21 . The columns are orthonormal since the norm of each column is s 2 2 2 2 1 1 1 1 + ± + ± + ± = 1. ± 2 2 2 2 So the matrix is orthonormal, so by Theorem 5.5 its inverse is equal to its transpose,  1  1 1 − 12 2 2 2 − 1 1 1 1  2 2 2 2 .  1  1 1  2 − 21  2 2 1 2 − 21 1 2 1 2 21. The dot product of the first and fourth columns is    1  √ 1 6    √1  0  1  6 = 1 · √  · 6= 0, 0 − √1   6    6 √1 0 2 so the matrix is not orthonormal. 376 CHAPTER 5. ORTHOGONALITY T −1 22. By Theorem 5.5, Q−1 is orthogonal if and only if Q−1 = Q−1 . But since Q is orthogonal, Q−1 = QT , so that T −1 −1 , Q−1 = QT = Q−1 and Q−1 is orthogonal. 23. Since det Q = det QT , we have 1 = det(I) = det(QQ−1 ) = det(QQT ) = (det Q)(det QT ) = (det Q)2 . Since (det Q)2 = 1, we must have det Q = ±1. T 24. By Theorem 5.5, it suffices to show that (Q1 Q2 )−1 = (Q1 Q2 )T . But using the fact that Q−1 1 = Q1 −1 T and Q2 = Q2 , we have −1 T T T (Q1 Q2 )−1 = Q−1 2 Q1 = Q2 Q1 = (Q1 Q2 ) . Thus Q1 Q2 is orthogonal. 25. By Theorem 3.17, if P is a permutation matrix, then P −1 = P T . So by Theorem 5.5, P is orthogonal. 26. Reordering the rows of Q is the same as multiplying it on the left by a permutation matrix P . By the previous exercise, P is orthogonal, and then by Exercise 24, P Q is also orthogonal. So reordering the rows of an orthogonal matrix results in an orthogonal matrix. 27. Let θ be the angle between x and y, and φ the angle between Qx and Qy. By Theorem 5.6, kQxk = kxk, kQyk = kyk, and Qx · Qy = x · y, so that cos θ = Qx · Qy x·y = = cos φ. kxk kyk kQxk kQyk Since 0 ≤ θ, φ ≤ π, the fact that their cosines are equal implies that θ = φ. a c 28. (a) Suppose that Q = is an orthogonal matrix. Then b d T T QQ = Q Q = I ⇒ a c b a d b 2 c a + b2 ac + bd 1 = = d ac + bd c2 + d2 0 0 . 1 So a2 + b2 = 1 and c2 + d2 = 1, and thus both columns of Q are unit vectors. Also, ac + bd = 0, which implies that ac = −bd and therefore a2 c2 = b2 d2 . But c2 = 1 − d2 and b2 = 1 − a2 ; substituting gives a2 (1 − d2 ) = (1 − a2 )d2 ⇒ a2 − a2 d2 = d2 − a2 d2 ⇒ If a = d = 0, then b = c = ±1, so we get one of the matrices 0 1 0 −1 0 1 0 , , , 1 0 1 0 −1 0 −1 a2 = d2 ⇒ a = ±d. −1 , 0 all of which are of one of the required forms. Otherwise, if a = d 6= 0, then c2 = 1−d2 = 1−a2 = b2 , so that b = ±c and we get a ±b , b a and we must choose the minus sign in order that the columns be orthogonal. So the result is a −b , b a 5.1. ORTHOGONALITY IN RN 377 which is also of the required form. Finally, if a = −d 6= 0, then again b2 = c2 and we get a ±b ; b −a this time we must choose the plus sign for the columns to be orthogonal, resulting in a b , b −a which is again of the required form. a (b) Since is a unit vector, we have a2 +b2 = 1, so that (a, b) is a point on the unit circle. Therefore b there is some θ with (a, b) = (cos θ, sin θ). Substituting into the forms from part (a) gives the desired matrices. (c) Let cos θ Q= sin θ − sin θ , cos θ cos θ R= sin θ sin θ . − cos θ From Example 3.58 in Section 3.6, Q is the standard matrix of a rotation through angle θ. From Exercise 26(c) in Section 3.6, R is a reflection through the line ` that makes an angle of θ2 with the x-axis, which is the line y = tan θ2 x (or x = 0 if θ = π). (d) From part (c), we have det Q = cos2 θ + sin2 θ = 1 and det R = − cos2 θ − sin2 θ = −1. 29. Since det Q = √12 · √12 − − √12 · √12 = 1, this matrix represents a rotation. To find the angle, we have cos θ = √12 and sin θ = √12 , so that θ = π4 = 45◦ . √ √ 30. Since det Q = − 21 · − 12 − 23 · − 23 = 1, this matrix represents a rotation. To find the angle, we √ ◦ have cos θ = − 12 and sin θ = − 23 , so that θ = 4π 3 = 240 . √ √ 31. Since det Q = − 12 · 12 − 23 · 23 = −1, this matrix represents a reflection. If cos θ = − 21 and √ , so that by part (d) of the previous exercise the reflection occurs through the sin θ = 23 , then θ = 2π √3 line y = tan π3 x = 3 x. 32. Since det Q = − 35 · 53 − − 45 · − 45 = −1, this matrix represents a reflection. Let θ be the (third quadrant) angle with cos θ = − 35 and sin θ = − 54 ; then by part (d) of the previous exercise, the reflection occurs through the line y = tan θ2 x. 33. (a) Since A and B are orthogonal matrices, we know that AT = A−1 and B T = B −1 , so that A(AT + B T )B = A A−1 + B −1 B = AA−1 B + AB −1 B = B + A = A + B. (b) Since A and B are orthogonal matrices, we have det A = ±1 and det B = ±1. If det A+det B = 0, then one of them must be 1 and the other must be −1, so that (det A)(det B) = −1. But by part (a), det(A + B) = det A(AT + B T )B = (det A) det(AT + B T ) (det B) = (det A) det((A + B)T ) (det B) = (det A)(det B) (det(A + B)) = − det(A + B). Since det(A+B) = − det(A+B), we conclude that det(A+B) = 0 so that A+B is not invertible. 34. It suffices to show that QQT = I. Now, " #" x1 yT x1 T QQ = 1 t y I − 1−x1 yy y I− y # T 1 1−x1 yyt . 378 CHAPTER 5. ORTHOGONALITY Also, x21 + yyT = x21 + x22 + · · · + x2n = 1, since x is a unit vector. Rearranging gives yyT = 1 − x21 . Further, we have 1 1 T T T y yy yT (1 − x1 )2 = yT − (1 + x1 )yT I− =y − 1 − x1 1 − x1 = −x1 yT 1 1 I− yyT y = y − (1 − x21 )y = y − (1 + x1 )y 1 − x1 1 − x1 = −x1 y I− 1 1 − x1 yyT I− 1 1 − x1 yyT 2 1 1 yyT + yyT yyT 1 − x1 1 − x1 2(1 − x1 ) T 1 − x21 =I− yy + yyT (1 − x1 )2 (1 − x1 )2 1 − 2x1 + x21 T =I− yy (1 − x1 )2 = I2 − 2 = I − yyT . Then evaluating QQT gives   1 2 T T T T x + y y x y + y I − yy 1 1 1−x 1  QQT =  1 1 1 T T T T yy x1 y + I − 1−x1 yy y yy + I − 1−x1 yy I − 1−x 1 1 x1 y T − x1 y T = x1 y − x1 y yyT + (I − yyT ) 1 O = O I = I. Thus Q is orthogonal. 35. Let Q be orthogonal and upper triangular. Then the columns qi of Q are orthonormal; that is, ( n X 0 i 6= j qi · qj = qki qkj = . 1 i=j k=1 Since Q is upper triangular, we know that qij = 0 when i > j. We prove by induction on the rows that each row has only one nonzero entry, along the diagonal. Note that q11 6= 0, since all other entries in the first column are zero and the norm of the first column vector must be 1. Now the dot product of q1 with qj for j > 1 is n X q1 · qj = qk1 qkj = q11 q1j . k=1 Since these columns must be orthogonal, this product is zero, so that q1j = 0 for j > 1 and therefore q11 is the only nonzero entry in the first row. Now assume that q11 , q22 , . . . , qii are the only nonzero entries in their rows, and consider qi+1,i+1 . It is the only nonzero entry in its column, since entries in lower-numbered rows are zero by the inductive hypothesis and entries in higher-numbered rows are zero since Q is upper triangular. Therefore, qi+1,i+1 6= 0 since the (i + 1)st column must have norm 1. Now consider the dot product of qi+1 with qj for j > i + 1: qi+1 · qj = n X k=1 qk,i+1 qkj = qi+1,i+1 qi+1,j . 5.2. ORTHOGONAL COMPLEMENTS AND ORTHOGONAL PROJECTIONS 379 Since these columns must be orthogonal, this product is zero, so that qi+1,j = 0 for j > i + 1. So all entries to the right of the (i + 1)st column are zero. But also entries to the left of the (i + 1)st column are zero since Q is upper triangular. Thus qi+1,i+1 is the only nonzero entry in its row. So inductively, qkk is the only nonzero entry in the k th row for k = 1, 2, . . . , n, so that Q is diagonal. 36. If n > m, then the maximum rank of A is m. By the Rank Theorem (Theorem 3.26, in Section 3.5), we know that rank(A) + nullity(A) = n; since n > m, we have nullity(A) = n − m > 0. So there is some nonzero vector x such that Ax = 0. But then kxk = 6 0 while kAxk = k0k = 0. 37. (a) Since the {vi } are a basis, we may choose αi , βi such that x= n X α k vk , k=1 y= n X βk v k . k=1 Because {vi } form an orthonormal basis, we know that vi · vi = 1 and vi · vj = 0 for i 6= j. Then ! ! n n n n n X X X X X x·y = αk vk · βk v k = αi βj vi · vj = αi βi vi · vi = αi βi . k=1 i,j=1 k=1 i=1 i=1 Pn But x · vi = k=1 αk vk · vi = αi and similarly y · vi = βi . Therefore the sum at the end of the displayed equation above is n X i=1 αi βi = n X (x · vi )(y · vi ) = (x · v1 )(y · v1 ) + (x · v2 )(y · v2 ) + · · · + (x · vn )(y · vn ), i=1 proving Parseval’s Identity. (b) Since [x]B = [αk ] and [y]B = [βk ], Parseval’s Identity says that x·y = n X αk βk = [x]B · [y]B . k=1 In other words, the dot product is invariant under a change of orthonormal basis. 5.2 Orthogonal Complements and Orthogonal Projections 1 1. W is the column space of A = , so that W ⊥ is the null space of AT = 1 2 matrix is already row-reduced: 1 2 0 , 2 . The augmented so that a basis for the null space, and thus for W ⊥ , is given by −2 . 1 −4 2. W is the column space of A = , so that W ⊥ is the null space of AT = −4 3 gives h i h i −4 3 0 → 1 − 43 0 , so that a basis for the null space, and thus for W ⊥ , is given by " # " # 3 3 4 , or . 1 4 3 . Row-reducing 380 CHAPTER 5. ORTHOGONALITY 3. W consists of          x 1 0 0   1  y  = x 0 + y 1 , so that 0 , 1 is a basis for W .   x+y 1 1 1 1  Then W is the column space of  1 A = 0 1  0 1 , 1 so that W ⊥ is the null space of AT . The augmented matrix is already row-reduced: 1 0 1 0 0 1 1 0 so that the null space, which is W ⊥ , is the set       −t −1 −1 −t = t −1 = span −1 . t 1 1 This vector forms a basis for W ⊥ . 4. W consists of          x 1 0 0   1 2x + 3z  = x 2 + y 3 , so that 2 , 3 is a basis for W .   z 0 1 0 1  Then W is the column space of  1 A = 2 0  0 3 , 1 so that W ⊥ is the null space of AT . Row-reduce the augmented matrix: " # " # 1 2 0 0 1 0 − 32 0 → 1 0 3 1 0 0 1 0 3 so that the null space, which is W ⊥ , is the set       2 2 2 t 3 3  1      − 3 t = t − 13  = span −1 . 3 t 1 This vector forms a basis for W ⊥ .   1 5. W is the column space of A = −1, so that W ⊥ is the null space of AT = 1 3 augmented matrix is already row-reduced: 1 −1 3 0 , so that the null space, which is W ⊥ , is the set           1 −3 s − 3t 1 −3  s  = s 1 + t  0 = span 1 ,  0 t 0 1 0 1 These two vectors form a basis for W ⊥ . −1 3 . The 5.2. ORTHOGONAL COMPLEMENTS AND ORTHOGONAL PROJECTIONS  6. 381  1 2  1 W is the column space of A = − 2 , so that W ⊥ is the null space of AT = 12 − 12 2 . Row-reducing 2 gives 1 2 − 12 0 → 1 2 −1 4 0 , so the null space, which is W ⊥ , is the set           s − 4t 1 −4 1 −4  s  = s 1 + t  0 = span 1 ,  0 t 0 1 0 1 These two vectors form a basis for W ⊥ . 7. To find the required bases, we row-reduce A 0 :    1 −1 3 0 1  5 0  2 1 0  →  0 0 1 −2 0 −1 −1 1 0 0 0 1 0 0 1 −2 0 0  0 0 . 0 0 Therefore, a basis for the row space of A is the nonzero rows of the reduced matrix, or u1 = 1 0 1 , u2 = 0 1 −2 . The null space null(A) is the set of solutions of the augmented matrix, which are given by   −s  2s . s Thus a basis for null(A) is   −1 v =  2 . 1 Every vector in row(A) is orthogonal to every vector in null(A): u1 · v = 1 · (−1) + 0 · 2 + 1 · 1 = 0, 8. To find the required bases, we row-reduce A  1 1 −1 0 2 −2 0 2 4 4   2 2 −2 0 1 −3 −1 3 4 5 u2 · v = 0 · (−1) + 1 · 2 − 2 · 1 = 0. 0 :   0 1  0  → 0 0 0 0 0 0 1 0 0 −1 −2 0 0 2 0 0 0 1 0 0 0  0 0 . 0 0 Thus a basis for the row space of A is the nonzero rows of the reduced matrix, or u1 = 1 0 −1 −2 0 , u2 = 0 1 0 2 0 , u3 = 0 0 0 0 1 . The null space null(A) is the set of solutions of the augmented matrix, which are given by           s + 2t 1 2 1 2 0 −2 0 −2  −2t             s  = s 1 + t  0 = span 1 ,  0 .            t  0  1 0  1 0 0 0 0 0 382 CHAPTER 5. ORTHOGONALITY Thus a basis for null(A) is   1 0    v1 =  1 , 0 0  2 −2    v2 =   0 .  1 0  Every vector in row(A) is orthogonal to every vector in null(A): u1 · v1 = 1 · 1 + 0 · 0 − 1 · 1 − 2 · 0 + 0 · 0 = 0, u1 · v2 = 1 · 2 + 0 · (−2) − 1 · 0 − 2 · 1 + 0 · 0 = 0, u2 · v1 = 0 · 1 + 1 · 0 + 0 · 1 + 2 · 0 + 0 · 0 = 0, u2 · v2 = 0 · 2 + 1 · (−2) + 0 · 0 + 2 · 1 + 0 · 0 = 0, u3 · v1 = 0 · 1 + 0 · 0 + 0 · 1 + 0 · 0 + 1 · 0 = 0, u3 · v2 = 0 · 2 + 0 · (−2) + 0 · 0 + 0 · 1 + 1 · 0 = 0. 9. Using the row reduction from Exercise 7, we see that a basis for the column space of A is     1 −1  5  2   u1 =  u2 =   0 ,  1 . −1 −1 To find the null space of AT , we must row-reduce AT 0 :     3 1 0 − 75 1 5 0 −1 0 0 7     1 1 −1 0 → 0 1 − 27 0 . −1 2 7 3 1 −2 1 0 0 0 0 0 0 The null space null(AT ) is the set of solutions of the augmented matrix, which are given by  5  5  3      3 −3 −7 5 7s − 7t 7 − 1 s + 2 t − 1   2 −1  2  7    7  = s  7  + t  7  = span         ,   .    1  0  7  0  s t 0 0 1 7 Thus a basis for null(AT ) is  5 −1  v1 =   7 , 0    −3  2  v2 =   0 . 7 Every vector in col(A) is orthogonal to every vector in null(AT ): u1 · v1 = 1 · 5 + 5 · (−1) + 0 · 7 − 1 · 0 = 0, u1 · v2 = 1 · (−3) + 5 · 2 + 0 · 0 − 1 · 7 = 0, u2 · v1 = −1 · 5 + 2 · (−1) + 1 · 7 − 1 · 0 = 0, u2 · v2 = −1 · (−3) + 2 · 2 + 1 · 0 − 1 · 7 = 0. 10. Using the row reduction from Exercise 8, we see that a basis for the column space of A is       1 1 2 −2  0 4    u1 =  u2 =  u3 =   2 ,  2 , 1 −3 −1 5 To find the null space of AT , we must row-reduce AT 0 :     1 −2 2 −3 0 1 0 0 1 0  1 0 1 0 0 2 −1 0 1 0     −1   2 −2 3 0 → 0 0 1 −1 0  .  0 0 0 0 4 0 4 0 0 0 2 4 1 5 0 0 0 0 0 0 5.2. ORTHOGONAL COMPLEMENTS AND ORTHOGONAL PROJECTIONS 383 The null space null(AT ) is the set of solutions of the augmented matrix, which are given by     −s −1 −s −1   = s . s  1 s 1 Thus a basis for null(AT ) is   −1 −1  v1 =   1 . 1 Every vector in col(A) is orthogonal to every vector in null(AT ): u1 · v1 = 1 · (−1) − 2 · (−1) + 2 · 1 − 3 · 1 = 0, u2 · v1 = 1 · (−1) + 0 · (−1) + 2 · 1 − 1 · 1 = 0, u3 · v1 = 2 · (−1) + 4 · (−1) + 1 · 1 + 5 · 1 = 0. 11. Let  A = w1 w2 2 = 1 −2  4 0 . 1 Then the subspace W spanned by w1 and w2 is the column space of A, so that by Theorem 5.10, W ⊥ = (col(A))⊥ = null(AT ). To compute null(AT ), we row-reduce " # " # i h 1 2 1 −2 0 1 0 0 4 → . AT 0 = 4 0 1 0 0 1 − 52 0 Thus null(AT ) is the set of solutions of this augmented matrix, which is       −1 − 41 t − 14  5   5   = t = span  2t   2  10 4 t 1 so this vector forms a basis for W ⊥ . 12. Let  A = w1 1 −1 w2 =   3 −2  0 1 . −2 1 Then the subspace W spanned by w1 and w2 is the column space of A, so that by Theorem 5.10, W ⊥ = (col(A))⊥ = null(AT ). To compute null(AT ), we row-reduce T 1 −1 3 −2 0 1 0 1 −1 0 0 = A → . 0 1 −2 1 0 0 1 −2 1 0 Thus null(AT ) is the set of solutions of this augmented matrix, which is           −s + t −1 1 −1 1  2s − t   2 −1  2 −1            s  = s  1 + t  0 = span  1 ,  0 , t 0 1 0 1 so these two vectors are a basis for W ⊥ . 384 CHAPTER 5. ORTHOGONALITY 13. Let  A = w1 w2 2 −1 w3 =   6 3  −1 2 2 5 . −3 6 −2 1 Then the subspace W spanned by w1 , w2 , and w3 is the column space of A, so that by Theorem 5.10, W ⊥ = (col(A))⊥ = null(AT ). To compute null(AT ), we row-reduce     4 2 −1 6 3 0 1 0 3 0 h i 3     AT 0 = −1 2 −3 −2 0 → 0 1 0 − 13 0 . 2 5 6 1 0 0 0 0 0 0 Thus null(AT ) is the set of solutions of this augmented matrix, which is  4         −4 −3 −3 −3s − 43 t −3           1 1    0  1   0  3t   = s   + t  3  = span   ,   ,  0  1  0   1  s 0 t 1 0   1 2 2 2  0 2 . 1 −1 −3 2 3 so these two vectors form a basis for W ⊥ . 14. Let 4  6  v2 =  −1  1 −1 A = v1 Then the subspace W spanned by w1 , w2 , and w3 is the column space of A, so that by Theorem 5.10, W ⊥ = (col(A))⊥ = null(AT ). To compute null(AT ), we row-reduce     4 6 −1 1 −1 0 1 0 0 −2 7 0 h i     3  0 1 −3 0 AT 0 =  −5 0 1 2  → 0 1 0 . 2 2 2 2 −1 2 0 0 0 1 0 −1 Thus null(AT ) is the set of solutions of this augmented matrix, which is       2s − 7t 2 −7  3   3   − s + 5t −   5  2   2           = s  0 + t  1 t          1  0 s       t 0 1 Clearing fractions, we get   4 −3    0 ,    2 0   −7  5    1    0 1 as a basis for W ⊥ . 15. We have projW (v) = u1 · v u1 · u1 1 · 7 + 1 · (−4) 3 u1 = u1 = u1 = 1·1+1·1 2 " # 3 2 3 2 0 5.2. ORTHOGONAL COMPLEMENTS AND ORTHOGONAL PROJECTIONS 16. We have u1 · v u2 · v projW (v) = u1 + u2 u1 · u1 u2 · u2 1 · 3 + 1 · 1 + 1 · (−2) 1 · 3 − 1 · 1 + 0 · (−2) = u1 + u2 1·1+1·1+1·1 1 · 1 − 1 · (−1) + 0 · 0 2 = u1 + u2 3       5 1 1 3 2     1 + −1 =  = − 13    3 1 0 2 3 17. We have u1 · v u2 · v u1 + u2 u1 · u1 u2 · u2 −1 · 1 + 2 · 2 + 4 · 3 2·1−2·2+1·3 u1 + u2 = 2 · 2 − 2 · (−2) + 1 · 1 −1 · (−1) + 1 · 1 + 4 · 4 13 1 = u1 + u2 9   18    projW (v) = 2 − 13 −1  9   18   2  2  13   1  = − 9  +  18  =  2 . 1 52 3 9 18 18. We have u1 · v u2 · v u3 · v projW (v) = u1 + u2 + u3 u1 · u1 u2 · u2 u3 · u3 1 · 3 + 1 · (−2) + 0 · 4 + 0 · (−3) 1 · 3 − 1 · (−2) − 1 · 4 + 1 · (−3) = u1 + u2 + 1·1+1·1+0·0+0·0 1 · 1 − 1 · (−1) − 1 · (−1) + 1 · 1 0 · 3 + 0 · (−2) + 1 · 4 + 1 · (−3) u3 0·0+0·0+1·1+1·1 1 1 1 = u1 − u2 + u3 2   2  2     1 1 0 0  1 −1 1 0 1 1 1  −   +   =  . =  2 0 2 −1 2 1 1 0 1 1 0 19. Since W is spanned by the vector u1 = projW (v) = u1 · v u1 · u1 1 , we have 3 " 2# −5 1 · 2 + 3 · (−2) 2 1 u1 = u1 = − = . 1·1+3·3 5 3 − 65 The component of v orthogonal to W is thus " projW ⊥ (v) = v − projW (v) = 12 5 − 45 # . 385 386 CHAPTER 5. ORTHOGONALITY   1 20. Since W is spanned by the vector u1 = 1, we have 1 projW (v) = u1 · v u1 · u1     4 1 3 1 · 3 + 1 · 2 + 1 · (−1) 4   4 1 = u1 = u1 =  3 . 1·1+1·1+1·1 3 1 4 3 The component of v orthogonal to W is thus   5  3 2 projW ⊥ (v) = v − projW (v) =   3 . − 73 21. Since W is spanned by   1 u1 = 2 1   1 and u2 = −1 , 1 we have u1 · v u2 · v projW (v) = u1 + u2 u1 · u1 u2 · u2 1 · 4 − 1 · (−2) + 1 · 3 1 · 4 + 2 · (−2) + 1 · 3 u1 + u2 = 1·1+2·2+1·1 1 · 1 − 1 · (−1) + 1 · 1 1 = u1 + 3u2 2       7 1 3 2 2       =  1  + −3 = −2. 1 7 3 2 2 The component of v orthogonal to W is thus  1 2    projW ⊥ (v) = v − projW (v) =  0 . − 12 22. Since W is spanned by   1 1  u1 =   1 0   1  0  and u2 =  −1 , 1 we have u1 · v u2 · v u1 + u2 u1 · u1 u2 · u2 1 · 2 + 1 · (−1) + 1 · 5 + 0 · 6 1 · 2 + 0 · (−1) − 1 · 5 + 1 · 6 = u1 + u2 1·1+1·1+1·1+0·0 1 · 1 + 0 · 0 − 1 · (−1) + 1 · 1 = 2u1 + u2       1 1 3 1  0 2      = 2 1 + −1 = 1 . 0 1 1 projW (v) = 5.2. ORTHOGONAL COMPLEMENTS AND ORTHOGONAL PROJECTIONS 387 The component of v orthogonal to W is thus   −1 −3  projW ⊥ (v) = v − projW (v) =   4 . 5 23. W ⊥ is defined to be the set of all vectors in Rn that are orthogonal to every vector in W . So if w ∈ W ∩ W ⊥ , then w · w = 0. But this means that w = 0. Thus 0 is the only element of W ∩ W ⊥ . 24. If v ∈ W ⊥ , then since v is orthogonal to every vector in W , it is certainly orthogonal to wi , so that v · wi = 0 for i = 1, 2, . . . , k. For the converse, suppose w ∈ W ; then since W = span(w1 , . . . , wk ), Pk there are scalars ai such that w = i=1 ai wi . Then v·w =v· k X ai wi = k X i=1 ai v · wi = 0 i=1 since v is orthogonal to each of the wi . But this means that v is orthogonal to every vector in W , so that v ∈ W ⊥ .     0 1 25. No, it is not true. For example, let W ⊂ R3 be the xy-plane, spanned by e1 = 0 and e2 = 1. 0 0   0 / W ⊥. Then W ⊥ = span(e3 ) = span 0. Now let w = e1 and w0 = e2 ; then w · w0 = 0 but w0 ∈ 1 26. Yes, this is true. Let V = span(vk+1 , . . . , vn ); we want to show that V = W ⊥ . P By Theorem 5.9(d), n all we must show is that for every v ∈ V , vi · v = 0 for i = 1, 2, . . . , k. But v = j=k+1 cj vj , so that if i ≤ k, since the vi are an orthogonal basis for Rn we get n X vi · cj vj = j=k+1 n X cj vj · vi = 0. j=k+1 27. Let {uk } be an orthonormal basis for W . If x = projW (x) = n X uk · x k=1 uk · uk uk = n X (uk · x)uk , k=1 then x is in the span of the uk , which is W . For the converse, suppose that x ∈ W . Then x = P n k=1 ak uk . Since the uk are an orthogonal basis for W , we get x · ui = n X ak uk · ui = ai ui · ui = ai . k=1 Therefore x= n X ak uk = k=1 n X (x · uk )uk = projW (x). k=1 28. Let {uk } be an orthonormal basis for W . Then n n X X uk · x uk = (uk · x)uk . projW (x) = uk · uk k=1 k=1 So if projW (x) = 0, then uk · x = 0 for all k since the uk are linearly independent. But then by Theorem 4.9(d), x ∈ W ⊥ , so that x is orthogonal to W . Conversely, if x is orthogonal to W , then uk · x = 0 for all k, so the sum above is zero and thus projW (x) = 0. 388 CHAPTER 5. ORTHOGONALITY 29. If uk is an orthonormal basis for W , then projW (x) = n X (uk · x)uk ∈ W. k=1 By Exercise 27, x ∈ W if and only if projW (x) = x. Applying this to projW (x) ∈ W gives projW (projW (x)) = projW (x) as desired. 30. (a) Let W = span(S) = span{v1 , . . . , vk }, and let T = {b1 , . . . , bj } be an orthonormal basis for W ⊥ . By Theorem 5.13, dim W + dim W ⊥ = k + j = n. Therefore S ∪ T is a basis for Rn . Further, it is an orthonormal basis: any two distinct members of S are orthogonal since S is an orthonormal set; similarly, any two distinct members of T are orthogonal. Also, vi · bj = 0 since one of the vectors is in W and the other is in W ⊥ . Finally, vi · vi = bj · bj = 1 since both sets are orthonormal sets. Choose x ∈ Rn ; then there are αr , βs such that x= k X αr vr + r=1 j X βs bs . s=1 Then using orthonormality of S ∪ T , we have ! j k X X x · vi = αr vr + βs bs · vi = αi and x · bi = r=1 s=1 k X αr vr + r=1 j X ! βs bs · bi = βi . s=1 But then (again using orthonormality) 2 kxk = x · x = k X α r vr + r=1 = k X r=1 = k X αr2 + ≥ ! k X βs bs s=1 j X r=1 αr vr + j X ! βs bs s=1 βs2 s=1 2 |x · vr | + r=1 k X j X j X 2 |x · bs | s=1 2 |x · vr | , r=1 as required. (b) Looking at the string of equalities and inequalities above, we see that equality holds if and only Pj 2 if s=1 |x · bs | = 0. But this means that x · bs = 0 for all s, which in turn means that x ∈ span{vj } = W . 5.3 The Gram-Schmidt Process and the QR Factorization 1. First obtain an orthogonal basis: " # 1 v1 = x1 = 1 " # " # " # " # " # 3 1 − 21 1 1·1+1·2 1 v2 = x2 − projv1 (x2 )v1 = − = − 23 = . 1 1·1+1·1 1 2 2 2 2 5.3. THE GRAM-SCHMIDT PROCESS AND THE QR FACTORIZATION 389 Normalizing each of the vectors gives " # " # √1 v1 1 1 q1 = =√ = 12 , kv1 k √ 2 1 2 " # " # − √12 − 12 v2 1 q2 = = . = √ 1 kv2 k √1 1/ 2 2 2 2. First obtain an orthogonal basis: v1 = x1 = 3 −3 3·3−3·1 3 3 − v2 = x2 − projv1 (x2 )v1 = 1 3 · 3 − 3 · (−3) −3 1 3 3 3 1 2 = = − − = . 1 1 −1 2 3 −3 Normalizing each of the vectors gives " # " # √1 3 1 v1 2 = =√ , q1 = kv1 k 18 −3 − √12 " # " # √1 v2 1 2 = 12 q2 = =√ kv2 k √ 8 2 2 3. First obtain an orthogonal basis:   1 v1 = x1 = −1 −1           2 0 1 0 2 1 · 0 − 1 · 3 − 1 · 3 −1 = 3 + −2 = 1 v2 = x2 − projv1 (x2 )v1 = 3 − 1 · 1 − 1 · (−1) − 1 · (−1) 3 −1 3 −2 1 v3 = x3 − projv1 (x3 )v1 − projv2 (x3 )v2               3 3 1 2 0 1 2 1 · 3 − 1 · 2 − 1 · 4 2 · 3 + 1 · 2 + 1 · 4 −1 − 1 = 2 + −1 − 2 1 = −1 . = 2 − 1 · 1 − 1 · (−1) − 1 · (−1) 2·2+1·1+1·1 4 4 −1 1 1 −1 1 Normalizing each of the vectors gives     √1 1 3  v1 1    = − √1  , q1 = =√  −1 kv1 k 3    13  −1 − √3     √2 2   16  v2 1  q2 = =√  1 =  √  , kv2 k 6    16  √ 1 6     0 0  1  v3 1   √  q3 = =√  = −1   − 2  . kv3 k 2 √1 1 2 390 CHAPTER 5. ORTHOGONALITY 4. First obtain an orthogonal basis:   1 v1 = x1 = 1 1         1 1 1 1  3 1 · 1 + 1 · 1 + 1 · 0 2 1 v1 = 1 − 1 =  v2 = x2 − projv1 (x2 )v1 = 1 −  3 1·1+1·1+1·1 3 1 0 0 −2 3 v3 = x3 − projv1 (x3 )v1 − projv2 (x3 )v2   1 1 ·1+ 1 ·0− 2 ·0 1·1+1·0+1·0 = 0 − v1 − 1 31 1 3 1 2 3 2 v2 1·1+1·1+1·1 3 · 3 + 3 · 3 − 3 · −3 0       1 1 1   1   1  3     1 = 0 − 3 1 − 2  3  0 1 − 23   1  2 1 = − 2 . 0 Normalizing each of the vectors gives     √1 1   13  v1 1  q1 = =√  1 =  √  , kv1 k 3    13  √ 1 3    q3 =    1 √1 3  16  v2 1   1 =  √  , q2 = =p kv2 k 2/3  22   26  −3 − √6 v3 1 = √ kv3 k 1/ 2   1 √1 2  2  − 1  = − √1   2  2 0 0 5. Using Gram-Schmidt, we get   1   v1 = x1 = 1 0           3 1 3 1 − 21   1 · 3 + 1 · 4 + 0 · 2     7    1 v2 = x2 − projv1 (x2 )v1 = 4 − 1 = 4 − 1 =  2  . 1·1+1·1+0·0 2 2 0 2 0 2 5.3. THE GRAM-SCHMIDT PROCESS AND THE QR FACTORIZATION 6. Using Gram-Schmidt, we get   2 −1  v1 = x1 =   1 2   3 −1 2 · 3 − 1 · (−1) + 1 · 0 + 2 · 4  v2 = x2 − projv1 (x2 )v1 =   0 − 2 · 2 − 1 · (−1) + 1 · 1 + 2 · 2 v1 4       0 2 3   −1 3 −1  12       =  0 − 2  1 = − 3   2 2 4 1 v3 = x3 − projv1 (x3 )v1 − projv2 (x3 )v2   1 1 3 1  − 2 · 1 − 1 · 1 + 1 · 1 + 2 · 1 v1 − 0 · 1 + 2 · 1 − 2 · 1 + 1 · 1 v2 = 1 2 · 2 − 1 · (−1) + 1 · 1 + 2 · 2 0 · 0 + 12 · 21 − 32 · − 32 + 1 · 1 1     1     0 2 1  1   57   2 5 1 2 −1        = 1 − 5  1 + 0 − 3  =  3 .  2 5 2 1 1 1 5 7. First compute the projection of v on each of the vectors v1 and v2 from Exercise 5:     1 0 v1 · v 1 · 4 + 1 · (−4) + 0 · 3     projv1 v = = v1 = 1 0 v1 · v1 1·1+1·1+0·0 0 0       −2 − 12 −1 1 1 − 2 · 4 + 2 · (−4) + 2 · 3  1  4  21   92  v2 · v projv2 v = v2 =  2 =  2 =  9 . 2 2 v2 · v2 9 − 12 + 12 + 22 8 2 2 9 Then the component of v orthogonal to both v1 and v2 is       38 4 − 92 9    2   38  −4 −  9  = − 9  , 8 19 3 9 9 so that   38 9  38  4 v = − 9  + v2 . 9 19 9 391 392 CHAPTER 5. ORTHOGONALITY 8. First compute the projection of v on each of the vectors v1 and v2 from Exercise 6:      2 2 2 5  1 −1 − 1  −1 v1 · v 2·1−1·4+1·0+2·2      5 projv1 v = v1 =   =   =  1 v1 · v1 22 + 12 + 12 + 22  1 5  1  5  2 2 5      0 0 0  1  4 1 0 · 1 + f rac12 · 4 − 32 · 0 + 1 · 2  v2 · v  2 8  2  7 projv2 v = v2 =  3  =  3  =  12  2 2 v2 · v2 − 2  7 − 2  − 7  02 + 1 + − 3 + 1 2 2  2 2 1 v3 · v v3 = v3 · v3 8 7 1 5 5 60     217   1 7 3 1 7 7 · 1 + · 4 + · 0 + · 2 31       5 5 5 5 .  53  =  60 5 = 93  1 2 7 2 3 2 1 2 3 12     + 5 + 5 + 5 5 5 60 5 1 1 31 5 5 60 1 projv3 v = 1  31    Then the component of v orthogonal to both v1 and v2 is  2    31   1    0 5 60 12 4  1   4   217   1    − 5   7   60   84  −4 −  1  −  12  −  93  =  1  ,  5  − 7   60  − 28  3 2 8 31 5 − 84 5 7 60 so that 1 12  1 1 8 31   v =  84  + v1 + v2 + v3 . 1 7 12 − 28  5 5 − 84   9. To find the column space of the matrix, row-reduce it:    0 1 1 1 1 0 1 → 0 1 1 0 0 0 1 0  0 0 . 1 It follows that, since there are 1s in all three columns, the columns of the original matrix are linearly independent, so they form a basis for the column space. We must orthogonalize those three vectors.   0   v1 = x1 = 1 1           0 1 0 1 1   0 · 1 + 1 · 0 + 1 · 1     1    1 v2 = x2 − projv1 (x2 )v1 = 0 − 1 = 0 − 1 = − 2  0·0+1·1+1·1 2 1 1 1 1 1 2 v3 = x3 − projv1 (x3 )v1 − projv2 (x3 )v2       1 0 1 1 1 1 · 1 − 2 · 1 + 2 · 0  1   0·1+1·1+1·0  = 1 − 1 − − 2  0·0+1·1+1·1 1 · 1 − 21 · − 12 + 12 · 21 1 0 1 2         2 1 0 1 3   1  1    = 1 − 1 − − 21  =  23  . 2 3 1 0 1 − 23 2 5.3. THE GRAM-SCHMIDT PROCESS AND THE QR FACTORIZATION 10. To find the column space of the matrix, row-reduce it:    1 1 1 1 0  1 −1 2  → 0 −1 1 0 0 1 5 1 0 1 0 0 393  0 0 . 1 0 It follows that, since there are 1s in all three columns, the columns of the original matrix are linearly independent, so they form a basis for the column space. We must orthogonalize those three vectors.   1  1  v1 = x1 =  −1 1           0 1 1 1 1 −1 1 · 1 + 1 · (−1) − 1 · 1 + 1 · 5  1 −1  1 −2          v2 = x2 − projv1 (x2 )v1 =   1 − 1 · 1 + 1 · 1 − 1 · (−1) + 1 · 1 −1 =  1 − −1 =  2 4 5 1 5 1 v3 = x3 − projv1 (x3 )v1 − projv2 (x3 )v2       1 1 0 2  1 −2 1 · 1 + 1 · 2 − 1 · 0 + 1 · 1 0 · 1 − 2 · 2 + 2 · 0 + 4 · 1      = 0 − 1 · 1 + 1 · 1 − 1 · (−1) + 1 · 1 −1 − 0 · 0 − 2 · (−2) + 2 · 2 + 4 · 4  2 1 1 4         0 0 1 1 −2 1 2  1        = 0 − −1 + 0  2 = 1 . 0 4 1 1     0 0 11. The given vector, x1 , together with x2 = 1 and x3 = 0 form a basis for R3 ; we want to 0 1 orthogonalize that basis.   3   v1 = x1 = 1 5           3 0 − 35 3 0 3 1    34    3·0+1·1+5·0    v2 = x2 − projv1 (x2 )v1 = 1 − 1 = 1 − 1 =  35  2 2 2 3 +1 +5 35 0 − 71 5 0 5 v3 = x3 − projv1 (x3 )v1 − projv2 (x3 )v2       3 0 3 − 35 34 1 3 − ·0+ ·0− ·1    3·0+1·0+5·1   = 0 − 1 − 353 2 3534 2 7 1 2  34 35  32 + 12 + 52 − 35 + 35 + 7 1 5 − 71         3 0 3 − 35 − 15 34 5  34     1   = 0 − 1 +  35  =  0 . 7 34 9 1 5 − 17 34 12. Let   1  2  v1 =  −1 , 0   1 0  v2 =  1 . 3 394 CHAPTER 5. ORTHOGONALITY Note that v1 · v2 = 0. Now let   0 0  x4 =  0 . 1   0 0  x3 =  1 , 0 To see that v1 , v2 , x3 , and x4 form a basis for R4 , row-reduce the matrix that has these vectors as its columns:     1 1 0 0 1 0 0 0  2 0 0 0 0 1 0 0     −1 1 1 0 → 0 0 1 0 . 0 3 0 1 0 0 0 1 Since there are 1s in all four columns, the column vectors are linearly independent, so they form a basis for R4 . Next we need to orthogonalize that basis. Since v1 and v2 are already orthogonal, we start with x3 :   5       66 0 1 1  1  0 1  2  3  v1 · x3 1  v2 · x3 0        v3 = x3 − = v1 − v2 =   +   −  49    1 −1 1 v1 · v1 v2 · v2 6 11  66  0 0 3 3 − 11 v1 · x4 v2 · x4 v3 · x4 v4 = x4 − v1 − v2 − v3 v1 · v1 v2 · v2 v3 · v3           5 12 1 − 49 1 0 66    1   6          2 0 0 18  3      3  =  49 .    + =   + 0  −       49 −1 11 1 49  66   0  0 3 4 3 0 − 11 1 49   0 13. To start, we let x3 = 0 be the third column of the matrix. Then the determinant of this matrix 1 is (for example, expanding along the second row) √16 6= 0, so the resulting matrix is invertible and therefore the three column vectors form a basis for R3 . The first two vectors are orthogonal and each is of unit length, so we must orthogonalize the third vector as well: v3 = x3 − q1 · x3 q1 · q1 q1 − q2 · x3 q2 · q2 q2  1   1    √ √ 0 3 2 √1 √1 −   √1     2 3 = 0 − 2 0 − 2 2 2  3  2  √1 √1 + 02 + − √12 + √13 + √13 √1 1 − √12 2 3 3    1   1    1 √ √ 0 2 1  1  13   61     = 0 + √  0 − √  √3  = − 3  . 2 3 1 1 1 √ √ 1 − 2 6 3 Then v3 is orthogonal to the other two columns of the matrix, but we must normalize v3 to convert it to a unit vector. s 2 2 2 1 1 1 1 kv3 k = + − + =√ , 6 3 6 6 5.3. THE GRAM-SCHMIDT PROCESS AND THE QR FACTORIZATION 395 so that  √1 6 1   q3 = √ v3 = − √26  , 1/ 6 √1 6  so that the completed matrix Q is  √1 2  Q= 0 − √12 √1 3 √1 3 √1 3  √1 6  − √26  . √1 6     2 1  1 1    14. To simplify the calculations, we will start with the multiples v1 =  1 and v2 =  0 of the given −3 1 basis vectors (column vectors of Q). We will find an orthogonal basis containing those two vectors and then normalize all four at the very end. Note that v1 and v2 are already orthogonal. If we choose     0 0 0 0    x3 =  1 and x4 = 0 , 1 0 then calculating the determinant of v1 v2 x3 x4 , for example by expanding along the last column, shows that the matrix is invertible, so that these four vectors form a basis for R4 . We now need to orthogonalize the last two vectors: p3 = x3 − v1 · x3 v1 · v1 v1 − v2 · x3 v2 · v2 v2       0 1 2 0 1 · 0 + 1 · 0 + 1 · 1 + 1 · 0 1  2 · 0 + 1 · 0 + 0 · 1 − 3 · 0  1      = −  −   1 1 · 1 + 1 · 1 + 1 · 1 + 1 · 1 1 2 · 2 + 1 · 1 + 0 · 0 − 3 · (−3)  0 0 1 −3       0 1 −1 0 1 1 1 −1       =   −   =  . 1 4 1 4  3 0 1 −1 396 CHAPTER 5. ORTHOGONALITY   −1 −1   Now let v3 = 4p3 =  , again avoiding fractions. Next,  3 −1 v4 = x4 − v1 · x4 v1 · v1 v1 − v2 · x4 v2 · v2 v2 − v3 · x4 v3 · v3 v3       1 2 0  0 1 · 0 + 1 · 0 + 1 · 0 + 1 · 1 1 2 · 0 + 1 · 0 + 0 · 0 − 3 · 1  1      = −  −   0 1 · 1 + 1 · 1 + 1 · 1 + 1 · 1 1 2 · 2 + 1 · 1 + 0 · 0 − 3 · (−3)  0 1 −3 1   −1   −1 · 0 − 1 · 0 + 3 · 0 − 1 · 1 −1 −   −1 · (−1) − 1 · (−1) + 3 · 3 − 1 · (−1)  3 −1         0 1 2 −1 0 1 1  1 −1 3 1         = −  +  +   0 4 1 14  0 12  3 1  2 −3 1 −1 21 − 5    =  42  .  0 1 42 We now normalize v1 through v4 ; note that we already have normalized forms for the first two. p √ kv3 k = 12 + 12 + 32 + 12 = 2 3 s 2 2 2 5 1 1 2 + − + 02 + =√ , kv4 k = 21 42 42 42 so that the matrix is 1 2 1 2 Q= 1 2 1 2 15. From Exercise 9, kv1 k = √ √ √2 14 √1 14 − 63 0 3 2√ − 63 − √314 √ − 63 √  √4 42  − √542  . 0   √1 42 √ 2, kv2 k = √ 6 2 3 , and kv k = 3 2 3 , so that  h Q = q1 q2 √2 6 − √16 √1 6 0 i  q3 =  √12 √1 2 Therefore  0  R = QT A =  √26 √1 3 √1 2 − √16 √1 3  √1 0 2  √1  1 6 1 − √13 1 0 1  √1 3  √1  . 3 − √13  √ 2 1   1 =  0 0 0 √1 2 √3 6 0  √1 2  √1  . 6 √2 3 5.3. THE GRAM-SCHMIDT PROCESS AND THE QR FACTORIZATION 397 √ √ 16. From Exercise 10, kv1 k = 2, kv2 k = 2 6, and kv3 k = 2, so that  h Q = q1 1 2  i  1 2 q3 =  − 1  2 1 2 q2 √  0  √1  2 .  √1  2 0 0 − 66 √ 6 √6 6 3 Therefore  1 2 R = QT A =  0 0 17. We have  − 21 √ 6 6 √1 2 1 √2 − 66 √1 2  2 3  R = QT A =  13 2 3 1 3 2 3 − 32  1  √2   6  3  0 − 23 1 1 −1 1  2 2  3  1 1 −2 3   1 2   −1 2  = 0  1 0  0 5 1 1 2 √ 2 6 0   3 9 8 2   7 −1 = 0 6 −2 1 0 0 2   . 0  √  2  1 3 2 3 7 3 18. We have " T R=Q A= √1 6 √1 3 √2 6 − √16 0 √1 3  1 # 0   2  1 √ −1 3 0  3 "√ 4 6  = −1 0 √ # 2 6 √ . 3 1 19. If A is already orthogonal, let Q = A and R = I, which is upper triangular; then A = QR. 20. First, if A is invertible, then A has linearly independent columns, so that by Theorem 5.16, A = QR where R is invertible. Since R is upper triangular, its determinant is the product of its diagonal entries, so that R invertible implies that all of its diagonal entries are nonzero. For the converse, if R is upper triangular with all diagonal entries nonzero, then it has nonzero determinant; since Q is orthogonal, it is invertible so has nonzero determinant as well. But then det A = det(QR) = (det Q)(det R) 6= 0, so that A is invertible. Note that when this occurs, A−1 = (QR)−1 = R−1 Q−1 = R−1 QT . 21. From Exercise 20, A−1 = R−1 QT . To compute R−1 , we row-reduce R I using Exercise 15: √   √ √ √  2 √12 √12 1 0 0 1 0 0 22 − 66 − 63 √ √     6 3 .  0 √3  → 0 1 0 √1 0 1 0 − 0    3 6 6 √6  3 0 0 1 0 0 0 0 √23 0 0 1 2 Thus √ √ √ 2  2 − 66 − 63 6 3 √ 0 0 A−1 = R−1 QT =   0 √  0 √   2 − 63    √6 3 2 √1 3 √1 2 − √16 √1 3   √1 −1 2  2 1 √ = 1  2 6 1 − √13 2 1 2 − 12 1 2  1 2 1 . 2 − 21 22. From Exercise 20, A−1 = R−1 QT . To compute R−1 , we row-reduce R I using Exercise 17:     2 3 9 13 1 0 0 1 0 0 13 − 12 21    1 1  − 21 0 6 23 0 1 0 → 0 1 0 0 . 6 7 3 0 0 3 0 0 1 0 0 1 0 0 7 398 CHAPTER 5. ORTHOGONALITY Thus  1 3  − 12  A−1 = R−1 QT =  0 0 1 6 0 2 2 21 3   1 − 21   31 3 2 7 3 1 3 2 3 − 23 − 23   5 42   2 1 =  42 3 2 1 3 7 − 27 11 − 21 1 7 − 27  2  . 21  1 7 23. If A = QR is a QR-factorization of A, then we want to show that Rx = 0 implies that x = 0 so that R is invertible by part (c) of the Fundamental Theorem. Note that since Q is orthogonal, so is QT , so that by Theorem 5.6(b), QT x = kxk for any vector x. Then (QT A)x = QT (Ax) = kAxk = 0 ⇒ Ax = 0. Rx = 0 ⇒ kRxk = 0 ⇒ P If ai are the columns of A, then Ax = 0 means that ai xi = 0. But this means that all of the xi are zero since the columns of A are linearly independent. Thus x = 0. By part (c) of the Fundamental Theorem, we conclude that R is invertible. 24. Recall that row(A) = row(B) if and only if there is an invertible matrix M such that A = M B. Then A = QR implies that AT = RT QT ; since A has linearly independent columns, R is invertible, so that RT is as well. Thus row(AT ) = row(QT ), so that col(A) = col(Q). Exploration: The Modified QR Process d 1. I − 2uu = I − 2 1 d1 d2 T d2 2d21 =I− 2d1 d2 2d1 d2 1 − 2d21 = 2 2d2 −2d1 d2 −2d1 d2 . 1 − 2d22 2. (a) From Exercise 1, " 1 − 2d21 Q= −2d1 d2 (b) Since kx − yk = p # " 9 −2d1 d2 1 − 2 · 25 = 1 − 2d22 −2 · 53 · 45 # " 7 −2 · 53 · 45 25 = 16 24 1 − 2 · 25 − 25 # − 24 25 . 7 − 25 √ (5 − 1)2 + (5 − 7)2 = 2 5, we have " # " # 4 √ √2 1 2 5 5 u = √ (x − y) = = . 2 2 5 − 2√ − √15 5 Then from Exercise 1, " Q= 1 − 2d21 −2d1 d2 −2d1 d2 1 − 2d22 # 3. (a) QT = (I − 2uuT )T = I − 2 uuT   " 1 − 2 · 45 −2 · √25 · − √15 − 35 = = 4 −2 · √25 · − √15 1 − 2 · 15 5 T = I − 2 uT T 4 5 3 5 # . uT = I − 2uuT = Q, so that Q is symmetric. (b) Start with QQT = QQ = (I − 2uuT )(I − 2uuT ) = I − 4uuT + −2uuT 2 = I − 4uuT + 4uuT uuT . Now, uT u = u · u = 1 since u is a unit vector, so the equation above becomes QQT = I − 4uuT + 4uuT uuT = I − 4uuT + 4uuT = I. Thus QT = Q−1 and it follows that Q is orthogonal. (c) This follows from parts (a) and (b): Q2 = QQ = QQT = QQ−1 = I. Exploration: The Modified QR Process 399 4. First suppose that v is in the span of u; say v = cu. Then again using the fact that uT u = u · u = 1, Qv = Q(cu) = (I − 2uuT )(cu) = c(u − 2uuT u) = c(u − 2u) = −cu = −v. If v · u = uT v = 0, then Qv = (I − 2uuT )v = v − 2uuT v = v. 5. First normalize the given vector to get   √1  16   u= − √6  . √2 6 Then  1 6  uuT = − 16 1 3 − 16 1 6 − 13  1 3  − 13  , 2 3  2 3  so that Q = I − 2uuT =  13 − 23 1 3 2 3 2 3 − 32  2 . 3 1 −3 For Exercise 3, clearly Q is symmetric, and each column has norm 1. A short computation shows that each pair of columns is orthogonal. Therefore Q is orthogonal. Finally,    2 2 1 1 − 32 − 23 3 3 3 3  2  1 2 2 Q2 =  13 23 = I. 3  3 3 3 2 1 2 1 2 2 −3 3 −3 −3 3 −3 For Exercise 4, first suppose that v = cu. Then  2  3 1 Qv = Q(cu) = cQu = c   3 − 23 1 3 2 3 2 3     √1 − √1   16   16  2   √  = c  √  = −cu = −v. − 6  3  6 1 √2 √2 −3 − 6 6 − 23 If v · u = uT v = 0, then h √1 6 − √16   i x1 √2 x2  = √1 x1 − √1 x2 + √2 x3 = 0, 6 6 6 6 x3     1 0 so that x1 − x2 + 2x3 = 0 and therefore x2 = x1 + 2x3 . Thus if v · u = 0, then v = s 1 + t 2. But 0 1 then      1 0      Qv = Q s 1 + t 2 0 1     1 0     = sQ 1 + tQ 2 0 1       2 1 2 1 − 23 1 − 23 0 3 3 3 3  1 2  1 2 2   2   + t 3 3 = s 3 3 3  1 3  2 2 2 1 2 2 −3 3 −3 0 − 3 3 − 13 1     1 0     = s 1 + t 2 = v. 0 1 400 CHAPTER 5. ORTHOGONALITY 6. x − y is a multiple of u, so that Q(x − y) = −(x − y) by Exercise 4. Next, (x + y) · u = 1 (x + y) · (x − y). kx − yk But by Exercise 61 in Section 1.2, this dot product is zero because kxk = kyk. Thus x+y is orthogonal to u, so that Q(x + y) = x + y. So we have Q(x − y) = −(x − y) Q(x + y) = x + y 7. We have Qx − Qy = −x + y ⇒ Qx + Qy = x + y   −2 1 1 u= (x − y) = √  2 , kx − yk 2 3 2 so that  1 3  1 T uu = − 3 − 13 − 13 − 31 1 3 1 3 Then  q  1 3 1 3 1 3 2 Qx =  3 2 3 8. Since kyk = ⇒ 2Qx = 2y ⇒ Qx = y.  ⇒ 2 3 1 3 − 32 1 3 2 T Q = I − 2uu =  3 2 3   2 3 1 3 − 32  2 3  − 23  . 1 3   2 3 1 3   2   = − 3  2 0 = y. 1 0 2 3 2 kxk = kxk, if ai is the ith column vector of A, we can apply Exercise 6 to get Q1 A = Q1 a1 Q1 a2 ... Q1 an = Q1 x Q1 a2 ... Q1 an = y Q1 a2 ... Q1 an ∗ ∗ = , 0 A1 since the last m − 1 entries in y are all zero. Here A1 is (m − 1) × (n − 1), the left-hand ∗ is 1 × 1, and the right-hand ∗ is 1 × (n − 1). 9. Since P2 is a Householder matrix, it is orthogonal, so that 1 0 1 0 Q2 QT2 = = I, 0 P2 0 P2T so that Q2 is orthogonal as well. Further, 1 0 Q2 Q1 A = 0 P2  ∗ ∗ ∗ ∗ ∗ = = 0 0 A1 0 P2 A 1 0  ∗ ∗ ∗ ∗ . 0 A2 10. Using induction, suppose that  ∗ Qk Qk−1 · · · Q1 A = 0 0 ∗ ∗ 0  ∗ ∗ . Ak Repeat Exercise 8 on the matrix Ak , giving a Householder matrix Pk+1 such that ∗ ∗ Pk+1 Ak = . 0 Ak+1 Exploration: The Modified QR Process 401 1 0 . Then using the fact that Pk+1 is orthogonal, we again conclude that Qk+1 is 0 Pk+1 orthogonal, and that   ∗ ∗ ∗ ∗ . Qk+1 Qk Qk−1 · · · Q1 A = 0 ∗ 0 0 Ak+1 Let Qk+1 = The result then follows by induction. The resulting matrix R = Qm−1 · · · Q2 Q1 A is upper triangular by construction. 11. If Q = Q1 Q2 · · · Qm−1 , then Q is orthogonal since each Qi is orthogonal, and T QT = (Q1 Q2 · · · Qm−1 ) = QTm−1 · · · QT2 QT1 = Qm−1 · · · Q2 Q1 . Thus QT A = Qm−1 · · · Q2 Q1 A = R, so that QQT A = A = QR, since QQT = I. p 3 32 + (−4)2 = 5 . Then 12. (a) We start with x = , so that y = −4 0 " # " # 2 1 − √15 T 5 5 u= ⇒ uu = 2 4 . − √25 5 5 Then " − 54 − 53 3 5 − 45 T Q1 = I − 2uu = # " 5 ⇒ Q1 A = 0 # 3 −1 = R, −9 −2 and A = QT1 R.     3 1 (b) We start with x = 2, so that y = 0. Then 0 2    1 − √13 3  1   √  ⇒ uuT = − 1 u=   3 3 √1 − 31 3 Thus  1 3  Q1 = I − 2uuT =  23 2 3  2 3 1 3 − 32 Then 2 3  − 23  1 3 " 4 A1 = 3 3 1 − 31 − 13  1 3 1 3 1 . 3 1 3  3  ⇒ Q1 A = 0 0 1 3 4 3  −5 1 4 3 3 1  8 3 1 . 3 4 3 # , so that in the next iteration x= " # 4 ⇒ 3 y= Then " # 5 " T uu = 0 1 10 3 − 10 " ⇒ u= 3 − 10 # 9 10 √3 10 P2 = I − 2uuT = 4 5 3 5 3 5 − 54 # " ⇒ Q2 = # . . So " − √110 1 0 0 P2 #  1  = 0 0  0 0 4 5 3 5 3 . 5 4 −5 402 CHAPTER 5. ORTHOGONALITY Finally,  1  Q2 Q1 A = 0 0 0 4 5 3 5  3 3  5  0 0 − 45 0 −5 1 4 3 3 1    8 −5 1 3 16  = R, 5 3 15  0 1 − 13 15 8 3 3  1 = 0 3 4 0 3 and Q1 Q2 R = A. Exploration: Approximating Eigenvalues with the QR Algorithm 1. We have A1 = RQ = (Q−1 Q)RQ = Q−1 (QR)Q = Q−1 AQ, so that A1 ∼ A. Thus by Theorem 4.22(e), A1 and A have the same eigenvalues. 2. Using Gram-Schmidt, we have " # 1 v1 = 1 " # " # " # " # " # 0 0 − 23 1·0+1·3 1 3 1 v2 = − . = − = 3 1·1+1·1 1 2 1 3 3 2 Normalizing these vectors gives " Q= √1 2 √1 2 √1 2 − √12 Thus "√ A1 = RQ = "√ # 2 0 T , so that R = Q A = √ #" 3 2 √1 2 2 √ √1 −322 2 √1 2 − √12 2 0 # " = 5 2 − 32 √ # 3 2 2 √ . −322 − 21 3 2 # . Then 1−λ 1 0 = (1 − λ)(3 − λ) 3−λ 5 −λ − 12 3 3 5 2 det(A1 − λI) = −λ − λ − = λ2 − 4λ + 3 = (λ − 1)(λ − 3). = 3 2 2 4 − 32 − λ 2 det(A − λI) = So A and A1 do indeed have the same eigenvalues. 3. The proof is by induction on k. We know that A1 ∼ A. Now assume An−1 ∼ A. Then −1 −1 An = Rn−1 Qn−1 = (Q−1 n−1 Qn−1 )Rn−1 Qn−1 = Qn−1 (Qn−1 Rn−1 )Qn−1 = Qn−1 An−1 Qn−1 , so that An ∼ An−1 ∼ A. 4. Converting A1 = Q1 R1 to decimals gives, using technology 0.86 0.51 2.92 −1.20 3.12 0.47 A1 = Q1 R1 ≈ ⇒ A2 = R1 Q1 ≈ −0.51 0.86 0 1.04 −0.53 0.88 −0.99 0.17 −3.16 −0.32 3.06 −0.84 A2 = Q2 R2 ≈ ⇒ A3 = R2 Q2 ≈ 0.17 0.99 0 0.95 0.16 0.93 −1.00 −0.05 −3.06 0.79 3.02 0.94 ⇒ A4 = R3 Q3 ≈ A3 = Q3 R3 ≈ −0.05 1.00 0 0.97 −0.05 0.97 −1.00 0.02 −3.02 −0.92 3.00 −0.98 A4 = Q4 R4 ≈ ⇒ A5 = R4 Q4 ≈ . 0.02 1.00 0 0.99 0.02 0.99 The Ak are approaching an upper triangular matrix U . Exploration: Approximating Eigenvalues with the QR Algorithm 403 5. Since Ak ∼ A for all k, the eigenvalues of A and Ak are the same, so in the limit, the upper triangular matrix U will have eigenvalues equal to those of A; but the eigenvalues of a triangular matrix are its diagonal entries. Thus the diagonal entries of U will be the eigenvalues of A. 6. (a) Follow the same process as in Exercise 4. 0.71 0.71 2.83 2.83 4.02 0.00 A = QR ≈ ⇒ A1 = RQ ≈ 0.71 −0.71 0 1.41 1.00 −1.00 −0.97 −0.24 −4.14 0.24 3.96 1.23 A1 = Q1 R1 ≈ ⇒ A2 = R1 Q1 ≈ −0.24 0.97 0 −0.97 0.23 −0.94 −1.00 −0.06 −3.97 −1.17 4.04 −0.93 A2 = Q2 R2 ≈ ⇒ A3 = R2 Q2 ≈ −0.06 1.00 0 −1.01 0.06 −1.01 −1.00 −0.01 −4.04 0.94 4.03 0.98 A3 = Q3 R3 ≈ ⇒ A4 = R3 Q3 ≈ −0.01 1.00 0 −1.00 0.01 −1.00 −1.00 0.00 −4.03 −0.98 4.03 −0.98 A4 = Q4 R4 ≈ ⇒ A5 = R4 Q4 ≈ . 0.00 1.00 0 −1.00 0.00 −1.00 The eigenvalues of A are approximately 4.03 and −1.00. (b) Follow the same process as in Exercise 4. 0.45 0.89 2.24 1.34 2.20 1.40 A = QR ≈ ⇒ A1 = RQ ≈ 0.89 −0.45 0 0.45 0.40 −0.20 −0.98 −0.18 −2.24 −1.34 2.44 −0.92 A1 = Q1 R1 ≈ ⇒ A2 = R1 Q1 ≈ −0.18 0.98 0 −0.45 0.08 −0.44 −1.00 −0.03 −2.44 0.93 2.41 1.01 A2 = Q2 R2 ≈ ⇒ A3 = R2 Q2 ≈ −0.03 1.00 0 −0.41 0.01 −0.41 −1 0 −2.41 −1.01 2.42 −1 A3 = Q3 R3 ≈ ⇒ A4 = R3 Q3 ≈ 0 1 0 −0.42 0 −0.42 −1.00 0.00 −2.42 1 2.41 1 A4 = Q4 R4 ≈ ⇒ A5 = R4 Q4 ≈ . 0.00 1.00 0 −0.41 0 −0.41 √ The eigenvalues of A are approximately 2.41 and −0.41 (the true values are 1 ± 2). (c) Follow the same process as in Exercise 4.    4.24 0.47 −0.94 0.24 −0.06 −0.97 1.26 ⇒ 0.97 0  0 1.94 A = QR ≈  0.24 −0.94 0.23 −0.24 0 0 0.73   2.00 0 −3.89 A1 = RQ ≈ −0.73 2.18 −0.31 −0.69 0.17 −0.18    −0.89 −0.33 0.30 −2.24 0.76 3.32 0 −2.05 1.57 ⇒ A1 = Q1 R1 ≈  0.33 −0.94 −0.07  0.31 0.03 0.95 0 0 −1.31   3.27 0.13 2.44 1.98 1.64 A2 = R1 Q1 ≈ −0.18 −0.40 −0.04 −1.25    −0.99 −0.05 0.12 −3.3 −0.03 −2.47 0 −1.99 −1.8 ⇒ A2 = Q2 R2 ≈  0.06 −1.00 0.01  0.12 0.02 0.99 0 0 −1.31   2.96 0.16 −2.86 1.95 −1.81 A3 = R2 Q2 ≈ −0.33 −0.11 −0.02 −0.91 404 CHAPTER 5. ORTHOGONALITY  −0.99 A3 = Q3 R3 ≈  0.11 0.04   −0.11 0.04 −2.98 0.06 2.61 −0.99 0.01  0 −1.95 2.10 ⇒ 0.01 1.00 0 0 −1.03   3.07 0.30 2.49 1.96 2.09 A4 = R3 Q3 ≈ −0.14 −0.04 −0.01 −1.03    −1 −0.04 0.01 −3.07 −0.21 −2.41 −1 0  0 −1.97 −2.20 ⇒ A4 = Q4 R4 ≈ 0.04 0.01 0 1 0 0 −0.99   3.03 0.34 −2.45 A5 = R4 Q4 ≈ −0.12 1.96 −2.21 . −0.01 0 −0.99 The eigenvalues of A are approximately 3.03, 1.96, and −0.99. (The true values are 3, 2, and −1). (d) Follow the same process as in Exercise 4.   0.45 0.72 2.24 −3.13 −2.24 3.00 −1.07 0 0.60 A = QR ≈  ⇒ A1 = RQ ≈ 0 3.35 0 0 2.00 −0.89 0.36 1.00 0.00 3.00 −1.07 3.00 −1.07 A1 = Q1 R1 ≈ ⇒ A2 = R1 Q1 ≈ . 0.00 1.00 0 2.00 0 2.00 Since Q1 is the identity matrix, we get Q2 = I and R2 = R1 , so that successive Ai are equal to A1 . Thus the eigenvalues of A are approximately 3 and 2 (in fact, they are 3, 2, and 0). 7. Following the process in Exercise 4, " A = QR = √2 5 − √15 " A1 = Q1 R1 = √2 5 − √15 − √15 # "√ − √25 − √15 − √25 √8 5 √1 5 5 0 #" √1 5 0 # " ⇒ A1 = RQ = 2 5 − 15 # − √85 √ 5 − 21 5 − 25 " ⇒ A2 = R1 Q1 ≈ # 2 # 3 −1 −2 = A. Here A2 = A (Q1 = Q−1 and R1 = R−1 ); the algorithm fails because the eigenvalues of A, which are ±1, do not have distinct absolute values. 8. Set B = A + 0.9I = 2.9 −1 3 . −1.1 Then −0.95 0.33 −3.07 −3.19 1.86 −4.02 ⇒ B1 = RQ ≈ B = QR ≈ 0.33 0.95 0 −0.06 −0.02 −0.06 −1 0.01 −1.86 4.02 1.9 4 B1 = Q1 R1 ≈ ⇒ B2 = R1 Q1 ≈ 0.01 1 0 −0.1 0 −0.1 −1.00 0 −1.9 −4 1.9 −4 B2 = Q2 R2 ≈ ⇒ B3 = R2 Q2 ≈ . 0 1.00 0 −0.1 0 −0.1 Further iterations will simply reverse the sign of the upper right entry of Bi . The eigenvalues of B are approximately 1.9 and −0.1, so the eigenvalues of A are approximately 1.9 − 0.9 = 1.0 and −0.1 − 0.9 = −1.0. These are correct (see Exercise 7). 5.4. ORTHOGONAL DIAGONALIZATION OF SYMMETRIC MATRICES 405 9. We prove the first equality using induction on k. For k = 1, it says that Q0 A1 = AQ0 , or QA1 = AQ. Now, A1 = RQ and A = QR, so that QA1 = QRQ = (QR)Q = AQ. This proves the basis of the induction. Assume the equality holds for k = n; we prove it for k = n + 1: Q0 Q1 Q2 · · · Qn An+1 = Q0 Q1 Q2 · · · Qn−1 (Qn An+1 ) = Q0 Q1 Q2 · · · Qn−1 (Qn (Rn Qn )) = Q0 Q1 Q2 · · · Qn−1 (Qn Rn )Qn = (Q0 Q1 Q2 · · · Qn−1 An )Qn = (AQ0 Q1 Q2 · · · Qn−1 )Qn = AQ0 Q1 Q2 · · · Qn . For the second equality, note that Ak = Qk Rk , so that Q0 Q1 Q2 · · · Qk−1 Ak = Q0 Q1 Q2 · · · Qk−1 Qk Rk = AQ0 Q1 Q2 · · · Qk−1 . Multiply both sides by Rk−1 · · · R2 R1 to get (Q0 Q1 Q2 · · · Qk−1 Qk )(Rk Rk−1 · · · R2 R1 ) = A(Q0 Q1 Q2 · · · Qk−1 )(Rk−1 · · · R2 R1 ), as desired. For the QR-decomposition of Ak+1 , we again use induction. For k = 0, the equation says that Q0 R0 = A0+1 = A, which is true. Now assume the statement holds for k = n. Then using the first equation in this exercise together with the fact that Ak = Qk Rk and the inductive hypothesis, we get An+1 = A · An = A(Q0 Q1 · · · Qn−1 )(Rn−1 · · · R1 R0 ) = Q0 Q1 · · · Qn−1 An Rn−1 · · · R1 R0 = Q0 Q1 · · · Qn−1 Qn Rn Rn−1 · · · R1 R0 = (Q0 Q1 · · · Qn )(Rn Rn−1 · · · R1 R0 ), as desired. 5.4 Orthogonal Diagonalization of Symmetric Matrices 1. The characteristic polynomial of A is det(A − λI) = 4−λ 1 1 = λ2 − 8λ + 15 = (λ − 5)(λ − 3) 4−λ Thus the eigenvalues of A are λ1 = 3 and λ2 = 5. To find the eigenspace corresponding to λ1 = 3, we row-reduce 1 1 0 1 1 0 A − 3I 0 = → . 1 1 0 0 0 0 −1 A basis for the eigenspace is given by . 1 To find the eigenspace corresponding to λ2 = 5, we row-reduce −1 1 0 1 A − 5I 0 = → 1 −1 0 0 A basis for the eigenspace is given by 1 . 1 −1 0 0 . 0 406 CHAPTER 5. ORTHOGONALITY These eigenvectors are orthogonal, as is guaranteed by Theorem 5.19, so to turn them into orthonormal √ vectors, we normalize them. The norm of each vector is 2, so setting " # − √12 √12 Q= 1 1 √ 2 √ 2 we see that Q is an orthonormal matrix whose columns are the normalized eigenvectors for A, so that 3 0 −1 T Q AQ = Q AQ = = D. 0 5 2. The characteristic polynomial of A is det(A − λI) = −1 − λ 3 3 = λ2 + 2λ − 8 = (λ + 4)(λ − 2). −1 − λ Thus the eigenvalues of A are λ1 = −4 and λ2 = 2. To find the eigenspace corresponding to λ1 = −4, we row-reduce 3 3 0 1 1 0 A + 4I 0 = → . 3 3 0 0 0 0 −1 A basis for the eigenspace is given by . 1 To find the eigenspace corresponding to λ2 = 2, we row-reduce −3 3 0 1 A − 2I 0 = → 3 −1 0 0 1 A basis for the eigenspace is given by . 1 −1 0 0 . 0 These eigenvectors are orthogonal, as is guaranteed by Theorem 5.19, so to turn them into orthonormal √ vectors, we normalize them. The norm of each vector is 2, so setting " # − √12 √12 Q= 1 1 √ 2 √ 2 we see that Q is an orthonormal matrix whose columns are the normalized eigenvectors for A, so that −4 0 Q−1 AQ = QT AQ = = D. 0 2 3. The characteristic polynomial of A is 1−λ det(A − λI) = √ 2 √ 2 = λ2 − λ − 2 = (λ − 2)(λ + 1) −λ Thus the eigenvalues of A are λ1 = −1 and λ2 = 2. To find the eigenspace corresponding to λ1 = −1, we row-reduce " # " # √ √ h i 2 2 0 1 22 0 → . A+I 0 = √ 2 1 0 0 0 0 −1 A basis for the eigenspace is given by √ . 2 To find the eigenspace corresponding to λ2 = 2, we row-reduce √ √ −1 2 0 1 − 2 A − 2I 0 = √ → 0 0 2 −1 0 0 . 0 5.4. ORTHOGONAL DIAGONALIZATION OF SYMMETRIC MATRICES A basis for the eigenspace is given by 407 √ 2 . 1 These eigenvectors are orthogonal, as is guaranteed by Theorem 5.19, so to turn them into orthonormal √ vectors, we normalize them. The norm of each vector is 3, so setting √ # " 6 − √13 3 √ Q= 6 1 3 √ 3 we see that Q is an orthonormal matrix whose columns are the normalized eigenvectors for A, so that −1 0 −1 T Q AQ = Q AQ = = D. 0 2 4. The characteristic polynomial of A is det(A − λI) = 9−λ −2 −2 = λ2 − 15λ + 50 = (λ − 5)(λ − 10). 6−λ Thus the eigenvalues of A are λ1 = 5 and λ2 = 10. To find the eigenspace corresponding to λ1 = 5, we row-reduce " # " # h i 4 −2 0 1 − 21 0 → . A − 5I 0 = −2 1 0 0 0 0 1 A basis for the eigenspace is given by . 2 To find the eigenspace corresponding to λ2 = 10, we row-reduce −1 −2 0 1 A − 10I 0 = → −2 −4 0 0 −2 A basis for the eigenspace is given by . 1 2 0 0 . 0 These eigenvectors are orthogonal, as is guaranteed by Theorem 5.19, so to turn them into orthonormal √ vectors, we normalize them. The norm of each vector is 5, so setting # " √1 √2 − 5 Q = 25 1 √ 5 √ 5 we see that Q is an orthonormal matrix whose columns are the normalized eigenvectors for A, so that 5 0 −1 T Q AQ = Q AQ = = D. 0 10 5. The characteristic polynomial of A is det(A − λI) = 5−λ 0 0 0 0 1−λ 3 = −λ3 + 7λ2 − 2λ − 40 = −(λ − 4)(λ − 5)(λ + 2). 3 1−λ Thus the eigenvalues of A are λ1 = 4, λ2 = 5, and λ3 = −2. To find the eigenspace corresponding to λ1 = 4, we row-reduce     1 0 0 0 1 0 0 0 i h     3 0 → 0 1 −1 0 . A − 4I 0 = 0 −3 0 3 −3 0 0 0 0 0 408 CHAPTER 5. ORTHOGONALITY   0 A basis for the eigenspace is given by 1. 1 To find the eigenspace corresponding to λ2 = 5, we row-reduce    0 0 0 0 0 1 0 3 0 → 0 0 1 A − 5I 0 = 0 −4 0 3 −4 0 0 0 0   1 A basis for the eigenspace is given by 0. 0 To find the eigenspace corresponding to λ3 = −2, we row-reduce    7 0 0 0 1 0 0 A + 2I 0 = 0 3 3 0 → 0 1 1 0 3 3 0 0 0 0   0 A basis for the eigenspace is given by −1. 1  0 0 . 0  0 0 . 0 These eigenvectors are orthogonal, as is guaranteed by Theorem 5.19, so√to turn them into orthonormal vectors, we normalize them. The norm of the first and third vectors is 2, while the second is already a unit vector, so setting   0 1 0   1 √ √1  Q=  2 0 − 2 √1 √1 0 2 2 we see that Q is an orthonormal matrix whose columns are the normalized eigenvectors for A, so that   4 0 0 0 = D. Q−1 AQ = QT AQ = 0 5 0 0 −2 6. The characteristic polynomial of A is det(A − λI) = 2−λ 3 0 3 0 2−λ 4 = −λ3 + 6λ2 + 13λ − 42 = −(λ + 3)(λ − 2)(λ − 7). 4 2−λ Thus the eigenvalues of A are λ1 = −3, λ2 = 2, and λ3 = 7. To find the eigenspace corresponding to λ1 = −3, we row-reduce     3 5 3 0 0 1 0 − 0 4 h i     5  → 0 1 A + 3I 0 =  0 0 3 5 4    . 4 0 4 5 0 0 0  0  3 A basis for the eigenspace is given (after clearing fractions) by −5. 4 To find the eigenspace corresponding to λ2 = 2, we row-reduce    0 0 3 0 1 0 h i       = → A − 2I 0 3 0 4 0 0 1 0 4 0 0 0 0 4 3 0 0 0  0  0 . 0 5.4. ORTHOGONAL DIAGONALIZATION OF SYMMETRIC MATRICES 409   −4 A basis for the eigenspace is given (after clearing fractions) by  0. 3 To find the eigenspace corresponding to λ3 = 7, we row-reduce    1 0 0 −5 3 0 i  h      4 0 → 0 1 A − 7I 0 =  3 −5 0 4 −5  − 43 0 − 54  0 . 0 0 0   3 A basis for the eigenspace is given (after clearing fractions) by 5. 4 0 0 These eigenvectors are orthogonal, as is guaranteed by Theorem 5.19, so to √ turn them into √ orthonormal vectors, we normalize them. The norms of these vectors are respectively 5 2, 5, and 5 2, so setting   3 4 3 √ √ − 5 5 2  5 12 √ √1  Q= − 0  2 2  4 √ 5 2 3 5 4 √ 5 2 we see that Q is an orthonormal matrix whose columns are the normalized eigenvectors for A, so that   −3 0 0 Q−1 AQ = QT AQ =  0 2 0 = D. 0 0 7 7. The characteristic polynomial of A is det(A − λI) = 1−λ 0 −1 0 1−λ 0 −1 0 = −λ3 + 3λ2 − 2λ = −λ(λ − 1)(λ − 2). 1−λ Thus the eigenvalues of A are λ1 = 0, λ2 = 1, and λ3 = 2. To find the eigenspace corresponding to λ1 = 0, we row-reduce     1 0 −1 0 1 0 −1 0 h i     A − 0I 0 =  0 1 0 0 → 0 1 0 0 . −1 0 1 0 0 0 0 0   1 A basis for the eigenspace is given by 0. 1 To find the eigenspace corresponding to λ2 = 1, we row-reduce    0 0 −1 0 1 0 0 0 0 → 0 0 1 A−I 0 = 0 0 −1 0 0 0 0 0 0   0 A basis for the eigenspace is given by 1. 0 To find the eigenspace corresponding to λ3 = 2, we row-reduce    −1 0 −1 0 1 0 1 0 0 → 0 1 0 A − 2I 0 =  0 −1 −1 0 −1 0 0 0 0  0 0 . 0  0 0 . 0 410 CHAPTER 5. ORTHOGONALITY   −1 A basis for the eigenspace is given by  0. 1 These eigenvectors are orthogonal, as is guaranteed by Theorem 5.19, so√to turn them into orthonormal vectors, we normalize them. The norm of the first and third vectors is 2, while the second is already a unit vector, so setting   √1 √1 0 − 2  2 Q= 0 1 0   √1 √1 0 2 2 we see that Q is an orthonormal matrix whose columns are the normalized eigenvectors for A, so that   0 0 0 Q−1 AQ = QT AQ = 0 1 0 = D. 0 0 2 8. The characteristic polynomial of A is det(A − λI) = 1−λ 2 2 2 2 1−λ 2 = −λ3 + 3λ2 + 9λ + 5 = −(λ + 1)2 (λ − 5) 2 1−λ Thus the eigenvalues of A are λ1 = −1 and λ2 = 5. To find the eigenspace E−1 corresponding to λ1 = −1, we row-reduce     2 2 2 0 1 1 1 0 A + I 0 = 2 2 2 0 → 0 0 0 0 . 2 2 2 0 0 0 0 0 Thus the eigenspace E−1 is        −s − t −1 −1  s  = span x1 =  1 , x2 =  0 . t 0 1 To find the eigenspace E5 corresponding to λ2 = 5, we row-reduce    1 0 −4 2 2 0 A − 5I 0 =  2 −4 2 0 → 0 1 2 2 −4 0 0 0   1 Thus a basis for E5 is v3 = 1. 1 −1 −1 0  0 0 . 0 E5 is orthogonal to E−1 as guaranteed by Theorem 10.19, so to turn this basis into an orthogonal basis we must orthogonalize the basis for E−1 above. Use Gram-Schmidt:         −1 −1 −1 −1  2 v · x 1 1 2 1 v1 =  1 , v2 = x2 − v1 =  0 −  1 =  − 2 . v1 · v1 2 0 1 0 1 We double v2 to remove fractions, so we get the orthogonal basis       −1 −1 1 v1 =  1 , v2 = −1 , v3 = 1 0 2 1 5.4. ORTHOGONAL DIAGONALIZATION OF SYMMETRIC MATRICES 411 √ √ √ The norms of these vectors are, respectively, 2, 6, and 3, so normalizing each vector and setting Q equal to the matrix whose columns are the normalized eigenvectors gives   − √12 − √16 √13   1 Q= − √16 √13  ,  √2 1 2 √ √ 0 6 3 so that  −1 Q−1 AQ = QT AQ =  0 0  0 0 −1 0 = D. 0 5 9. The characteristic polynomial of A is 1−λ 1 det(A − λI) = 0 0 1 0 1−λ 0 0 1−λ 0 1 0 0 = λ4 − 4λ3 + 4λ2 = λ2 (λ − 2)2 . 1 1−λ Thus the eigenvalues of A are λ1 = 0 and λ2 = 2. To find the eigenspace E0 corresponding to λ1 = 0, we row-reduce     1 1 0 0 0 1 1 0 0 0 0 0 1 1 0 1 1 0 0 0    A + 0I 0 =  0 0 1 1 0 → 0 0 0 0 0 . 0 0 1 1 0 0 0 0 0 0 Thus the eigenspace E0 is        0 −1 −s  0   1  s   = span x1 =   , x2 =   . −1   0  −t 1 0 t To find the eigenspace E2 corresponding to λ2 = 2, we row-reduce    −1 1 0 0 0 1 −1 0 0  0  1 −1 0 0 0 1 −1 0 → A − 2I 0 =   0 0 0 −1 1 0 0 0 0 0 0 1 −1 0 0 0 0 0  0 0 . 0 0 Thus the eigenspace E2 is        s 1 0 s  1 0   = span x3 =   , x4 =   .  t  0 1 t 0 1 E0 is orthogonal to E2 as guaranteed by Theorem 10.19, and x1 and x2 are also orthogonal, as are x3 √ and x4 . So these four vectors form an orthogonal basis. The norm of each vector is 2, so normalizing each vector and setting Q equal to the matrix whose columns are the normalized eigenvectors gives   − √12 0 √12 0    √1 0 √12 0   2 Q=   √1 √1  0 − 0  2 2 √1 0 0 √12 2 412 CHAPTER 5. ORTHOGONALITY so that  0  0 Q−1 AQ = QT AQ =  0 0 0 0 0 0 0 0 2 0  0 0  = D. 0 2 10. The characteristic polynomial of A is 2−λ 0 det(A − λI) = 0 1 0 0 1−λ 0 0 1−λ 0 0 1 0 = λ4 − 6λ3 + 12λ2 − 10λ + 3 = (λ − 1)3 (λ − 3) 0 2−λ Thus the eigenvalues of A are λ1 = 1 and λ2 = 3. To find the eigenspace E1 corresponding to λ1 = 1, we row-reduce     1 0 0 1 0 1 0 0 1 0 0 0 0 0 0 0 0 0 0 0    AI 0 =  0 0 0 0 0 → 0 0 0 0 0 . 1 0 0 1 0 0 0 0 0 0 Thus the eigenspace E1 is         0 0 −1 −s  0 1 0  t    = span   ,   ,   .  0 0 1 u 0 0 1 s Note that these vectors are mutually orthogonal. To find the eigenspace E3 corresponding to λ2 = 3, we row-reduce    1 0 0 −1 0 0 1 0  0 1 0  0 −2 0 0 0 → A − 3I 0 =  0 0 1  0 0 −2 0 0 1 0 0 −1 0 0 0 0   1 0  . Thus a basis for E3 is   0 1 −1 0 0 0  0 0 . 0 0 Since the eigenvectors of E1 are orthogonal to those of E3 by Theorem 10.19, and the eigenvectors of E3 are already all we need to do is to normalize the vectors. The norms are, √ mutually orthogonal, √ respectively, 2, 1, 1, and 2, so setting Q equal to the matrix whose columns are the normalized eigenvectors gives   1 − √2 0 0 √12    0 1 0 0  , Q= 0 1 0  0  1 √1 √ 0 0 2 2 so that  1 0 −1 T Q AQ = Q AQ =  0 0 0 1 0 0 0 0 1 0  0 0  = D. 0 3 11. The characteristic polynomial is det(A − λI) = a−λ b = (a − λ)2 − b2 . b a−λ 5.4. ORTHOGONAL DIAGONALIZATION OF SYMMETRIC MATRICES 413 This is zero when a − λ = ±b, so the eigenvalues are a + b and a − b. To find the corresponding eigenspaces, row-reduce: −b b 0 1 −1 0 A − (a + b)I 0 = → b −b 0 0 0 0 1 1 0 b b 0 A − (a − b)I 0 = → . b b 0 0 0 0 1 −1 So bases for the two eigenspaces (which are orthogonal by Theorem 5.19) are and . Each of 1 1 √ these vectors has norm 2, so normalizing each gives # " √1 √1 − 2 , Q = 12 1 √ √ 2 so that Q−1 AQ = QT AQ = 2 a+b 0 . 0 a−b 12. The characteristic polynomial is (expanding along the second row) det(A − λI) = a−λ 0 b 0 a−λ 0 b 0 = (a − λ) (a − λ)2 − b2 . a−λ This is zero when a = λ or when a − λ = ±b, so the eigenvalues are a, a + b and a − b. To find the corresponding eigenspaces, row-reduce:     0 0 b 0 1 0 0 0 A − aI 0 = 0 0 0 0 → 0 0 1 0 b 0 0 0 0 0 0 0     −b 0 b 0 1 0 −1 0 0 0 → 0 1 0 0 A − (a + b)I 0 =  0 −b b 0 −b 0 0 0 0 0     1 0 1 0 b 0 b 0 A − (a − b)I 0 = 0 b 0 0 → 0 1 0 0 . b 0 b 0 0 0 0 0       0 1 −1 So bases for the three eigenspaces (which are orthogonal by Theorem 5.19) are 1, 0 and  0. 0 1 1 Normalizing the second and third vectors gives   0 √12 − √12   Q= 0 0 1 , 1 1 √ 0 √2 2 so that  a Q−1 AQ = QT AQ = 0 0 0 a+b 0  0 0 . a−b 13. (a) Since A and B are each orthogonally diagonalizable, they are each symmetric by the Spectral Theorem. Therefore A + B is also symmetric, so it too is orthogonally diagonalizable by the Spectral Theorem. 414 CHAPTER 5. ORTHOGONALITY (b) Since A is orthogonally diagonalizable, it is symmetric by the Spectral Theorem. Therefore cA is also symmetric, so it too is orthogonally diagonalizable by the Spectral Theorem. (c) Since A is orthogonally diagonalizable, it is symmetric by the Spectral Theorem. Therefore A2 = AA is also symmetric, so it too is orthogonally diagonalizable by the Spectral Theorem. 14. If A is invertible and orthogonally diagonalizable, then QT AQ = D for some invertible diagonal matrix D and orthogonal matrix Q, so that −1 −1 T D−1 = QT AQ = Q−1 A−1 QT = QT A−1 QT = QT A−1 Q. Thus A−1 is also orthogonally diagonalizable (and by the same orthogonal matrix). 15. Since A and B are orthogonally diagonalizable, they are both symmetric. Since AB = BA, Exercise 36 in Section 3.2 implies that AB is also symmetric, so it is orthogonally diagonalizable as well. 16. Let A be a symmetric matrix. Then it is orthogonally diagonalizable, say D = QT AQ. Then the diagonal entries of D are the eigenvalues of A. First, if every eigenvalue of A is nonnegative, let √    λ1 · · · 0 λ1 · · · 0 √    ..  . .. D =  ... . . . ...  , and define D =  ...  . √. 0 · · · λn λn 0 ··· √ Let B = Q DQT . Then B is diagonally orthogonalizable by construction, so it is symmetric. Further, since Q is orthogonal, √ √ √ √ √ √ B 2 = Q DQT Q DQT = Q DQ−1 Q DQT = Q D DQT = QDQT = A. Conversely, suppose A = B 2 for some symmetric matrix B. Then B is orthogonally diagonalizable, say QT BQ = D, so that QT AQ = QT B 2 Q = (QT BQ)(QT BQ) = D2 . But D2 is a diagonal matrix whose diagonal entries are the squares of the diagonal entries of D, so they are all nonnegative. Thus the eigenvalues of A are all nonnegative. 17. From Exercise 1 we have " λ1 = 3, q1 = − √12 # √1 2 " , λ2 = 5, q2 = √1 2 √1 2 # . Therefore " A = λ1 q1 qT1 + λ2 q2 qT2 = 3 1 2 − 21 − 12 # 1 2 " +5 1 2 1 2 1 2 1 2 # " √1 2 √1 2 # . 18. From Exercise 2 we have " λ1 = −4, q1 = − √12 √1 2 # , λ2 = 2, q2 = . Therefore " A = λ1 q1 qT1 + λ2 q2 qT2 = −4 1 2 − 12 − 12 1 2 # " +2 1 2 1 2 1 2 1 2 # . 5.4. ORTHOGONAL DIAGONALIZATION OF SYMMETRIC MATRICES 415 19. From Exercise 5 we have  λ1 = 4, 0  1  √  q1 =   2 ,    1    q2 =  0 , 0  λ2 = 5, √1 2 λ3 = −2, 0   1  √  q2 =  − 2  . √1 2 Therefore  0  T T T  A = λ1 q1 q1 + λ2 q2 q2 + λ3 q3 q3 = 4 0 0  1   1 + 5  0 2 1 0 2 0 0 1 2 1 2  0 0 0   0 0     0 − 2 0 0 0 1 2 − 12      0  − 21  . 0 1 2 20. From Exercise 8 we have λ1 = −1,   − √16  1   q2 =  − √6  ,   − √12   1  q1 =   √2  , 0 λ2 = 5, √2 6 √1  13   q3 =   √3  . √1 3 Therefore  1 2  1 A = λ1 q1 qT1 + λ1 q2 qT2 + λ2 q3 qT3 = −  −  2 − 12 1 2 0 0   1 0   6  1 0 − 6 0 − 13 1 6 1 6 − 31 1  3 1 − 13   + 5 3 2 1 3 3 − 13  1 3 1 3 1 3 1 3 1 . 3 1 3 − 32 # 21. Since v1 and v2 are already orthogonal, normalize each of them to get " # " # q1 = √1 2 √1 2 , q2 = √1 2 − √12 . Now let Q be the matrix whose columns are q1 and q2 ; then the desired matrix is # " # " " #" #" # " 1 √1 √1 √1 √1 −1 0 −1 0 −1 0 T −1 2 2 2 2 Q = 12 Q =Q Q = √ 0 2 0 2 0 2 √12 − √12 − 32 − √12 2 . 1 2 22. Since v1 and v2 are already orthogonal, normalize each of them to get " # " # √1 − √25 5 q1 = 2 , q2 = . 1 √ √ 5 5 Now let Q be the matrix whose columns are q1 and q2 ; then the desired matrix is " # " # " #" #" # " √1 √1 √2 − √25 3 3 0 3 0 0 − 59 −1 T 5 5 5 Q Q =Q Q = 2 = 12 √ √1 0 −3 0 −3 0 −3 − √25 √15 5 5 5 12 5 9 5 # . √ √ √ 23. Note that v1 , v2 , and v3 are mutually orthogonal, with norms 2, 3, and 6 respectively. Let Q be the matrix whose columns are q1 = kvv11 k , q2 = kvv22 k , and q3 = kvv33 k ; then the desired matrix is     1 0 0 1 0 0   −1   T    Q 0 2 0 Q = Q 0 2 0 Q 0 0 3 0  √1  12 =  √2 0 0 3 √1 3 − √13 √1 3  1  1   √ 0 6  2 √ 0 6 − √16 0 2 0  √1 0   12  0   √3 3 − √16 √1 2 − √13 √1 6   5 0 3   √1  = − 2  3 3 √2 − 13 6 − 32 5 3 1 3 − 13   1 . 3 8 3 416 CHAPTER 5. ORTHOGONALITY √ √ √ 24. Note that v1 , v2 , and v3 are mutually orthogonal, with norms 42, 3, and 14 respectively. Let Q be the matrix whose columns are q1 = kvv11 k , q2 = kvv22 k , and q3 = kvv33 k ; then the desired matrix is  1   Q 0 0   1 0   −1   −4 0 Q = Q 0 0 0 −4   0  T −4 0 Q 0 −4 0 0 √4  542 =  √42 − √142  − 44  21 50 =  21 10 − 21  − √13 √2 14  1 1  − √14   0 3 √ 0 14 √1 3 √1 3 50 21 43 − 42 25 − 42 − 10 21  √4 0   42  1 −4 0  − √3 √2 0 −4 14 0 √5 42 √1 3 − √114 − √142   √1  3 √3 14   . − 25 42  − 163 42 25. Since q is a unit vector, q · q = 1, so that projW v = q·v q·q q = (q · v)q = q(q · v) = q(qT v) = (qqT )v. 26. (a) By definition (see Section 5.2), the projection of v onto W is projW v = q1 · v q1 · q1 q1 + · · · + qk · v qk · qk qk = (q1 · v)q1 + · · · + (qk · v)qk = q1 (q1 · v) + · · · + qk (qk · v) = q1 (qT1 v) + · · · + qk (qTk · v) = (q1 qT1 + · · · + qk qTk )v. Therefore the matrix of the projection is indeed q1 qT1 + · · · + qk qTk . (b) First, T + · · · + qk qTk T T = qT1 qT1 + · · · + qTk qk = q1 qT1 + · · · + qk qTk = P. P T = (q1 qT1 + · · · + qk qTk )T = q1 qT1 T Next, since qi · qj = 0 for i 6= j, and qi · qi = qTi qi = 1, we have P 2 = (q1 qT1 )(q1 qT1 ) + · · · + (qk qTk )(qk qTk ) = q1 qT1 q1 qT1 + · · · + qk qTk qk qTk = q1 qT1 + · · · qk qTk = P. (c) We have QQT = q1 ···  T q  .1  qk  ..  = q1 qT1 + · · · qk qTk = P. qTk Thus rank(P ) = rank(QQT ) = rank(Q) by Theorem 3.28 in Section 3.5. But if q1 , . . . , qk is an orthonormal basis, then rank(Q) = k, so that rank(P ) = k as well. 5.5. APPLICATIONS 417 27. Suppose the eigenvalues are λ1 , . . . , λn and that v1 , . . . , vn are a corresponding orthonormal set of eigenvectors. Let Q1 = v1 · · · vn . Then  T  T v1 v1     QT1 AQ1 =  ...  A v1 · · · vn =  ...  Av1 · · · Avn vnT vnT  T v1  ..  =  .  λ1 v1 vnT ··· λ λn vn = 1 0 ∗ = T, A1 where A1 is an (n − 1) × (n − 1) matrix. The result now follows by induction since T is block upper triangular. 28. We know that if A is nilpotent, then its only eigenvalue is 0. So if A is nilpotent, then all of its eigenvalues are real, so that by Exercise 27 there is an orthogonal matrix Q such that QT AQ is upper triangular. But A and QT AQ have the same eigenvalues; since the eigenvalues of QT AQ are the diagonal entries, all the diagonal entries must be zero. 5.5 Applications x 2 3 x y = 2x2 + 6xy + 4y 2 . y 3 4 x1 5 1 T x1 x2 = 5x21 + 2x1 x2 − x22 . 2. f (x) = x Ax = x2 1 −1 1 3 −2 T 1 6 = 3 · 12 − 4 · 1 · 6 + 4 · 62 = 123. 3. f (x) = x Ax = 6 −2 4    1 0 −3 x 1 x y z = 1x2 + 2y 2 + 3z 2 + 0xy − 6xz + 2yz = x2 + 2y 2 + 3z 2 − 6xz + 2yz. 4. f (x) = y   0 2 −3 1 3 z 1. f (x) = xT Ax = 5. From Exercise 4,   2 f −1 = 22 + 2 · (−1)2 + 3 · 12 − 6 · 2 · 1 + 2 · (−1) · 1 = −5. 1    1 2 2 0 6. f (x) = xT Ax = 2 2 0 1 1 2 3 = 2 · 12 + 0 · 22 + 1 · 32 + 4 · 1 · 2 + 0 · 1 · 3 + 2 · 2 · 3 = 31. 3 0 1 1 1 3 7. The diagonal entries are 1 and 2, and the corners are each 21 · 6 = 3, so A = . 3 2 0 1 8. The diagonal entries are both zero, and the corners are each 21 · 1 = 12 , so A = 1 2 . 0 2 3 − 32 1 3 9. The diagonal entries are 3 and −1, and the corners are each 2 · (−3) = − 2 , so A = . − 23 −1 10. The diagonal entries are 1, 0, and −1, while the off-diagonal entries are 12 · 8 = 4, 0, and 12 · (−6) = −3. Thus   1 4 0 0 −3 . A = 4 0 −3 −1 418 CHAPTER 5. ORTHOGONALITY 11. The diagonal entries are 5, −1, and 2, while the off-diagonal entries are 12 · 2 = 1, 21 · (−4) = −2, and 1 2 · 4 = 2. Thus   5 1 −2 2 . A =  1 −1 −2 2 2 12. The diagonal entries are 2, −3, and 1, while the off-diagonal entries are 21 · 0 = 0, 21 · (−4) = −2, and 1 2 · 0 = 0. Thus   2 0 −2 0 . A =  0 −3 −2 0 1 2 −2 13. The matrix of this form is A = , so that its characteristic polynomial is −2 5 2−λ −2 −2 = λ2 − 7λ + 6 = (λ − 1)(λ − 6). 5−λ So the eigenvalues are the eigenvectors by the usual method gives corresponding 1 and 6; finding 2 1 eigenvectors v1 = and v2 = . Normalizing these and constructing Q gives 1 −2 " Q= √2 5 √1 5 √1 5 − √25 # . The new quadratic form is 1 y2 0 T f (y) = y Dy = y1 14. The matrix of this form is A = 1 4 1−λ 4 0 6 y1 = y12 + 6y22 . y2 4 , so that its characteristic polynomial is 1 4 = λ2 − 2λ − 15 = (λ − 5)(λ + 3). 1−λ So the eigenvaluesare the eigenvectors by the usual method gives corresponding 5 and −3; finding 1 1 eigenvectors v1 = and v2 = . Normalizing these and constructing Q gives 1 −1 " Q= √1 2 √1 2 √1 2 − √12 # . The new quadratic form is f (y) = yT Dy = y1  7 15. The matrix of this form is A = 4 4 7−λ 4 4 y2 5 0 0 y1 = 5y12 − 3y22 . −3 y2  4 4 1 −8, so that its characteristic polynomial is −8 1 4 1−λ −8 4 −8 = (λ − 9)2 (λ + 9). 1−λ 5.5. APPLICATIONS 419 So the eigenvalues are 9 (of algebraic multiplicity the eigenvectors  2) and −9; finding   by the usual 2 2 −1 method gives corresponding eigenvectors v1 = 0 and v2 = 1 for λ = 9 and  2 for λ = −9. 1 0 2 Orthogonalizing v1 and v2 gives    2   2 2 5 4 v2 · v1 v1 = 1 − 0 =  1 . v20 = v2 − v1 · v1 5 1 0 − 45 Normalizing these and constructing Q gives  √2  5 √1 5 2 √ 3 5 5 √ 3 5 4 − 3√ 5  9 y3 0 0 0 9 0 Q=  0 − 31   2 . 3 2 3 The new quadratic form is f (y) = yT Dy = y1 y2   0 y1 0 y2  = 9y12 + 9y22 − 9y32 . −9 y3   −2 0 1 0, so that its characteristic polynomial is 0 3 1−λ −2 0 −2 0 1−λ 0 = (λ − 3)2 (λ + 1). 0 3−λ 1 16. The matrix of this form is A = −2 0 So the eigenvalues are 3 (of algebraic multiplicity  and −1; finding   the eigenvectors  by the usual  2) 0 1 1 method gives corresponding eigenvectors v1 = −1 and v2 = 0 for λ = 3 and 1 for λ = −1. 1 0 0 Since v1 and v2 are already orthogonal, we normalize all three vectors and construct Q to get   √1 √1 0 2 2  1 1  Q= − √2 0 √2  . 0 1 0 The new quadratic form is f (y) = yT Dy = y1  1 17. The matrix of this form is A = −1 1 1−λ −1 1 y2  3 y3 0 0 0 3 0   0 y1 0 y2  = 3y12 + 3y22 − y32 . −1 y3  −1 0 0 1, so that its characteristic polynomial is 1 1 −1 −λ 1 0 1 = (λ − 2)(λ − 1)(λ + 1). 1−λ 420 CHAPTER 5. ORTHOGONALITY So the eigenvalues are  2, 1, and−1;  finding  the  eigenvectors by the usual method gives corresponding 1 1 1 eigenvectors v1 = −1, v2 = 0, and  2. Normalize all three vectors and construct Q to get −1 1 −1   √1  13 Q= − √3 − √13 √1 2 0 √1 2 √1 6 √2  . 6 − √16 The new quadratic form is f (x0 ) = (x0 )T Dx0 = x0  0 18. The matrix of this form is A = 1 1 y0  2 z 0 0 0 0 1 0   0 x0 0 y 0  = 2(x0 )2 + (y 0 )2 − (z 0 )2 . −1 z0  1 1, so that its characteristic polynomial is 0 1 0 1 −λ 1 1 1 −λ 1 1 1 = (λ − 2)(λ + 1)2 . −λ So the eigenvalues are 2 and −1; finding the eigenvectors by the  usual method gives corresponding  0 −2 1 eigenvectors v1 = 1 for λ = 2 and v2 =  1 and v3 =  1 for λ = −1. The latter two are −1 1 1 already orthogonal, so simply normalize all three vectors and construct Q to get   √1 √2 − 0 6   13 √ √1 √1  . Q=  3 6 2 √1 √1 − √12 3 6 The new quadratic form is f (x0 ) = (x0 )T Dx0 = x0 1 19. The matrix of this form is A = 0 y0  2 z 0 0 0   0 0 x0 −1 0 y 0  = 2(x0 )2 − (y 0 )2 − (z 0 )2 . 0 −1 z0 0 , so that its characteristic polynomial is 2 1−λ 0 0 = (1 − λ)(2 − λ). 2−λ Since the eigenvalues, 1 and 2, are both positive, this is a positive definite form. 1 −1 20. The matrix of this form is A = , so that its characteristic polynomial is −1 1 1−λ −1 −1 = λ(λ − 2). 1−λ Since the eigenvalues, 0 and 2, are both nonnegative, but one is zero, this is a positive semidefinite form. 5.5. APPLICATIONS 421 21. The matrix of this form is A = −2 1 1 , so that its characteristic polynomial is −2 −2 − λ 1 1 = (λ + 3)(λ + 1). −2 − λ Since the eigenvalues, −1 and −3, are both negative, this is a negative definite form. 1 2 22. The matrix of this form is A = , so that its characteristic polynomial is 2 1 1−λ 2 2 = (λ − 3)(λ + 1). 1−λ Since the eigenvalues, −1 and 3, have opposite signs, this is an indefinite form.   2 1 1 23. The matrix of this form is A = 1 2 1, so that its characteristic polynomial is 1 1 2 2−λ 1 1 1 1 2−λ 1 = −(λ − 1)2 (λ − 4). 1 2−λ Since both eigenvalues, 1 and 4, are positive, this is a positive definite form.   1 0 1 24. The matrix of this form is A = 0 1 0, so that its characteristic polynomial is 1 0 1 1−λ 0 1 0 1 1−λ 0 = −λ(λ − 1)(λ − 2). 0 1−λ Since the eigenvalues, 0, 1, and 2, are all nonnegative, but one of them is zero, this is a positive semidefinite form.   1 2 0 0, so that its characteristic polynomial is 25. The matrix of this form is A = 2 1 0 0 −1 1−λ 2 0 2 1−λ 0 0 0 = −(λ + 1)2 (λ − 3). −1 − λ Since the eigenvalues, −1 and 3, have opposite signs, this is an indefinite form.   −1 −1 −1 26. The matrix of this form is A = −1 −1 −1, so that its characteristic polynomial is −1 −1 −1 −1 − λ −1 −1 −1 −1 − λ −1 −1 −1 = −λ2 (λ + 3) −1 − λ Since the eigenvalues, 0 and −3, are both nonpositive but one of them is zero, this is a negative semidefinite form. 422 CHAPTER 5. ORTHOGONALITY 27. By the Principal Axes Theorem, if A is the matrix of f (x), then there is an orthogonal matrix Q such that QT AQ = D is a diagonal matrix, and then if y = QT x, we have f (x) = xT Ax = yT Dy = λ1 y12 + · · · + λn yn2 . • For 5.22(a), f (x) is positive definite if and only if λ1 y12 + · · · + λn yn2 > 0 for all y1 , . . . , yn . But this happens if and only if λi > 0 for all i, since if some λi ≤ 0, take yi = 1 and all other yj = 0 to get a value for f (x) that is not positive. • For 5.22(b), f (x) is positive semidefinite if and only if λ1 y12 + · · · + λn yn2 ≥ 0 for all y1 , . . . , yn . But this happens if and only if λi ≥ 0 for all i, since if some λi < 0, take yi = 1 and all other yj = 0 to get a value for f (x) that is negative. • For 5.22(c), f (x) is negative semidefinite if and only if λ1 y12 + · · · + λn yn2 < 0 for all y1 , . . . , yn . But this happens if and only if λi < 0 for all i, since if some λi ≥ 0, take yi = 1 and all other yj = 0 to get a value for f (x) that is nonnegative. • For 5.22(d), f (x) is negative semidefinite if and only if λ1 y12 + · · · + λn yn2 ≤ 0 for all y1 , . . . , yn . But this happens if and only if λi ≤ 0 for all i, since if some λi > 0, take yi = 1 and all other yj = 0 to get a value for f (x) that is positive. • For 5.22(e), f (x) is indefinite if and only if λ1 y12 + · · · + λn yn2 can be either negative or positive. But from the arguments above, we see that this can happen if and only if some λi is positive and some λj is negative. 28. The given matrix represents the form ax2 + 2bxy + dy 2 , which can be written in the form given in the hint: 2 b2 b y2 . ax2 + 2bxy + dy 2 = a x + y + d − a a This is a positive definite form if and only if the right-hand side is positive for all values of x and y not 2 both zero. First suppose that a > 0 and det A = ad − b2 > 0. Then ad > b2 so that d > ba (using the 2 fact that a > 0), and thus d − ba > 0. Therefore the first term on the right-hand side is always positive if either x or y is nonzero since a > 0 and nonzero squares are always positive, while the second term is always nonnegative since the constant factor is positive. Thus the form is positive definite. Next assume the form is positive definite; we must show a > 0 and det A > 0. By way of contradiction, assume a ≤ 0; then setting x = 1 and y = 0 we get a ≤ 0 for the value of the form, which contradicts the assumption that it is positive definite. Thus a > 0. Next assume that det A = ad − b2 ≤ 0; then 2 2 (since a > 0) d ≤ ba . Now setting x = − ab and y = 1 gives d − ba ≤ 0 for the value of the form, which is again a contradiction. Thus a > 0 and det A > 0. 2 29. Since A = B T B, we have xT Ax = xT B T Bx = (Bx)T (Bx) = kBxk ≥ 0. Now, if xT Ax = 0, then 2 kBxk = 0. But since B is invertible, this implies that x = 0. Therefore xT Ax > 0 for all x 6= 0, so that A is positive definite. 30. Since A is symmetric and positive definite, we can find an orthogonal matrix Q and a diagonal matrix D such that A = QDQT , where   λ1 · · · 0   D =  ... . . . ...  , where λi > 0 for all i. 0 Let ··· λn √ λ1  .. T C=C = . 0 ··· .. . ···  0 ..  ;  √. λn then D = C 2 = C T C. This gives A = QDQT = QC T CQT = (CQT )T CQT . 5.5. APPLICATIONS 423 Since all the λi are positive, C is invertible; since Q is orthogonal, it is invertible. Thus B = CQT is invertible and we are done. 31. (a) xT (cA)x = c xT Ax > 0 because c > 0 and A is positive definite. 2 (b) xT A2 x = xT Ax xT Ax = xT Ax > 0. (c) xT (A + B)x = xT Ax + xT Bx > 0 since both A and B are positive definite. (d) First note that A is invertible by Exercise 30, since it factors as A = B T B where B is invertible. n o Since A is positive definite, its eigenvalues {λi } are positive; the eigenvalues of A−1 are which are also positive. Thus A −1 1 λ1 , is positive definite. 32. From Exercise 30, we have A = QC T CQT , where Q is orthogonal and C is a diagonal matrix. But then C T = C, so that A = QCCQT = QCQT QCQT = (QCQT )(QCQT ) = B 2 . B is symmetric since B = QCQT is orthogonally diagonalizable, and it is positive definite since its eigenvalues are the diagonal entries of C, which are all positive. 33. The eigenvalues of this form, from Exercise 20, are λ1 = 2 and λ2 = 0, so that by Theorem 5.23, the maximum value of f subject to the constraint kxk = 1 is 2 and the minimum value is 0. These occur at the unit eigenvectors of A, which are found by row-reducing: −1 −1 0 1 1 0 A − 2I 0 = → −1 −1 0 0 0 0 1 −1 0 1 −1 0 A − 0I 0 = → . −1 1 0 0 0 0 So normalized eigenvectors are " # " # √1 1 1 2 v1 = √ = , 2 −1 − √12 " # " # √1 1 1 = 12 . v2 = √ √ 2 1 2 Therefore the maximum value of 2 occurs at x = ±v1 , and the minimum of 0 occurs at x = ±v2 . 34. The eigenvalues of this form, from Exercise 22, are λ1 = 3 and λ2 = −1, so that by Theorem 5.23, the maximum value of f subject to the constraint kxk = 1 is 3 and the minimum value is −1. These occur at the unit eigenvectors of A, which are found by row-reducing: −2 2 0 1 −1 0 A − 3I 0 = → 2 −2 0 0 0 0 2 2 0 1 1 0 A+I 0 = → . 2 2 0 0 0 0 So normalized eigenvectors are " # " # √1 1 1 v1 = √ = 12 , √ 2 1 2 " # " # √1 1 1 2 . v2 = √ = 2 −1 − √12 Therefore the maximum value of 3 occurs at x = ±v1 , and the minimum of −1 occurs at x = ±v2 . 35. The eigenvalues of this form, from Exercise 23, are λ1 = 4 and λ2 = 1, so that by Theorem 5.23, the maximum value of f subject to the constraint kxk = 1 is 4 and the minimum value is 1. These occur at the unit eigenvectors of A, which are found by row-reducing:     −2 1 1 0 1 0 −1 0 A − 4I 0 =  1 −2 1 0 → 0 1 −1 0 1 1 −2 0 0 0 0 0     1 1 1 0 1 1 1 0 A − I 0 = 1 1 1 0 → 0 0 0 0 . 1 1 1 0 0 0 0 0 424 CHAPTER 5. ORTHOGONALITY So normalized eigenvectors are      √1 1   13  1  v1 = √  1 =  √  , 3    13  √ 1 3     √1 1 2   1   = − √1  , v2 = √  −1 2 2   0 0 1    √1 2  1   v3 = √  = 0 0  .   2 1 − √2 −1 Therefore the maximum value of 4 occurs at x = ±v1 , and the minimum of 1 occurs at x = ±v2 as well as at x = ±v3 . 36. The eigenvalues of this form, from Exercise 24, are λ1 = 2, λ2 = 1, and λ3 = 0, so that by Theorem 5.23, the maximum value of f subject to the constraint kxk = 1 is 2 and the minimum value is 0. These occur at the corresponding unit eigenvectors of A, which are found by row-reducing:     −1 0 1 0 1 0 −1 0 A − 2I 0 =  0 −1 0 0 → 0 1 0 0 1 0 −1 0 0 0 0 0     1 0 1 0 1 0 1 0 A − 0I 0 = 0 1 0 0 → 0 1 0 0 . 1 0 1 0 0 0 0 0 So normalized eigenvectors are     √1 1   2 1  v1 = √  0 =  0  , 2   1  √ 1 2  1   √1 2    1   v2 = √  = 0 0     . 2 1 √ −1 − 2 Therefore the maximum value of 2 occurs at x = ±v1 , and the minimum of 0 occurs at x = ±v2 . 37. Since λ1 ≥ λ2 ≥ · · · ≥ λn , we have 2 f (x) = λ1 y12 + λ2 y22 + · · · + λn yn2 ≥ λn (y12 + y22 + · · · + yn2 ) = λn kyk = λn since we are assuming the constraint kyk = 1. 38. If qn is a unit eigenvector corresponding to λn , then Aqn = λn qn , so that f (qn ) = qTn Aqn = qTn λn qn = λn qTn qn = λn . Thus the quadratic form actually takes the value λn , so by property (a) of Theorem 5.23, it must be the minimum value of f (x), and it occurs when x = qn . 39. Divide this equation through by 25 to get x2 y2 + = 1, 25 5 which is an ellipse, from Figure 5.15. 40. Divide through by 4 and rearrange terms to get x2 y2 − = 1, 4 4 which is a hyperbola, from Figure 5.15. 5.5. APPLICATIONS 425 41. Rearrange terms to get y = x2 − 1, which is a parabola, from Figure 5.15. 42. Divide through by 8 and rearrange terms to get y2 x2 + = 1, 4 8 which is an ellipse, from Figure 5.15. 43. Rearrange terms to get y 2 − 3x2 = 1, which is a hyperbola, from Figure 5.15. 44. This is a parabola, from Figure 5.15. 45. Group the x and y terms and then complete the square, giving (x2 − 4x + 4) + (y 2 − 4x + 4) = −4 + 4 + 4 ⇒ (x − 2)2 + (y − 2)2 = 4. 2 2 This is a circle of radius 2 centered at (2, 2). Letting x0 = x−2 and y 0 = y −2, we have (x0 ) +(y 0 ) = 4. y 4 3 y¢ x¢ 2 1 1 2 3 4 x 46. Group the x and y terms and simplify: (4x2 − 8x) + (2y 2 + 12y) = −6 ⇒ 4(x2 − 2x) + 2(y 2 + 6y) = −6. Now complete the square and simplify again: 4(x2 − 2x + 1) + 2(y 2 + 6y + 9) = −6 + 4 + 18 2 2 4(x − 1) + 2(y + 3) = 16 ⇒ ⇒ (x − 1)2 (y + 3)2 + = 1. 4 8 0 2 0 2 This is an ellipse centered at (1, −3). Letting x0 = x − 1 and y 0 = y + 3, we have (x4) + (y8) = 1. 426 CHAPTER 5. ORTHOGONALITY y 1 -1 2 3 x -1 y¢ -2 x¢ -3 -4 -5 -6 47. Group the x and y terms to get 9x2 − 4(y 2 + y) = 37. Now complete the square and simplify: 2 2 y + 21 1 1 x2 2 2 2 − = 1. 9x − 4 y + y + = 37 − 1 ⇒ 9x − 4 y + = 36 ⇒ 4 2 4 9 2 2 (x0 ) (y 0 ) This is a hyperbola centered at 0, − 21 . Letting x0 = x and y 0 = y + 21 , we have 4 − 9 = 1. y y¢ 3 2 1 -3 -2 1 -1 2 3 x x¢ -1 -2 -3 48. Rearrange terms to get 3y − 13 = x2 + 10x; then complete the square: 3y − 13 + 25 = x2 + 10x + 25 = (x + 5)2 ⇒ 3y + 12 = (x + 5)2 ⇒ y+4= 1 (x + 5)2 . 3 5.5. APPLICATIONS 427 2 This is a parabola. Letting x0 = x + 5 and y 0 = y + 4, we have y 0 = 13 (x0 ) . y x -9 -8 -7 -6 -5 -4 -3 -2 -1 -1 y¢ -2 -3 x¢ -4 49. Rearrange terms to get 4x = −2y 2 − 8y; then divide by 2, giving 2x = −y 2 − 4y. Finally, complete the square: 1 2x − 4 = −(y 2 + 4y + 4) = −(y + 2)2 ⇒ x − 2 = − (y + 2)2 . 2 2 This is a parabola. Letting x0 = x − 2 and y 0 = y + 2, we have x0 = − 21 (y 0 ) . y 1 -3 -2 y¢ 1 -1 2 3 x -1 x¢ -2 -3 -4 50. Complete the square and simplify: 2y 2 − 3x2 − 18x − 20y + 11 = 0 2 ⇒ 2 2(y − 10y + 25) − 3(x + 6x + 9) + 11 − 50 + 27 = 0 2 2 2(y − 5) − 3(x + 3) = 12 2 ⇒ ⇒ 2 (y − 5) (x + 3) − = 1. 6 4 This is a hyperbola. Letting x0 = x + 3 and y 0 = y − 5, we have (y0 ) 6 2 − ( x0 ) 4 y y¢ 12 10 8 6 x¢ 4 2 -12 -10 -8 -6 -4 2 -2 -2 x 2 = 1. 428 CHAPTER 5. ORTHOGONALITY 51. The matrix of this form is " A= # 1 1 2 1 2 1 , which has characteristic polynomial 1 3 (1 − λ) − = λ2 − 2λ + = 4 4 2 1 3 λ− λ− . 2 2 Therefore the associated diagonal matrix is " D= 3 2 0 0 1 2 # , so that the equation becomes T xT Ax = (x0 ) Dx0 = 3 0 2 1 0 2 (x ) + (y ) = 6. 2 2 Dividing through by 6 to simplify gives 2 2 (y 0 ) (x0 ) + = 1, 4 12 which is an ellipse. To determine the rotation matrix Q, find unit eigenvectors corresponding to λ1 = 32 and λ2 = 21 by row-reduction to get " # √1 − √12 2 Q= 1 . 1 √ 2 √ 2 Then " e1 → Qe1 = √1 2 √1 2 # e2 → Qe2 = , " # − √12 √1 2 , so that the axis rotation is through a 45◦ angle. y 3 y¢ x¢ 2 1 -3 -2 1 -1 -1 -2 -3 52. The matrix of this form is A= 4 5 5 , 4 2 3 x 5.5. APPLICATIONS 429 which has characteristic polynomial (4 − λ)2 − 25 = λ2 − 8λ − 9 = (λ − 9)(λ + 1). Therefore the associated diagonal matrix is D= 9 0 0 , −1 so that the equation becomes T 2 2 xT Ax = (x0 ) Dx0 = 9 (x0 ) − (y 0 ) = 9. Dividing through by 9 to simplify gives 2 2 (x0 ) − (y 0 ) = 1, 9 which is a hyperbola. To determine the rotation matrix Q, find unit eigenvectors corresponding to λ1 = 9 and λ2 = 1 by row-reduction to get " Q= √1 2 √1 2 − √12 # √1 2 . Then " e1 → Qe1 = √1 2 √1 2 # e2 → Qe2 = , " # − √12 √1 2 , so that the axis rotation is through a 45◦ angle. y 3 y¢ x¢ 2 1 -3 -2 1 -1 2 3 x -1 -2 -3 53. The matrix of this form is 4 A= 3 3 , −4 which has characteristic polynomial (4 − λ)(−4 − λ) − 9 = λ2 − 25 = (λ − 5)(λ + 5) 430 CHAPTER 5. ORTHOGONALITY Therefore the associated diagonal matrix is 5 D= 0 0 , −5 so that the equation becomes T 2 2 xT Ax = (x0 ) Dx0 = 5 (x0 ) − 5 (y 0 ) = 5. Dividing through by 5 to simplify gives 2 2 (x0 ) − (y 0 ) = 1, which is a hyperbola. To determine the rotation matrix Q, find unit eigenvectors corresponding to λ1 = 9 and λ2 = 1 by row-reduction to get " # √3 √1 − 10 10 . Q= 1 3 √ 10 √ 10 Then " e1 → Qe1 = √3 10 √1 10 # h e2 → Qe2 = − √110 , √3 10 i √ , 10 √ so that the axis rotation is through an angle whose tangent is 1/ = 13 , or about 18.435◦ . 3/ 10 y 3 y¢ 2 x¢ 1 -3 -2 1 -1 2 3 x -1 -2 -3 54. The matrix of this form is A= 3 −1 −1 , 3 which has characteristic polynomial (3 − λ)(3 − λ) − 1 = λ2 − 6λ + 8 = (λ − 4)(λ − 2). Therefore the associated diagonal matrix is D= 4 0 0 , 2 5.5. APPLICATIONS 431 so that the equation becomes T 2 2 xT Ax = (x0 ) Dx0 = 4 (x0 ) + 2 (y 0 ) = 8. Dividing through by 8 to simplify gives 2 2 (x0 ) (y 0 ) + = 1, 2 4 which is an ellipse. To determine the rotation matrix Q, find unit eigenvectors corresponding to λ1 = 4 and λ2 = 2 by row-reduction to get " # Q= √1 2 − √12 " # √1 2 √1 2 . Then e1 → Qe1 = √1 2 − √12 " e2 → Qe2 = , √1 2 √1 2 # , so that the axis rotation is through a −45◦ angle. y 2 y¢ 1 -2 1 -1 2 x -1 x¢ -2 55. Writing the equation as xT Ax + Bx + C = 0, we have √ 3 −2 A= , B = −28 2 −2 3 √ 22 2 , C = 84. A has characteristic polynomial (3 − λ)(3 − λ) − 4 = λ2 − 6λ + 5 = (λ − 5)(λ − 1). Therefore the associated diagonal matrix is D= 5 0 0 . 1 To determine the rotation matrix Q, find unit eigenvectors corresponding to λ1 = 5 and λ2 = 1 by row-reduction to get " # Q= √1 2 − √12 √1 2 √1 2 . 432 CHAPTER 5. ORTHOGONALITY Thus h √ √ i 22 2 BQ = −28 2 " √1 2 − √12 √1 2 √1 2 # h i = −50 −6 , so the equation becomes 2 2 5 (x0 ) − 50x0 + (y 0 ) − 6y 0 = −84 ⇒ 2 2 5 (x0 ) − 10x0 + 25 + (y 0 ) − 6y 0 + 9 = −84 + 125 + 9 0 0 2 2 5(x − 5) + (y − 3) = 50 0 0 2 ⇒ ⇒ 2 (y − 3) (x − 5) + = 1. 10 50 Letting x00 = x0 − 5 and y 00 = y 0 − 3 gives 2 2 (x00 ) (y 00 ) + = 1, 10 50 which is an ellipse. 56. Writing the equation as xT Ax + Bx + C = 0, we have 6 −2 A= , B = −20 −10 , −2 9 C = −5. A has characteristic polynomial (6 − λ)(9 − λ) − 4 = λ2 − 15λ + 50 = (λ − 10)(λ − 15). Therefore the associated diagonal matrix is D= 10 0 0 . 5 To determine the rotation matrix Q, find unit eigenvectors corresponding to λ1 = 10 and λ2 = 5 by row-reduction to get # " − √15 √25 . Q= 2 1 √ 5 √ 5 Thus " h i − √1 5 BQ = −20 −10 2 √ 5 √2 5 √1 5 # h = 0 √ i −10 5 , so the equation becomes √ 2 2 10 (x0 ) + 5 (y 0 ) − 10 5y 0 = 5 ⇒ √ 2 2 10 (x0 ) + 5 (y 0 ) − 2 5 y 0 + 5 = 5 + 25 √ 2 2 10 (x0 ) + 5 y 0 − 5 = 30 ⇒ √ 2 2 y0 − 5 (x0 ) + = 1. 3 6 √ Letting x00 = x0 and y 00 = y 0 − 5 gives 2 2 (x00 ) (y 00 ) + = 1, 3 6 which is an ellipse. ⇒ 5.5. APPLICATIONS 433 57. Writing the equation as xT Ax + Bx + C = 0, we have √ 0 1 A= , B= 2 2 1 0 0 , C = −1. A has characteristic polynomial (−λ)(−λ) − 1 = λ2 − 1 = (λ − 1)(λ + 1) Therefore the associated diagonal matrix is D= 1 0 0 . −1 To determine the rotation matrix Q, find unit eigenvectors corresponding to λ1 = 1 and λ2 = −1 by row-reduction to get # " Q= √1 2 √1 2 √1 2 − √12 " √1 2 − √12 . Thus h √ BQ = 2 2 i 0 √1 2 √1 2 # h = 2 i 2 , so the equation becomes 2 2 (x0 ) − (y 0 ) + 2x0 + 2y 0 = 1 ⇒ 2 2 (x0 ) + 2x0 + 1 − (y 0 ) − 2y 0 + 1 = 1 + 1 − 1 2 ⇒ 2 (x0 + 1) − (y 0 − 1) = 1. Letting x00 = x0 + 1 and y 00 = y 0 − 1 gives 2 2 (x00 ) − (y 00 ) = 1, which is a hyperbola. 58. Writing the equation as xT Ax + Bx + C = 0, we have √ 1 −1 A= , B= 4 2 −1 1 0 , C = −4. A has characteristic polynomial (1 − λ)(1 − λ) − 1 = λ2 − 2λ = λ(λ − 2). Therefore the associated diagonal matrix is 2 D= 0 0 . 0 To determine the rotation matrix Q, find unit eigenvectors corresponding to λ1 = 2 and λ2 = 0 by row-reduction to get " # Q= √1 2 − √12 √1 2 √1 2 " √1 2 √1 2 . Thus h √ BQ = 4 2 i 0 √1 2 − √12 # h = 4 i 4 , 434 CHAPTER 5. ORTHOGONALITY so the equation becomes 2 2 (x0 ) + 4x0 + 4y 0 = 4 ⇒ 0 2 2y 0 = − (x ) − 2x0 + 2 ⇒ 2 2y 0 − 1 = − (x0 ) + 2x0 + 1 ⇒ 1 2 = − (x0 + 1) ⇒ 2 y0 − 2 1 1 2 y 0 − = − (x0 + 1) . 2 2 Letting x00 = x0 + 1 and y 00 = y 0 − 12 gives 1 2 y 00 = − (x00 ) , 2 which is a parabola. 59. x2 − y 2 = (x + y)(x − y) = 0 means that x = y or x = −y, so the graph of this degenerate conic is the pair of lines y = x and y = −x. 2 1 -2 1 -1 2 -1 -2 60. Rearranging terms gives x2 + 2y 2 = −2, which has no solutions since the left-hand side is a sum of two squares, so is always nonnegative. Therefore this is an imaginary conic. 61. The left-hand side is a sum of two squares, so it equals zero if and only if x = y = 0. The graph of this degenerate conic is the single point (0, 0). 1. 0.5 -1. 0.5 -0.5 -0.5 -1. 1. 5.5. APPLICATIONS 435 62. x2 + 2xy + y 2 = (x + y)2 . Then (x + y)2 = 0 is satisfied only when x + y = 0, so the graph of this degenerate conic is the line y = −x. 2 1 -2 1 -1 2 -1 -2 63. Rearranging terms gives √ √ x2 − 2xy + y 2 = 2 2(y − x), or (x − y)2 = 2 2(y − x). This is satisfied when y = x. If y 6= x, then we can divide through by y − x to get √ √ y − x = 2 2, or y = x + 2 2. √ So the graph of this degenerate conic is the union of the lines y = x and y = x + 2 2. 4 3 2 1 -2 1 -1 2 -1 -2 64. Writing the equation as xT Ax + Bx + C = 0, we have √ √ 2 1 A= , B = 2 2 −2 2 , 1 2 C = 6. 436 CHAPTER 5. ORTHOGONALITY A has characteristic polynomial (2 − λ)(2 − λ) − 1 = λ2 − 4λ + 3 = (λ − 3)(λ − 1). Therefore the associated diagonal matrix is 3 D= 0 0 . 1 To determine the rotation matrix Q, find unit eigenvectors corresponding to λ1 = 3 and λ2 = 1 by row-reduction to get # " √1 − √12 2 . Q= 1 1 √ √ 2 2 Thus h √ BQ = 2 2 √ i −2 2 " − √12 √1 2 √1 2 # √1 2 h = 0 i 4 , so the equation becomes 2 2 3 (x0 ) + (y 0 ) + 4y 0 = −6 ⇒ 2 2 3 (x0 ) + (y 0 ) + 4y 0 + 4 = −6 + 4 2 ⇒ 2 3 (x0 ) + (y 0 + 2) = −2. The left-hand side is always nonnegative, so this equation has no solutions and this is an imaginary conic. 65. Let the eigenvalues of A be λ1 and λ2 . Then there is an orthogonal matrix Q such that λ 0 QT AQ = D = 1 . 0 λ2 Then by the Principal Axes Theorem, xT Ax = x0T Dx0 = λ1 (x0 )2 + λ2 (y 0 )2 = k. If k 6= 0, then we may divide through by k to get the standard equation λ1 0 2 λ2 0 2 (x ) + (y ) = 1. k k (a) Since det A = det D = λ1 λ2 < 0, the coefficients of (x0 )2 and (y 0 )2 in the standard equation have opposite signs, so this curve is a hyperbola. (b) Since det A = det D = λ1 λ2 > 0, the coefficients of (x0 )2 and (y 0 )2 in the standard equation have the same sign. If they are both positive, the curve is an ellipse; if they are positive and equal, then the ellipse is in fact a circle. If they are both negative, then the equation has no solutions so is an imaginary conic. (c) Since det A = det D = λ1 λ2 = 0, then at least one of the coefficients of (x0 )2 and (y 0 )2 in the standard equation is zero. If the remaining coefficient is positive, then we get the two parallel q lines x0 = ± λki , while if it is zero or negative, the equation has no solutions and we get an imaginary conic. (d) If k = 0 but det A = det D = λ1 λ2 6= 0, then the equation is λ1 (x0 )2 + λ2 (y 0 )2 = 0, and both λ1 and λ2 are nonzero. Then solving for y 0 gives r λ1 0 y = ± − x0 . λ2 If λ1 and λ2 have the same sign, then this equation has only the solution x0 = y 0 = 0, so the graph is just the origin. If they have opposite signs, then the graph is two straight lines whose equations are above. 5.5. APPLICATIONS 437 (e) If k = 0 and det A = det D = λ1 λ2 = 0, then the equation is λ1 (x0 )2 + λ2 (y 0 )2 = 0, and at least one of λ1 and λ2 is zero. If λ1 = 0, we get x0 = 0, so the graph is the y 0 -axis, a straight line. Similarly, if λ2 = 0, we get y 0 = 0, so the graph is the x0 -axis, also a straight line (if both λ1 and λ2 are zero, then A was the zero matrix to start, so it does not correspond to a quadratic form). 66. The matrix of this form is  4 A = 2 2  2 2 , 4 2 4 2 which has characteristic polynomial −λ3 + 12λ2 − 36λ + 32 = −(λ − 2)2 (λ − 8) Therefore the associated diagonal matrix is  8 D = 0 0  0 0 , 2 0 2 0 so that the equation becomes 2 T 2 2 2 2 xT Ax = (x0 ) Dx0 = 8 (x0 ) + 2 (y 0 ) + 2 (z 0 ) = 8, or (x0 ) + This is an ellipsoid. 67. The matrix of this form is  1 A = 0 0  0 0 1 −2 , −2 1 which has characteristic polynomial −λ3 + 3λ2 + λ − 3 = −(x + 1)(x − 1)(x − 3) Therefore the associated diagonal matrix is  −1 D= 0 0 0 1 0  0 0 , 3 so that the equation becomes T 2 2 2 xT Ax = (x0 ) Dx0 = − (x0 ) + (y 0 ) + 3 (z 0 ) = 1, which is a hyperboloid of one sheet. 68. The matrix of this form is  −1 A= 2 2  2 2 −1 2 , 2 −1 which has characteristic polynomial −λ3 − 3λ2 + 9λ + 27 = −(x + 3)2 (x − 3) 2 (z 0 ) (y 0 ) + = 1. 4 4 438 CHAPTER 5. ORTHOGONALITY Therefore the associated diagonal matrix is  −3 D= 0 0  0 0 −3 0 , 0 3 so that the equation becomes 2 2 T 2 2 2 (y 0 ) (z 0 ) (x0 ) + − = −1. 4 4 4 2 xT Ax = (x0 ) Dx0 = −3 (x0 ) − 3 (y 0 ) + 3 (z 0 ) = 12, or This is a hyperboloid of two sheets. 69. Writing the form as xT Ax + Bx + C = 0, we have   0 1 0 A = 1 0 0 , B = 0 0 0 0 0 1 , C = 0. The characteristic polynomial of A is λ − λ3 = −λ(λ − 1)(λ + 1). Computing unit eigenvectors for each eigenspace gives       √1 √1 − 0 2   2   1  1   v−1 =  v1 =  v0 =   √ . √  , 0 , 1 2 2 0 0 Then setting  0  Q= 0 1 √1 2 √1 2 − √12  √1 2  ,  0 0  0  D= 0 0 0 1 0  0  0  −1 we get BQ = 1 0 0 . Then the form becomes 2 2 2 2 xT Ax + Bx + C = 0 ⇒ x0T Dx0 + BQx0 = 0 ⇒ (y 0 ) − (z 0 ) + x0 = 0 ⇒ x0 = (z 0 ) − (y 0 ) . This is a hyperbolic paraboloid. 70. Writing the form as xT Ax + Bx + C = 0, we have   16 0 −12 0 , B = −60 A =  0 100 −12 0 9 0 −80 , C = 0. The characteristic polynomial of A is −λ3 + 125λ2 − 2500λ = −λ(λ − 25)(λ − 100). Computing unit eigenvectors for each eigenspace gives     3 − 54 5     v0 =  0  , v25 =  0 , 4 5 3 5   0   v100 = 1 . 0  0  D = 0 0 0 25 0 Then setting  3 5  Q = 0 4 5 − 45 0 3 5  0  1 , 0  0  0  100 5.5. APPLICATIONS 439 we get BQ = −100 0 . 0 Then the form becomes 2 2 2 xT Ax + Bx + C = 0 ⇒ x0T Dx0 + BQx0 = 0 ⇒ 25 (y 0 ) + 100 (z 0 ) − 100x0 = 0 ⇒ x0 = (y 0 ) 2 + (z 0 ) . 4 This is an elliptic paraboloid. 71. Writing the form as xT Ax + Bx + C = 0, we have   1 2 −1 1 , B = −1 A= 2 1 −1 1 −2 1 , 1 C = 0. The characteristic polynomial of A is −λ3 + 9λ = −λ(λ − 3)(λ + 3). Computing unit eigenvectors for each eigenspace gives     √1 − √13  2  1  1   v3 =  v0 =   √2  ,  √3  , √1 0 3   √1  16   v−3 =  − √6  . √2 6 Then setting  − √1  13 Q=  √3 √1 3 √1 2 √1 2 0   0   D = 0 0 √1 6  − √16  , √2 6 0 3 0  0  0  −3 we get BQ = √ 3 0 . 0 Then the form becomes 2 2 xT Ax + Bx + C = 0 ⇒ x0T Dx0 + BQx0 = 0 ⇒ 3 (y 0 ) − 3 (z 0 ) + √ 2 3 x0 = 0 ⇒ x0 = This is a hyperbolic paraboloid. 72. Writing the form as xT Ax + Bx = 15, we have   10 0 −20 0 , A =  0 25 −20 0 10 √ B = 20 2 50 √ 20 2 . The characteristic polynomial of A is −λ3 + 45λ2 − 200λ − 7500 = −(λ + 10)(λ − 25)(λ − 30). Computing unit eigenvectors for each eigenspace gives     √1 0  2      v−10 =  , v = 25 0 1 , 1 √ 0 2   − √12   v30 =  0  . √1 2 2 (y 0 ) (z 0 ) √ − √ 1/ 3 1/ 3 440 CHAPTER 5. ORTHOGONALITY Then setting  √1  2 0 Q= 0 1 √1 2 0 − √12   −10 0 0    D=  0 25 0 0 0 30   0  , √1 2 we get BQ = 40 0 . 50 Then the form becomes xT Ax + Bx = 15 ⇒ x0T Dx0 + BQx0 = 15 ⇒ 2 2 2 −10 (x0 ) + 25 (y 0 ) + 30 (z 0 ) + 40x0 + 50y 0 = 15 ⇒ 2 2 2 −10 (x0 ) + 4x0 + 25 (y 0 ) + 2y 0 + 30 (z 0 ) = 15 ⇒ 2 2 2 −10 (x0 + 2) + 25 (y 0 + 1) + 30 (z 0 ) = 0 ⇒ 2 2 2 (x0 + 2) = (z 0 ) (y 0 + 1) + . 2/5 1/3 Letting x00 = x0 + 2, y 00 = y 0 + 1, z 00 = z 0 gives 2 2 (x00 ) = 2 (y 00 ) (z 00 ) + . 2/5 1/3 This is an elliptic cone. 73. Writing the form as xT Ax + Bx = 6, we have   11 1 4 A =  1 11 −4 , 4 −4 14 B = −12 12 12 . The characteristic polynomial of A is −λ3 + 36λ2 − 396λ + 1296 = −(λ − 6)(λ − 12)(λ − 18). Computing unit eigenvectors for each eigenspace gives   − √13  1   v6 =   √3  ,    √1 3  √1  16   v18 =  − √6  . √2 6 √1  2 1  v12 =   √2  , 0 Then setting  − √1  13 Q=  √3 √1 3 we get √1 2 √1 2 0  √1 6  − √16  , 2 √ 6 √ BQ = 12 3  6   D = 0 0 0 0 .  0 0 12  0  0 18 5.5. APPLICATIONS 441 Then the form becomes xT Ax + Bx = 6 ⇒ x0T Dx0 + BQx0 = 6 ⇒ √ 2 2 2 6 (x0 ) + 12 (y 0 ) + 18 (z 0 ) + 12 3x0 6 ⇒ √ 2 2 2 6 (x0 ) + 2 3 x0 + 12 (y 0 ) + 18 (z 0 ) = 6 ⇒ √ 2 2 2 6 (x0 ) + 2 3 x0 + 3 + 12 (y 0 ) + 18 (z 0 ) = 24 ⇒ √ 2 2 2 x0 + 3 (y 0 ) (z 0 ) + + = 1. 4 2 4/3 Letting x00 = x0 + √ 3, y 00 = y 0 , z 00 = z 0 gives 2 2 2 (x00 ) (y 00 ) (z 00 ) + + = 1. 4 2 4/3 This is an ellipsoid. 74. The strategy is to reduce the problem to one in which we can apply part (b) of Exercise 65. Using the −1 hint, let v be an eigenvector corresponding to λ = a−bi, and let P = Re v Im v . Set B = P P T . Now, B is a real matrix; it is also symmetric since −1 T T −1 T T T −1 −1 P = PPT = B. BT = P P T = PPT = P Also, det B = det PPT −1 = 1 1 1 = = 2 > 0. det (P P T ) det P det P T (det P ) Thus Exercise 65(b) applies to B, so that xT Bx = k is an ellipse if both eigenvalues of B are positive (since k > 0). But by Exercise 29 in this section, B −1 is positive definite, so that B is as well (Exercise 31(d)), and therefore has positive eigenvalues. Hence the trajectories of xT B are ellipses. Next, let a −b C= , b a so that by Theorem 4.43, A = P CP −1 . Note that 2 2 a b a −b a + b2 ab − ab |λ| CT C = = = −b a b a ab − ab a2 + b2 0 1 0 = 0 |λ| 0 = I. 1 Now, (Ax)T B(Ax) = xT AT P P T = xT = xT = xT = xT = xT −1 (Ax) −1 −1 P CP PT P P CP −1 x T −1 P −1 C T P T P T CP −1 x −1 T PT C CP −1 x −1 −1 PT P x −1 PPT x −1 T = xT Bx. Thus the quadratics xT Bx = k and (Ax)T B(Ax) = k are the same, so that the trajectories of Ax are ellipses as well. 442 CHAPTER 5. ORTHOGONALITY Chapter Review 1. (a) True. See Theorem 5.1 and the definition preceding it. (b) True. See Theorem 5.15; the Gram-Schmidt process allows us to turn any basis into an orthonormal basis, showing that any subspace has such a basis. (c) True. See the definition preceding Theorem 5.5 in Section 5.1. (d) True. See Theorem 5.5. (e) False. If Q is an orthogonal matrix, then |det Q| = 1. However, there are many matrices with determinant 1 that are not orthogonal. For example, 1 1 2 5 , . 0 1 1 3 ⊥ ⊥ (f ) True. By Theorem 5.10, (row(A)) = null(A). But if (row(A)) = Rn , then null(A) = Rn so that Ax = 0 for every x, and therefore A = O (since Aei = 0 for every i, showing that every row of A is zero). (g) False. For example, let W be the subset of R2 spanned by e1 , and let v = e2 . Then projW (v) = 0, but v 6= 0. (h) True. If A is orthogonal, then A−1 = AT ; but if A is also symmetric, then AT = A. Thus A−1 = A, so that A2 = AA−1 = I. (i) False. For example, any diagonal matrix is orthogonally diagonalizable, since it is already diagonal. But if it has a zero entry on the diagonal, it is not invertible. (j) True. For example, the diagonal matrix with λ1 , λ2 , . . . , λn on the diagonal has this property. 2. The first pair of vectors is already orthogonal, since 1 · 4 + 2 · 1 + 3 · (−2) = 0. The other two pairs are orthogonal if and only if 1·a+2·b+3·3=0 ⇒ 4·a+1·b−2·3=0 a + 2b = −9 4a + b = 6 . Computing and row-reducing the augmented matrix gives 1 0 1 2 −9 → 4 1 6 0 1 3 −6 So the unique solution is a = 3 and b = −6; these are the only values of a and b for which these vectors form an orthogonal set. 3. Let   1 u1 = 0 , 1   1 u2 =  1 , −1   −1 u3 =  2 . 1 Note that these vectors form an orthogonal set. Therefore using Theorem 5.2, we write v = c1 u1 + c2 u2 + c3 u3 v · u1 v · u2 v · u3 = u1 + u2 + u3 u1 · u1 u2 · u2 u3 · u3 7·1−3·0+2·1 7 · 1 − 3 · 1 + 2 · (−1) 7 · (−1) − 3 · 2 + 2 · 1 = u1 + u2 + u3 1·1+0·0+1·1 1 · 1 + 1 · 1 − 1 · (−1) −1 · (−1) + 2 · 2 + 1 · 1 9 2 11 = u1 + u2 − u3 . 2 3 6 So the coordinate vector is   9 2   [v]B =  23  . − 11 6 Chapter Review 443 4. We first find v2 . Since v1 and v2 = x are orthogonal, we have y 3 4 x + y = 0, 5 5 4 or x = − y. 3 25 2 3 4 2 2 But also v2 is a unit vector, so that x2 + y 2 = 16 9 y + y = 9 y = 1. Therefore y = ± 5 and x = ∓ 5 . h i • If v2 = − 45 35 , then • If v2 = h 4 5 1 v = −3v1 + v2 = −3 2 " # 3 5 4 5 " # " # − 11 1 − 54 5 + = . 3 2 − 21 5 10 1 v = −3v1 + v2 = −3 2 " # # " # " 4 − 57 1 5 + = . 2 − 35 − 27 10 i − 35 , then 3 5 4 5 5. Using Theorem 5.5, we show that the matrix is orthogonal by showing that that QQT = I:      6 4 2 3 6 √ − √15 1 0 0 7 7 7 7 7 5     1  √ √2   2 √  = 0 1 0 = I. 0 QQT =  0 − 715   − 5 5  7 5 4 √ 7 5 √ − 715 5 2 √ 7 5 3 7 √2 5 2 √ 7 5 0 0 1 6. Let Q be the given matrix. If Q is to be orthogonal, then by Theorem 5.5, we must have " #" # " # " # 1 1 a 12 b + a2 21 b + ac 1 0 T 2 4 QQ = = 1 = . b c a c b2 + c2 0 1 2 b + ac Therefore we must have a2 + 1 = 1, 4 1 ac + b = 0, 2 b2 + c2 = 1. √ The first equation gives a = ± 23 . Substituting this into the second equation gives √ ± 3 1 c + b = 0, 2 2 √ b = ∓ 3c ⇒ b2 = 3c2 . ⇒ But b2 + c2 = 1 and b2 = 3c2 means that 4c2 = 1, so that c = ± 21 . So there are two possibilities: √ • If a = 3 2 , then b = − √ 3 c, so we get the two matrices " √ 3 2 1 2 1 2√ − 23 √ • If a = − 23 , then b = √ # " , 1 √2 3 2 √ 3 2 − 12 # . 3 c, so we get the two matrices " 1 √2 3 2 √ − 23 1 2 # " , 1 2√ − 23 √ − 23 − 21 # . 444 CHAPTER 5. ORTHOGONALITY 7. We must show that Qvi · Qvj = 0 for i 6= j and that Qvi · Qvi = 1. Since Q is orthogonal, we know that QQT = QT Q = I. Then for any i and j (possibly equal) we have T Qvi · Qvj = (Qvi ) Qvj = viT QT Q vj = viT vj = vi · vj . Thus if i 6= j, the dot product is zero since vi · vj = 0 for i 6= j, and if i = j, then the dot product is 1 since vi · vi = 1. 8. The given condition means that for all vectors x and y, x·y Qx · Qy = . kxk kyk kQxk kQyk Let qi be the ith column of Q. Then qi = Qei . Since the ei are orthonormal and the qi are unit vectors, we have ( 0 i 6= j qi · qj ei · ej qi · qj = = = ei · ej = kqi k kqj k kei k kej k 1 i = j. It follows that Q is an orthogonal matrix. 5 9. W is the column space of A = , so that W ⊥ is the null space of AT = 5 2 augmented matrix: 5 2 0 → 1 52 0 , 2 . Row-reduce the so that a basis for the null space, and thus for W ⊥ , is given by " # " # − 25 −2 , or . 1 5  1 10. W is the column space of A =  2, so that W ⊥ is the null space of AT = 1 −1 augmented matrix is already row-reduced: 1 2 −1 0 ,  2 −1 . The so that the null space, which is W ⊥ , is the set           −2s + t −2 1 −2 1  s  = s  1 + t 0 = span  1 , 0 . t 0 1 0 1 These two vectors form a basis for W ⊥ . 11. Let A be the matrix whose columns are the given basis of W . Then by Theorem 5.10, W ⊥ = ⊥ (col(A)) = null(AT ). Row-reducing AT , we get 1 −1 4 0 1 0 1 0 → , 0 1 −3 0 0 1 −3 0 so that a basis for W ⊥ is   −1  3 . 1 Chapter Review 445 12. Let A be the matrix whose columns are the given basis of W . Then by Theorem 5.10, W ⊥ = ⊥ (col(A)) = null(AT ). Row-reducing AT , we get " 1 1 # " 0 1 0 1 → 0 1 0 0 1 1 1 −1 1 2 # 0 , 0 3 2 − 12 so that           3  −3  −s − 32 t  −2  −1 −1              1      1    0 0 t 2  = s   + t  2  = span   ,  1 . W⊥ =   s    1  0  1  0               2 0 0 1 t 13. Row-reducing A gives  1 0 A→ 0 0 0 1 0 0 2 0 0 0 3 2 0 0  4 1 , 0 0 so that a basis for row(A) is given by the nonzero rows of the reduced matrix, or 1 0 2 3 4 , 0 1 0 2 1 . A basis for col(A) is given by the columns of A corresponding to leading 1s in the reduced matrix, so one basis is     −1  1        −1   ,  2 .  2  1      −5 3 A basis for null(A) is given by the solution set of the matrix, which, from the reduced form, is         −4  −3 −2 −2s − 3t − 4u            0 −2 −1     −2t − u             = span  1 ,  0 ,  0 .  s              0  1  0   t             1 0 0 u Finally, to determine null(AT ), we row-reduce AT :  1 −1   2   1 3   −1 2 3 1 0 2 1 −5    −2 4 6  → 0 0 1 8 −1 −2 9 7 0 0 1 0 0 0 5 3 0 0 0  1 −2  0 . 0 0 Then       −5s − t  −5 −1               −3s + 2t −3  = span   ,  2 . null(AT ) =    1  0 s           t 0 1 14. The orthogonal decomposition of v with respect to W is v = projW (v) + (v − projW (v)), 446 CHAPTER 5. ORTHOGONALITY so we must find projW (v). Let the three given vectors in W be w1 , w2 , and w3 . Then w1 · v w2 · v w3 · v projW (v) = w1 + w2 + w3 w1 · w1 w2 · w2 w3 · w3     0 1  1 · 1 + 0 · 0 + 1 · (−1) − 1 · 2  0 0 · 1 + 1 · 0 + 1 · (−1) + 1 · 2  1  +   = 0 · 0 + 1 · 1 + 1 · 1 + 1 · 1 1 1 · 1 + 0 · 0 + 1 · 1 − 1 · (−1)  1 1 −1   3  3 · 1 + 1 · 0 − 2 · (−1) + 1 · 2   1 +   −2 3 · 3 + 1 · 1 − 2 · (−2) + 1 · 1 1       0 1 3  2  0  1 7 1 1 −  +   =  3 1 3  1 15 −2 1 −1 1  11  15  4   =  195 . − 15  22 15 Then   11   4  1 15 15  0   4  − 4     5  5 v − projW (v) =   −  19  =  4  . −1 − 15   15   2 22 15 8 15 15. (a) Using Gram-Schmidt, we get   1 1  v1 = x1 =  1 1     1 1 1 1 1 · 1 + 1 · 1 + 1 · 1 + 1 · 0    v2 = x2 − projv1 (x2 )v1 =  1 − − 1 · 1 + 1 · 1 + 1 · 1 + 1 · 1 1 1 0      1 1 1 4 1 1 3 1    −   =  4 = 1 4 1  1   4 0 1 3 − 4 v3 = x3 − projv1 (x3 )v1 − projv2 (x3 )v2   0 1 1 1 3 1 1 · 0 + 1 · 1 + 1 · 1 + 1 · 1 4 ·0+ 4 ·1+ 4 ·1− 4 ·1 − v − v2 = 1 1 1 1 9 1 1 · 1 + 1 · 1 + 1 · 1 + 1 · 1 16 + 16 + 16 + 16 1  1  2     −3 0 1 4  1 1 1 3 1 1   4  3    =  =  . 1 − 4 1 + 3   14   13  1 1 −3 0 4 Chapter Review 447 (b) To find a QR factorization, we normalize the vectors from (a) and use those as the column vectors of Q. Then we set R = QT A. Normalizing v1 , v2 , and v3 yields     1 1    21    v1 1 1  = 2 q1 = =    1 kv1 k 2 1  2 1 1 2    √  3 1 4 √  1   √63     v2 2 3  4  =  √6  q2 = =     3 1 kv2 k 3     4 √6 3 3 −4 − 2    √  − 6 −2 √  31   √36     v3 6  3  =  √6  . q3 = =     6 1 kv3 k 2    3 6  0 0 Then  1  12 2 Q= 1 2 1 2 √ 3 √6 3 √6 3 6√ − 23 √ − 36  6 √6 6 6   ,   √ 0 and  1  √2 T 3 R=Q A=  6√ − 36 1 √2 3 √6 6 6 1 √2 3 √6 6 6   1 1 2√   1  − 23   1  0 1 1 1 1 0   0 2   1  = 0  1  0 1 3 √2 3 2 0 3 2√   − 63  . √ 6 3 Then A = QR is a QR factorization of A. 16. Note that the two given vectors,     1 0 0  1    v1 =  2 and v2 =  1 , 2 −1 are indeed orthogonal. We extend these to a basis by adding e3 and e4 . We can verify that this is a basis by computing the determinant of the matrix whose column vectors are the vi ; this determinant is 1, so the matrix is invertible and therefore the vectors are linearly independent, so they do in fact 448 CHAPTER 5. ORTHOGONALITY form a basis for R4 . We apply Gram-Schmidt to v3 and v4 : v3 = x3 − projv1 (x3 )v1 − projv2 (x3 )v2   0 0 0 · 1 + 0 · 0 + 1 · 2 + 0 · 2 0 · 0 + 0 · 1 + 1 · 1 + 0 · (−1)   = − v1 − v2 12 + 02 + 22 + 22 02 + 12 + 12 + (−1)2 1 0        2 −9 0 1 0 0 2 0 1  1 − 1          =   −   −   =  32  1 9 2 3  1  9  0 2 −1 − 19 v4 = x4 − projv1 (x4 )v1 − projv2 (x4 )v2 − projv3 (x4 )v3   0 0 0 · 1 + 0 · 0 + 0 · 2 + 1 · 2 0 · 0 + 0 · 1 + 0 · 1 + 0 · (−1)   = − v1 − v2 2 2 2 2 1 +0 +2 +2 02 + 12 + 12 + (−1)2 0 1 0 · − 92 + 0 · − 13 + 0 · 29 + 1 · − 19 v3 4 1 4 1 81 + 9 + 81 + 81        2  1 0 −3 1 0 −9 0 2 0 1  1 1 − 1   1            =   −   +   +  32  =  6  . 0 9 2 3  1 2  9   0 − 1 −1 2 − 19 1 6 17. If x1 + x2 + x3 + x4 = 0, then x4 = −x1 − x2 − x3 , so that a basis for W is given by      0  0 1   0  1  0  , x2 =   , x3 =   . B = x1 =   1  0  0      −1 −1 −1      Then using Gram-Schmidt,    v1 = x1 =    1 0   0 −1    v2 = x2 − projv1 (x2 )v1 =         0 1   1 1 · 0 + 0 · 1 + 0 · 0 − 1 · (−1)  0  −−   12 + 02 + 02 + (−1)2  0 0 −1  1 1 −2  1 0    =  0  0 0  1 1     = −   0 2  −1 −1  − 12 −1 Chapter Review 449 v3 = x3 − projv1 (x3 )v1 − projv2 (x3 )v2      1 0 1 −2  0 1 · 0 + 0 · 0 + 0 · 1 − 1 · (−1)  0 − 1 · 0 + 1 · 0 + 0 · 1 − 1 · (−1)  1       2 = −  − 2   1 1 12 + 02 + 02 + (−1)2  1  0  0 + 1 + 0 + 4 4 −1 −1 − 12      1  1 −3 0 1 −2  0 1  0 1  1 − 1          =   −   −   =  3 .  1 2  0 3  0  1 −1 −1 − 12 − 13 18. (a) The characteristic polynomial of A is 1 −1 2−λ 1 = −λ3 + 6λ2 − 9λ = −λ(λ − 3)2 , 1 2−λ so the eigenvalues are 0 and 3. To find the eigenspaces, we row-reduce A − λI 0 :     2 1 −1 0 1 0 −1 0 A − 0I 0 =  1 2 1 0 → 0 1 1 0 −1 1 2 0 0 0 0 0     1 −1 1 0 −1 1 −1 0 1 0 → 0 0 0 0 . A − 3I 0 =  1 −1 −1 1 −1 0 0 0 0 0 det(A − λI) = 2−λ 1 −1 Thus the eigenspaces are    1   E0 = span v1 = −1 ,   1      1 −1   E3 = span v2 = 1 , x3 =  0 .   0 1 The vectors in E0 are orthogonal to those in E3 since they correspond to different eigenvalues, but the basis vectors above in E3 are not orthogonal, so we use Gram-Schmidt to convert the second basis vector into one orthogonal to the first:       −1 1 − 12   1 · (−1) + 1 · 0 + 0 · 1    1  v3 =  0 − 1 =  2  . 1·1+1·1+0·0 1 0 1 Finally, normalize v1 , v2 , and v3 and let Q be the matrix whose columns are those vectors; then Q is the orthogonal matrix   √1 √1 − √16 3 2  1  1 √1  . Q= − √3 √2 6  √1 √2 0 3 6 Finally, an orthogonal diagonalization of A is given by  0 0 QT AQ = D = 0 3 0 0  0 0 . 3 450 CHAPTER 5. ORTHOGONALITY (b) From part (a), we have λ1 = 0 and λ2 = λ3 = 3, with     √1  13   q1 =  − √3  , √1 3   − √16  1   q3 =   √6  . √1  2 1  q2 =   √2  , √2 6 0 The spectral decomposition of A is A = λ1 q1 qT1 + λ2 q2 qT2 + λ3 q3 qT3 = 3q2 qT2 + 3q3 qT3     " # √1 − √16 1 1 2 √   √  h 2 2 + 3  √1  − √1 √1  = 3  2 0  6 6 √2 0 6     1 1 1 1 0 − 6 − 13 2 2 6     1 1 1 . 1  1  = 3  2 2 0 + 3 − 6 6 3 0 0 − 31 0 1 3 √1 6 √2 6 i 2 3 19. See Example   5.20, or Exercises 23 and 24 in Section 5.4. The basis vector for E−2 , which we name 1 v3 = −1, is orthogonal to the other two since they correspond to different eigenvalues. However, 0 we must orthogonalize the basis for E0 using Gram-Schmidt:         1 1 1 0     1·1+1·1+1·0    v1 = 1 , v2 = 1 − = 1 0 . 12 + 1 2 + 0 2 0 1 0 1 √ √ Now v1 , v2 , and v3 are mutually orthogonal, with norms 2, 2, and 1 respectively. Let Q be the matrix whose columns are q1 = kvv11 k , q2 = kvv22 k , and q3 = kvv33 k ; then the desired matrix is  1  Q 0 0 0 1 0  0  T 1 0 Q 0 −2  √1 0 √12 1 2   1 1  = − √2 0 √2  0 0 1 0 0   1 0  −1   0  Q = Q 0 −2 0  0 20. vi viT is symmetric since vi viT T = viT 0 1 0 T  √1 0  2  0  0 √1 −2 2 − √12 0 √1 2   0 −1   2  3 1  = − 2 0 0 − 23 − 21 0  0  0 . 1 viT = viT vi , so it is equal to its own transpose. Since a scalar times a symmetric matrix is a symmetric matrix, and the sum of symmetric matrices is symmetric, it follows that A is symmetric. Next, ! n n n X X X T Avj = ci vi vi vj = ci vi viT vj = ci vi (vi · vj ) = cj vj i=1 i=1 i=1 since {vi } is a set of orthonormal vectors. This shows that each cj is an eigenvalue for A with corresponding eigenvector vj . Since there are n of them, these are all the eigenvalues and eigenvectors. Chapter 6 Vector Spaces 6.1 Vector Spaces and Subspaces x . V is the set of all vectors in R2 whose first and second components are the same. x We verify all ten axioms of a vector space: x y x y x+y 1. If , ∈ V then + = ∈ V since its first and second components are the same. x y x y x+y x y x+y y+x y x 2. + = = = + . x y x+y y+x y x x y z x+y z x+y+z x y+z x y z 3. + + = + = = + = + + . x y z x+y z x+y+z x y+z x y z 0 x 0 x+0 x 4. ∈ V , and + = = ∈V. 0 x 0 x+0 x x −x x −x x−x 0 5. If ∈ V , then ∈ V , and + = = . x −x x −x x−x 0 x x cx 6. If ∈ V , then c = ∈ V since its first and second components are the same. x x cx x y x+y c(x + y) c(y + x) y+x y x 7. c + =c = = =c =c + . x y x+y c(x + y) c(y + x) y+x y x x (c + d)x cx + dx cx dx x x = = = + =c +d . 8. (c + d) x (c + d)x cx + dx cx dx x x x dx c(dx) (cd)x x =c = = = (cd) . 9. c d x dx c(dx) (cd)x x x 1x x = = . 10. 1 x 1x x x 2. V = : x, y ≥ 0 fails to satisfy Axioms 5 and 6, so it is not a vector space: y x 0 x cx 5. If 6= ∈ V , choose c < 0; then c = ∈ / V since at least one of cx and cy is negative. y 0 y cy x 0 −x 6. If 6= ∈ V , then ∈ / V since −x < 0. x 0 −x 1. Let V = 451 452 CHAPTER 6. VECTOR SPACES x : xy ≥ 0 fails to satisfy Axiom 1, so it is not a vector space: y x −y 1. Choose x 6= y with xy ≥ 0. Then and are both in V , but y −x x −y x−y x−y + = = ∈ /V y −x y−x −(x − y) 3. V = since (x − y)(−(x − y)) = −(x − y)2 < 0 because x 6= y. x 4. V = : x ≥ y fails to satisfy Axioms 5 and 6, so it is not a vector space: y x −x 5. If ∈ V and x 6= y, then x > y, so that −x < −y. Therefore ∈ / V. y −y x cx −x 6. Again let ∈ V and x 6= y, and choose c = −1. Then = ∈ / V as above. y cy −y . 5. R2 with the given scalar multiplication fails to satisfy Axiom 8, so it is not a vector space: x 8. Choose ∈ R2 with y 6= 0, and let c and d be nonzero scalars. Then y x (c + d)x cx + dx x x cx dx cx + dx (c + d) = = , while c +d = + = . y y y y y y y 2y These two are unequal since we chose y 6= 0. 6. R2 with the given addition fails to satisfy Axiom 7, so it is not a vector space: x x 8. Choose 1 , 2 ∈ R2 and let c be a scalar not equal 1. Then y1 y2 x1 x x + x2 + 1 cx1 + cx2 + c c + 2 =c 1 = y1 y2 y1 + y2 + 1 cy1 + cy2 + c x x cx1 cx2 cx1 + cx2 + 1 c 1 +c 2 = + = . y1 y2 cy1 cy2 cy1 + cy2 + 1 These two are unequal since we chose c 6= 1. 7. All axioms apply, so that this set is a vector space. Let x, y, and z by positive real numbers, so that x, y, z ∈ V , and c and d be any real numbers. Then 1. x ⊕ y = xy > 0, so that x ⊕ y ∈ V . 2. x ⊕ y = xy = yx = y ⊕ x. 3. (x ⊕ y) ⊕ z = (xy) ⊕ z = (xy)z = x(yz) = x ⊕ (yz) = x ⊕ (y ⊕ z). 4. 0 = 1, since x ⊕ 0 = x1 = 1x = 0 ⊕ x = x. 5. −x is the positive real number x1 , since x ⊕ (−x) = x · x1 = 1 = 0. 6. c 7. c x = xc > 0, so that c 9. c 10. 1 (d x=x c c c+d c d x∈V. (x ⊕ y) = (xy) = x y = xc ⊕ y c = (c 8. (c + d) c x) = c x x = x = x. c x) ⊕ (c y). d = x x = x ⊕ x = (c x) ⊕ (c y). c d x = xd = xcd = (xc ) = d xc = d d (c x). 6.1. VECTOR SPACES AND SUBSPACES 453 8. This is not a vector space, since Axiom 6 fails. Choose any rational number q, and let c = π. Then qπ is not rational, so that V is not closed under scalar multiplication. 9. All apply, so axioms that this is a vector space. Let x, y, z, u, v, w, c, and d be real numbers. Then x y u v and ∈ V , and 0 z 0 w 1. The sum of upper triangular matrices is again upper triangular. 2. Matrix addition is commutative. 3. Matrix addition is associative. x y x 4. The zero matrix O is upper triangular, and +O = 0 z 0 x y −x −y . 5. − is the matrix 0 z 0 −z x y cx cy 6. c = ∈ V since it is upper triangular. 0 z 0 cz y . z 7. c x 0 y u v x+u y+v + =c z 0 w 0 z+w c(x + u) c(y + v) = 0 c(z + w) cx + cu cy + cv = 0 cz + cw cx cy cu cv + = 0 cz 0 cw x y u v =c +c . 0 z 0 w 8. (c + d) x y dx 9. c d =c 0 z 0 x y x y 10. 1 = . 0 z 0 z x 0 y (c + d)x (c + d)y = z 0 (c + d)z cx + dx cy + dy = 0 cz + dz cx cy dx dy = + 0 cz 0 dz x y x y =c +d . 0 z 0 z dy c(dx) c(dy) (cd)x (cd)y x = = = (cd) dz 0 c(dz) 0 (cd)z 0 y . z 10. This set V is not a vector space, since it is not closed under addition. For example, 1 0 0 0 1 0 0 0 1 0 , ∈ V, but + = ∈ /V 0 0 0 1 0 0 0 1 0 1 since the product of the diagonal entries is not zero. 454 CHAPTER 6. VECTOR SPACES 11. This set V is a vector space. Let A be a skew-symmetric matrix and c a constant. Then 1. The sum of skew-symmetric matrices is skew-symmetric. 2. Matrix addition is commutative. 3. Matrix addition is associative. 4. The zero matrix O is skew-symmetric, and A + O = A. 5. If A is skew-symmetric, then so is −A, since (−A)ij = −aij = aji = (A)ji = −(−A)ji . 6. If A is skew-symmetric, then so is cA, since (cA)ij = c(A)ij = −c(A)ji = −(cA)ji . 7. Scalar multiplication distributes over matrix addition. 8, 9. These properties hold for general matrices, and skew-symmetric matrices are closed under these operations, so then in V . 10. 1A = A is skew-symmetric. 12. Axioms 1, 2, and 6 were verified in Example 6.3. For the remaining examples, using the notation from Example 3, and letting r(x) = c0 + c1 x + c2 x2 , 3. (p(x) + q(x)) + r(x) = ((a0 + b0 ) + (a1 + b1 )x + (a2 + b2 )x2 ) + r(x) = (a0 + b0 + c0 ) + (a1 + b1 + c1 )x + (a2 + b2 + c2 )x2 = (a0 + a1 x + a2 x2 ) + ((b0 + c0 ) + (b1 + c1 )x + (b2 + c2 )x2 ) = p(x) + (q(x) + r(x)). 4. The zero polynomial 0 + 0x + 0x2 is the zero vector, since 0 + p(x) + (0 + a0 ) + (0 + a1 )x + (0 + a2 )x2 = a0 + a1 x + a2 x2 = p(x). 5. −p(x) is the polynomial −a0 − a1 x − a2 x2 . 7. c(p(x) + q(x)) = c (a0 + a1 x + a2 x2 ) + (b0 + b1 x + b2 x2 ) = c (a0 + b0 ) + (a1 + b1 )x + (a2 + b2 )x2 = c(a0 + b0 ) + c(a1 + b1 )x + c(a2 + b2 )x2 = (ca0 + cb0 ) + (ca1 + cb1 )x + (ca2 + cb2 )x2 = ca0 + ca1 x + ca2 x2 + cb0 + cb1 x + cb2 x2 = c a0 + a1 x + a2 x2 + c b0 + b1 x + b2 x2 = cp(x) + cq(x). 8. (c + d)p(x) = (c + d) a0 + a1 x + a2 x2 = (c + d)a0 + (c + d)a1 x + (c + d)a2 x2 = (ca0 + da0 ) + (ca1 + da1 )x + (ca2 + da2 )x2 = (ca0 + ca1 x + ca2 x2 ) + (da0 + da1 x + da2 x2 ) = c a0 + a1 x + a2 x2 + d a0 + a1 x + a2 x2 = cp(x) + dp(x). 6.1. VECTOR SPACES AND SUBSPACES 455 9. c(dp(x)) = c da0 + da1 x + da2 x2 = c(da0 ) + c(da1 )x + c(da2 )x2 = (cd)a0 + (cd)a1 x + (cd)a2 x2 = (cd) a0 + a1 x + a2 x2 = (cd)p(x). 10. 1p(x) = 1a0 + 1a1 x + 1a2 x2 = a0 + a1 x + a2 x2 = p(x). 13. Axioms 1 and 6 were verified in Example 6.4. For the rest, using the notation from Example 4, 2, 3. Addition of functions is commutative and associative. 4. The zero function f (x) = 0 for all x is the zero vector, since (0 + f )(x) = 0(x) + f (x) = f (x). 5. The function −f such that (−f )(x) = −f (x) is such that (−f + f )(x) = (−f )(x) + f (x) = −f (x) + f (x) = 0, so their sum is the zero function. 7. The function c(f + g) is defined by (c(f + g))(x) = c((f + g)(x)) by the definition of scalar multiplication. But (f + g)(x) = f (x) + g(x), so we get (c(f + g))(x) = c(f (x) + g(x)) = cf (x) + cg(x) = (cf + cg)(x). 8. Again using the definition of scalar multiplication, the function (c+d)f is defined by ((c+d)f )(x) = (c + d)(f (x)) = cf (x) + df (x) = (cf + df )(x). 9. The function df is defined by (df )(x) = df (x), so c(df ) is defined by (c(df ))(x) = c((df )(x)) = cdf (x) = ((cd)f )(x). 10. (1f )(x) = 1f (x) = f (x). 14. This is not a complex vector space, since Axiom 6 fails. Let z ∈ C be any nonzero complex number, and let c ∈ C be any complex number with nonzero real and imaginary parts. Then z cz cz c = . 6= z̄ cz cz̄ 15. This is a vector space over C; the proof is identical to the proof that Mmn (R) is a real vector space. z w 16. This is a complex vector space. Let 1 , 1 ∈ C2 , and let c, d ∈ C. Then z2 w2 1, 2, 3, 4, 5. These follow since addition is the same as in the usual vector space C2 . z c̄z1 6. c 1 = ∈ C2 . z2 c̄z2 7. z1 w z +w+1 c̄(z1 + w1 ) c̄z1 + c̄w1 c + 1 =c 1 = = z2 w2 z2 + w2 c̄(z2 + w2 ) c̄z2 + c̄w2 c̄z1 c̄w1 z w = + =c 1 +c 1 . c̄z2 c̄w2 z2 w2 (c + d)z1 cz1 dz1 (c + d)z1 z z z1 = = + =c 1 +d 1 . = cz2 z2 z2 z2 (c + d)z2 dz2 (c + d)z2 ¯ ¯ (cd)z1 c̄dz dz z z = ¯1 =c ¯1 =c d 1 . 9. (cd) 1 = z2 c̄ dz dz z2 (cd)z2 2 2 z 1̄z1 1z1 z 10. 1 1 = = = 1 . z2 1̄z2 1z2 z2 8. (c + d) 456 CHAPTER 6. VECTOR SPACES 17. This is not a complex vector space, since Axiom 6 fails. Let x ∈ Rn be any nonzero vector, and let c ∈ C be any complex number with nonzero imaginary part. Then cx ∈ / Rn , so that Rn is not closed under scalar multiplication. 18. This set V is a vector space over Z2 . Let x, y ∈ Zn2 each have an even number of 1s, say x has 2r ones and y has 2s ones. 1. x + y has a 1 where either x or y has a 1, except that if both x and y have a 1, x + y has a zero. Say this happens in k different places. Then the total number of 1s in x+y is 2r+2s−2k = 2(r+s−k), which is even, so V is closed under addition. 2. Since addition in Z2 is commutative, we see that x + y = y + x since we are simply adding corresponding components. 3. Since addition in Z2 is associative, we see that (x + y) + z = x + (y + z) since we are simply adding corresponding components. 4. 0 ∈ Zn2 has an even number (zero) of ones, so 0 ∈ V , and x + 0 = x. 5. Since 1 = −1 in Z2 , we can take −x = x; then −x + x = x + x = 0. 6. If c ∈ Z2 , then cx is equal to either 0 or x depending on whether c = 0 or c = 1. In either case, x ∈ V implies that cx ∈ V . 7, 8. Since multiplication distributes over addition in Z2 , these axioms follow since operations are component by component. 9. Since multiplication in Z2 is distributive, these axioms follow since operations are component by component. 10. 1x = x since multiplication is component by component. 19. This set V is not a vector space: 1. For example, let x and y be vectors with exactly one 1. Then x + y has either 0 or 2 ones (it has zero if the locations of the 1s in x and y are the same), so x + y ∈ / V and V is not closed under addition. 4. 0 ∈ / V since 0 does not have an odd number of 1s. 6. If c = 0, then cx = 0 ∈ / V if x ∈ V , so that V is not closed under scalar multiplication. 20. This is a vector space over Zp ; the proof is identical to the proof that Mmn (R) is a real vector space. 21. This is not a vector space. Axioms 1 through 7 are all satisfied, as is Axiom 10, but Axioms 8 and 9 fail since addition and multiplication are not the same in Z6 as they are in Z3 . For example: 8. Let c = 2, d = 1, and u = 1. Then (c + d)u = (2 + 1)u = 0u = 0 since addition of scalars is performed in Z3 , while cu + du = 2 · 1 + 1 · 1 = 2 + 1 = 3 since here the addition is performed in Z6 . 9. Let c = d = 2 and u = 1. Then (cd)u = 1 · 1 = 1 since the scalar multiplication is done in Z3 , while c(du) = 2(2 · 1) = 2 · 2 = 4 since here the multiplication is done in Z6 . 22. Since 0 = 0 + 0, we have 0u = (0 + 0)u = 0u + 0u by Axiom 8. By Axiom 5, we add −0u to both sides of this equation, giving −0u + 0u = (−0u + 0u) + 0u, so that by Axiom 5 0 = 0 + 0u. Finally, 0 + 0u = 0u by Axiom 4, so that 0 = 0u. 23. To show that (−1)u = −u, it suffices to show that (−1)u + u = 0, by Axiom 5. But (−1)u + u = (−1)u + 1u by Axiom 10. Then use Axiom 8 to rewrite this as (−1 + 1)u = 0u. By part (a) of Theorem 6.1, 0u = 0 and we are done. 6.1. VECTOR SPACES AND SUBSPACES 457 24. This is a subspace. We check each of the conditions in Theorem 6.2:       a b a+b 1. 0 + 0 =  0  ∈ W since its first and third components are equal and its second component a b a+b is zero.     a ca 2. c 0 =  0  ∈ W . a cz 25. This is a subspace. We check each of the conditions in Theorem 6.2:         a b a+b (a + b) 1. −a + −b =  −a − b  = −(a + b) ∈ W . 2a 2b 2a + 2b 2(a + b)       a ac (ac) 2. c −a = −ac = −(ac) ∈ W . 2a 2ac 2(ac) 26. This is not a subspace. It fails both conditions in Theorem 6.2:       c a+c a + = , which is not in W . d b+d b 1.  c+d+1 (a + c) + (b + d) + 2 a+b+1       ca ca a = = , which is not in W if c 6= 1. cb cb b 2. c  c(a + b + 1) ca + cb + c a+b+1 27. This is not a subspace. It fails both conditions in Theorem 6.2:       c a+c a 6 |a + c|, so that 1.  b  +  d  =  b + d . If a and c do not have the same sign, then |a| + |c| = |c| |a| + |c| |a| the sum is not in W .     a ca 2. c  b  =  cb . If c < 0, then c |a| = 6 |ca|, so that the result is not in W . |a| c |a| 28. This is a subspace. We check each of the conditions in Theorem 6.2: a b c d a+c b+d a+c b+d 1. + = = ∈ W. b 2a d 2c b + d 2a + 2c b + d 2(a + c) a b ca cb ac bc 2. c = = ∈ W. b 2a cb 2ac bc 2(ac) 29. This is not a subspace; condition 1 in Theorem 6.2 fails. For example, if a, d ≥ 0, then a 0 −a a , ∈ W, 0 d d −d but a 0 since 0 < ad. 0 −a a 0 a + = ∈ /W d d −d d 0 458 CHAPTER 6. VECTOR SPACES 30. This is not a subspace; both conditions of Theorem 6.2 fail. For example: Let A = B = I; then det A = det B = 1, so that A, B ∈ W , but det(A + B) = det 2I = 2, so A+B ∈ / W. Let A = I and c be any constant other than ±1. Then det(cA) = cn det A 6= det A = 1, so that cA ∈ / W. 31. Since the sum of two diagonal matrices is again diagonal, and a scalar times a diagonal matrix is diagonal, this subset satisfies both conditions of Theorem 6.2, so it is a subspace. 32. This is not a subspace. For example, it fails the second condition: let A be idempotent and c be any constant other than 0 or 1. Then (cA)2 = c2 A2 = c2 A 6= cA so that cA is not idempotent, so is not in W . 33. This is a subspace. We check each of the conditions in Theorem 6.2: 1. Suppose that A, C ∈ W . Then (A + C)B = AB + CB = BA + BC = B(A + C), so that A + C ∈ W . 2. Suppose that A ∈ W and c is any scalar. Then (cA)B = c(AB) = c(BA) = B(cA), so that cA ∈ W . 34. This is a subspace. We check each of the conditions in Theorem 6.2: 1. Let v = bx + cx2 , v0 = b0 x + c0 x2 . Then v + v0 = bx + cx2 + b0 x + c0 x2 = (b + b0 )x + (c + c0 )x2 ∈ W. 2. Let v = bx + cx2 and d be any scalar. Then dv = d(bx + cx2 ) = (db)x + (dc)x2 ∈ W . 35. This is a subspace. We check each of the conditions in Theorem 6.2: 1. Let v = a + bx + cx2 , v0 = a0 + b0 x + c0 x2 ∈ W . Then v + v0 = a + bx + cx2 + a0 + b0 x + c0 x2 = (a + a0 ) + (b + b0 )x + (c + c0 )x2 . Then (a + a0 ) + (b + b0 ) + (c + c0 ) = (a + b + c) + (a0 + b0 + c0 ) = 0, so that v + v0 ∈ W . 2. Let v = a + bx + cx2 and d be any scalar. Then dv = d(a + bx + cx2 ) = (da) + (db)x + (dc)x2 , and da + db + dc = d(a + b + c) = d · 0 = 0, so that dv ∈ W . 36. This is not a subspace; it fails condition 1 in Theorem 6.2. For example, let v = 1 + x and v0 = x2 . Then both v and v0 are in V , but v + v0 = 1 + x + x2 ∈ / V since the product of its coefficients is not zero. 37. This is not a subspace; it fails both conditions of Theorem 6.2. 1. For example let v = x3 and v0 = −x3 . Then v + v0 = 0, which is not a degree 3 polynomial. 2. Let c = 0; then cv = 0 for any v, but 0 ∈ / W , so that this condition fails. 38. This is a subspace. We check each of the conditions in Theorem 6.2: 1. Let f, g ∈ W . Then (f + g)(−x) = f (−x) + g(−x) = f (x) + g(x) = (f + g)(x), so that f + g ∈ W . 2. Let f ∈ W and c be any scalar. Then (cf )(−x) = cf (−x) = cf (x) = (cf )(x), so that cf ∈ W . 39. This is a subspace. We check each of the conditions in Theorem 6.2: 1. Let f, g ∈ W . Then (f + g)(−x) = f (−x) + g(−x) = −f (x) − g(x) = −(f (x) + g(x)) = −(f + g)(x), so that f + g ∈ W . 2. Let f ∈ W and c be any scalar. Then (cf )(−x) = cf (−x) = −cf (x) = −(cf )(x), so that cf ∈ W . 6.1. VECTOR SPACES AND SUBSPACES 459 40. This is not a subspace; it fails both conditions in Theorem 6.2. For example, if f, g ∈ W , then (f + g)(0) = f (0) + g(0) = 1 + 1 6= 1, so that f + g ∈ / W . Similarly, if c is a scalar not equal to 1, then (cf )(0) = cf (0) = c 6= 1, so that cf ∈ / W. 41. This is a subspace. We check each of the conditions in Theorem 6.2: 1. Let f, g ∈ W . Then (f + g)(0) = f (0) + g(0) = 0 + 0 = 0, so that f + g ∈ W . 2. Let f ∈ W and c be any scalar. Then (cf )(0) = cf (0) = c · 0 = 0, so that cf ∈ W . 42. If f and g ∈ W are integrable and c is a constant, then Z Z Z (f + g) = f + g, Z Z (cf ) = c f, so that both f + g and cf ∈ W . Thus W is a subspace. 43. This is not a subspace; it fails condition 2 in Theorem 6.2. For example, let f ∈ W be such that f 0 (x) > 0 everywhere, and choose c = −1; then (cf )0 (x) = cf 0 (x) < 0, so that cf ∈ / W. 44. If f, g ∈ W and c is a constant, then (f + g)00 = f 00 + g 00 , (cf )00 = cf 00 , so that both f + g and cf have continuous second derivatives, so they are both in W . 45. This is not a subspace; it fails both conditions in Theorem 6.2. / W , since limx→0 (−f (x)) = −∞. 1. f (x) = x12 ∈ W , but −f (x) = − x12 ∈ 2. Take c = 0 and f ∈ W ; then limx→0 (cf )(x) = limx→0 0 = 0, so that cf ∈ / W. 46. Suppose that v, v0 ∈ U ∩ W , and let c be a scalar. Then v, v0 ∈ U , so that v + v0 and cv ∈ U . Similarly, v, v0 ∈ W , so that v + v0 and cv ∈ W . Therefore v + v0 ∈ U ∩ W and cv ∈ U ∩ W , so that U ∩ W is a subspace. x 0 47. For example, let U = , the x-axis, and let W = , the y-axis. Then U ∪ W consists of all 0 y 2 vectors in R in which at least one component is zero. Choose a, b 6= 0; then a 0 a 0 a ∈ U, ∈ W, but + = ∈ / U ∪W 0 b 0 b b since neither a nor b is zero. 48. (a) U + W is the xy-plane, since     x 0 U =  0  , W = y  , 0 0 so that any point in the xy-plane is       a a 0  b  = 0 +  b  ∈ U + W. 0 0 0 However, a point not in the xy-plane has nonzero z-coordinate, so it cannot be a sum of an element of U and an element of W . Thus U + W is exactly the xy-plane. 460 CHAPTER 6. VECTOR SPACES (b) Suppose x, x0 ∈ U + W ; say x = u + w and x0 = u0 + w0 , where u, u0 ∈ U and w, w0 ∈ W . Then x + x0 = u + w + u0 + w0 = (u + u0 ) + (w + w0 ) ∈ U + W since both U and W are subspaces. If c is a scalar, then cx = c(u + w) = cu + cw ∈ U + W, again since both U and W are subspaces. 49. We verify all ten axioms of a vector space. Note that both U and V are vector spaces, so they satisfy all the vector space axioms. These facts will be used without comment below. Let (u1 , v1 ), (u2 , v2 ), and (u3 , v3 ) be elements of U × V , and let c and d be scalars. Then 1. (u1 , v1 ) + (u2 , v2 ) = (u1 + u2 , v1 + v2 ) ∈ U × V . 2. (u1 , v1 ) + (u2 , v2 ) = (u1 + u2 , v1 + v2 ) = (u2 + u1 , v2 + v1 ) = (u2 , v2 ) + (u1 , v1 ). 3. ((u1 , v1 ) + (u2 , v2 )) + (u3 , v3 ) = (u1 + u2 , v1 + v2 ) + (u3 , v3 ) = ((u1 + u2 ) + u3 , (v1 + v2 ) + v3 ) = (u1 + (u2 + u3 ), v1 + (v2 + v3 )) = (u1 , v1 ) + (u2 + u3 , v2 + v3 ) = (u1 , v1 ) + ((u2 , v2 ) + (u3 , v3 )). 4. (0, 0) ∈ U × V , and (u1 , v1 ) + (0, 0) = (u1 + 0, v1 + 0) = (u1 , v1 ). 5. (u1 , v1 ) + (−u1 , −v1 ) = (u1 − u1 , v1 − v1 ) = (0, 0). 6. c(u1 , v1 ) = (cu1 , cv1 ) ∈ U × V . 7. c ((u1 , v1 ) + (u2 , v2 )) = c(u1 + u2 , v1 + v2 ) = (c(u1 + u2 ), c(v1 + v2 )) = (cu1 + cu2 , cv1 + cv2 ) = (cu1 , cv1 ) + (cu2 , cv2 ) = c(u1 , v1 ) + c(u2 , v2 ). 8. (c + d)(u1 , v1 ) = ((c + d)u1 , (c + d)v1 ) = (cu1 + du1 , cv1 + dv1 ) = (cu1 , cv1 ) + (du1 , dv1 ) = c(u1 , v1 ) + d(u1 , v1 ). 9. c (d(u1 , v1 )) = c(du1 , dv1 ) = (c(du1 ), c(du2 )) = ((cd)u1 , (cd)v1 ) = (cd)(u1 , v1 ). 10. 1(u1 , v1 ) = (1u1 , 1v1 ) = (u1 , v1 ). 50. Suppose that (w, w), (w0 , w0 ) ∈ ∆. Then (w, w) + (w0 , w0 ) = (w + w0 , w + w0 ), which is in ∆ since its two components are the same. Let c be a scalar. Then c(w, w) = (cw, cw), which is also in ∆ since its components are the same. Therefore ∆ satisfies both conditions of Theorem 6.2, so is a subspace. 6.1. VECTOR SPACES AND SUBSPACES 461 51. If aA + bB = C, then 1 a −1 1 1 +b 1 1 −1 a+b = 0 b−a a−b 1 = a 3 2 . 4 Then a − b = 2 and b − a = 3, which is impossible. Therefore C is not in the span of A and B. 52. If aA + bB = C, then a 1 −1 1 1 +b 1 1 −1 a+b a−b 3 = = 0 b−a a 5 −5 . −1 Then a = −1 from the lower right. The upper left then gives −1+b = 3, so that b = 4. Then a−b = −5 and b − a = 5, so that −A + 4B = C. 53. Suppose that a(1 − 2x) + b(x − x2 ) + c(−2 + 3x + x2 ) = (a − 2c) + (−2a + b + 3c)x + (−b + c)x2 = 3 − 5x − x2 . Then − 2c = a 3 −2a + b + 3c = −5 − b + c = −1. The third equation gives b = c + 1; substituting this in the second equation gives −2a + 4c = −6, which is −2 times the first equation. So there is a family of solutions, which can be parametrized as a = 2t + 3, b = t + 1, c = t. For example, with t = 0 we get s(x) = 3p(x) + q(x), and s(x) ∈ span(p(x), q(x), r(x)). 54. Suppose that a(1 − 2x) + b(x − x2 ) + c(−2 + 3x + x2 ) = (a − 2c) + (−2a + b + 3c)x + (−b + c)x2 = 1 + x + x2 . Then a − 2c = 1 −2a + b + 3c = 1 − b + c = 1. The third equation gives b = c − 1; substituting into the second equation gives −2a + 4c − 1 = 1, or a − 2c = −1. This conflicts with the first equation, so the system has no solution. Thus s(x) ∈ / span(p(x), q(x), r(x)). 55. Since sin2 x + cos2 x = 1, we have 1 = f (x) + g(x) ∈ span(sin2 x, cos2 x). 56. Since cos 2x = cos2 x − sin2 x = g(x) − f (x), we have cos 2x = −f (x) + g(x) ∈ span(sin2 x, cos2 x). 57. Suppose that sin 2x = af (x) + bg(x) = a sin2 x + b cos2 x for all x. Then Let x = 0; then 0 = a sin2 0 + b cos2 0 = b, so that b = 0. π π π π = sin π = 0 = a sin2 + b cos2 = a, so that a = 0. Let x = ; then sin 2 · 2 2 2 2 Thus the only solution is a = b = 0, but this is not a valid solution since sin 2x 6= 0. So sin 2x ∈ / span(sin2 x, cos2 x). 462 CHAPTER 6. VECTOR SPACES 58. Suppose that sin x = af (x) + bg(x) = a sin2 x + b cos2 x for all x. Then Let x = 0; then 0 = a sin2 0 + b cos2 0 = b, so that b = 0. π π π π Let x = ; then sin = 1 = a sin2 + b cos2 = a, so that a = 1. 2 2 2 2 Thus the only possibility is sin x = sin2 x. But this is not a valid equation; for example, if x = 3π 2 , then the left-hand side is −1 while the right-hand side is 1. So sin x ∈ / span(sin2 x, cos2 x). 59. Let 1 V1 = 0 1 , 1 0 V2 = 1 1 , 0 1 V3 = 1 0 , 1 0 V4 = 1 −1 . 0 Now, V4 = V3 − V1 , so that span(V1 , V2 , V3 , V4 ) = span(V1 , V2 , V3 ). If span(V1 , V2 , V3 ) = M22 , then every matrix is in the span, so in particular E11 is. But a+c a+b 1 0 aV1 + bV2 + cV3 = E11 ⇒ = . b+c a+c 0 0 Now the diagonal entries give a + c = 0 and a + c = 1, which has no solution. So E11 is not in the span and thus span(V1 , V2 , V3 ) 6= M22 . 60. Let V1 = 1 1 0 , 0 V2 = 1 1 1 , 0 V3 = 1 1 1 , 1 V4 = 0 1 −1 . 0 Solving the four independent systems E11 = aV1 + bV2 + cV3 + dV4 E12 = aV1 + bV2 + cV3 + dV4 E21 = aV1 + bV2 + cV3 + dV4 E22 = aV1 + bV2 + cV3 + dV4 gives E11 = 2V1 − V2 − V4 E12 = −V1 + V2 E21 = −V1 + V2 + V4 E22 = −V2 + V3 . Therefore span(V1 , V2 , V3 , V4 ) ⊇ span(E11 , E12 , E21 , E22 ) = M22 . 61. Let p(x) = 1 + x, q(x) = x + x2 , and r(x) = 1 + x2 . Then a + c = 1 1 1 1 = 0 ⇒ a= , b=− , c= 1 = ap(x) + bq(x) + cs(x) ⇒ a + b 2 2 2 b + c = 0 a + c = 0 1 1 1 = 1 ⇒ a= , b= , c=− x = ap(x) + bq(x) + cs(x) ⇒ a + b 2 2 2 b + c = 0 a + c = 0 1 1 1 = 0 ⇒ a=− , b= , c= . x2 = ap(x) + bq(x) + cs(x) ⇒ a + b 2 2 2 b + c = 1 Thus span(p(x), q(x), r(x)) ⊇ span(1, x, x2 ) = P2 . 62. Let p(x) = 1 + x + 2x2 , q(x) = 2 + x + 2x2 , and r(x) = −1 + x + 2x2 . Then x = ap(x) + bq(x) + cs(x) ⇒ a + 2b − c = 0 a + b + c = 1 2a + 2b + 2c = 0 . But the second pair of equations are inconsistent, so that x ∈ / span(p(x), q(x), r(x)). span(p(x), q(x), r(x)) 6= P2 . Therefore 6.2. LINEAR INDEPENDENCE, BASIS, AND DIMENSION 463 63. Suppose 0 and 00 satisfy Axiom 4. Then 0 + 00 = 0 since 00 satisfies Axiom 4, and also 0 + 00 = 00 since 0 satisfies Axiom 4. Putting these two equalities together gives 0 = 00 , so that 0 is unique. 64. Suppose v0 and v00 satisfy Axiom 5 for v. Then using Axiom 5, v00 = v00 + 0 = v00 + (v + v0 ) = (v00 + v) + v0 = 0 + v0 = v0 . 6.2 Linear Independence, Basis, and Dimension 1. We wish to determine if there are constants c1 , c2 , and c3 , not all zero, such that 1 1 1 −1 1 0 c1 + c2 + c3 = 0. 0 −1 1 0 3 2 This is equivalent to asking for solutions to the linear system c1 + c2 + c3 = 0 c1 − c2 =0 c2 + 3c3 = 0 −c1 + 2c3 = 0. Row-reducing the corresponding augmented matrix, we get    1 0 1 1 1 0 0 1  1 −1 0 0 →  0 0  0 1 3 0 −1 0 2 0 0 0 0 0 1 0  0 0 . 0 0 This system has only the trivial solution c1 = c2 = c3 = 0, so that these matrices are linearly independent. 2. We wish to determine if there are constants c1 , c2 , and c3 , not all zero, such that 2 −3 1 −1 −1 3 c1 + c2 + c3 = 0. 4 2 3 3 1 5 This is equivalent to asking for solutions to the linear system 2c1 + c2 − c3 = 0 −3c1 − c2 + 3c3 = 0 4c1 + 3c2 + c3 = 0 2c1 + 3c2 + 5c3 = 0. Row-reducing the corresponding augmented matrix, we get    2 1 −1 0 1 0  −3 −1 3 0  → 0 1   4 0 0 3 1 0 2 3 5 0 0 0 −2 3 0 0  0 0 . 0 0 This system has a nontrivial solution c1 = 2t, c2 = −3t, c3 = t; setting t = 1 for example gives 2 −3 1 −1 −1 3 2 −3 + = 0. 4 2 3 3 1 5 so that the matrices are linearly dependent. 464 CHAPTER 6. VECTOR SPACES 3. We wish to determine if there are constants c1 , c2 , c3 , and c4 , not all zero, such that −1 1 3 0 0 2 −1 0 c1 + c2 + c3 + c4 = 0. −2 2 1 1 −3 1 −1 7 This is equivalent to asking for solutions to the linear system −c1 + 3c2 c1 − c4 = 0 + 2c3 =0 −2c1 + c2 − 3c3 − c4 = 0 2c1 + c2 + c3 + 7c4 = 0. Row-reducing the corresponding augmented matrix, we get    1 0 −1 3 0 −1 0  1 0 0 1  2 0 0    −2 1 −3 −1 0 → 0 0 2 1 1 7 0 0 0 0 0 1 0 4 1 −2 0  0 0 . 0 0 This system has a nontrivial solution c1 = −4t, c2 = −t, c3 = 2t, c4 = t; setting t = 1 for example gives −1 1 3 0 0 2 −1 0 = 0. −4 − +2 + −2 2 1 1 −3 1 −1 7 so that the matrices are linearly dependent. 4. We wish to determine if there are constants c1 , c2 , c3 , and c4 , not all zero, such that 1 1 1 0 0 1 1 1 c1 + c2 + c3 + c4 = 0. 0 1 1 1 1 1 1 0 This is equivalent to asking for solutions to the linear system c1 + c2 c1 + c4 = 0 + c3 + c4 = 0 c2 + c3 + c4 = 0 c1 + c2 + c3 = 0. Row-reducing the corresponding augmented matrix, we get    1 0 0 0 1 1 0 1 0 1 0 1 1 0 0 1 0 0    0 1 1 1 0 → 0 0 1 0 1 1 1 0 0 0 0 0 1  0 0 . 0 0 Since the only solution is the trivial solution, it follows that these four matrices are linearly independent over M22 . 5. Following, for instance, Example 6.26, we wish to determine if there are constants c1 and c2 , not both zero, such that c1 (x) + c2 (1 + x) = 0. This is equivalent to asking for solutions to the linear system c2 = 0 c1 + c2 = 0. By forward substitution, we see that the only solution is the trivial solution, c1 = c2 = 0. It follows that these polynomials are linearly independent over P2 . 6.2. LINEAR INDEPENDENCE, BASIS, AND DIMENSION 465 6. Following, for instance, Example 6.26, we wish to determine if there are constants c1 , c2 , and c3 , not all zero, such that c1 (1 + x) + c2 (1 + x2 ) + c3 (1 − x + x2 ) = (c1 + c2 + c3 ) + (c1 − c3 )x + (c2 + c3 )x2 = 0. This is equivalent to asking for solutions to the linear system c1 + c2 + c3 = 0 − c3 = 0 c1 c2 + c3 = 0. Row-reducing the corresponding augmented matrix, we get    1 1 1 0 1 0 0 1 0 −1 0 → 0 1 0 0 1 1 0 0 0 1  0 0 . 0 Since the only solution is the trivial solution, it follows that these three polynomials are linearly independent over P2 . 7. Following, for instance, Example 6.26, we wish to determine if there are constants c1 , c2 , and c3 , not all zero, such that c1 (x) + c2 (2x − x2 ) + c3 (3x + 2x2 ) = (c1 + 2c2 + 3c3 )x + (−c2 + 2c3 )x2 = 0. This is a homogeneous system of two equations in three unknowns, so we know it has a nontrivial solution and these polynomials are linearly dependent. To find an expression showing the linear dependence, we must find solutions to the linear system c1 + 2c2 + 3c3 = 0 − c2 + 2c3 = 0. Row-reducing the corresponding augmented matrix, we get 1 2 3 0 1 0 7 → 0 −1 2 0 0 1 −2 0 . 0 The general solution is c1 = −7t, c2 = 2t, c3 = t. Setting t = 1, for example, we get −7(x) + 2(2x − x2 ) + (3x + 2x2 ) = 0. 8. Following, for instance, Example 6.26, we wish to determine if there are constants c1 , c2 , c3 , and c4 , not all zero, such that c1 (2x)+c2 (x−x2 )+c3 (1+x3 )+c4 (2−x2 +x3 ) = (c3 +2c4 )+(2c1 +c2 )x+(−c2 −c4 )x2 +(c3 +c4 )x3 = 0. This is equivalent to asking for solutions to the linear system c3 + 2c4 = 0 2c1 + c2 =0 − c2 − c4 = 0 c3 + c4 = 0. Row-reducing the corresponding augmented matrix, we get    0 0 1 2 0 1 0 2  0 1 1 0 0 0    0 −1 0 −1 0 → 0 0 0 0 1 1 0 0 0 0 0 1 0 0 0 0 1  0 0 . 0 0 Since the only solution is the trivial solution, it follows that these four polynomials are linearly independent over P3 . 466 CHAPTER 6. VECTOR SPACES 9. Following, for instance, Example 6.26, we wish to determine if there are constants c1 , c2 , c3 , and c4 , not all zero, such that c1 (1 − 2x) + c2 (3x + x2 − x3 ) + c3 (1 + x2 + 2x3 ) + c4 (3 + 2x + 3x3 ) = (c1 + c3 + 3c4 ) + (−2c1 + 3c2 + 2c4 )x + (c2 + c3 )x2 + (−c2 + 2c3 + 3c4 )x3 = 0. This is equivalent to asking for solutions to the linear system c1 + c3 + 3c4 = 0 −2c1 + 3c2 + 2c4 = 0 c2 + c3 =0 − c2 + 2c3 + 3c4 = 0. Row-reducing the corresponding augmented matrix, we get    1 0 1 3 0 1 0 0 1  −2 3 0 2 0  →  0 0 0 1 1 0 0 0 −1 2 3 0 0 0 0 0 1 0 0 0 0 1  0 0 . 0 0 Since the only solution is the trivial solution, it follows that these four polynomials are linearly independent over P3 . 10. We wish to find solutions to a + b sin x + c cos x = 0 that hold for all x. Setting x = 0 gives a + c = 0, so that a = −c; setting x = π gives a − c = 0, so that a = c. This forces a = c = 0; then setting x = π2 gives b = 0. So the only solution is the trivial solution, and these functions are linearly independent in F. 11. Since sin2 x + cos2 x = 1, the functions are linearly dependent. 12. Suppose that aex + be−x = 0 for all x. Setting x = 0 gives ae0 + be0 = a + b = 0, while setting x = ln 2 gives 2a + 12 b = 0. Since the coefficient matrix, " # 1 1 , 2 12 has nonzero determinant, the only solution to the system is the trivial solution, so that these functions are linearly independent in F . 13. Using properties of the logarithm, we have ln(2x) = ln 2 + ln x and ln(x2 ) = 2 ln x, so we want to see if {1, ln 2 + ln x, 2 ln x} are linearly independent over F . But clearly they are not, since by inspection (−2 ln 2) · 1 + 2 · (ln 2 + ln x) − 2 ln x = 0. So these functions are linearly dependent in F . 14. We wish to find solutions to a sin x + b sin 2x + c sin 3x = 0 that hold for all x. Setting π 2 π x= 6 π x= 3 a−c=0 √ 1 3 ⇒ a+ b=0 2√ 2√ 3 3 ⇒ a− b = 0. 2 2 √ Adding the second and third equations gives 12 + 23 a = 0, so that a = 0; then the first and third equations force b = c = 0. So the only solution is the trivial solution, and thus these functions are linearly independent in F . x= ⇒ 6.2. LINEAR INDEPENDENCE, BASIS, AND DIMENSION 467 15. We want to prove that if the Wronskian is not identically zero, then f and g are linearly independent. We do this by proving the contrapositive: assume f and g are linearly dependent. If either f or g is zero, then clearly the Wronskian is zero as well. So assume that neither f nor g is zero; since they are linearly dependent, each is a nonzero multiple of the other. Say f (x) = ag(x), so that f 0 (x) = ag 0 (x). Now, the Wronskian is f (x) g(x) = f (x)g 0 (x) − g(x)f 0 (x) = ag(x)g 0 (x) − g(x)(ag 0 (x)) = ag(x)g 0 (x) − ag(x)g 0 (x) = 0. f 0 (x) g 0 (x) Thus the Wronskian is identically zero, proving the result. 1 sin x cos x cos x − sin x = − cos2 x − sin2 x = −1 6= 0. Since W (x) = 0 cos x − sin x = − sin x − cos x 0 − sin x − cos x the Wronskian is not zero, the functions are linearly independent. Exercise 11: 1 sin2 x cos2 x 2 sin x cos x −2 sin x cos x W (x) = 0 0 2(cos2 x − sin2 x) 2(sin2 x − cos2 x) 16. Exercise 10: 1 sin2 x = 0 sin 2x 0 2 cos 2x = sin 2x 2 cos 2x cos2 x − sin 2x −2 cos 2x − sin 2x −2 cos 2x = 0. Since the Wronskian is zero, the functions are linearly dependent. ex e−x = −1 − 1 = −2 6= 0. Since the Wronskian is not zero, the Exercise 12: W (x) = x e −e−x functions are linearly independent. Exercise 13: 1 ln(2x) ln(x2 ) 1 2 1 2 x x W (x) = 0 = = 0. x x − x12 − x22 0 − x12 − x22 Since the Wronskian is zero, the functions are linearly dependent. sin x sin 2x sin 3x 2 cos 2x 3 cos 3x . Rather than computing the determinant and Exercise 14: W (x) = cos x − sin x −4 sin 2x −9 sin 3x showing it is not identically zero, it suffices to find some value of x such that W (x) 6= 0. Let x = π3 ; then W π 3 sin π3 sin 2π 3 cos π3 2 cos 2π 3 − sin π3 −4 sin 2π 3 √ √ 3 3 0 2 2 = 21 −1 −3 √ √ − 23 −2 3 0 √ √ 3 3 2√ 2 =3 √ − 23 −2 3 = sin π 3 cos π −9 sin π 3 27 = 3 −3 + =− 6= 0. 4 4 468 CHAPTER 6. VECTOR SPACES So the Wronskian is not identically zero and therefore the functions are linearly independent. 17. (a) Yes, it is. Suppose a(u + v) + b(v + w) + c(u + w) = (a + c)u + (a + b)v + (b + c)w = 0. Since {u, v, w} are linearly independent, it follows that a + c = 0, a + b = 0, and b + c = 0. Subtracting the latter two from each other gives a − c = 0; combining this with the first equation shows that a = c = 0, and then b = 0. So the only solution to the system is the trivial one, so these three vectors are linearly independent. (b) No, they are not, since (u − v) + (v − w) − (u − w) = 0. 18. Since dim M22 = 4 but B has only three elements, it cannot be a basis. 19. If a 1 0 0 0 +b 1 1 −1 1 +c 0 1 1 1 +d 1 1 1 0 = −1 0 0 , 0 then a +c+d= 0 −b+c+d= 0 b+c+d= 0 a + c − d = 0. Subtracting the second and third equations shows that b = 0; subtracting the first and fourth shows that d = 0. Then from the second equation, c = 0 as well, which forces a = 0. So the only solution is the trivial solution, and these matrices are linearly independent. Since there are four of them, and dim M22 = 4, it follows that they form a basis for M22 . 20. Since dim M22 = 4, if these vectors are linearly independent then they span and thus form a basis. To see if they are linearly independent, we want to solve 1 0 0 1 1 1 1 0 c1 + c2 + c3 + c4 = 0. 0 1 1 0 0 1 1 1 Solving this linear system gives c1 = −2c4 , c2 = −c4 , and c3 = c4 . For example, letting c4 = 1 gives 1 0 0 1 1 1 1 0 −2 − + + = 0. 0 1 1 0 0 1 1 1 Thus the matrices are linearly dependent so do not form a basis (see Theorem 6.10). 21. Since dim M22 = 4 but B has five elements, it cannot be a basis. 22. Since dim P2 = 3, if these vectors are linearly independent then they span and thus form a basis. To see if they are linearly independent, we want to solve c1 x + c2 (1 + x) + c3 (x − x2 ) = c2 + (c1 + c2 + c3 )x − c3 x2 = 0. Rather than setting up the augmented matrix and row-reducing it, simply note that the first and third terms force c2 = c3 = 0; then the second term forces c1 = 0. So the only solution is the trivial solution, and these three polynomials are linearly independent, so they form a basis for P2 . 6.2. LINEAR INDEPENDENCE, BASIS, AND DIMENSION 469 23. Since dim P2 = 3, if these vectors are linearly independent then they span and thus form a basis. To see if they are linearly independent, we want to solve c1 (1 − x) + c2 (1 − x2 ) + c3 (x − x2 ) = (c1 + c2 ) + (−c1 + c3 )x + (−c2 − c3 )x2 = 0. To solve the resulting linear system, we row-reduce its augmented matrix:     1 1 0 0 1 0 −1 0 −1 0 1 0 → 0 1 1 0 . 0 −1 −1 0 0 0 0 0 This has general solution c1 = t, c2 = −t, c3 = t, so setting t = 1 we get (1 − x) − (1 − x2 ) + (x − x2 ) = 0. Therefore the vectors are not linearly independent, so they cannot form a basis for P2 , which has dimension 3. 24. Since dim P2 = 3 but there are only two vectors in B, they cannot form a basis. 25. Since dim P2 = 3 but there are four vectors in B, they must be linearly dependent, so cannot form a basis. 26. We want to find constants c1 , c2 , c3 , and c4 such that c1 E22 + c2 E21 + c3 E12 + c4 E11 = A = 1 3 2 . 4 Setting up the equivalent linear system gives c1 =4 c2 =3 c3 =2 c4 = 1. This system obviously has the solution c1 = 4, c2 = 3, c3 = 2, c4 = 1, so the coordinate vector is   4 3  [A]B =  2 . 1 27. We want to find constants c1 , c2 , c3 , and c4 such that 1 0 1 1 1 1 1 c1 + c2 + c3 + c4 0 0 0 0 1 0 1 1 1 = 1 3 2 . 4 Setting up the equivalent linear system gives c1 + c2 + c3 + c4 = 1 c2 + c3 + c4 = 2 c3 + c4 = 3 c4 = 4. By backwards substitution, the solution is c4 = 4, c3 = −1, c2 = −1, c1 = −1, so the coordinate vector is   −1 −1  [A]B =  −1 . 4 470 CHAPTER 6. VECTOR SPACES 28. We want to find constants c1 , c2 , and c3 such that p(x) = 1 + 2x + 3x2 = c1 (1 + x) + c2 (1 − x) + c3 (x2 ) = (c1 + c2 ) + (c1 − c2 )x + c3 x2 . The corresponding linear system is c1 + c2 =1 c1 − c2 =2 c3 = 3. Row-reducing the corresponding augmented matrix gives    1 1 0 1 1 0 0    1 −1 0 2 → 0 1 0    0 0 1 3 0 0 1  3 2 − 12  , 3 so that c1 = 23 , c2 = − 12 , and c3 = 3. Thus the coordinate vector is   3 2   [p]B = − 12  . 3 29. We want to find constants c1 , c2 , and c3 such that p(x) = 2 − x + 3x2 = c1 (1) + c2 (1 + x) + c3 (−1 + x2 ) = (c1 + c2 − c3 ) + c2 x + c3 x2 . Equating coefficients shows that c3 = 3 and c2 = −1; substituting into the equation c1 + c2 − c3 = 2 gives c1 = 6, so that the coordinate vector is   6 [p]B = −1 . 3 30. To show B is a basis, we must show that it spans V and that the elements of B are linearly independent. B spans V , since if v ∈ V , then it can be written as a linear P combination of vectors in B by hypothesis. To show they are linearly independent, note that 0 = P 0ui ; since the representation of 0 is unique by assumption, it follows that the only solution to 0 = ci ui is the trivial solution, so that B = {ui } is a linearly independent set. Thus B is a basis for V . 31. [c1 u1 + · · · + ck uk ]B = [c1 u1 ]B + · · · + [ck uk ]B = c1 [u1 ]B + · · · + ck [uk ]B . 32. Suppose c1 u1 + · · · + cn un = 0. Then clearly 0 = [c1 u1 + · · · + cn un ]B . But by Exercise 31, the latter expression is equal to c1 [u1 ]B + · · · + cn [un ]B ; since [ui ]B are linearly independent in Rn , we get ci = 0 for all i, so that the only solution to the original equation is the trivial solution. Thus the ui are linearly independent in V . 33. First suppose that span(u1 , . . . , um ) = V . Choose   c1  c2    x =  .  ∈ Rn .  ..  cn 6.2. LINEAR INDEPENDENCE, BASIS, AND DIMENSION Then P 471 ci ui ∈ V since V = span(ui ). Therefore by Exercise 31 x = [c1 u1 + · · · + cm um ]B = c1 [u1 ]B + · · · + cm [um ]B , which shows that x ∈ span(S). For the converse, suppose that span(S) = Rn . Choose v ∈ V ; then [v]B ∈ Rn , and using Exercise 31 again, hX i X [v]B = ci [ui ]B = ci ui . B P But since B is a basis, and the two elements v and c u of V have the same coordinate vector, they i i P must be the same element. Thus v = ci ui , so that {ui } spans V . 34. Using the standard basis {x2 , x, 1} for P2 , if p(x) = ax2 + bx + c ∈ V , then p(0) = c = 0, so that c = 0 and therefore p(x) = ax2 + bx. Since x2 and x are linearly independent and span V , {x, x2 } is a basis, so that dim V = 2. 35. Two polynomials in V are p(x) = 1 − x and q(x) = 1 − x2 , since p(1) = q(1) = 0. If we show these polynomials are linearly independent and span V , we are done. They are linearly independent, since if a(1 − x) + b(1 − x2 ) = (a + b) − ax − bx2 = 0, then equating coefficients shows that a = b = 0. To see that they span, suppose r(x) = c+bx+ax2 ∈ V . Then r(1) = a + b + c = 0, so that c = −b − a and therefore r(x) = −(b + a) + bx + ax2 = −b(1 − x) − a(1 − x2 ), so that r is a linear combination of p and q. Thus {p(x), q(x)} forms a basis for V , which has dimension 2. 36. Suppose p(x) = c + bx + ax2 ; then xp0 (x) = bx + 2ax2 . So p(x) = xp0 (x) means that c = 0 and a = 2a, so that a = 0. Thus p(x) = bx. Then clearly V = span(x), so that {x} forms a basis, and dim V = 1. 37. Upper triangular matrices in M22 have the form a b = aE11 + bE12 + dE22 . 0 d Since E11 , E12 , and E22 span V , and we know they are linearly independent, it follows that they form a basis for V , which therefore has dimension 3. 38. Skew-symmetric matrices in M22 have the form 0 b 0 =b −b 0 −1 1 = bA. 0 Since A spans V , and {A} is a linearly independent set, we see that {A} is a basis for V , which therefore has dimension 1. 39. Suppose that A= Then a AB = c a+b , c+d a c b . d a+c BA = c b+d . d Then AB = BA means that a = a + c, so that c = 0, and also a + b = b + d, so that a = d. So the matrix A has the form a b 1 0 A= =a + bE12 = aI + bE12 . 0 a 0 1 I and E12 are linearly independent since I = E11 + E22 , and {E11 , E22 , E12 } is a linearly independent set. Since they span, we see that {I, E12 } forms a basis for V , which therefore has dimension 2. 472 CHAPTER 6. VECTOR SPACES 40. dim Mnn = n2 . For symmetric matrices, we have aij = aji . Therefore specifying the entries on or above the diagonal determines the matrix, and these entries may take any value. Since there are n + (n − 1) + · · · + 2 + 1 = n(n + 1) 2 entries on or above the diagonal, this is the dimension of the space of symmetric matrices. 41. dim Mnn = n2 . For skew-symmetric matrices, we have aij = −aji . Therefore all the diagonal entries are zero; further, specifying the entries above the diagonal determines the matrix, and these entries may take any value. Since there are (n − 1) + · · · + 2 + 1 = n(n − 1) 2 entries above the diagonal, this is the dimension of the space of symmetric matrices. 42. Following the hint, Let B = {v1 , . . . , vk } be a basis for U ∩ W . By Theorem 6.10(e), this can be extended to a basis C for U and to a basis D for W . Consider the set C ∪ D. Clearly it spans U + W , since an element of U + W is u + w for u ∈ U , w ∈ W ; since C and D are bases for U and W , the result follows. Next we must show that this set is linearly independent. Suppose dim U = m and dim V = n; then C = {uk+1 , . . . , um } ∪ B, D = {wk+1 , . . . , wn } ∪ B. Suppose that a linear combination of C ∪ D is the zero vector. This means that a1 v1 + · · · + ak vk + ck+1 uk+1 + · · · + cm um + dk+1 wk+1 + dn wn = A + C + D = 0. This means that C = −A − D. Since A ∈ U ∩ W and D ∈ W , we see that −A − D ∈ W so that C ∈ W . But also C ∈ U , since it is a combination of basis vectors for U . Thus C ∈ U ∩ W , which is impossible unless all of the ci are zero, since the uk+1 are elements of U not in the subspace U ∩ W . We are left with A + D = 0; this is a linear combination of basis elements of W , so that all the ai and all the dj must be zero as well. Therefore C ∪ D is linearly independent. It follows that C ∪ D forms a basis for U + W , so it remains to count the elements in this set. Clearly C ∩ D = B, for any basis element in common between C and D must be in U ∩ W , so it must be in B. Write #X for the number of elements in the set X. Then dim(U + W ) = #(C ∪ D) = #C + #D − #B = n + m − k = dim U + dim W − dim(U ∩ W ). 43. Let dim U = m, with basis B = {u1 , . . . , um }, and dim V = n, with basis C = {v1 , . . . , vn }. (a) Claim that a basis for U × V is D = {(ui , 0), (0, vj )}. This will also show that dim(U × V ) = mn, by counting the basis vectors. First we want to show that D spans. Let x = (u, v) ∈ U × V . Then   m n m n X X X X x = (u, v) =  ci ui , dj vj  = ci (ui , 0) + dj (0, vj ), i=1 j=1 i=1 j=1 proving that D spans U × V . To see that D is linearly independent, suppose that 0 is a linear combination of the elements of D. This means that   m n m n X X X X 0 = (0, 0) = ci (ui , 0) + dj (0, vj ) =  ci ui , dj vj  . i=1 j=1 i=1 j=1 Pm But the ui are linearly independent, so that P i=1 ci ui = 0 means that all of the ci are zero; n similarly the vj are linearly independent, so that j=1 dj vj = 0 means that all of the dj are zero. Thus D is linearly independent. So D is a basis for U × W , and therefore dim(U × W ) = mn. 6.2. LINEAR INDEPENDENCE, BASIS, AND DIMENSION 473 (b) Let dim W = k, with basis {w1 , . . . , wk }. Claim that D = {(wi , wi )} is a basis for ∆; proving this claim also shows, by countingPthe basis vectors, that dim ∆ = dim W . To see that D spans ∆, choose (w, w) ∈ ∆; then w = ci wi , so that X X X X (w, w) = ci wi , ci wi = (ci wi , ci wi ) = ci (wi , wi ). P Thus D spans ∆. To see that D is linearly independent, suppose that ci (wi , wi ) = 0. Then X X X X 0= ci (wi , wi ) = (ci wi , ci wi ) = ci wi , ci wi . P This means that ci wi = 0; since the wi are linearly independent, we see that all the ci are zero. So D is linearly independent, showing that it is a basis for ∆ and thus that dim ∆ = dim W . 44. Suppose that P is finite dimensional, and that {p1 (x), . . . , pn (x)} is a basis. Let m be the largest power of x that appears in any of the basis elements, and consider q(x) = xm+1 . Since any linear combination of the pi has powers of x at most equal to m, it follows that q ∈ / span{pi }, so that P cannot have a finite basis. 45. Start by adding all three standard basis vectors for P2 to the set, yielding {1 + x, 1 + x + x2 , 1, x, x2 }. Since (1 + x) + x2 = 1 + x + x2 , we see that {1 + x, 1 + x + x2 , x2 } is linearly dependent, so discard x2 , leaving {1 + x, 1 + x + x2 , 1, x}. Next, since (1 + x) = 1 + x, the set {1 + x, 1, x} is linearly dependent, so discard x, leaving B = {1 + x, 1 + x + x2 , 1}. This is a set of three vectors, so to prove that it is a basis, it suffices to show that it spans. But by construction, the standard basis vectors not in B are linear combinations of elements of B. Hence B is a basis for P2 . 46. Start by adding the standard basis vectors Eij to the set, giving 0 1 1 1 B= C= ,D= , E11 , E12 , E21 , E22 . 0 1 0 1 Since E11 = D − C is the difference of the two given matrices, we discard it from B, leaving us with five elements. Next, C = E12 + E22 , so discard E22 , leaving us with B = {C, D, E12 , E21 }. This has four elements, and dim M22 = 4, so to prove that this is a basis, it suffices to prove that it spans. But by construction, the standard basis vectors not in B are linear combinations of elements of B. So B spans and therefore forms a basis for M22 . 47. Start by adding the standard basis vectors Eij to the set, giving 1 0 0 1 0 −1 B= I= ,C= ,D= , E11 , E12 , E21 , E22 . 0 1 1 0 1 0 Since E21 = 12 (C + D), we discard it from B. Next, I = E11 + E22 , so discard E22 from B. Finally, E12 = 21 (C − D), so we discard it from B. This leaves us with B = {I, C, D, E11 }. This has four elements, and dim M22 = 4, so to prove that this is a basis, it suffices to prove that it spans. But by construction, the standard basis vectors not in B are linear combinations of elements of B. So B spans and therefore forms a basis for M22 . 48. Start by adding the standard basis vectors Eij to the set, giving 1 0 0 1 B= I= ,C= , E11 , E12 , E21 , E22 . 0 1 1 0 Since I + C = E11 + E12 + E21 + E22 , these are a linearly dependent set; discard E22 from B. Next, C = E12 + E21 , so discard E21 from B. This leaves us with B = {I, C, E11 , E12 }. This has four elements, and dim M22 = 4, so to prove that this is a basis, it suffices to prove that it spans. But by construction, the standard basis vectors not in B are linear combinations of elements of B. So B spans and therefore forms a basis for M22 . 474 CHAPTER 6. VECTOR SPACES 49. The standard basis for P1 is {1, x}. If we show that each of these vectors is in the span of the given vectors, then those vectors span P1 . But 1 ∈ span(1, 1 + x, 2x), and x = (1 + x) − 1 ∈ span(1, 1 + x, 2x). Thus span(1, 1 + x, 2x) ⊇ span(1, x) = P1 , so that a basis for the span is {1, x}. 50. Since (1 − 2x) + (2x − x2 ) = 1 − x2 , these three elements are linearly independent, so we can discard 1 − x2 from the set without changing the span, leaving {1 − 2x, 2x − x2 , 1 + x2 }. Claim these are linearly independent. Suppose that a(1 − 2x) + b(2x − x2 ) + c(1 + x2 ) = (a + c) + (−2a + 2b)x + (−b + c)x2 = 0. Then a = −c and a = b from the constant term and the coefficient of x, so that b = −c. But then b = −c together with −b + c = 0 shows that b = c = 0, so that a = 0. Hence these three polynomials are linearly independent, so they form a basis for the span of the original set. 51. Since (1 − x) + (x − x2 ) = 1 − x2 and (1 − x) − (x − x2 ) = 1 − 2x + x2 , we can remove both 1 − x2 and 1 − 2x + x2 from the set without changing the span. This leaves us with {1 − x, x − x2 }. Claim these are linearly independent. Suppose that a(1 − x) + b(x − x2 ) = a + (−a + b)x − bx2 = 0. Then clearly a = b = 0. Hence these two polynomials are linearly independent, so they form a basis for the span of the original set. 52. Let 1 V1 = 0 0 , 1 0 V2 = 1 1 , 0 −1 V3 = 1 1 , −1 1 V4 = −1 −1 . 1 Then V3 + V4 = 0, so we can discard V4 from the set without changing the span. Next, V3 = V2 − V1 , so we can discard V3 from the set without changing the span. So we are left with {V1 , V2 }; these are linearly independent since they are not scalar multiples of one another. So this set forms a basis for the span of the given set. 53. Since cos 2x = cos2 x − sin2 x, we can discard cos 2x from the set without changing the span, leaving us with {sin2 x, cos2 x}. Claim that this set is linearly independent. For suppose that a sin2 x+b cos2 x = 0 holds for all x. Setting x = 0 gives b = 0, while setting x = π2 gives a = 0. Thus a = b = 0 and these functions are linearly independent. So {sin2 x, cos2 x} forms a basis for the span of the original set. 54. Suppose that av + c1 v1 + · · · + cn vn = 0, so that c1 v1 + · · · + cn vn = −av. If a 6= 0, then dividing through by −a shows that v ∈ span(S), a contradiction. Therefore we must have a = 0, and we are left with c1 v1 + · · · + cn vn = 0, which implies (since the vi are linearly independent) that all of the ci are zero. Therefore all coefficients must be zero, so that S 0 is linearly independent. 55. If vn ∈ span(v1 , . . . , vn−1 ), then there are constants such that vn = c1 v1 + · · · + cn−1 vn−1 . Now choose x ∈ V = span(S). Then x = a1 v1 + · · · + an−1 vn−1 + an vn = a1 v1 + · · · + an−1 vn−1 + an (c1 v1 + · · · + cn−1 vn−1 ) = (a1 + an c1 )v1 + · · · + (an−1 + an cn−1 )vn−1 , showing that x ∈ span(S 0 ). Then V ⊆ span(S 0 ) ⊆ span(S) = V , so that span(S 0 ) = V . 6.2. LINEAR INDEPENDENCE, BASIS, AND DIMENSION 475 56. Use induction on n. If S = {v1 }, then S is linearly independent so is a basis. Next, suppose the statement is true for all sets that span V and have no more than n − 1 elements. Let S = {v1 , . . . , vn }. If S is linearly independent, then it is a basis and we are done. Otherwise it is a linearly dependent set, so we can write some element of S (say, vn after a possible reordering) as a linear combination of the other, so that vn ∈ span{v1 , · · · , vn−1 }. Then by Exercise 55, we can remove vn from S, giving S 0 = {v1 , . . . , vn−1 } that also spans V . By the inductive assumption, S 0 can be made into a basis for V by removing some (possibly no) elements, so S can. 57. Suppose x ∈ V . Since S = {v1 , . . . , vn } is a basis, there are scalars a1 , . . . , an such that x = a1 v1 + a2 v2 + · · · + an vn . Since the ci are nonzero, we also have a2 an a1 (cn vn ), x = (c1 v1 ) + v2 + · · · + c1 c2 cn so that T = {c1 v1 , . . . , cn vn } spans V . To show T is linearly independent, the argument is much the same. Suppose that b1 (c1 v1 ) + b2 (c2 v2 ) + · · · + bn (cn vn ) = (b1 c1 )v1 + (b2 c2 )v2 + · · · + (bn cn )vn = v0 . Since S is linearly independent, it follows that bi ci = 0 for all i. Since all of the ci are nonzero, all of the bi must be zero, so that T is linearly independent. Therefore T is also a basis for V . 58. Let S = {v1 , v2 , . . . , vn }, T = {v1 , v1 + v2 , v1 + v2 + v3 , . . . , v1 + v2 + · · · + vn }. By Exercise 21 in Section 2.3, to show that span(T ) = span(S) = V it suffices to show that every element of S is in span(T ) and vice versa. But vi = (v1 + v2 + · · · + vi−1 + vi ) − (v1 + v2 + · · · + vi−1 ) ∈ span(T ) and v1 + · · · + vi = (v1 ) + (v2 ) + · · · + (vi ) ∈ span(S). Thus the two spans are equal. Since S is a basis, and T has the same number of elements as S, it must also be a basis, by Theorem 6.10(d). 59. (a) (x − a1 )(x − a2 ) (x − 2)(x − 3) x2 − 5x + 6 1 5 = = = x2 − x + 3 (a0 − a1 )(a0 − a2 ) (1 − 2)(1 − 3) 2 2 2 (x − a0 )(x − a2 ) (x − 1)(x − 3) x2 − 4x + 3 p1 (x) = = = = −x2 + 4x − 3 (a1 − a0 )(a1 − a2 ) (2 − 1)(2 − 3) −1 (x − a0 )(x − a1 ) (x − 1)(x − 2) x2 − 3x + 2 1 3 p2 (x) = = = = x2 − x + 1. (a2 − a0 )(a2 − a1 ) (3 − 1)(3 − 2) 2 2 2 p0 (x) = (b) If i = j, then substituting aj for x in the definition of pi gives a numerator equal to the denominator, so that pi (aj ) = 1 for i = j. If i 6= j, however, then substituting aj for x in the definition of pi gives zero, since the term x − aj in the numerator of pi becomes zero. Thus ( 0 if i 6= j pi (aj ) = 1 if i = j. 60. (a) Note first that the degree of each pi is n, since there are n factors in the numerator, so that pi ∈ Pn for all i. Suppose that there are constants ci such that c0 p0 (x) + c1 p1 (x) + · · · + cn pn (x) = 0. Then this equation must hold for all x. Substitute ai for x. Then by Exercise 59(b), all pj (ai ) for j 6= i vanish, leaving only ci pi (ai ) = ci . Therefore ci = 0. Since this holds for every i, we see that all of the ci are zero, so that the pi are linearly independent. 476 CHAPTER 6. VECTOR SPACES (b) Since dim Pn = n + 1 and there are n + 1 polynomials p0 , p1 , . . . , pn that span, they must form a basis, by Theorem 6.10(d). 61. (a) This is similar to part (a) of Exercise 60. Suppose that q(x) ∈ P, and that q(x) = c0 p0 (x) + c1 p1 (x) + · · · + cn pn (x). Then substituting ai for x gives q(ai ) = ci pi (ai ) = ci , since pj (ai ) = 0 for j 6= i by Exercise 59(b). Since the pi form a basis for P, any element of P has a unique representation as a linear combination of the pi , so that unique representation must be the one we just found: q(x) = q(a0 )p0 (x) + q(a1 )p1 (x) + · · · + q(an )pn (x). (b) q(x) passes through the given n + 1 points if and only if q(ai ) = c1 . But this is exactly what we proved in part (a). Since the representation is unique in part (a), this is the only such degree n polynomial. (c) (i) Here a0 = 1, a1 = 2, a2 = 3, so we use the Lagrange polynomials determined in Exercise 59(a) and therefore q(x) = c0 p0 (x) + c1 p1 (x) + c2 p2 (x) 1 2 3 1 2 5 x − x + 3 − −x2 + 4x − 3 − 2 x − x+1 =6 2 2 2 2 = 3x2 − 16x + 19. (ii) Here a0 = −1, a1 = 0, a2 = 3. We first compute the Lagrange polynomials: (x − a1 )(x − a2 ) (x − 0)(x − 3) x2 − 3x 1 3 = = = x2 − x (a0 − a1 )(a0 − a2 ) (−1 − 0)(−1 − 3) 4 4 4 2 (x − (−1))(x − 3) x − 2x − 3 1 2 (x − a0 )(x − a2 ) = = = − x2 + x + 1 p1 (x) = (a1 − a0 )(a1 − a2 ) (0 − (−1))(0 − 3) −3 3 3 (x − a0 )(x − a1 ) (x − (−1))(x − 0) x2 + x 1 2 1 p2 (x) = = = = x + x. (a2 − a0 )(a2 − a1 ) (3 − (−1))(3 − 0) 12 12 12 p0 (x) = Using these polynomials, we get q(x) = c0 p0 (x) + c1 p1 (x) + c2 p2 (x) 1 2 1 2 3 1 2 2 1 x − x +5 − x + x+1 +2 x + x = 10 4 4 3 3 12 12 5 5 1 15 10 1 = − + + x2 + − + x+5 2 3 6 2 3 6 = x2 − 4x + 5. 62. Suppose that (a0 , 0), (a1 , 0), . . . , (an , 0) be the n + 1 distinct zeros. Then by Exercise 61(a), the only polynomial passing through these n + 1 points is q(x) = 0p0 (x) + 0p1 (x) + · · · + 0pn (x) = 0, so it is the zero polynomial. 63. An n × n matrix is invertible if and only if it row-reduces to the identity matrix, so that its columns are linearly independent. Since n linearly independent columns in Znp form a basis for Znp , finding the number of invertible matrices in Mnn (Zp ) is the same as finding the number of ways to construct a basis for Znp . To construct a basis, we can start by choosing any nonzero vector v1 ∈ Znp . There are pn − 1 ways to choose this vector (since Znp has pn elements, and we are excluding only one). For the second basis Exploration: Magic Squares 477 element v2 , we can choose any vector not in span(v1 ) = {a1 v1 : a1 ∈ Zp }. There are therefore pn − p choices for v2 . Likewise, suppose we have chosen v1 , . . . , vk−1 . Then we can choose for vk any vector not in the span of the previous vectors, so any vector not in span(v1 , . . . , vk−1 ) = {a1 v1 + a2 v2 + · · · + ak−1 vk−1 : a1 , a2 , . . . , ak−1 ∈ Zp }. There are pn − pk−1 ways of choosing vk . Thus the total number of bases is (pn − 1)(pn − p)(pn − p2 )(· · · )(pn − pn−1 ). Exploration: Magic Squares 1. A classical magic square M contains each of the numbers 1, 2, . . . , n2 exactly once. Since wt(M ) is the common sum of any row (or column), and there are n rows, we have, using the comments preceding Exercise 51 in Section 2.4) wt(M ) = 1 n2 (n2 + 1) n(n2 + 1) 1 (1 + 2 + · · · + n2 ) = · = . n n 2 2 2 2. A classical 3 × 3 magic square will have common sum 3(3 2+1) = 15. Therefore that no two of 7, 8, and 9 can be in the same row, column, or diagonal, since the sum of any two of these numbers is at least 15. The central square cannot be 1, since then 1 + 2 + k = 15 is impossible. Similarly, it cannot be any of 2 through 4, since then some row, column, or diagonal contains the numbers 1 and 2, which is impossible. The central square cannot be 7, 8, or 9, since then some line will contain two of those numbers. Finally, it cannot be 6, since then some line will contain both 6 and 9. So the central square must be 5. Also, 7 and 1 cannot be in the same line, since the third number would have to be 7. Finally, 9 cannot be in a corner, since it would then participate in three lines adding to 15. But there are only two possible sets of numbers, one of which is 9, adding to 15: 9, 5, 1 and 9, 4, and 2. Thus 9 must be on a side. Keeping these constraints in mind and trying a few configurations gives for example     2 7 6 8 1 6 9 5 1 3 5 7 4 3 8 4 9 2 These examples are essentially the same — the second one is a rotated and reflected version of the first. 3. Since the magic squares in Exercise 2 have a weight of 15, subtracting 14 3 from each entry will subtract 14 from each row and column while maintaining common sums, so the result will have weight 1. For example, using the second magic square above, we get   7 4 − 38 3 3  13  1 . − 11  3 3 3  10 − 23 − 53 3 In general, starting from any 3 × 3 classical magic square, to construct a magic square of weight w, add w−15 to each entry in the magic square. This clearly preserves the common sum property, and 3 the new sum of each row is 15 + 3 w−15 = w. 3 4. (a) We must show that if A and B are magic squares, then so is A + B, and if c is any scalar, then cA is also a magic square. Suppose wt(A) = a and wt(B) = b. Then the sum of the elements in any row of A + B is the sum of the elements in that row of A plus the sum of the elements in that row of B, which is therefore a + b. The same argument shows that the sum of the elements in any column or either diagonal of A + B is also a + b, so that A + B ∈ Mag 3 with weight a + b. In the same way, the sum of the elements in any row or column, or either diagonal, of cA is c times the sum of the elements in that line of A, so it is equal to ca. Thus cA ∈ Mag 3 with weight ca. 478 CHAPTER 6. VECTOR SPACES (b) If A, B ∈ Mag 03 , then wt(A) = wt(B) = 0. From part (a), wt(A + B) = 0 as well, and also wt(cA) = c wt(A) = 0. Thus A + B, cA ∈ Mag 03 , so by Theorem 6.2, Mag 03 is a subspace of Mag 3 . 5. Let M be any 3 × 3 magic square, with wt(M ) = w. Using Exercise 3, we define M0 to be the magic square of weight 0 derived from M by adding −w 3 to every element of M . This is the same as adding the matrix   − w3 − w3 − w3 w   w − 3 − w3 − w3  = − J 3 − w3 − w3 − w3 to M . Thus M0 = M − w w J, so that M = M0 + J. 3 3 Thus k = w3 . 6. The linear equations express the fact that every row and column sum each diagonal sum is zero: a+b+c =0 d+e+f =0 g+h+i=0 a +d b +g +e c a =0 +h +f +i=0 +e c Row-reducing the associated matrix gives  1 1 1 0 0 0 0 0 0 0 0 0 1 1 1 0 0 0  0 0 0 0 0 0 1 1 1  1 0 0 1 0 0 1 0 0  0 1 0 0 1 0 0 1 0  0 0 1 0 0 1 0 0 1  1 0 0 0 1 0 0 0 1 0 0 1 0 1 0 1 0 0 =0 +i=0 +e +g   0 1 0 0   0 0    0 0 →  0 0   0 0  0 0 0 0 0 1 0 0 0 0 0 0 =0 0 0 1 0 0 0 0 0 0 0 0 1 0 0 0 0 0 0 0 0 1 0 0 0 0 0 0 0 0 1 0 0 0 0 0 0 0 0 1 0 0 1 1 0 −1 −1 −1 −2 0 0 1 2 1 1 0 0  0 0  0  0 . 0  0  0 0 So there are two free variables in the general solution, which gives the magic square   −s −t s+t  t + 2s 0 −t − 2s . −s − t t s 7. From Exercise 6, a basis for Mag 03 is given by    −1 0 1 0 A =  2 0 −2 , B =  1 −1 0 1 −1  −1 1 0 −1 , 1 0 so that Mag 03 has dimension 2. Note that these matrices are linearly independent since they are not scalar multiples of each other. (The matrix in Exercise 6 is not the same as that given in the hint. To show that the two are just different forms of one another, make the substitutions u = −s, v = s + t; then s = −v and t = v − s = u + v, so the matrix becomes   u −u − v v −u + v 0 u − v , −v u+v −u which is the matrix given in the hint.) Exploration: Magic Squares 479 8. From Exercise 5, an arbitrary element M ∈ Mag 3 can be written M = M0 + kJ, M0 ∈ Mag 3 ,  1 J = 1 1  1 1 . 1 1 1 1 Since M0 has basis {A, B} (see Exercise 7), it follows that M3 is spanned by S = {A, B, J}. To see that this is a basis, we must prove that these matrices are linearly independent. Now, we have seen that A and B are linearly independent, so if S is linearly dependent, then span(S) = span(A, B) = Mag 03 . But this is impossible, since S spans Mag 3 , which contains matrices not in Mag 03 . Thus S is linearly independent, so it forms a basis for Mag 3 , which therefore has dimension 3. 9. Using the matrix M from exercise 5, adding both diagonals and the middle column gives (a + e + i) + (c + e + g) + (b + e + h) = (a + b + c) + (g + h + i) + 3e, which is the sum of the first and third rows, plus 3 times the central entry. Since each row, column, and diagonal sum is w, we get w + w + w = w + w + 3e, so that e = w3 . 10. The sum of the squares in the entries of M are s2 + (−s − t)2 + t2 + (−s + t)2 + 02 + (s − t)2 + (−t)2 + (s + t)2 + (−s)2 = s2 + (s2 + 2st + t2 ) + t2 + (s2 − 2st + t2 ) + (s2 − 2st + t2 ) + t2 + (s2 + 2st + t2 ) + s2 = 6s2 + 6t2 . We have another way of computing this sum, however. Since M was obtained from a classical 3 × 3 magic square by subtracting 5 from each entry (see Exercise 5), the sum of the squares of the entries of M is also 2 2 2 (1 − 5) + (2 − 5) + · · · + (9 − 5) = 60. √ So the equation is 6s2 + 6t2 = 60, or s2 + t2 = 10, which is a circle of radius 10 centered at the origin. The only way to write 10 as a sum of two integral squares is (±1)2 + (±3)2 , so this equation has the eight integral solutions (s, t) = (−3, −1), (−3, 1), (−1, −3), (−1, 3), (3, −1), (3, 1), (1, −3), (1, 3). Substituting those values of s and t into  s+5 M + 5J = −s + t + 5 −t + 5 −s − t + 5 5 s+t+5  t+5 s − t + 5 −s + 5 gives the eight classical magic squares   2 9 4 7 5 3 , 6 1 8   8 3 4 1 5 9 , 6 7 2   2 7 6 9 5 1 , 4 3 8   8 1 6 3 5 7 , 4 9 2   4 9 2 3 5 7 , 8 1 6   6 7 2 1 5 9 , 8 3 4   4 3 8 9 5 1 , 2 7 6   6 1 8 7 5 3 . 2 9 4 These matrices can all be obtained from one another by some combination of row and column interchanges and matrix transposition. A plot of the circle, with these eight points marked, is 480 CHAPTER 6. VECTOR SPACES t 3 2 1 -3 -2 1 -1 2 s 3 -1 -2 -3 6.3 Change of Basis 1. (a) 2 1 0 b1 = 2 2 = b1 + b2 ⇒ ⇒ [x]B = 3 0 1 b2 = 3 3 " # 5 5 c1 = 2 2 1 1 2 ⇒ [x] = = c1 + c2 ⇒ C 3 1 −1 c2 = − 21 −1 2 (b) By Theorem 6.13, [ C | B ] → [ I | PC←B ]: " # " 1 1 1 0 1 0 → 1 −1 0 1 0 1 " (c) [x]C = PC←B [x]B = 1 2 1 2 1 2 1 −2 1 2 1 2 1 2 − 12 # " ⇒ PC←B = 1 2 1 2 1 2 − 21 # #" # " # 5 2 2 , which is the same as the answer from part (a). = 3 − 12 (d) By Theorem 6.13, [ B | C ] → [ I | PB←C ]: 1 0 1 1 1 1 ⇒ PB←C = 0 1 1 −1 1 −1 " #" # " # 5 1 1 2 = 2 , which is the same as the answer from part (a). (e) [x]B = PB←C [x]C = 1 −1 − 12 3 2. (a) 4 1 1 = b1 + b2 −1 0 1 4 0 2 = c1 + c2 −1 1 3 ⇒ b1 = 5 b2 = −1 ⇒ ⇒ c1 = −7 c2 = 2 ⇒ (b) By Theorem 6.13, [ C | B ] → [ I | PC←B ]: # " " 1 0 − 32 0 2 1 1 → 1 1 3 0 1 0 1 2 − 21 1 2 5 −1 −7 [x]C = 2 [x]B = # " ⇒ PC←B = − 32 − 12 1 2 1 2 # 6.3. CHANGE OF BASIS (c) [x]C = PC←B [x]B = 481 " − 23 − 12 1 2 1 2 #" 5 −1 # = " # −7 2 , which is the same as the answer from part (a). (d) By Theorem 6.13, [ B | C ] → [ I | PB←C ]: 1 1 0 2 1 0 −1 −1 −1 −1 → ⇒ PB←C = 0 1 1 3 0 1 1 3 1 3 −1 −1 −7 5 (e) [x]B = PB←C [x]C = = , which is the same as the answer from part (a). 1 3 2 −1 3. (a)        0 0 1 1         + b + b = b  0 3 0 2 1 1 0 1 0 0 −1         0 0 1 1          0 = c1 1 + c2 1 + c3 0 1 1 1 −1  (b) By Theorem 6.13, [ C | B ] → [ I | PC←B ]:    1 0 0 1 0 0 1 0 0 1    1 1 0 0 1 0 → 0 1 0 −1 1 1 1 0 0 1 0 0 1 0 ⇒ ⇒ ⇒ c1 = 1 c2 = −1 c3 = −1 ⇒  0 0  1 0 −1 1  1   [x]B =  0 −1   1   [x]C = −1 . −1  b1 = 1 b2 = 0 b3 = −1  ⇒ 1  PC←B = −1 0  0 0  1 0 . −1 1 (c)      1 1 0      0 = −1 1 0     −1 −1 1 −1 1 0  [x]C = PC←B [x]B =  −1 0 which is the same as the answer from part (a). (d) By Theorem 6.13, [ B | C ] → [ I | PB←C ]:   1 0 0 1 0 0 0 1 0 1 1 0 ⇒ 0 0 1 1 1 1  1 PB←C = 1 1  0 0 . 1 0 1 1 (e)  1   [x]C = PB←C [x]C = 1 1   1  0 1         0  −1 =  0 , 1 −1 −1 1 1  0 which is the same as the answer from part (a). 4. (a)         3 0 0 1         1 = b1 1 + b2 0 + b3 0         5 0 1 0         3 1 0 1         1 = c1 1 + c2 1 + c3 0         5 0 1 1   1    b2 = 5 ⇒ [x]B =  5 b3 = 3 3   c1 = − 12 −1  2 3 ⇒ [x]C =  c2 = 32  2 . 7 c3 = 72 2 b1 = 1 ⇒ ⇒ 482 CHAPTER 6. VECTOR SPACES (b) By Theorem 6.13, [ C | B ] → [ I | PC←B ]:    1 1 0 1 0 0 1 1 0 0 2    1 1 1 0 1 0 0 → 0 1 0    2 0 0 1 − 12 0 1 1 0 1 0  − 12  1 2 − 12   1 2 1 2 1 2 ⇒ 1  2 1 PC←B =   2 − 12 − 21 1 2 1 2  1 2 − 12  . 1 2 (c)  1  2 1 [x]C = PC←B [x]B =   2 − 12 − 12     1 2 1 2 1 −1 2  1  2    3  − 21   5 =  2  , 7 1 3 2 2 1 0 1 1 1 0  0 1 1  1   [x]C = PB←C [x]C = 0 1 1     −1 0 1   2   3 =    1  2  5 7 1 3 2 which is the same as the answer from part (a). (d) By Theorem 6.13, [ B | C ] → [ I | PB←C ]:    0 0 1 1 0 1 1 0 0 1 0 0 1 1 0 → 0 1 0 0 1 0 0 1 1 0 0 1 ⇒  1 PB←C = 0 1 1 1 0  0 1 1 (e) 1 0 which is the same as the answer from part (a). 5. Note that in terms of the standard basis, 1 0 B= , , 0 1 C= 0 1 , . 1 1 (a) 2 − x = b1 (1) + b2 (x) b1 = 2 b2 = −1 ⇒ 2 − x = c1 (x) + c2 (1 + x) ⇒ ⇒ 1 0 PC←B = 0 1 ⇒ −1 1 1 . 0 1 2 −3 = , which agrees with the answer in part (a). 0 −1 2 (d) By Theorem 6.13, [ B | C ] → [ I | PB←C ]: 1 0 0 1 0 1 1 1 (e) [2 − x]B = PB←C [2 − x]C = 2 −1 −3 [2 − x]C = . 2 [2 − x]B = c1 = −3 c2 = 2 (b) By Theorem 6.13, [ C | B ] → [ I | PC←B ]: 0 1 1 0 1 0 −1 → 1 1 0 1 0 1 1 −1 (c) [2 − x]C = PC←B [2 − x]B = 1 ⇒ ⇒ 0 PC←B = 1 1 1 1 −3 2 = , which agrees with the answer in part (a). 1 2 −1 6.3. CHANGE OF BASIS 483 6. Note that in terms of the standard basis, 1 1 B= , , 1 −1 0 4 C= , . 2 0 (a) 1 + 3x = b1 (1 + x) + b2 (1 − x) ⇒ (c) [1 + 3x]C = PC←B [1 + 3x]B = 1 2 1 4 − 21 #" 1 4 2 (e) [1 + 3x]B = PB←C [1 + 3x]C = #" # 2 32 1 −1 1 4 2 # " # = 3 2 1 4 1 −1 2 2 = 7. Note that in terms of the standard basis,       0 0   1 B = 1 , 1 , 0 ,   1 1 1 # −1 2 −1 3 2 1 4 [1 + 3x]C = " ⇒ 1 4 " [1 + 3x]B = # − 21 (d) By Theorem 6.13, [ B | C ] → [ I | PB←C ]: 1 1 0 4 1 0 → 1 −1 2 0 0 1 " ⇒ 1 2 1 4 −1 ⇒ 2 " # c2 = 14 (b) By Theorem 6.13, [ C | B ] → [ I | PC←B ]: " " # 1 0 0 4 1 1 → 0 1 2 0 1 −1 " b2 = −1 c1 = 23 ⇒ 1 + 3x = c1 (2x) + c2 (4) " b1 = 2 1 2 1 4 PC←B = − 12 # 1 4 . , which agrees with the answer in part (a). ⇒ PC←B = 2 2 1 −1 # , which agrees with the answer in part (a).       0 0   1 C = 0 , 1 , 0 .   0 0 1 (a) 1 + x2 = b1 (1 + x + x2 ) + b2 (x + x2 ) + b3 (x2 ) 1 + x2 = c1 (1) + c2 (x) + c3 (x2 ) ⇒ c1 = 1 c2 = 0 c3 = 1 (b) By Theorem 6.13, [ C | B ] → [ I | PC←B ]:   1 0 0 1 0 0   0 1 0 1 1 0   0 0 1 1 1 1  1 (c) [1 + x2 ]C = PC←B [1 + x2 ]B = 1 1 0 1 1 ⇒   1 [1 + x2 ]B = −1 1 ⇒ b1 = 1 b2 = −1 b3 = 1 ⇒   1 [1 + x2 ]C = 0 . 1  1  PC←B =  1 1 0 1 1 ⇒  0  0 . 1     0 1 1 0 −1 = 0, which agrees with the answer in part (a). 1 1 1 484 CHAPTER 6. VECTOR SPACES (d) By Theorem 6.13, [ B | C ] → [ I | PB←C ]:       1 0 0 1 0 0 1 0 0 1 0 0 1 0 0 1 1 0 0 1 0 → 0 1 0 −1 1 0 ⇒ PC←B = −1 1 0 0 −1 1 1 1 1 0 0 1 0 0 1 0 −1 1      1 0 0 1 1 1 0 0 = −1, which agrees with the answer in part (e) [1 + x2 ]B = PB←C [1 + x2 ]C = −1 0 −1 1 1 1 (a). 8. Note that in terms of the standard basis,       1 0   0 B = 1 , 0 , 1 ,   0 1 1       1 0   1 C = 0 , 1 , 0 .   0 0 1 (a) b1 = 3 b2 = 4 b3 = −5 4 − 2x − x2 = b1 (x) + b2 (1 + x2 ) + b3 (x + x2 ) ⇒ 4 − 2x − x2 = c1 (1) + c2 (1 + x) + c3 (x2 ) c1 = 6 c2 = −2 c3 = −1 ⇒ ⇒   3 ⇒ [4 − 2x − x2 ]B =  4 −5   6 [4 − 2x − x2 ]C = −2 −1 (b) By Theorem 6.13, [ C | B ] → [ I | PC←B ]:       1 0 0 −1 1 −1 −1 1 −1 1 1 0 0 1 0        0 1 0 1 0 1 → 0 1 0 1 1 0 1 .    ⇒ PC←B =  1 0  0 1 1 0 0 1 0 1 1 0 0 1 0 1 1      −1 1 −1 3 6 1  4 = −2, which agrees with the (c) [4 − 2x − x2 ]C = PC←B [4 − 2x − x2 ]B =  1 0 0 1 1 −5 −1 answer in part (a). (d) By Theorem 6.13, [ B | C ] → [ I | PB←C ]:       0 1 0 1 1 0 1 0 0 1 2 −1 1 2 −1 1 0 1 0 1 0 → 0 1 0 1 1 0 ⇒ PC←B =  1 1 0 0 1 1 0 0 1 0 0 1 −1 −1 1 −1 −1 1      1 2 −1 6 3 1 0 −2 =  4, which agrees with the (e) [4 − 2x − x2 ]B = PB←C [4 − 2x − x2 ]C =  1 −1 −1 1 −1 −5 answer in part (a). 9. (a)             4 1 0 0 0 b1 = 4 4  2 0 1 0 0  b = 2 2 2   = b1   + b2   + b3   + b4   ⇒  ⇒ [A]B =   0 0 0 1 0  0 b3 = 0 −1 0 0 0 1 b4 = −1 −1  5           b1 = 25 4 1 2 1 1 2  0  2            = b1  2 + b2 1 + b3 1 + b4 0 ⇒ b2 = 0 ⇒ [A]C =   .  0  0 1 0 0 b3 = −3 −3 −1 −1 0 1 1 9 b4 = 92 2 6.3. CHANGE OF BASIS 485 (b) By Theorem 6.13, [ C | B ] → [ I | PC←B ]:  1 2 1 1 1 0 0   2   0  −1 1 1 0 0 1 0 1 0 0 0 0 1 0 1 1 0 0 0  1 0 0 0   0 1 0 0 0 → 0 0 1 0 0   1 0 0 0 1 0  1 2 0 −1 0 0 1 −1 1 1 3 2 −1 −2 − 21   0  1  − 21  ⇒ 1 2 0 −1   0 PC←B =  −1  0 1 1 1 −1 −2 3 2 (c)  1 2  0  [A]C = PC←B [A]B =  −1 3 2     5 0 −1 − 21 4 2  2  0 0 1 0       =  , 1 1 1  0 −3 9 −1 −1 −2 − 21 2 which agrees with the answer in part (a). (d) By Theorem 6.13, [ B | C ] → [ I | PB←C ]:  1 0 0 0  0 1 0 0  0 0 1 0  0 0 0 1  1 2 1 1 2 1 1 0 1 0 −1 0 1  0  0  1  ⇒  1 2 1 1   2 PB←C =   0  −1 1 1 1 0 0 1  0 . 0  1 (e)  1 2 1   2 [A]B = PB←C [A]C =   0  −1 1 1 1 0 0 1    4         0   0 =  2 ,     0  −3  0 9 1 −1 2 1 5 2  which agrees with the result in part (a). 10. (a)           1 1 0 1 1           1           = b1 0 + b2 1 + b3 1 + b4 0 1 0 1 0 1           1 1 0 0 0           1 1 1 1 0           1 1 1 0 1   = b1   + b2   + b3   + b4   1 0 1 1 1           1 1 0 1 1 b1 = 1 ⇒ b2 = 1 b3 = 0 ⇒ b4 = 0 b1 = 13 ⇒ b2 = 13 b3 = 13 b4 = 13   1   1  [A]B =  0   0   1 ⇒  31  3  [A]C =  1 . 3 1 3 − 12   0  1  − 12 486 CHAPTER 6. VECTOR SPACES (b) By Theorem 6.13, [ C | B ] → [ I | PC←B ]:  1 1 1 0  1 1 0 1  0 1 1 1  1 0 1 1 1 0 1 0 1 1 0 1 0 1 0 0   1 0 0 0 1    0  → 0 1 0 0 0 0 1 0  1  0 0 0 0 1 2 3 − 13 2 3 − 13 − 13 2 3 2 3 − 13 − 13 2 3 − 13 2 3 − 31  2  3 2 3 − 31  2  31 − 3 PC←B =   2  3 − 13 ⇒ − 13 − 13 2 3 − 13 2 3 2 3 2 3 − 31 − 31 1 0 1 1  1  2 PB←C =   1  2 − 12 1 2 1 2 1 2 1 2 − 21 1 2  2  3 . 2 3 − 13 (c)  2  13 − 3 [A]C = PC←B [A]B =   2  3 − 13 − 13     1 1 3     2   1   1 3   3  =     2 1 3 3  0 1 0 − 13 3 2 3 2 3 − 31 − 31 − 13 1 0 1 1 2 1 2 − 12 1 2 1 2 1 2 1 2 − 21 1 2 2 3 − 13 2 3 which agrees with the answer in part (a). (d) By Theorem 6.13, [ B | C ] → [ I | PB←C ]:  1 0 1 1  0 1 1 0  0 1 0 1  1 0 0 0 1 1 1 1 1 0 0 1 1 1 0 1   1 0 0 0 0    1  → 0 1 0 0 0 0 1 0  1  1 0 0 0 1 1  3  2 1 −2 − 12  ⇒ (e)  1 0 1  1  2 [A]B = PB←C [A]C =   1  2 − 12 1 2 1 2 1 2 1 2 − 12 1 2   which agrees with the result in part (a). 11. Note that in terms of the standard basis {sin x, cos x}, B= 1 0 , , 1 1 C=   1 1 3     3 1  1 2 3 =   1 1 −2 3  0 1 − 21 0 3 1 1 1 , . 1 −1  3  2 . 1 −2 − 12 6.3. CHANGE OF BASIS 487 (a) b1 = 2 b2 = −5 " # 2 [2 sin x − 3 cos x]B = −5 2 sin x − 3 cos x = b1 (sin x + cos x) + b2 (cos x) ⇒ ⇒ 2 sin x − 3 cos x = c1 (sin x + cos x) + c2 (sin x − cos x) ⇒ [2 sin x − 3 cos x]C = (b) By Theorem 6.13, [ C | B ] → [ I | PC←B ]: " # " 1 0 1 1 1 1 0 → 0 1 0 1 −1 1 1 1 2 − 21 # " 1 2 1 −2 1 (c) [2 sin x − 3 cos x]C = PC←B [2 sin x − 3 cos x]B = 0 answer in part (a). (d) By Theorem 6.13, [ B | C ] → [ I | PB←C ]: 1 0 1 1 1 0 → 1 1 1 −1 0 1 1 −2 1 0 " # − 21 5 2 " 1 PC←B = 0 ⇒ c1 = − 21 c2 = 52 ⇒ . 1 2 − 12 # . #" # " # 2 − 12 = , which agrees with the 5 −5 2 ⇒ 1 PC←B = 0 1 −2 #" # " # 1 − 21 2 , which agrees with the = 5 −2 −5 2 " 1 (e) [2 sin x − 3 cos x]B = PB←C [2 sin x − 3 cos x]C = 0 answer in part (a). 12. Note that in terms of the standard basis {sin x, cos x}, 1 0 −1 1 B= , , C= , . 1 1 1 1 (a) ⇒ sin x = b1 (sin x + cos x) + b2 (cos x) " b1 = 1 b2 = −1 sin x = c1 (cos x − sin x) + c2 (sin x + cos x) ⇒ ⇒ c1 = − 21 c2 = 12 [sin x]B = 1 # −1 ⇒ [sin x]C = " 0 1 2 1 2 " # − 21 1 2 (b) By Theorem 6.13, [ C | B ] → [ I | PC←B ]: " −1 1 1 0 1 1 1 1 (c) [sin x]C = PC←B [sin x]B = " 0 1 1 2 1 2 # " → #" # 1 −1 1 2 1 2 1 0 0 0 1 1 " # = − 12 1 2 # ⇒ PC←B = 1 # . , which agrees with the answer in part (a). 488 CHAPTER 6. VECTOR SPACES (d) By Theorem 6.13, [ B | C ] → [ I | PB←C ]: 1 0 −1 1 1 0 → 1 1 1 1 0 1 (e) [sin x]B = PB←C [sin x]C = " −1 2 #" # 1 − 12 0 1 2 −1 2 " = 1 1 0 ⇒ PC←B = −1 2 1 0 # −1 , which agrees with the answer in part (a). 13. From Section 3.6, Example 3.58, if B is the usual basis and C is the basis whose basis vectors are the rotations of the usual basis vectors through 60◦ , the matrix of the rotation is " # " √ # 3 1 cos 60◦ sin 60◦ 2 2 √ PC←B = = . 1 − sin 60◦ cos 60◦ − 23 2 Then (a) " [x]C = PC←B [x]B = 1 √2 − 23 √ 3 2 1 2 #" # 3 " 1 √2 3 2 2 √ # " 3 3 2 + √ 3 1− 2 3 = . (b) Since PC←B is an orthogonal matrix, we have [x]B = (PC←B ) −1 T [x]C = (PC←B ) [x]C = √ − 23 " √ # 2 3+2 = √ . −4 2 3−2 #" 1 2 4 # 14. From Section 3.6, Example 3.58, if B is the usual basis and C is the basis whose basis vectors are the rotations of the usual basis vectors through 135◦ , the matrix of the rotation is # " √ " √ # 2 − 22 cos 135◦ sin 135◦ √ √2 = . PC←B = − 22 − 22 − sin 135◦ cos 135◦ Then (a) " [x]C = PC←B [x]B = √ − 22 √ − 22 √ 2 √2 − 22 #" # 3 2 √ " = # − 22 √ . #" # 4 −522 (b) Since PC←B is an orthogonal matrix, we have " −1 [x]B = (PC←B ) T [x]C = (PC←B ) [x]C = √ √ − 22 − 22 2 2 − 22 √ √ " # 0 = √ . −4 4 2 15. By definition, the columns of PC←B are the coordinates of B in the basis C, so we have 1 1 2 −1 [u1 ]C = , so that u1 = 1 −1 = −1 2 3 −1 −1 1 2 3 [u2 ]C = , so that u2 = −1 +2 = . 2 2 3 4 Therefore B = {u1 , u2 } = −1 3 , . −1 4 6.3. CHANGE OF BASIS 489 16. Let C = {p1 (x), p2 (x), p3 (x)}. By definition, the columns of PC←B are the coordinates of B in the basis C, so that   1 [x]C =  0 ⇒ x = p1 (x) − p3 (x), −1   0 [1 + x]C = 2 ⇒ 1 + x = 2p2 (x) + p3 (x), 1   0 [1 − x + x2 ]C = 1 ⇒ 1 − x + x2 = p2 (x) + p3 (x). 1 Subtract the third equation from the second, giving p2 (x) = 2x − x2 ; substituting back into the third equation gives p3 (x) = 1 − 3x + 2x2 ; then from the first equation, p1 (x) = 1 − 2x + 2x2 . So C = {p1 (x), p2 (x), p3 (x)} = {1 − 2x + 2x2 , 2x − x2 , 1 − 3x + 2x2 }. 17. Since a = 1 in this case and we are considering a quadratic, we have for the Taylor basis of P2 B = {1, x − 1, (x − 1)2 } = {1, x − 1, x2 − 2x + 1}. Let C = {1, x, x2 } be the standard basis for P2 . Then   1 [p(x)]C =  2 . −5 To find the change-of-basis matrix PB←C , we reduce B C → I PB←C :    1 −1 1 1 0 0 1 0 0 1 1 1 −2 0 1 0 → 0 1 0 0 1 B C = 0 0 0 1 0 0 1 0 0 1 0 0 Therefore  1 [p(x)]C = PB←C [p(x)]B = 0 0 1 1 0  1 2 . 1     −2 1 1 2  2 = −8 , −5 −5 1 so that the Taylor polynomial centered at 1 is −2 − 8(x − 1) − 5(x − 1)2 . 18. Since a = −2 in this case and we are considering a quadratic, we have for the Taylor basis of P2 B = {1, x + 2, (x + 2)2 } = {1, x + 2, x2 + 4x + 4}. Let C = {1, x, x2 } be the standard basis for P2 . Then   1 [p(x)]C =  2 . −5 To find the change-of-basis matrix PB←C , we reduce B C → I PB←C :     1 2 4 1 0 0 1 0 0 1 −2 4 B C = 0 1 4 0 1 0 → 0 1 0 0 1 −4 . 0 0 1 0 0 1 0 0 1 0 0 1 490 CHAPTER 6. VECTOR SPACES Therefore  1 [p(x)]C = PB←C [p(x)]B = 0 0     −2 4 1 −23 1 −4  2 =  22 , 0 1 −5 −5 so that the Taylor polynomial centered at −2 is −23 + 22(x + 2) − 5(x + 2)2 . 19. Since a = −1 in this case and we are considering a cubic, we have for the Taylor basis of P3 B = {1, x + 1, (x + 1)2 , (x + 1)3 } = {1, x + 1, x2 + 2x + 1, x3 + 3x2 + 3x + 1}. Let C = {1, x, x2 , x3 } be the standard basis for P2 . Then   0 0  [p(x)]C =  0 . 1 To find the change-of-basis matrix PB←C , we reduce B B Therefore  1 1 1 1 0 1 2 3 C = 0 0 1 3 0 0 0 1 1 0 0 0 0 1 0 0 C → I PB←C :   1 0 0 0 0 0 1 0 0 0 → 0 0 1 0 0 1 0 0 0 1 0 0 1 0  1 0 [p(x)]C = PB←C [p(x)]B =  0 0  −1 1 −1 1 −2 3 . 0 1 −3 0 0 1 1 0 0 0     −1 0 −1 1 −1 0  3 1 −2 3   =  , 0 1 −3 0 −3 1 1 0 0 1 so that the Taylor polynomial centered at −1 is −1 + 3(x + 1) − 3(x + 1)2 + (x + 1)3 . 20. Since a = 21 in this case and we are considering a cubic, we have for the Taylor basis of P3 ( B= 2 3 ) 1 1 1 1 1 3 3 1 1, x − , x − , x− = 1, x − , x2 − x + , x3 − x2 + x − . 2 2 2 2 4 2 4 8 Let C = {1, x, x2 , x3 } be the standard basis for P2 . Then   0 0  [p(x)]C =  0 . 1 To find the change-of-basis matrix PB←C , we reduce B h B  1 i 0  C = 0 0 − 12 1 0 0 1 4 − 18 −1 1 0 3 4 − 32 1 1 0 0 0 0 1 0 0 0 0 1 0 C → I   0 1 0 0   → 0 0 1 0 0 1 0 0 PB←C : 0 0 1 0 0 0 0 1 1 0 0 0 1 2 1 4 1 0 0 1 1 0 1 8 3 4 . 3  2  1 6.4. LINEAR TRANSFORMATIONS 491 Therefore  1 0  [p(x)]C = PB←C [p(x)]B =  0 0 1 2 1 4 1 0 0 1 1 0 1 2 2 1 1 0 8 8  3   3 4  0 =  4  ,     3  0  32  2     1 1 1 so that the Taylor polynomial centered at 12 is 1 3 + 8 4 1 2 x− + 3 2 x− 3 1 + x− . 2 21. Since [x]D = PD←C [x]C and [x]C = PC←B [x]B , we get [x]D = PD←C [x]C = PD←C (PC←B [x]B ) = (PD←C PC←B )[x]B . Since the change-of-basis matrix is unique (its columns are determined by the two bases), it follows that PD←C PC←B = PD←B . 22. By Theorem 6.10 (c), to show that C is a basis it suffices to show that it is linearly independent, since C has n elements and dim V = n. Since P is invertible, the columns of P are linearly independent. But the given equation says that [ui ]B = pi , since the elements of that column are the coordinates of ui with respect to the basis B. Since the columns of P are linearly independent, so is {[u1 ]B , [u2 ]B , . . . , [un ]B }. But then by Theorem 6.7 in Section 6.2, it follows that {u1 , u2 , . . . , un } is linearly independent, so that C is a basis for V . To show that P = PB←C , recall that the columns of PB←C are the coordinate vectors of C with respect to the basis B. But looking at the given equation, we see that this is exactly what the columns of P are. Thus P = PB←C . 6.4 Linear Transformations 1. T is a linear transformation: T a c 0 b a + 0 d c b0 a + a0 = T 0 d c + c0   a + a0 + b + b0   0 b + b0   0 =  0 d+d 0 0 c+c +d+d    0  a + b0 a+b  0   0  =T a + =  0   0  c c+d c0 + d0 0 b a +T 0 d c b0 d0 and T a α c b αa =T d αc αb αd = αa + αb 0 a+b =α 0 αc + αd 0 2. T is not a linear transformation: 0 w + w0 w x w x0 =T T + 0 0 y z y + y0 y z x + x0 z + z0 0 a = αT c+d c 1 = x + x0 − y − y 0 w + w0 − z − z 0 1 while T w y 0 x w +T z y0 x0 z0 1 w−z 1 w0 − z 0 = + 0 x−y 1 x − y0 1 0 2 w + w − z − z0 = x + x0 − y − y 0 2 Since the two are not equal, T is not a linear transformation. b . d 492 CHAPTER 6. VECTOR SPACES 3. T is a linear transformation. If A, C ∈ Mnn and α is a scalar, then T (A + C) = (A + C)B = AB + CB = T (A) + T (C) T (αA) = (αA)B = α(AB) = αT (A). 4. T is a linear transformation. If A, C ∈ Mnn and α is a scalar, then T (A + C) = (A + C)B − B(A + C) = AB + CB − BA − BC = (AB − BA) + (CB − BC) = T (A) + T (C) T (αA) = (αA)B − B(αA) = α(AB) − α(BA) = α(AB − BA) = αT (A). 5. T is a linear transformation, since by Exercise 44 in Section 3.2, tr(A + B) = tr(A) + tr(B) and tr(αA) = α tr(A). 6. T is not a linear transformation. For example, in M22 , let A = I; then T (2A) = T (2I) = 2 · 2 = 4, while 2T (A) = 2T (I) = 2(1 · 1) = 2. 7. T is not a linear transformation. For example, let A be any invertible matrix in Mnn ; then −A is invertible as well, so that rank(A) = rank(−A) = n. Then T (A) + T (−A) = rank(A) + rank(−A) = 2n while T (A + (−A)) = T (O) = rank(O) = 0. 8. Since T (0) = 1 + x + x2 6= 0, T is not a linear transformation by Theorem 6.14(a). 9. T is a linear transformation. Let p(x) = a + bx + cx2 and p0 (x) = a0 + b0 x + c0 x2 be elements of P2 , and let α be a scalar. Then (T (p + p0 ))(d) = (T ((a + a0 ) + (b + b0 )x + (c + c0 )x2 ))(d) = (a + a0 ) + (b + b0 )(d + 1) + (b + b0 )(d + 1)2 = a + b(d + 1) + b(d + 1)2 + a0 + b0 (d + 1) + b0 (d + 1)2 = (T (p))(d) + (T (p0 ))(d), and T (αp)(d) = (T (αa + αbx + αcx2 ))(d) = αa + αb(d + 1) + αb(d + 1)2 = α a + b(d + 1) + b(d + 1)2 = α(T (p))(d). 10. T is a linear transformation. Let f, g ∈ F and α be a scalar. Then (T (f + g))(x) = (f + g)(x2 ) = f (x2 ) + g(x2 ) = (T (f ))(x) + (T (g))(x) (T (αf ))(x) = (αf )(x2 ) = αf (x2 ) = α(T (f ))(x). 11. T is not a linear transformation. For example, let f be any function in F that takes on a value other than 0 or 1. Then 2 2 (T (2f ))(x) = ((2f )(x)) = (2f (x)) = 4f (x)2 6= 2f (x)2 = 2(T (f ))(x), where the inequality holds because of the conditions on f . 12. This is similar to Exercise 10; T is a linear transformation. Let f, g ∈ F and α be a scalar. Then (T (f + g))(c) = (f + g)(c) = f (c) + g(c) = (T (f ))(c) + (T (g))(c) (T (αf ))(c) = (αf )(c) = αf (c) = α(T (f ))(c) 6.4. LINEAR TRANSFORMATIONS 493 13. For T , we have (with α a scalar) a c a+c T + =T = (a + c) + (a + c + b + d)x = (a + (a + b)x) + (c + (c + d)x) b d b+d a c =T +T b d a αa a T α =T = αa + (αa + αb)x = α(a + (a + b)x) = αT . b αb b For S, suppose that p(x), q(x) ∈ P1 and α is a scalar; then T (p(x) + q(x)) = x(p(x) + q(x)) = xp(x) + xq(x) = T (p(x)) + T (q(x)) T (αp(x)) = x(αp(x)) = α(xp(x)) = αT (p(x)). 14. Using the comment following the definition of a linear transformation, we have           1 3 5 6 11 5 1 0 1 0 T =T 5 +2 = 5T + 2T = 5  2 + 2 0 =  10 + 0 = 10 2 0 1 0 1 −1 4 −5 8 3 In a similar way,           3 a 3b a + 3b 1 a 1 0 1 0 T =T a +b = aT + bT = a  2 + b 0 =  2a  +  0 =  2a  b 0 1 0 1 4 −a 4b −a + 4b −1 −7 1 3 15. We must first find the coordinates of in the basis , . So we want to find c, d such 9 1 −1 that −7 1 3 c + 3d = −7 c = 5 =c +d ⇒ ⇒ 9 1 −1 c − d = 9 d = −4. Then using the comment following the definition of a linear transformation, we have −7 1 3 1 3 T =T 5 −4 = 5T − 4T = 5(1 − 2x) − 4(x + 2x2 ) = −8x2 − 14x + 5. 9 1 −1 1 −1 The second part is much the same, except that we need to find c, d such that c = a+3b a 1 3 c + 3d = a 4 =c +d ⇒ ⇒ b 1 −1 c − d = b d = a−b 4 . Then using the comment following the definition of a linear transformation, we have a−b a + 3b 1 a 3 T + =T b 1 −1 4 4 a + 3b a−b 1 3 = T T + 1 −1 4 4 a−b a + 3b = (1 − 2x) + (x + 2x2 ) 4 4 a − b 2 a + 7b a + 3b = x − x+ . 2 4 4 16. Since T is linear, T (6 + x − 4x2 ) = T (6) + T (x) + T (−4x2 ) = 6T (1) + T (x) − 4T (x2 ) = 6(3 − 2x) + (4x − x2 ) − 4(2 + 2x2 ) = 10 − 8x − 9x2 . 494 CHAPTER 6. VECTOR SPACES Similarly, T (a + bx + cx2 ) = T (a) + T (bx) + T (cx2 ) = aT (1) + bT (x) + cT (x2 ) = a(3 − 2x) + b(4x − x2 ) + c(2 + 2x2 ) = (3a + 2c) + (4b − 2a)x + (2c − b)x2 . 17. We do the second part first, then substitute for a, b, and c to get the first part. We must find the coordinates of a + bx + cx2 in the basis 1 + x, x + x2 , 1 + x2 . So we want to find r, s, t such that a + bx + cx2 = r(1 + x) + s(x + x2 ) + t(1 + x2 ) = (r + t) + (r + s)x + (s + t)x2 r + t = a r + s = b s + t = c ⇒ ⇒ r = a+b−c 2 s = −a+b+c 2 t = a−b+c . 2 Then using the comment following the definition of a linear transformation, we have a+b−c −a + b + c a−b+c 2 2 2 T (a + bx + cx ) = T (1 + x) + (x + x ) + (1 + x ) 2 2 2 a+b−c −a + b + c a−b+c = T (1 + x) + T (x + x2 ) + T (1 + x2 ) 2 2 2 a+b−c −a + b + c a−b+c = (1 + x2 ) + (x − x2 ) + (1 + x + x2 ) 2 2 2 3a − b − c 2 = a + cx + x 2 For the second part, substitute a = 4, b = −1, and c = 3 to get T (4 − x + 3x2 ) = 4 + 3x + 12 + 1 − 3 2 x = 4 + 3x + 5x2 . 2 18. We do the second part first, then substitute for a, b, c, and d to get the first part. We must find the coordinates of a b 1 0 1 1 1 1 1 1 in the basis , , , . c d 0 0 0 0 1 0 1 1 So we want to find r, s, t, u such that a c b 1 =r d 0 0 1 +s 0 0 1 1 +t 0 1 1 1 +u 0 1 r + s + t + u = a r = a−b 1 s + t + u = b s = b−c ⇒ ⇒ 1 t + u = c t = c−d u = d u = d. Then using the comment following the definition of a linear transformation, we have a b 1 0 1 1 1 1 1 1 T = T (a − b) + (b − c) + (c − d) + +d c d 0 0 0 0 1 0 1 1 1 0 1 1 1 1 1 1 = (a − b)T + (b − c)T + (c − d)T + +dT 0 0 0 0 1 0 1 1 = (a − b) · 1 + (b − c) · 2 + (c − d) · 3 + d · 4 = a + b + c + d. For the second part, substitute a = 1, b = 3, c = 4, d = 2 to get 1 3 T = 1 + 3 + 4 + 2 = 10. 4 2 6.4. LINEAR TRANSFORMATIONS 495 19. Let T (E11 ) = a, T (E12 ) = b, T (E21 ) = c, T (E22 ) = d. Then w x T = t (wE11 + xE12 + yE21 + zE22 ) = wT E11 + xT E12 + yT E21 + zT E22 = aw + bx + cy + dz. y z 20. Writing       0 2 3  6 = a 1 + b 0 −8 0 2 means that we must have b = −4, and then a = 6, so that if T were a linear transformation, then       0 2 3 T  6 = 6T 1 − 4 0 = 6(1 + x) − 4(2 − x + x2 ) = −2 + 10x − 4x2 6= −2 + 2x2 . −8 0 2 This is a contradiction, so T cannot be linear. 21. Choose v ∈ V . Then −v = 0 − v by definition, so using part (a) of Theorem 6.14 and the definition of a linear transformation, T (−v) = T (0 − v) = T (0) − T (v) = 0 − T (v) = −T (v). Pn 22. Choose v ∈ V . Then there are scalars c1 , . . . , cn such that v = i=1 ci vi since the vi are a basis. Since T is linear, using the comment following the definition of a linear transformation we get ! n n n X X X T (v) = T ci v i = ci T (vi ) = ci vi = v. i=1 i=1 i=1 Since T (v) = v for every vector v ∈ V , T is the identity transformation. Pn 23. We must show that if p(x) ∈ Pn , then T p(x) = p0 (x). Suppose that p(x) = k=0 ak xk . Then since T is linear, using the comment following the definition of a linear transformation we get ! n n n n X X X X k kak xk−1 = p0 (x). ak T (xk ) = ak T (xk ) = a0 T (1) + T p(x) = T ak x = k=0 k=1 k=0 k=1 24. (a) We prove the contrapositive: suppose that {v1 , . . . , vn } is linearly dependent in V ; then we have c1 v1 + · · · + cn vn = 0, where not all of the ci are zero. Then T (c1 v1 + · · · + cn vn ) = c1 T (v1 ) + · · · + cn T (vn ) = T (0) = 0, showing that {T (v1 ), . . . , T (vn )} is linearly dependent in W . Therefore if {T (v1 ), . . . , T (vn )} is linearly independent in W , we must also have that {v1 , . . . , vn } is linearly independent in V . (b) For example, the zero transformation from V to W , which takes every element of V to 0, shows that the converse is false. There are many other examples as well, however. For instance, let x x 2 2 T :R →R : 7→ . y 0 Then e1 and e2 are linearly independent in V , but 1 0 T (e1 ) = and T (e2 ) = 0 0 are linearly dependent. 496 CHAPTER 6. VECTOR SPACES 25. We have 2 2 5 4 −1 =S T =S = 1 1 −1 0 6 x x 2x + y 2x −y (S ◦ T ) =S T =S = . y y −y 0 2x + 2y (S ◦ T ) Since the domain of T is R2 but the codomain of S is M22 , T ◦ S does not make sense and we cannot compute it. 26. We have (S ◦ T )(3 + 2x − x2 ) = S(T (3 + 2x − x2 )) = S(2 + 2 · (−1)x) = S(2 − 2x) = 2 + (2 − 2)x + 2 · (−2)x2 = 2 − 4x2 (S ◦ T )(a + bx + cx2 ) = S(T (a + bx + cx2 )) = S(b + 2cx) = b + (b + 2c)x + 2(2c)x2 = b + (b + 2c)x + 4cx2 . Since the domain of S is P1 , which equals the codomain of T , we can compute T ◦ S: (T ◦ S)(a + bx) = T (S(a + bx)) = T (a + (a + b)x + 2bx2 ) = (a + b) + 2 · 2bx = (a + b) + 4bx. 27. The Chain Rule says that (f ◦ g)0 (x) = g 0 (x)f 0 (g(x)). Let g(x) = x + 1 and f (x) = p(x). Then (S ◦ T )(p(x)) = S(T (p(x))) = S(p0 (x)) = p0 (x + 1) (T ◦ S)(p(x)) = T (S(p(x))) = T (p(x + 1)) = T ((p ◦ g)(x)) = (p ◦ g)0 (x) = g 0 (x)p0 (g(x)) = p0 (x + 1) since the derivative of g(x) is 1. 28. Use the Chain Rule from the previous exercise, and let g(x) = x + 1: (S ◦ T )(p(x)) = S(T (p(x))) = S(xp0 (x)) = (x + 1)p0 (x + 1) (T ◦ S)(p(x)) = T (S(p(x))) = T (p(x + 1)) = T ((p ◦ g)(x)) = x(p ◦ g)0 (x) = xg 0 (x)p0 (g(x)) = xp0 (x + 1) since the derivative of g(x) is 1. 29. We must verify that S ◦ T = IR2 and T ◦ S = IR2 . x x x−y 4(x − y) + (−3x + 4y) x (S ◦ T ) =S T =S = = y y −3x + 4y 3(x − y) + (−3x + 4y) y x x 4x + y (4x + y) − (3x + y) x (T ◦ S) =T S =T = = . y y 3x + y −3(4x + y) + 4(3x + y) y x x x x Thus (S ◦ T ) = (T ◦ S) = for any vector , so both compositions are the identity and y y y y we are done. 30. We must verify that S ◦ T = IP1 and T ◦ S = IP1 . b b b + (a + 2b)x = −4 · + (a + 2b) + 2 · x = a + bx (S ◦ T )(a + bx) = S(T (a + bx)) = S 2 2 2 2a (T ◦ S)(a + bx) = T (S(a + bx)) = T ((−4a + b) + 2ax) = + ((−4a + b) + 2 · 2a)x = a + bx. 2 Thus (S ◦ T )(a + bx) = (T ◦ S)(a + bx) = a + bx for any a + bx ∈ P1 , so both compositions are the identity and we are done. 6.4. LINEAR TRANSFORMATIONS 497 31. Suppose T 0 and T 00 are such that T 0 ◦ T = T 00 ◦ T = IV and T ◦ T 0 = T ◦ T 00 = IW . To show that T 0 = T 00 , it suffices to show that T 0 (w) = T 00 (w) for any vector w ∈ W . But T 0 (w) = IV ◦ T 0 (w) = T 00 ◦ T ◦ T 0 (w) = T 00 ◦ ((T ◦ T 0 )w) = T 00 ◦ IW (w) = T 00 (w). 32. (a) First, if T (v) = ±v, then clearly {v, T v} is linearly dependent. For the converse, if {v, T (v)} is linearly dependent, then for some c, d not both zero cv + dT v = 0. Then T (cv + dT v) = cT v + d(T ◦ T )v = cT v + dv = T 0 = 0 since T ◦ T = IV . Thus cv + dT v = 0 cT v + dv = 0. Subtracting these two equations gives c(T v − v) − d(T v − v) = (c − d)(T v − v) = 0. But this means that either c = d 6= 0 or that T v = v. If c = d 6= 0, then divide through by c in the equation cv + dT v = 0 to get v + T v = 0, so that T v = −v. Hence T v = ±v. x −x (b) Let T = . Then y −y (T ◦ T ) so that T ◦ T = IV , and x x −x −(−x) x =T T =T = = , y y −y −(−y) y x −x and are linearly dependent. y −y 33. (a) First, if T (v) = v, then clearly {v, T v} = {v, v} is linearly dependent, and if T (v) = 0, then {v, T v} = {v, 0} is linearly dependent. For the converse, suppose that {v, T v} is linearly dependent. Then there exist c, d, not both zero, such that cv + dT v = 0. As in Exercise 32, we get T (cv + dT v) = cT v + d(T ◦ T )v = cT v + dT v = T 0 = 0 since T ◦ T = T . Thus (c + d)T v = 0 cv + dT v = 0. Subtracting the two equations gives c(v − T v) = 0, so that either T v = v or c = 0. But if c = 0, then cv + dT v = dT v = 0, so that since c and d were not both zero, d cannot be zero, so that T v = 0. Thus either T v = v or T v = 0. x x (b) For example, let T = . This is a projection operator, so we already know that T ◦ T = T ; y 0 to show this via computation, x x x x x (T ◦ T ) =T T =T = =T . y y 0 0 y Then for example T 1 1 0 = while T = 0. 0 0 1 498 CHAPTER 6. VECTOR SPACES 34. To show that S+T is a linear transformation, we must show it obeys the rules of a linear transformation. Using the definition of S + T and of cT , as well as the fact that both S and T are linear, we have (S + T )(u + v) = S(u + v) + T (u + v) = Su + Sv + T u + T v = (Su + T u) + (Sv + T v) = (S + T )u + (S + T )v (c(S + T ))v = (cS + cT )v = (cS)v + (cT )v = c(Sv) + c(T v) = c(Sv + T v) = c(S + T )v. 35. We show that L = L (V, W ) satisfies the ten axioms of a vector space. We use freely below the fact that V and W are both vector spaces. Also, note that S = T in L (V, W ) simply means that Sv = T v for all v ∈ V . 1. From Exercise 34, S, T ∈ L implies that S + T ∈ L . 2. (S + T )v = Sv + T v = T v + Sv = (T + S)v. 3. ((R + S) + T )v = (R + S)v + T v = Rv + Sv + T v = Rv + (S + T )v = (R + (S + T ))v. 4. Let Z : V → W be the map Z(v) = 0 for all v ∈ V. Then Z is the zero map, so it is linear and thus in L , and (S + Z)v = Sv + Zv = Sv + 0 = Sv. 5. If S ∈ L , let −S : V → W be the map (−S)v = −Sv. By Exercise 34, −S ∈ L , and (S + (−S))v = Sv + (−S)v = Sv − Sv = 0. 6. By Exercise 34, if S ∈ L , then cS ∈ L . 7. (c(S + T ))v = c(S + T )v = c(Sv + T v) = cSv + cT v = (cS)v + (cT )v = ((cS) + (cT ))v. 8. ((c + d)S)v = (c + d)Sv = cSv + dSv = (cS)v + (cT )v = ((cS) + (cT ))v. 9. (c(dS))v = c(dS)v = c(dSv) = (cd)Sv = ((cd)S)v. 10. By Exercise 34, (1S)v = 1Sv = Sv. 36. To show that these linear transformations are equal, it suffices to show that they have the same value on every element of V . (a) (R ◦ (S + T ))v = R((S + T )v) = R(Sv + T v) = R(Sv) + R(T v) = (R ◦ S)v + (R ◦ T )v. (b) (c(R ◦ S))v = c(R ◦ S)v = cR(Sv) = (cR)(Sv) = ((cR) ◦ S)v (c(R ◦ S))v = c(R ◦ S)v = cR(Sv) = (cR)(Sv) = R((cS)v) = (R ◦ (cS))v. 6.5 The Kernel and Range of a Linear Transformation 1. (a) Only (ii) is in ker(T ), since 1 2 1 0 (i) T = 6= O, −1 3 0 3 0 4 0 0 (ii) T = = O, 2 0 0 0 3 0 3 0 (iii) T = 6= O. 0 −3 0 −3 (b) From the definition of T , a matrix is in range(T ) if and only if its two off-diagonal elements are zero. Thus only (iii) is in range(T ). 2. (a) T is explicitly defined by a T c Thus b =a+d d 6.5. THE KERNEL AND RANGE OF A LINEAR TRANSFORMATION 499 1 2 =1+3=4 −1 3 0 4 (ii) T =0+0=0 2 0 1 3 (iii) T = 1 + (−1) = 0, 0 −1 (i) T and therefore (ii) and (iii) are in ker(T ) but (i) is not. (b) All three of these are in range(T ). For example, 0 0 (i) T = 0 + 0 = 0, 0 0 2 1 (ii) T = 2 + 0 = 2, 4 0 √ √ √ 2 π = 22 + 0 = 22 . (iii) T 2 12 0 (c) ker(T ) is the set of all matrices in M22 whose diagonal elements are negatives of each other. range(T ) is all real numbers. 3. (a) We have 1−1 0 (i) T (1 + x) = T (1 + 1x + 0x ) = = 6= 0, 1+0 1 0−1 −1 (ii) T (x − x2 ) = T (0 + 1x − 1x2 ) = = 6= 0, 1 + (−1) 0 1−1 0 (iii) T (1 + x − x2 ) = T (1 + 1x − 1x2 ) = = = 0, 1 + (−1) 0 2 so only (iii) is in range(T ). (b) All of them are. For example, 0 2 (i) T (1 + x − x ) = , from part (a), 0 1−0 1 (ii) T (1) = = , 0+0 0 0−0 0 (iii) T (x2 ) = = . 0+1 1 (c) ker(T ) consists of all polynomials a + bx + cx2 such that a = b and b = −c; parametrizing via a = t gives ker(T ) = {a + ax − ax2 }, so that ker(T ) consists of all multiples of 1 + x − x2 . range(T ) is all of R2 , since it is a subspace of R2 containing both e1 and e2 (see items (ii) and (iii) of part (b)). 4. (a) Since xp0 (x) = 0 if and only if p0 (x) = 0, we see that only (i) is in ker(T ). For the other two, we have T (x) = xx0 = x and T (x2 ) = x(x2 )0 = 2x2 . (b) A polynomial in range(T ) must be divisible by x; that is, it must have no constant term. Thus (i) is not in range(T ). However, x = T (x) from part (a), and also from part (a), we see that T 21 x2 = x2 . Thus (ii) and (iii) are in range(T ). (c) ker(T ) consists of all polynomials whose derivative vanishes, so it consists of constant polynomials. range(T ) consists of all polynomials with zero constant terms, sinceR given any such polynomial, we may write it as xg(x) for some polynomial g(x); then clearly T g(x) dx = xg(x). 0 b 5. Since ker(T ) = , a basis for ker(T ) is c 0 0 1 0 0 , . 0 0 1 0 500 CHAPTER 6. VECTOR SPACES Similarly, since range(T ) = a 0 0 d , a basis for range(T ) is 1 0 0 0 , 0 0 0 . 1 Thus dim ker(T ) = 2 and dim range(T ) = 2. The Rank Theorem says that 4 = dim M22 = nullity(T ) + rank(T ) = dim ker(T ) + dim range(T ) = 2 + 2, which is true. 6. Since a b ker(T ) = , c −a a basis for ker(T ) is 1 0 0 0 , −1 0 1 0 , 0 1 0 0 . From Exercise 2, range(T ) = R. Thus dim ker(T ) = 3 and dim range(T ) = 1. The Rank Theorem says that 4 = dim M22 = nullity(T ) + rank(T ) = dim ker(T ) + dim range(T ) = 3 + 1, which is true. 7. Since ker(T ) = a + ax − ax2 , a basis for ker(T ) is {1 + x − x2 }. From Exercise 3, range(T ) = R2 . Thus dim ker(T ) = 1 and dim range(T ) = 2. The Rank Theorem says that 3 = dim P2 = nullity(T ) + rank(T ) = dim ker(T ) + dim range(T ) = 1 + 2, which is true. 8. Since ker(T ) is the constant polynomials, it has a basis {1}. From Exercise 4, range(T ) is the set of polynomials with nonzero constant terms, or {bx + cx2 }, so it has a basis {x, x2 }. Thus dim ker(T ) = 1 and dim range(T ) = 2. The Rank Theorem says that 3 = dim P2 = nullity(T ) + rank(T ) = dim ker(T ) + dim range(T ) = 1 + 2, which is true. 9. Since 1 T 0 0 1−0 1 = = , 0 0−0 0 0 T 1 0 0−0 0 = = , 0 1−0 1 it follows that range(T ) = R2 , since it is a subspace of R2 containing both e1 and e2 . Thus rank(T ) = dim range(T ) = 2, so that by the Rank Theorem, nullity(T ) = dim M22 − rank(T ) = 4 − 2 = 2. 10. Since 1−0 1 T (1 − x) = = , 1−1 0 0 T (x) = , 1 it follows that range(T ) = R2 , since it is a subspace of R2 containing both e1 and e2 . Thus rank(T ) = dim range(T ) = 2, so that by the Rank Theorem, nullity(T ) = dim P2 − rank(T ) = 3 − 2 = 1. 6.5. THE KERNEL AND RANGE OF A LINEAR TRANSFORMATION 11. If A = a c 501 b , then d a T (A) = AB = c b 1 d −1 −1 a − b −a + b a − b −(a − b) = = . 1 c − d −c + d c − d −(c − d) Thus T (A) = O if and only if a = b and c = d, so that a a ker(T ) = , c c so that a basis for ker(T ) is 1 0 1 0 , 0 1 0 . 1 Hence nullity(T ) = dim ker(T ) = 2, so that by the Rank Theorem rank(T ) = dim M22 − nullity(T ) = 4 − 2 = 2. a 12. If A = c b , then d T (A) = AB − BA = a c b 1 d 0 1 1 − 1 0 1 a 1 c b a = d c a+b a+c − c+d c So A ∈ ker(T ) if and only if c = 0 and a = d, so that a b 1 0 0 A= =a +b 0 a 0 1 0 b+d −c = d 0 a−d . c 1 . 0 These two matrices are linearly independent since they are not multiples of one another, so that nullity(T ) = 2. Since dim M22 = 4, we see that by the Rank Theorem rank(T ) = dim M22 − nullity(T ) = 4 − 2 = 2. 13. The derivative of p(x) = a + bx + cx2 is b + 2cx; then p0 (0) = b, so this is zero if b = 0. Thus ker(T ) = {a + cx2 }, so that a basis for ker(T ) is {1, x2 }. Then nullity(T ) = dim ker(T ) = 2, so that by the Rank Theorem, rank(T ) = dim P2 − nullity(T ) = 3 − 2 = 1. 14. We have  0 T (A) = A − AT = a12 − a21 a31 − a13 a12 − a21 0 a32 − a23  a13 − a31 a23 − a32  , 0 so that A ∈ ker(T ) if and only if a12 = a21 , a13 = a31 , and a23 = a32 ; that is, if and only if A is symmetric. By Exercise 40 in Section 6.2, nullity(T ) = dim ker(T ) = 3·4 2 = 6. Then by the Rank Theorem, rank(T ) = dim M33 − nullity(T ) = 9 − 6 = 3. 15. (a) Yes, T is one-to-one. If T a 2a − b 0 = = , b a + 2b 0 then 2a − b = 0 and a + 2b = 0. The only solution to this pair of equations is a = b = 0, so that 0 the only element of the kernel is and thus T is one-to-one. 0 (b) Since T is one-to-one and the dimension of the domain and codomain are both 2, T must be onto by Theorem 6.21. 502 CHAPTER 6. VECTOR SPACES 16. (a) Yes, T is one-to-one. If T a = (a − 2b) + (3a + b)x + (a + b)x2 = 0, b then a + b = 0 from the third term. Since 3a + b= 0 as well, subtracting gives 2a = 0, so that a a = 0. The third term then forces b = 0. Thus = 0 and T is one-to-one. b (b) T cannot be onto. R2 has dimension 2, so by the rank theorem 2 = dim R2 = rank(T ) + nullity(T ) ≥ rank(T ). It follows that the dimension of the range of T is at most 2 (in fact, since from part (a) nullity(T ) = 0, the dimension of the range is equal to 2). But dim P2 = 3, so T is not onto. 17. (a) If T (a + bx + cx2 ) = 0, then 2a − b =0 a + b − 3c = 0 −a Row-reducing the augmented matrix gives  2 −1 0  1 1 −3 −1 0 1 + c = 0.   0 1 0 → 0 0 0 0 1 0 −1 −2 0  0 0 . 0 This has nontrivial solutions a = t, b = 2t, c = t, so that ker(T ) = {t + 2tx + tx2 } = {t(1 + 2x + x2 )}. So T is not one-to-one. (b) Since dim P2 = dim R3 = 3, the fact that T is not one-to-one implies, by Theorem 6.21, that T is not onto. 18. (a) By Exercise 10, nullity(T ) = 1, so that ker(T ) 6= {0}, so that T is not one-to-one. (b) By Exercise 10, range(T ) = R2 , so that T is onto by definition.   a 19. (a) If T  b  = O, then a − b = a + b = 0 and b − c = b + c = 0; these equations give a = b = c = 0 as c the only solution. Thus T is one-to-one. (b) By the Rank Theorem, dim range(T ) = dim R3 − dim ker(T ) = 3 − 0 = 3. However, dim M22 = 4 > 3 = dim range(T ), so that T cannot be onto.   a 20. (a) If T  b  = O, then a + b + c = 0, b − 2c = 0, and a − c = 0. The latter two equations give a = c c and b = 2c, so that from the first equation 4c = 0 and thus c = 0. It follows that a = b = 0. Thus T is one-to-one. 3 (b) By Exercise 40 in Section 6.2, dim W = 2·3 2 = 3 = dim R . Since the dimensions of the domain and codomain are equal and T is one-to-one, Theorem 6.21 implies that T is also onto. 6.5. THE KERNEL AND RANGE OF A LINEAR TRANSFORMATION 503 21. These vector spaces are isomorphic. Bases for the two vector spaces are D3 : {E11 , E22 , E33 }, R3 : {e1 , e2 , e3 }. Since both have dimension 3, we know that D3 ∼ = R3 . Define T (aE11 + bE22 + cE33 ) = a e1 + b e2 + c e3 . Then T is a linear transformation from the way we have defined it. Since the ei are linearly independent, it follows that a e1 + b e2 + c e3 = 0 if and only if a = b = c = 0. Thus ker(T ) = {0} so that T is one-to-one. Since dim D3 = dim R3 , we know that T is onto as well, so it is an isomorphism. 22. By Exercise 40 in Section 6.2, dim S3 = 3·4 2 = 6. Since an upper triangular matrix has three entries that are forced to be zero (those below the diagonal), and the other six may take any value, we see that dim U3 = 6 as well. Thus S3 ∼ = U3 . Define     a11 a12 a13 a11 a12 a13 T : S3 → D3 : a21 a22 a23  7→  0 a22 a23  . a31 a32 a33 0 0 a33 Then any matrix in ker(T ) must have a11 = a12 = a13 = a22 = a23 = a33 = 0, so that it must itself be the zero matrix. Thus ker(T ) = {0} so that T is one-to-one. Since T is one-to-one and dim S3 = dim U3 = 6, it follows that T is onto and is therefore an isomorphism. 2·3 0 23. By Exercises 40 and 41 in Section 6.2, dim S3 = 3·4 2 = 6 while dim S3 = 2 = 3. Since their dimensions 0 ∼ are unequal, S3 = 6 S3 . 24. Since a basis for P2 is {1, x, x2 }, we have dim P2 = 3. If p(x) ∈ W , say p(x) = a + bx + cx2 + dx3 , then p(0) = 0 implies that a = 0, so that p(x) = bx + cx2 + dx3 ; a basis for this subspace is {x, x2 , x3 } so that dim W = 3 as well. Therefore P2 ∼ = W . Define T : P2 → W : a + bx + cx2 7→ ax + bx2 + cx3 . Then T (a + bx + cx2 ) = ax + bx2 + cx3 = 0 implies that a = b = c = 0. Thus T is one-to-one; since the dimensions of domain and codomain are equal, T is also onto, so it is an isomorphism. 25. A basis for C as a vector space over R is {1, i}, so that dim C = 2. A basis for R2 is {e1 , e2 }, so that dim R2 = 2. Thus C ∼ = R2 . Define a 2 T : C → R : a + bi 7→ . b Then T (a + bi) = 0 means that a = b = 0. Thus T is one-to-one; since the dimensions of domain and codomain are equal, T is also onto, so it is an isomorphism. 26. V consists of all 2 × 2 matrices whose upper-left and lower-right entries are negatives of each other, so that a b V = . c −a Thus a basis for V is 1 0 0 0 , −1 0 1 0 , 0 1 0 , 0 and therefore dim V = 3. Since dim W = dim R2 = 2, these vector spaces are not isomorphic. 27. Since the dimensions of the domain and codomain are the same, it suffices to show that T is one-to-one. Let p(x) = a0 + a1 x + · · · + an xn . Then p(x) + p0 (x) = (a0 + a1 x + · · · + an−1 xn−1 + an xn ) + (a1 + 2a2 x + · · · + nan xn−1 ) = (a0 + a1 ) + (a1 + 2a2 )x + · · · + (an−1 + nan )xn−1 + an xn . 504 CHAPTER 6. VECTOR SPACES If p(x) + p0 (x) = 0, then the coefficient of xn is zero, so that an = 0. But then the coefficient of xn−1 is 0 = an−1 + nan = an−1 , so that an−1 = 0. We continue this process, determining that ai = 0 for all i. Thus p(x) = 0, so that T is one-to-one and is thus an isomorphism. 28. Since the dimensions of the domain and codomain are the same, it suffices to show that T is one-to-one. Now, T (xk ) = (x − 2)k . Suppose that T (a0 + a1 x + · · · + an xn ) = a0 + a1 (x − 2) + · · · + an (x − 2)n = 0. If we show that {1, x − 2, (x − 2)2 , . . . , (x − 2)n } is a basis for Pn , then it follows that all the ai are zero, so that ker(T ) = {0} and T is one-to-one. To prove they form a basis, it suffices to show they are linearly independent, since there are n + 1 elements in the set and dim Pn = n + 1. We do this by induction. For n = 0, this is clear. For n = 1, suppose that a + b(x − 2) = (a − 2b) + bx = 0. Then b = 0 and thus a = 0, so that linear independence for n = 1 is established. Now suppose that {1, x − 2, . . . , (x − 2)k } is a linearly independent set, and suppose that a0 + a1 (x − 2) + · · · + ak−1 (x − 2)k−1 + ak (x − 2)k = 0. If this polynomial were expanded, the only term involving xk comes from the last term of the sum, and it is ak xk . Therefor ak = 0, and then the other ai = 0 by induction. Thus this set forms a basis for Pn , so that T is one-to-one and thus is an isomorphism. 29. Since the dimensions of the domain and codomain are the same, it suffices to show that T is one-to-one. That is, we want to show that xn p x1 = 0 implies that p(x) = 0. Suppose p(x) = a0 + a1 x + · · · + an xn . Then 1 1 1 n x p = x a0 + a1 + · · · + an = a0 xn + a1 xn−1 + · · · + an . x x xn Therefore if xn p x1 = 0, all of the ai are zero so that p(x) = 0. Thus T is one-to-one and is an isomorphism. n 30. (a) As the hint suggests, define a map T : C [0, 1] → C [2, 3] by (T (f ))(x) = f (x − 2) for f ∈ C [0, 1]. Thus for example (T (f ))(2.5) = f (2.5 − 2) = f (0.5). T is linear, since (T (f + g))(x) = (f + g)(x − 2) = f (x − 2) + g(x − 2) = (T (f ))(x) + (T (g))(x) (T (cf ))(x) = (cf )(x − 2) = cf (x − 2) = c(T (f ))(x). We must show that T is one-to-one and onto. First suppose that T (f ) = 0; that is, suppose that f ∈ C [0, 1] and that f (x − 2) = 0 for all x ∈ [2, 3]. Let y = x − 2; then x ∈ [2, 3] means y ∈ [0, 1], and we get f (y) = 0 for all y ∈ [0, 1]. But this means that f = 0 in C [0, 1]. Thus T is one-to-one. To see that it is onto, choose f ∈ C [2, 3], and let g(x) = f (x + 2). Then (T (g))(x) = g(x − 2) = f (x + 2 − 2) = f (x), so that T (g) = f and thus T is onto. Since T is one-to-one and onto, it is an isomorphism. (b) This is similar to part (a). Define a map T : C [0, 1] → C [a, a + 1] by (T (f ))(x) = f (x − a) for f ∈ C [0, 1]. T is proved linear in a similar fashion to part (a). We must show that T is one-to-one and onto. First suppose that T (f ) = 0; that is, suppose that f ∈ C [0, 1] and that f (x − a) = 0 for all x ∈ [a, a + 1]. Let y = x − a; then x ∈ [a, a + 1] means y ∈ [0, 1], and we get f (y) = 0 for all y ∈ [0, 1]. But this means that f = 0 in C [0, 1]. Thus T is one-to-one. To see that it is onto, choose f ∈ C [a, a + 1], and let g(x) = f (x + a). Then (T (g))(x) = g(x − a) = f (x + a − a) = f (x), so that T (g) = f and thus T is onto. Since T is one-to-one and onto, it is an isomorphism. 6.5. THE KERNEL AND RANGE OF A LINEAR TRANSFORMATION 505 31. This is similar to Exercise 30. Define a map T : C [0, 1] → C [0, 2] by (T (f ))(x) = f 21 x for f ∈ C [0, 1]. Thus for example (T (f ))(1) = f 12 · 1 = f 21 . T is proved linear in a similar fashion to part (a) of Exercise 30. We must show that T is one-to-one and onto. First suppose that T (f ) = 0; that is, suppose that f ∈ C [0, 1] and that f 12 x = 0 for all x ∈ [0, 2]. Let y = 12 x; then x ∈ [0, 2] means y ∈ [0, 1], and we get f (y) = 0 for all y ∈ [0, 1]. But this means that f = 0 in C [0, 1]. Thus T is one-to-one. To see that it is onto, choose f ∈ C [0, 2], and let g(x) = f (2x). Then 1 1 (T (g))(x) = g x = f 2 · x = f (x), 2 2 so that T (g) = f and thus T is onto. Since T is one-to-one and onto, it is an isomorphism. 32. This is again similar to Exercises 30 and 31, and is the most general form. Define a map T : C [a, b] → C [c, d] by (T (f ))(x) = f b−a d−c (x − c) + a for f ∈ C [a, b]. Then (T (f ))(c) = f (a) and (T (f ))(d) = b − a + a = b. T is proved linear in a similar fashion to part (a) of Exercise 30. We must show that T is one-to-one and onto. First suppose that T (f ) = 0; that is, suppose that f ∈ C [a, b] and that f b−a d−c (x − c) + a = 0 for all x ∈ [c, d]. Let y = b−a d−c (x − c) + a; then x ∈ [c, d] means y ∈ [a, b], and we get f (y) = 0 for all y ∈ [a, b]. But this means that f = 0 in C [a, b]. Thus T is one-to-one. To see that it is onto, choose f ∈ C [c, d], and let g(x) = f (T (g))(x) = g d−c b−a (x − a) + c . Then d−c b−a b−a (x − c) + a = f (x − c) + a − a + c = f (x), d−c b−a d−c so that T (g) = f and thus T is onto. Since T is one-to-one and onto, it is an isomorphism. 33. (a) Suppose that (S ◦ T )(u) = 0. Then (S ◦ T )(u) = S(T u) = 0. Since S is one-to-one, we must have T u = 0. But then since T is one-to-one, it follows that u = 0. Thus S ◦ T is one-to-one. (b) Choose w ∈ W . Since S is onto, we can find some v ∈ V such that Sv = w. But T is onto as well, so there is some u ∈ U such that T u = v. Then (S ◦ T )(u) = S(T u) = Sv = w, so that S ◦ T is onto. 34. (a) Suppose that T u = 0. Then since S0 = 0, we have (S ◦ T )u = S(T u) = S0 = 0. But S ◦ T is one-to-one, so we must have u = 0, so that T is one-to-one as well. Intuitively, if T takes two elements of U to the same element of V , then certainly S ◦ T takes those two elements of U to the same element of W . (b) Choose w ∈ W . Since S ◦ T is onto, there is some u ∈ U such that (S ◦ T )u = w. Let v = T u. Then Sv = S(T u) = (S ◦ T )u = w, so that S is onto. Intuitively, if S ◦ T “hits” every element of W , then surely S must. 35. (a) Suppose that T : V → W and that dim V < dim W . Then range(T ) is a subspace of W , and the Rank Theorem asserts that dim range(T ) = dim V − nullity(T ) ≤ dim V < dim W. Theorem 6.11(b) in Section 6.2 states that if U is a subspace of W , then dim U = dim W if and only if U = W . But here dim range(T ) 6= dim W , and range(T ) is a subspace of W , so that range(T ) 6= W and T cannot be onto. 506 CHAPTER 6. VECTOR SPACES (b) We want to show that ker(T ) 6= {0}. Since range(T ) is a subspace of W , then dim W ≥ dim range(T ) = rank(T ). Then the Rank Theorem asserts that dim W + nullity(T ) ≥ rank(T ) + nullity(T ) = dim V, so that dim ker(T ) = nullity(T ) ≥ dim V − dim W > 0. Thus ker(T ) 6= {0}, so that T is not one-to-one. 36. Since dim Pn = dim Rn+1 = n + 1, we need only show that T is a one-to-one linear transformation. It is linear since         q(a0 ) p(a0 ) p(a0 ) + q(a0 ) (p + q)(a0 )  (p + q)(a1 )   p(a1 ) + q(a1 )   p(a1 )   q(a1 )          (T (p + q))(x) =   =  ..  +  ..  = (T (p))(x) + (T (q))(x) = .. ..   .   .     . . q(an ) p(an ) p(an ) + q(an ) (p + q)(an )       p(a0 ) cp(a0 ) (cp)(a0 )  p(a1 )   (cp)(a1 )   cp(a1 )        (T (cp))(x) =   =  ..  = c  ..  = c(T (p))(x). ..      .  . . p(an ) cp(an ) (cp)(an ) Now suppose T (p) = 0. That means that p(a0 ) = p(a1 ) = · · · = p(an ) = 0, so that p has n + 1 zeroes since the ai are distinct. But Exercise 62 in Section 6.2 implies that p(x) must be the zero polynomial, so that ker(T ) = {0} and thus T is one-to-one. 37. By the Rank Theorem, since T : V → V and T 2 : V → V , we have rank(T ) + nullity(T ) = dim V = rank(T 2 ) + nullity(T 2 ). Since rank(T ) = rank(T 2 ), it follows that nullity(T ) = nullity(T 2 ), and thus that dim ker(T ) = dim ker(T 2 ). So if we show ker(T ) ⊆ ker(T 2 ), we can conclude that ker(T ) = ker(T 2 ). Choose v ∈ ker(T ); then T 2 v = T (T v) = T 0 = 0. Hence ker(T ) = ker(T 2 ). Now choose v ∈ range(T ) ∩ ker(T ); we want to show v = 0. Since v ∈ range(T ), there is w ∈ V such that v = T w. Then since v ∈ ker(T ), 0 = T v = T (T w) = T 2 w, so that w ∈ ker(T 2 ) = ker(T ). So w is also in ker(T ). As a result, v = T w = 0, as we were to show. 38. (a) We have T ((u1 , w1 ) + (u2 , w2 )) = T (u1 + u2 , w1 + w2 ) = u1 + u2 − w1 − w2 = (u1 − w1 ) + (u2 − w2 ) = T (u1 , w1 ) + T (u2 , w2 ) T (c(u, w)) = T (cu, cw) = cu − cw = c(u − w) = cT (u, w). (b) By definition, U + W = {u + w : u ∈ U, w ∈ W }. So if u + w ∈ U + W , we have T (u, −w) = u + w, so that U + W ⊂ range(T ). But since T (u, w) = u − w ∈ U + W , we also have range(T ) ⊂ U + W . Thus range(T ) = U + W . 6.6. THE MATRIX OF A LINEAR TRANSFORMATION 507 (c) Suppose v = (u, w) ∈ ker(T ). Then u − w = 0, so that u = w and thus u ∈ U ∩ W , and v = (u, u). So ker(T ) = {(u, u) : u ∈ U } ⊂ U × W is a subspace of U × W . Then S : U ∩ W → ker(T ) : u 7→ (u, u) is clearly a linear transformation that is one-to-one and onto. Thus S is an isomorphism between U ∩ W and ker T . (d) From Exercise 43(b) in Section 6.2, dim(U × W ) = dim U + dim W . Then the Rank Theorem gives rank(T ) + nullity(T ) = dim(U × W ) = dim U + dim W. But from parts (a) and (b), we have rank(T ) = dim range(T ) = dim(U + W ), nullity(T ) = dim ker(T ) = dim(U ∩ W ). Thus rank(T ) + nullity(T ) = dim(U + W ) + dim(U ∩ W ) = dim U + dim W. Rearranging this equation gives dim(U + W ) = dim U + dim W − dim(U ∩ W ). 6.6 The Matrix of a Linear Transformation 1. Computing T (v) directly gives T (v) = T (4 + 2x) = 2 − 4x. To compute the matrix [T ]C←B , we compute the value of T on each basis element of B: [T (1)]C = [0 − x]C = 0 , −1 [T (x)]C = [1 − 0x]C = 1 0 ⇒ [T ]C←B = Then 0 [T ]C←B [4 + 2x]B = −1 [T (1)]C [T (x)]C = 0 −1 1 . 0 1 4 2 = = [2 − 4x]C = [T (4 + 2x)]C . 0 2 −4 2. Computing T (v) directly gives (as in Exercise 1) T (v) = T (4 + 2x) = 2 − 4x. In terms of the basis B, we have 4 + 2x = 3(1 + x) + (1 − x), so that 3 [4 + 2x]B = . 1 Now, 1 [T (1 + x)]C = [1 − x]C = , −1 Then −1 [T (1 − x)]C = [−1 − x]C = , ⇒ −1 1 [T ]C←B = [T (1 + x)]C [T (1 − x)]C = −1 [T ]C←B [4 + 2x]B = 1 −1 −1 3 2 = = [2 − 4x]C = [T (4 + 2x)]C . −1 1 −4 −1 . −1 508 CHAPTER 6. VECTOR SPACES 3. Computing T (v) directly gives T (v) = T (a + bx + cx2 ) = a + b(x + 2) + c(x + 2)2 . In terms of the standard basis B, we have   a [a + bx + cx2 ]B =  b  . c Now,   1 [T (1)]C = [1]C = 0 , 0   0 [T (x)]C = [x + 2]C = 1 , 0 [T ]C←B =   0 [T (x2 )]C = [(x + 2)2 ]C = 0 1 [T (1)]C [T (x)]C [T (x2 )]C ⇒  1 = 0 0 0 1 0  0 0 . 1 Then  1 [T ]C←B [a + bx + cx2 ]B = 0 0 0 1 0     a a 0 0  b  =  b  = [a + b(x + 2) + c(x + 2)2 ]C = [T (a + bx + cx2 )]C . c c 1 4. Computing T (v) directly gives T (v) = T (a + bx + cx2 ) = a + b(x + 2) + c(x + 2)2 = (a + 2b + 4c) + (b + 4c)x + cx2 . To compute [a + bx + cx2 ]B , we want to solve a + bx + cx2 = r + s(x + 2) + t(x + 2)2 = (r + 2s + 4t) + (s + 4t)x + tx2 for r, s, and t. The x2 terms give t = c; then from the linear term b = s + 4c, so that s = b − 4c; finally, a = r + 2(b − 4c) + 4c = r + 2b − 4c, so that r = a − 2b + 4c. Hence   a − 2b + 4c [a + bx + cx2 ]B =  b − 4c  . c Now,   1 [T (1)]C = [1]C = 0 , 0   4 [T (x + 2)]C = [x + 4]C = 1 , 0   16 [T ((x + 2)2 )]C = [(x + 4)2 ]C = [x2 + 8x + 16]C =  8  , 1 so that [T ]C←B = [T (1)]C [T (x + 2)]C [T ((x + 2)2 )]C  1 = 0 0 4 1 0  16 8 . 1 6.6. THE MATRIX OF A LINEAR TRANSFORMATION 509 Then    1 4 16 a − 2b + 4c [T ]C←B [a + bx + cx2 ]B = 0 1 8   b − 4c  0 0 1 c   (a − 2b + 4c) + 4(b − 4c) + 16c  (b − 4c) + 8c = c   a + 2b + 4c =  b + 4c  c = [(a + 2b + 4c) + (b + 4c)x + cx2 ]C = [T (a + bx + cx2 )]C . 5. Directly, T (a + bx + cx2 ) = a . a+b+c Using the standard basis B, we have   a 2  [a + bx + cx ]B = b  . c Now, [T (1)]C = 1 1 = , 1 1 C [T (x)]C = 0 0 = , 1 1 C [T (x2 )]C = so that [T ]C←B = [T (1)]C [T (x)]C 2 [T (x )]C 1 = 1 0 1 0 0 = , 1 1 C 0 . 1 Then 1 [T ]C←B [a + bx + cx2 ]B = 1   a a a 0 0   b = = [T (a + bx + cx2 )]C . = a+b+c a+b+c C 1 1 c 6. Directly, as in Exercise 5, T (a + bx + cx2 ) = a . a+b+c Using the basis B, we have   c [a + bx + cx2 ]B =  b  . a Now, [T (x2 )]C = 0 −1 = , 1 C 1 [T (x)]C = 0 −1 = , 1 C 1 so that [T ]C←B = [T (x2 )]C [T (x)]C [T (1)]C = −1 [T (1)]C = 1 1 0 = , 1 C 1 −1 0 . 1 1 Then −1 2 [T ]C←B [a + bx + cx ]B = 1   c −1 0   −(b + c) a b = = = [T (a + bx + cx2 )]C . a+b+c a+b+c C 1 1 a 510 CHAPTER 6. VECTOR SPACES 7. Computing directly gives     −7 + 2 · 7 7 −7 Tv = T =  −(−7)  = 7 . 7 7 7 −7 We first compute the coordinates of in B. We want to solve 7 −7 1 3 =r +s 7 2 −1 for r and s; that is, we want r and s such that −7 = r + 3s and 7 = 2r − s. Solving this pair of equations gives r = 2 and s = −3. Thus −7 2 = . 7 B −3 Now,     5 6 1 = −1 = −3 , T 2 C 2 C 2 so that 1 T [T ]C←B = 2 C Then  6 −7 [T ]C←B = −3 7 B 2     1 4 3 = −3 = −2 , T −1 C −1 C −1  6 3 T = −3 −1 C 2  4 −2 . −1      4 7 0 −7 2      −2 . = 0 = 7 = T 7 C −3 −1 7 C 7 8. Computing directly gives (as in the statement of Exercise 7)   a + 2b a Tv = T =  −a  . b b a We first compute the coordinates of in B. We want to solve b a 1 3 =r +s b 2 −1 for r and s; that is, we want r and s such that a = r + 3s and b = 2r − s. Solving this pair of equations gives r = 71 (a + 3b) and s = 17 (2a − b). Thus " # " # 1 a (a + 3b) 7 = 1 . b 7 (2a − b) B Now,     6 5 1 T = −1 = −3 , 2 C 2 C 2 so that 1 T [T ]C←B = 2 C     1 4 3 T = −3 = −2 , −1 C −1 C −1  6 3 T = −3 −1 C 2  4 −2 . −1 6.6. THE MATRIX OF A LINEAR TRANSFORMATION 511 Then   " # # 6 4 "1 a  7 (a + 3b)  [T ]C←B = −3 −2 1 b 7 (2a − b) B 2 −1   4 6 7 (a + 3b) + 7 (2a − b)   = − 37 (a + 3b) − 27 (2a − b) 1 2 7 (a + 3b) − 7 (2a − b)     " " ## a + 2b 2a + 2b a     . =  −a − b  =  −a  = T b C b b C 9. Computing directly gives T v = T A = AT = a b c . d Since A = aE11 + bE12 + cE21 + dE22 , the coordinate vector of A in B is   a b  [A]B =  c . d Now,     1 1     0 T  = 0 , [T E11 ]C = [E11 ]C = [E11 ]C =  0 0 0 0 C   0  0 0 0 T =  , [T E12 ]C = [E12 ]C = [E21 ]C = 1 0 C 1 0   0  1 0 1 T =  , [T E21 ]C = [E21 ]C = [E12 ]C = 0 0 C 0 0   0  0 0 0 T . [T E22 ]C = [E22 ]C = [E22 ]C = = 0 1 C 0 1 Therefore [T ]C←B = Then [T E11 ]C  1 0 [T ]C←B [A]B =  0 0 [T E12 ]C 0 0 1 0 0 1 0 0 [T E21 ]C  1 0 [T E22 ]C =  0 0     0 a a b c 0   =   = a 0  c   b  b 1 d d 0 0 1 0 0 1 0 0  0 0 . 0 1 c = [T A]C . d C 512 CHAPTER 6. VECTOR SPACES 10. Computing directly gives a Tv = TA = A = b T c . d Since A = aE11 + bE12 + cE21 + dE22 , the coordinate vector of A in B is   d c  [A]B =  b . a Now,   0 0 0 =  , 1 C 1 0   1  0 1 0 T =  , [T E21 ]C = [E21 ]C = [E12 ]C = 0 0 C 0 0   0  0 0 1 T [T E12 ]C = [E12 ]C = [E21 ]C = =  , 1 0 C 0 0     0 1     0 T  0 [T E11 ]C = [E11 ]C = [E11 ]C =  0 = 0 . 1 0 C 0 T [T E22 ]C = [E22 ]C = [E22 ]C = 0 Therefore [T ]C←B = [T E22 ]C [T E21 ]C [T E12 ]C  0 0 [T E11 ]C =  1 0 1 0 0 0 0 1 0 0  0 0 . 0 1 Then  0 0 [T ]C←B [A]B =  1 0 1 0 0 0 0 1 0 0     0 d c c b 0   =   = a 0  b   d  b 1 a a c = [T A]C . d C 11. Computing directly gives Tv = TA = a c b 1 d −1 −1 1 − 1 −1 −1 a b 1 c d a−b = c−d b−a a−c b−d c−b − = d−c c−a d−b a−d Since A = aE11 + bE12 + cE21 + dE22 , the coordinate vector of A in B is   a b  [A]B =  c . d d−a . b−c 6.6. THE MATRIX OF A LINEAR TRANSFORMATION 513 Now,   0 −1 −1  , = 0 C  1 0   −1  0 −1 0  , [T E12 ]C = [E12 B − BE12 ]C = = 0 1 C  0 1   1  0 1 0 [T E21 ]C = [E21 B − BE21 ]C = =  , 0 −1 C  0 −1   0  1 0 1 [T E22 ]C = [E22 B − BE22 ]C = =  . −1 0 C −1 0 0 [T E11 ]C = [E11 B − BE11 ]C = 1 Therefore  [T ]C←B = [T E11 ]C [T E12 ]C [T E21 ]C 0 −1 [T E22 ]C =   1 0  −1 1 0 0 0 1 . 0 0 −1 1 −1 0 Then  0 −1  [T ]C←B [A]B =  1 0     c−b −1 1 0 a     0 0 1   b  = d − a = c − b a−d 0 0 −1  c  a − d b−c d 1 −1 0 d−a = [T A]C . b−c C 12. Computing directly gives a Tv = TA = c b a − d b c 0 b−c = . d c−b 0 Using the given basis B for M22 , the coordinate vector of A is   a b  [A]B =  c . d Now,   0 0 T  [T E11 ]C = [E11 − E11 ]C =  0 , 0   0  1 T  [T E12 ]C = [E12 − E12 ]C = [E12 − E21 ]C =  −1 , 0     0 0 −1 0 T T    [T E21 ]C = [E21 − E21 ]C = [E21 − E12 ]C =   1 , [T E22 ]C = [E22 − E22 ]C = 0 . 0 0 514 CHAPTER 6. VECTOR SPACES Therefore [T ]C←B = Then [T E22 ]C [T E21 ]C  0 0 [T E11 ]C =  0 0 [T E12 ]C  0 0 0 1 −1 0 . −1 1 0 0 0 0     0 0 0 0 a     0 b−c 1 −1 0   b  = b − c = = [T A]C . c−b 0 C −1 1 0  c  c − b 0 d 0 0 0  0 0 [T ]C←B [A]B =  0 0 13. (a) If a sin x + b cos x ∈ W , then D(a sin x + b cos x) = D(a sin x) + D(b cos x) = aD sin x + bD cos x = a cos x − b sin x ∈ W. (b) Since [sin x]B = 1 , 0 [cos x]B = 0 , 1 then [D sin x]B = [cos x]B = 0 , 1 [D cos x]B = [− sin x]B = −1 , 0 ⇒ [D]B = 0 1 −1 . 0 3 (c) [f (x)]B = [3 sin x − 5 cos x]B = , so −5 [D(f (x))]B = [D]B [f (x)]B = 0 1 −1 3 5 = , 0 −5 3 so that D(f (x)) = f 0 (x) = 5 sin x + 3 cos x. Directly, since (sin x)0 = cos x and (cos x)0 = − sin x, we also get f 0 (x) = 3 cos x − 5(− sin x) = 5 sin x + 3 cos x. 14. (a) If ae2x + be−2x ∈ W , then D(ae2x + be−2x ) = D(ae2x ) + D(be−2x ) = aDe2x + bDe−2x = (2a)e2x + (−2b)e−2x ∈ W. (b) Since 1 [e ]B = , 0 2x [e −2x 0 ]B = , 1 then [De2x ]B = [2e2x ]B = 2 , 0 (c) [f (x)]B = [e2x − 3e−2x ]B = [De−2x ]B = [−2e−2x ]B = 0 , −2 ⇒ [D]B = 2 0 0 . −2 1 , so −3 2 [D(f (x))]B = [D]B [f (x)]B = 0 0 1 2 = , −2 −3 6 so that D(f (x)) = f 0 (x) = 2e2x + 6e−3x . Directly, f 0 (x) = e2x · 2 − 3e−2x · (−2) = 2e2x + 6e−3x . 15. (a) Since   1 [e2x ]B = 0 , 0   0 [e2x cos x]B = 1 , 0   0 [e2x sin x]B = 0 , 1 6.6. THE MATRIX OF A LINEAR TRANSFORMATION 515 then   2 [De2x ]B = [2e2x ]B = 0 , 0   0 [D(e2x cos x)]B = [2e2x cos x − e2x sin x]B =  2 , −1   0 [D(e2x sin x)]B = [2e2x sin x + e2x cos x]B = 1 . 2 Thus  2 [D]B = 0 0  0 0 2 1 . −1 2  3 (b) [f (x)]B = [3e2x − e2x cos x + 2e2x sin x]B = −1, so 2      3 6 0 0 2 1 −1 = 0 , −1 2 2 5  2 [D(f (x))]B = [D]B [f (x)]B = 0 0 so that D(f (x)) = f 0 (x) = 6e2x + 5e2x sin x. Directly, f 0 (x) = 3e2x · 2 − (2e2x cos x − e2x sin x) + (4e2x sin x + 2e2x cos x) = 6e2x + 5e2x sin x. 16. (a) Since   1 0  [cos x]B =  0 , 0   0 1  [sin x]B =  0 , 0   0 0  [x cos x]B =  1 , 0   0 0  [x sin x]B =  0 , 1 then     0 1 −1 0    [D cos x]B = [− sin x]B =   0 , [D sin x]B = [cos x]B = 0 , 0 0     1 0  0 1    [D(x cos x)]B = [cos x − x sin x]B =   0 , [D(x sin x)]B = [sin x + x cos x]B = 1 . −1 0 Thus  0 −1 [D]B =   0 0 1 0 0 0  1 0 0 1 . 0 1 −1 0 516 CHAPTER 6. VECTOR SPACES   1 0  , so (b) [f (x)]B = [cos x + 2x cos x]B =   2 0  0 −1 [D(f (x))]B = [D]B [f (x)]B =   0 0 1 0 0 0     2 1 0 1 0 −1 0 1   =  , 0 1 2  0 −2 −1 0 0 so that D(f (x)) = f 0 (x) = 2 cos x − sin x − 2x sin x. Directly, f 0 (x) = − sin x + 2 cos x − 2x sin x = 2 cos x − sin x − 2x sin x. 17. (a) Let p(x) = a + bx; then (S ◦ T )p(x) = S(T p(x)) = S Then p(0) a a − 2(a + b) −a − 2b =S = = . p(1) a+b 2a − (a + b) a−b −1 −1 = , [(S ◦ T )(1)]D = 1 1 D so that −2 −2 = , [(S ◦ T )(x)]D = −1 −1 D −1 [S ◦ T ]D←B = 1 −2 . −1 (b) We have [T (1)]C = 1 1 = , 1 C 1 so that [T ]C←B = [T (x)]C = 1 1 0 0 = , 1 C 1 0 . 1 Next, 1 1 1 S = = , 2 0 D 2 D so that −2 0 −2 = , S = −1 1 D −1 D 1 [S]D←C = 2 Then [S ◦ T ]D←B = [S]D←C [T ]C←B = 1 2 −2 . −1 −2 1 −1 1 0 −1 = 1 1 −2 , −1 which matches the result from part (a). 18. (a) Let p(x) = a + bx; then (S ◦ T )p(x) = S(T p(x)) = S(a + b(x + 1)) = S((a + b) + bx) = (a + b) + b(x + 1) = (a + 2b) + bx. Then so that     1 1 [(S ◦ T )(1)]D = 0 = 0 , 0 D 0     2 2 [(S ◦ T )(x)]D = 1 = 1 , 0 D 0  1 [S ◦ T ]D←B = 0 0  2 1 . 0 6.6. THE MATRIX OF A LINEAR TRANSFORMATION 517 (b) We have     1 1 [T (1)]C = [1]C = 0 = 0 , 0 C 0 so that     1 1 [T (x)]C = [x + 1]C = 1 = 1 , 0 C 0  1 [T ]C←B = 0 0  1 1 . 0 Next,   1 [S(1)]D = [1]D = 0 , 0   1 [S(x)]D = [x + 1]D = 1 , 0   1 [S(x2 )]D = [(x + 1)2 ]D = [x2 + 2x + 1]D = 2 , 1 so that Then  1 [S]D←C = 0 0  1 [S ◦ T ]D←B = [S]D←C [T ]C←B = 0 0  1 2 . 1 1 1 0 1 1 0  1 1 2 0 0 1   1 1 1 = 0 0 0  2 1 , 0 which matches the result from part (a). 19. In Exercise 1, the basis for P1 in both the domain and codomain was already the standard basis E, so we can use the matrix computed there: 0 1 [T ]E←E = . −1 0 Since this matrix has determinant 1, it is invertible, so that T is invertible as well. Then −1 [T −1 (a + bx)]E←E = ([T ]E←E ) −1 1 a 0 = 0 b 1 a 0 = b −1 −1 a −b = . 0 b a So T −1 (a + bx) = −b + ax. 20. From Exercise 5, 1 [T ]C←B = 1 0 1 0 . 1 Since this matrix is not square, it is not invertible, so that T is not invertible. 21. We first compute [T ]E←E , where E = {1, x, x2 } is the standard basis:       1 2 4 [T (1)]E = [1]E = 0 , [T (x)]E = [x + 2]E = 1 , [T (x2 )]E = [(x + 2)2 ]E = [x2 + 4x + 4]E = 4 , 0 0 1 so that  1 [T ]E←E = 0 0 2 1 0  4 4 . 1 518 CHAPTER 6. VECTOR SPACES Since this is an upper triangular matrix with nonzero diagonal entries, it is invertible, so that T is invertible as well. To find T −1 , we have    a 1 −1 [T −1 (a + bx + cx2 )]E←E = ([T ]E←E )  b  = 0 c 0 −1   4 a 4  b  1 c      1 −2 4 a a − 2b + 4c 1 −4  b  =  b − 4c  . = 0 0 0 1 c c 2 1 0 So T −1 (a + bx) = (a − 2b + 4c) + (b − 4c)x + cx2 = a + b(x − 2) + c(x − 2)2 = p(x − 2). 22. We first compute [T ]E←E , where E = {1, x, x2 } is the standard basis:   0 [T (1)]E = [0]E = 0 , 0   1 [T (x)]E = [1]E = 0 , 0   0 [T (x2 )]E = [2x]E = 2 , 0 so that  0 [T ]E←E = 0 0 1 0 0  0 2 . 0 Since this is an upper triangular matrix with a zero entry on the diagonal, it has determinant zero, so is not invertible; therefore, T is not invertible either. 23. We first compute [T ]E←E , where E = {1, x, x2 } is the standard basis:   1 [T (x)]E = [x + 1]E = 1 , 0   1 [T (1)]E = [1 + 0]E = [1]E = 0 , 0   0 [T (x2 )]E = [x2 + 2x]E = 2 , 1 so that  1 [T ]E←E = 0 0 1 1 0  0 2 . 1 Since this is an upper triangular matrix with nonzero diagonal entries, it is invertible, so that T is invertible as well. To find T −1 , we have [T −1 2    a 1 b  = 0 c 0 −1  (a + bx + cx )]E←E = ([T ]E←E ) 1 1 0 So T −1 (a + bx + cx2 ) = (a − b + 2c) + (b − 2c)x + cx2 . −1   0 a 2  b  1 c      1 −1 2 a a − b + 2c 1 −2  b  =  b − 2c  . = 0 0 0 1 c c 6.6. THE MATRIX OF A LINEAR TRANSFORMATION 519 24. We first compute [T ]E←E , where E = {E11 , E12 , E21 , E22 } is the standard basis:   3 2 1 0 3 2 3 2 [T E11 ]E = = =  , 0 0 2 1 E 0 0 E 0 0   2 1 0 1 3 2 2 1 [T E12 ]E = = =  , 0 0 2 1 E 0 0 E 0 0   0 0 0 0 3 2 0 0  [T E21 ]E = = = , 1 0 2 1 E 3 2 E 3 2   0 0 0 0 3 2 0 0  [T E22 ]E = = = , 0 1 2 1 E 2 1 E 2 1 so that  3 2 [T ]E←E =  0 0 2 1 0 0 0 0 3 2  0 0 . 2 1 This matrix is block triangular, and each diagonal block is invertible, so it is invertible and therefore T is invertible as well. To find T −1 , we have −1 a T c So T −1 a c    3 a  2 b −1    = ([T ]E←E )   c  = 0 E←E 0 d b d b −a + 2b = d −c + 2d 2 1 0 0 0 0 3 2 −1   a 0 b 0    2  c  d 1      −a + 2b a −1 2 0 0      2 −3 0 0   b  =  2a − 3b  . =  0 0 −1 2  c  −c + 2d 2c − 3d d 0 0 2 −3 2a − 3b . 2c − 3d 25. From Exercise 11, since the bases given there were both the standard basis,   0 −1 1 0 −1 0 0 1 . [T ]E←E =   1 0 0 −1 0 1 −1 0 This matrix has determinant zero (for example, the second and third rows are multiples of one another), so that T is not invertible. 26. From Exercise 12, since the bases given there were both the standard basis,   0 0 0 0 0 1 −1 0 . [T ]E←E =  0 −1 1 0 0 0 0 0 This matrix has determinant zero, so that T is not invertible. 520 CHAPTER 6. VECTOR SPACES 27. In Exercise 13, the basis is B = {sin x, cos x}. Using the method of Example 6.83, Z a (a sin x + b cos x) dx = [D]−1 B←B b . B Using [D]B←B from Exercise 13, we get Z −1 0 −1 1 0 1 1 −3 (sin x − 3 cos x) dx = = = . 1 0 −3 −1 0 −3 −1 B R So (sin x − 3 cos x) dx = −3 sin x − cos x + C. This is the same answer as integrating directly. 28. In Exercise 14, the basis is B = {e2x , e−2x }. Using the method of Example 6.83, Z a ae2x + be−2x dx = [D]−1 B←B b . B Using [D]B←B from Exercise 14, we get #" # " # " #−1 " # " Z 1 0 0 0 2 0 0 −2x 2 = . 5e dx = = 1 5 0 −2 5 −2 0 −2 5 B R So 5e−2x dx = − 52 e−2x + C. This is the same answer as integrating directly. 29. In Exercise 15, the basis is B = {e2x , e2x cos x, e2x sin x}. Using the method of Example 6.83, Z a ae2x + be−2x dx = [D]−1 B←B b . B Using [D]B←B from Exercise 15, we get  −1        1 2 0 0 0 0 0 0 0 2          e2x cos x − 2e2x sin x dx = 0 2 1  1 =  0 52 − 15   1 =  54  . B 2 0 −1 2 0 15 − 35 −2 −2 5 R 2x So e cos x − 2e2x sin x dx = 45 e2x cos x − 53 e2x sin x + C. Checking via differentiation gives 0 3 3 2x 4 4 2x e cos x − e sin x + C = 2e2x cos x − e2x sin x − 2e2x sin x + e2x cos x 5 5 5 5 Z = e2x cos x − 2e2x sin x. 30. In Exercise 16, the basis is B = {cos x, sin x, x cos x, x sin x}. Using the method of Example 6.83,   a Z b −1  (a cos x + b sin x + cx cos x + dx sin x) dx = [D]B←B  c . B d Using [D]B←B from Exercise 16, we get  −1        0 1 1 0 0 0 −1 1 0 0 1 Z −1 0  0 1  0  1 0 1 0 0 1   =   =   (x cos x + x sin x) dx =   0 0 0 1 1 0 0 0 −1 1 −1 B 0 0 −1 0 1 0 0 1 0 1 1 R So (x cos x + x sin x) dx = cos x + sin x − x cos x + x sin x + C. Checking via differentiation gives 0 (cos x + sin x − x cos x + x sin x + C) = − sin x + cos x − cos x + x sin x + sin x + x cos x = x cos x + x sin x. 6.6. THE MATRIX OF A LINEAR TRANSFORMATION 31. With E = {e1 , e2 }, we have −4 · 0 0 0 [T e1 ]E = = = , 1+5·0 E 1 E 1 Therefore [T ]E = 521 [T e2 ]E = 0 1 −4 · 1 −4 −4 = = . 0+5·1 E 5 E 5 −4 . 5 Using the method of Example 6.86(b), the eigenvectors of [T ]E will give us a diagonalizing basis C. Using methods from Chapter 4, we find the eigenvalues and corresponding eigenvectors to be −1 −4 λ1 = 4, v1 = , λ2 = 1, v2 = . 1 1 Thus C= −1 −4 −1 , , and [T ]C = 1 1 1 32. With E = {e1 , e2 }, we have 1 1 1−0 = , = [T e1 ]E = 1 1 E 1+0 E Therefore [T ]E = −1 −4 −1 [T ]E 1 1 [T e2 ]E = 1 1 −4 4 = 1 0 0 1 −1 −1 0−1 = . = 1 1 E 0+1 E −1 . 1 Using the method of Example 6.86(b), the eigenvectors of [T ]E will give us a diagonalizing basis C. Using methods from Chapter 4, we find the eigenvalues and corresponding eigenvectors to be i −i λ1 = 1 + i, v1 = , λ2 = 1 − i, v2 = . 1 1 Since the eigenvectors are not real, T is not diagonalizable. 33. With E = {1, x}, we have 4 4 = , [T (1)]E = [4 + x]E = 1 1 E Therefore 2 2 = . [T (x)]E = [2 + 3x]E = 3 3 E 4 [T ]E = 1 2 . 3 Using the method of Example 6.86(b), the eigenvectors of [T ]E will give us a diagonalizing basis C. Using methods from Chapter 4, we find the eigenvalues and corresponding eigenvectors to be 2 −1 λ1 = 5, v1 = , λ2 = 2, v2 = . 1 1 Thus C= 2 −1 2 , , and [T ]C = 1 1 1 −1 −1 2 [T ]E 1 1 −1 5 = 1 0 0 . 2 34. With E = {1, x, x2 }, we have     1 1 [T (1)]E = [1]E = 0 = 0 , 0 E 0     1 1 [T (x)]E = [x + 1]E = 1 = 1 , 0 E 0     1 1 [T (x2 )]E = [(x + 1)2 ]E = [x2 + 2x + 1]E = 2 = 2 . 1 E 1 522 CHAPTER 6. VECTOR SPACES Therefore  1 [T ]E = 0 0 1 1 0  1 2 . 1 Using the method of Example 6.86(b), the eigenvectors of [T ]E will give us a diagonalizing basis C. Since this is an upper triangular matrix with 1s on the diagonal, the only eigenvalue is 1; using methods from   1 Chapter 4 we find the eigenspace to be one-dimensional with basis 0. Thus T is not diagonalizable. 0 35. With E = {1, x}, we have 1 1 [T (1)]E = [1 + x · 0]E = [1]E = = , 0 E 0 0 0 [T (x)]E = [x + x · 1]E = [2x]E = = . 2 E 2 Therefore [T ]E = 1 0 0 . 2 Thus T is already diagonal with respect to the standard basis E. 36. With E = {1, x, x2 }, we have     1 1 [T (1)]E = [1]E = 0 = 0 , 0 E 0     2 2 [T (x)]E = [3x + 2]E = 3 = 3 , 0 E 0    4 9 [T (x2 )]E = [(3x + 2)2 ]E = [9x2 + 12x + 4]E = 12 = 12 . 9 E 4  Therefore  1 [T ]E = 0 0 2 3 0  4 12 . 9 Using the method of Example 6.86(b), the eigenvectors of [T ]E will give us a diagonalizing basis C. Since this is an upper triangular matrix with 1s on the diagonal, the eigenvalues are 1, 3, and 4; using methods from Chapter 4 we find the corresponding eigenvectors to be       1 1 1 λ1 = 1, v1 = 0 , λ2 = 3, v2 = 1 , λ3 = 9, v3 = 2 . 0 0 1 Thus        1 1  1  1 C = 0 , 1 , 2 , and [T ]C = 0   0 0 1 0 1 1 0 −1  1 1 2 [T ]E 0 1 0 1 1 0   1 1 2 = 0 1 0 0 3 0  0 0 . 9 d1 −d2 , and let T be the reflection in `. Let d0 = . As in Example d2 d1 6.85, we may as well assume for simplicity that d is a unit vector; then d0 is as well. Continuing 37. Let ` have direction vector d = 6.6. THE MATRIX OF A LINEAR TRANSFORMATION 523 to follow Example 6.85, D = {d, d0 } is a basis for R2 . Since d0 is orthogonal to d, it follows that T d0 = −d0 . Thus 1 0 [T (d)]D = [d]D = , [T (d0 )]D = [−d0 ]D = . 0 −1 Thus 1 [T ]D = 0 0 . −1 Now, [T ]E = PE←D [T ]D PD←E , so we must find PE←D . The columns of this matrix are the expression of the basis vectors of D in the standard basis E, so we have " # d1 d2 PE←D = . d2 −d1 Since we assumed d was a unit vector, this is an orthogonal matrix, so that −1 T [T ]E = PE←D [T ]D PD←E = PE←D [T ]D PE←D = PE←D [T ]D PE←D d d2 1 0 d1 = 1 d2 −d1 0 −1 d2 2 d2 d − d22 = 1 −d1 2d1 d2 2d1 d2 . d22 − d21 Note that this is the same as the answer from Exercise 26 in Section 3.6, with the assumption that d is a unit vector. 38. Let T be the projection onto W . From Example 5.11, we have an orthogonal basis for R2 ,        1  −1 1  u1 = 1 , u2 =  1 , u3 = −1 ,   2 1 0 where {u1 , u2 } is a basis for W and {u3 } is a basis for W ⊥ . Then       0 0 1 so that [T (u1 )]D = 0 , [T (u2 )]D = 1 , [T (u3 )]D = 0 , 0 0 0  1 [T ]D = 0 0 0 1 0  0 0 . 0 Further, since the columns of PE←D are the expression of the basis vectors of D in the standard basis E,     1 1 0 1 −1 1 2 2    −1 1 1 . PE←D = 1 = − 13 1 −1 ⇒ PE←D 3 3 1 1 1 −6 3 0 1 2 6 Thus −1 [T ]E = PE←D [T ]D PD←E = PE←D [T ]D PE←D    1 1 1 −1 1 1 0 0 2 2    1 1 = 1 1 −1 0 1 0 − 3 3 1 0 1 2 0 0 0 − 61 6   5 0 6  1 1 = 6 3 1 − 13 3 1 6 5 6 1 3   3 If v = −1, then 2  5 6  1 projW v = [T ]E [v]E =  6 − 13 1 6 5 6 1 3    5 3 3      1 =  13  , 3  −1 1 − 23 2 3 − 31  which is the same as was found in Example 11 in Section 5.2. − 31  1 . 3 1 3 524 CHAPTER 6. VECTOR SPACES 39. Let B = {v1 , . . . , vn }, and suppose that A is such that A[v]B = [T v]C for all v ∈ V . Then A[vi ]B = [T vi ]C . But [vi ]B = ei since the vi are the basis vectors of B, so this equation says that Aei = [T vi ]C . The left-hand side of this equation is the ith column of A, while the right-hand side is the ith column of [T ]C←B . Thus those ith columns are equal; since this holds for all i, we get A = [T ]C←B . 40. Suppose that [v]B ∈ null([T ]C←B ). That means that 0 = [T ]C←B [v]B = [T v]C , so that [T v]C = 0 and thus v ∈ ker(T ). Similarly, if v ∈ ker(T ), then 0 = [T v]C = [T ]C←B [v]B , so that [v]B ∈ null([T ]C←B ). Then the map V → Rn : v 7→ [v]B is an isomorphism mapping ker(T ) to null([T ]C←B ), so the two subspaces are isomorphic and thus have the same dimension. So nullity(T ) = nullity([T ]C←B ). 41. Suppose B = {b1 , . . . , bn }. Then [T ]C←B [bi ]B = [T (bi )]C by definition of the matrix of T with respect to these bases, and denote by ai the columns of [T ]C←B . Then rank([T ]C←B ) = dim col[T ]C←B , which is the number of linearly independent ai , and rank(T ) = dim range(T ), which is the number of linearly independent T (vi ). Now, since ai = [T ]C←B [bi ]B = [T (bi )]C we see that {ai1 , ai2 , . . . , aim } are linearly independent if and only if {T (bi1 ), . . . , T (b)im } are linearly independent, so col(A) = span(ai1 , ai2 , . . . , aim ) if and only if range(T ) = span(T (bi1 ), . . . , T (b)im ) and therefore rank(A) = rank(T ). 42. T is diagonalizable if and only if there is some basis C and matrix P such that [T ]C = P −1 AP is diagonal. Thus if T is diagonalizable, so is A. For the reverse, if A is diagonalizable, then the basis C whose elements are the columns of P is a basis in which [T ]C is diagonal. 43. Let dim V = n and dim W = m. Then [T ]C←B is an m × n matrix, and by Theorem 3.26 in Section 3.5, rank([T ]C←B ) + nullity([T ]C←B ) = n. But by Exercises 40 and 41, rank([T ]C←B ) = rank(T ) and nullity([T ]C←B ) = nullity(T ); since dim V = n, we get dim V = rank(T ) + nullity(T ). 44. Claim that [T ]C 0 ←B0 = PC 0 ←C [T ]C←B PB←B0 . To prove this, it is enough to show that for all x ∈ V , [x]C 0 = PC 0 ←C [T ]C←B PB←B0 [x]B0 since the matrix of T with respect to B 0 and C 0 is unique by Exercise 39. And indeed PC 0 ←C [T ]C←B PB←B0 [x]B0 = PC 0 ←C [T ]C←B (PB←B0 [x]B0 ) = PC 0 ←C ([T ]C←B [x]B ) = PC 0 ←C [x]C = [x]C 0 . 45. For T ∈ L(V, W ), define ϕ(T ) = [T ]C←B ∈ Mmn . Claim that ϕ is an isomorphism. We must first show it is linear: ϕ(S + T ) = [(S + T )]C←B = [(S + T )b1 ]C · · · [(S + T )bn ]C = [Sb1 ]C + [T b1 ]C · · · [Sbn ]C + [T bn ]C = [Sb1 ]C · · · [Sbn ]C + [T b1 ]C · · · [T bn ]C = ϕ(S) + ϕ(T ). Exploration: Tiles, Lattices, and the Crystallographic Restriction 525 Similarly ϕ(cS) = [(cS)]C←B = [(cS)b1 ]C · · · [(cS)bn ]C = c[Sb1 ]C · · · c[Sbn ]C = c [Sb1 ]C · · · [Sbn ]C = cϕ(S). Next we show that ker ϕ = 0, so that ϕ is one-to-one. Suppose that ϕ(T ) = 0. That means that [T ]C←B = [0]mn , so that nullity([T ]C←B ) = n. By Exercise 40, this is equal to nullity(T ). Thus nullity(T ) = n = dim V , so that T is the zero linear transformation, showing that ϕ is one-to-one. Finally, since dim Mmn = mn, we will have shown that ϕ is an isomorphism if we prove that dim L(V, W ) = mn. But if {v1 , . . . , vn } and {w1 , . . . , wm } are bases for V and W , then a linear map from V to W sends each vi to some linear combination of the wj , so it is a sum of maps that send vi to wk and the other vi to zero. There are mn of these maps, which are clearly linearly independent, so that dim L(V, W ) = mn. 46. From Exercise 45 with W = R, we have dim W = dim R = 1, so that M1n ∼ = V . Thus V ∗ = L(V, W ) = L(V, R) ∼ = M1n ∼ = V. Exploration: Tiles, Lattices, and the Crystallographic Restriction 1. Let P be the pattern in Figure 6.15. We are given, from the figure, that P +u = P and that P +v = P . Therefore if a is a positive integer, then P + au = (P + u) + (a − 1)u = P + (a − 1)u, and by induction this is just P . Similarly if b is a positive integer then P + bv = (P + v) + (b − 1)v = P + (b − 1)v, which again by induction is equal to P . If a is a negative integer, then since P + (−a)u = P , it follows that P = P − (−a)u = P + au, and similarly for b < 0 and v. 2. The lattice points correspond to centers of the trefoils in Figure 6.15. 1.5 1. 0.5 -3 -2 v u -1 1 2 3 -0.5 -1. -1.5 θ 3. Let θ be the smallest positive angle through which the lattice exhibits rotational symmetry. Let RO denote the operation of rotation through an angle θ > 0 around a center of rotation O, and let P be a pattern or lattice. Now we have nθ θ n RO (P ) = RO (P ) = P, n ∈ Z, n > 0, since rotation through nθ is the same as applying a rotation through θ n times. If n < 0, then rotating through nθ gives a figure which, if rotated back through (−n)θ, is P , so it must be P as well. This shows that P is rotationally symmetric under any integer multiple of θ. Next, let m be the smallest positive integer such that mθ ≥ 360◦ , and suppose that mθ = 360◦ + ϕ. ◦ 360◦ +ϕ ϕ 360◦ is an integer, so Note that RO (P ) = P , so that RO (P ) = RO (P ). We want to show that 360 θ it suffices to show that ϕ = 0. Now, since m is minimal, we know that mθ − ϕ = 360◦ > (m − 1)θ, so that mθ − ϕ > mθ − θ and thus −ϕ > −θ, so that ϕ < θ. But ◦ 360 +ϕ ϕ mθ P = RO (P ) = RO (P ) = RO (P ), and we see that the lattice is rotationally symmetric under a rotation of ψ < θ0 . Since ϕ ≥ 0 and θ is ◦ the smallest positive such angle, it follows that ϕ = 0. Thus θ = 360 m . 526 CHAPTER 6. VECTOR SPACES 4. The smallest positive angle of rotational symmetry in the lattice of Exercise 2 is 60◦ ; this rotational symmetry is also present in the original figure, in Figure 6.14 or 6.15. 5. Some examples are below: -3 -2 n1 n2 n3 1 1 1 v v u 1 -1 v 2 3 -3 -2 u 1 -1 -1 n4 n6 1 1 v u 1 v -3 -2 u 1 -1 2 3 -3 -2 -1 -1 2 3 -3 -2 u 1 -1 2 3 -1 -1 2 3 -1 It is impossible to draw a lattice with 8-fold rotational symmetry, as the rest of this section will show. 6. (a) With θ = 60◦ , we have u= " # 1 0 " , v= 1 √2 3 2 # " # , and [Rθ ]E = " cos θ − sin θ sin θ cos θ = √ − 23 1 √2 3 2 1 2 # . (b) From the discussion in part (a), " Rθ (u) = " Rθ (v) = With B = {u, v}, we have 0 [Rθ (u)]B = [v]B = , 1 √ 1 √2 3 2 − 23 1 √2 3 2 − 23 #" # 1 " = 0 # √ #" 1 2 1 2 1 √2 3 2 = 1 √2 3 2 # = v, " # − 12 √ 3 2 −1 [Rθ (v)]B = [−u + v]B = 1 = −u + v. ⇒ 0 [Rθ ]B = 1 −1 . 1 7. Since the lattice is invariant under Rθ , u and v are mapped to lattice points under this rotation. But each lattice point is by definition au + bv where a and b are integers. Thus a b a b [Rθ (u)]B = [au + cv]B = , [Rθ (v)]B = [bu + dv]B = ⇒ [Rθ ]B = , c d c d and the matrix entries are integers. 8. Let E = {e1 , e2 } and B = {u, v}. Then by Theorem 6.29 in Section 6.6, [Rθ ]E = P −1 [Rθ ]B P . But this means that [Rθ ]E ∼ [Rθ ]B (that is, the two matrices are similar). By Exercise 35 in Section 4.4, this implies that 2 cos θ = tr[Rθ ]E = tr[Rθ ]B = a + d, which is an integer. 6.7. APPLICATIONS 527 9. If 2 cos θ = m is an integer, then m must be 0, ±1, or ±2, so the angles are angles whose cosines are 0, ± 21 , or ±1; these angles are θ = 60◦ , 90◦ , 120◦ , 180◦ , 240◦ , 270◦ , 300◦ , 360◦ , corresponding to n = 1, 2, 3, 4, or 6. These are the only possible n-fold rotational symmetries. 10. Left to the reader. 6.7 Applications 1. Here a = −3, so by Theorem 6.32, {e3t } is a basis for solutions, which therefore have the form y(t) = ce3t . Since y(1) = ce3 = 2, we have c = 2e−3 , so that y(t) = 2e−3 e3t = 2e3t−3 . 2. Here a = 1, so by Theorem 6.32, {e−t } is a basis for solutions, which therefore have the form x(t) = ce−t . Since x(1) = ce−1 = 1, we have c = e, so that x(t) = ee−t = e1−t . 3. y 00 − 7y 0 + 12y = 0 corresponds to the characteristic equation λ2 − 7λ + 12 = (λ − 3)(λ − 4) = 0. Then λ1 = 3 and λ2 = 4, so Theorem 6.33 tells us that the general solution has the form y(t) = c1 e3t + c2 e4t . Then 1 − e4 c = 1 y(0) = 1 c1 + c2 = 1 e3 − e4 ⇒ ⇒ 3 4 e c + e c = 1 y(1) = 1 1 2 e3 − 1 c2 = 3 e − e4 so that y(t) = 1 e3 − e4 1 − e4 e3t + e3 − 1 e4t . 4. x00 +x0 −12x = 0 corresponds to the characteristic equation λ2 +λ−12 = (λ−3)(λ+4) = 0. Then λ1 = 3 and λ2 = −4, so Theorem 6.33 tells us that the general solution has the form x(t) = c1 e3t + c2 e−4t . Then 1 c1 = x(0) = 0 c1 + c2 = 0 7 ⇒ ⇒ 3c1 − 4c2 = 1 1 x0 (0) = 1 c2 = − 7 so that y(t) = 1 3t e − e−4t . 7 √ 5. f 00 − f 0 − f = 0 corresponds to the characteristic equation λ2 − λ − 1 = 0, so that λ1 = 1+2 5 and √ √ √ λ2 = 1−2 5 . Therefore by Theorem 6.33, the general solution is f (t) = c1 e(1+ 5)t/2 + c2 e(1− 5)t/2 . So √ f (0) = 0 f (1) = 1 so that ⇒ c + c2 = 0 √ + e(1− 5)/2 c2 = 1 1 √ e(1+ 5)/2 c1 √ c1 = ⇒ e( 5−1)/2 √ e 5−1 √ e( 5−1)/2 c2 = − √ e 5−1 √ e( 5−1)/2 (1+√5)t/2 f (t) = √ e − e(1− 5)t/2 . e 5−1 528 CHAPTER 6. VECTOR SPACES 6. g 00 − 2g = 0 corresponds to the characteristic equation λ2 − 2√= 0, so that λ1 = √ Therefore by Theorem 6.33, the general solution is g(t) = c1 et 2 + c2 e−t 2 . So g(0) = 1 ⇒ g(1) = 0 √ e 2 c1 + c = 1 √ 2 − 2 c1 + e c2 = 0 so that g(t) = c1 = − 1 √ e2 2 − 1 √ √ 2 and λ2 = − 2. 1 √ e2 2 − 1 √ e2 2 √ c2 = e2 2 − 1 ⇒ √ √ √ −et 2 + e2 2−t 2 . 7. y 00 − 2y 0 + y = 0 corresponds to the characteristic equation λ2 − 2λ + 1 = (λ − 1)2 = 0, so that λ1 = λ2 = 1. Therefore by Theorem 6.33, the general solution is y(t) = c1 et + c2 tet . So y(0) = 1 y(1) = 1 c1 = 1 ec1 + ec2 = 1 ⇒ so that y(t) = et + c1 = 1 1−e c2 = e ⇒ 1−e t te = et + (1 − e)tet−1 . e 8. x00 + 4x0 + 4x = 0 corresponds to the characteristic equation λ2 + 4λ + 4 = (λ + 2)2 = 0, so that λ1 = λ2 = −2. Therefore by Theorem 6.33, the general solution is x(t) = c1 e−2t + c2 te−2t . So x(0) = 1 c1 = 1 −2c1 + c2 = 1 ⇒ 0 x (0) = 1 ⇒ c1 = 1 c2 = 3 so that y(t) = e−2t + 3te−2t . 9. y 00 − k 2 y = 0 corresponds to the characteristic equation λ2 − k 2 = 0, so that λ1 = k and λ2 = −k. Therefore by Theorem 6.33, the general solution is y(t) = c1 ekt + c2 e−kt . So y(0) = 1 y 0 (0) = 1 c1 + c2 = 1 kc1 − kc2 = 1 ⇒ so that y(t) = k+1 2k k−1 c2 = 2k c1 = ⇒ 1 (k + 1)ekt + (k − 1)e−kt . 2k 10. y 00 − 2ky 0 + k 2 y = 0 corresponds to the characteristic equation λ2 − 2kλ + k 2 = (λ − k)2 = 0, so that λ1 = λ2 = k. Therefore by Theorem 6.33, the general solution is y(t) = c1 ekt + c2 tekt . So y(0) = 1 y(1) = 0 c1 = 1 c1 + c2 = 0 ⇒ c1 = 1 ⇒ c2 = −1 so that y(t) = ekt − te−kt . 11. f 00 − 2f 0 + 5f = 0 corresponds to the characteristic equation λ2 − 2λ + 5 = 0, so that λ1 = 1 + 2i and λ2 = 1 − 2i. Therefore by Theorem 6.33, the general solution is f (t) = c1 et cos 2t + c2 et sin 2t. Then f (0) = 1 π f =0 4 ⇒ c1 = 1 eπ/4 c2 = 0 so that y(t) = et cos 2t. ⇒ c1 = 1 c2 = 0 6.7. APPLICATIONS 529 12. h00 − 4h0 + 5h = 0 corresponds to the characteristic equation λ2 − 4λ + 5 = 0, so that λ1 = 2 + i and λ2 = 2 − i. Therefore by Theorem 6.33, the general solution is h(t) = c1 e2t cos t + c2 e2t sin t. Then h0 (t) = c1 (2e2t cos t − e2t sin t) + c2 (2e2t sin t + e2t cos t), and h(0) = 0 0 h (0) = −1 ⇒ c1 = 0 2c1 + c2 = −1 ⇒ c1 = 0 c2 = −1 so that y(t) = −e2t sin t. 13. (a) We start with p(t) = cekt ; then p(0) = ce0k = c = 100, so that c = 100. Next, p(3) = 100e3k = 1600, so that e3k = 16 and k = ln316 . Hence p(t) = 100e(ln 16/3)t = 100 eln 16 t/3 = 100 · 16t/3 . (b) The population doubles when p(t) = 100e(ln 16/3)t = 200. Simplifying gives ln 16 3 ln 2 3 ln 2 3 t = ln 2, so that t = = = . 3 ln 16 4 ln 2 4 The population doubles in 43 of an hour, or 45 minutes. (c) We want to find t such that p(t) = 100e(ln 16/3)t = 1000000. Simplifying gives ln 16 3 ln 10000 12 ln 10 3 ln 10 t = ln 10000, so that t = = = ≈ 9.966 hours. 3 ln 16 4 ln 2 ln 2 14. (a) Start with p(t) = cekt ; then p(0) = ce0k = c = 76. Next, p(1) = 76e1k = 92, so that k = ln 92 76 , and therefore p(t) = 76et ln(92/76) . For the year 2000, we get p(10) = 76e10 ln(92/76) ≈ 513.5. This is almost twice the actual population in 2000. (b) Start with p(t) = cekt ; then p(0) = ce0k = c = 203. Next, p(1) = 203e1k = 227, so that k = ln 227 203 , t ln(227/203) and therefore p(t) = 203e . For the year 2000, we get p(3) = 203e3 ln(227/203) ≈ 283.8. This is a pretty good approximation. (c) Since the first estimate is very high but the second estimate is close, we can conclude that population growth was lower than exponential in the early 20th century, but has been close to exponential since 1970. This could be attributed for example to three major wars during the period from 1900 to 1970. 15. (a) Let m(t) = ae−ct . Then m(0) = ae−c·0 = a = 50, so that m(t) = 50e−ct . Since the half-life is 1590 years, we know that 25 mg will remain after that time, so that m(1590) = 50e−1590c = 25. ln 2 Simplifying gives −1590c = ln 21 = − ln 2, so that c = 1590 . So finally, m(t) = 50e−t ln 2/1590 . After 1000 years, the amount remaining is m(1000) = 50e−1000 ln 2/1590 ≈ 32.33 mg. (b) Only 10 mg will remain when m(t) = 10. Solving 50e−t ln 2/1590 = 10 gives −t ln 2 1 1590 ln 5 = ln = − ln 5, so that t = ≈ 3692 years. 1590 5 ln 2 530 CHAPTER 6. VECTOR SPACES 16. With m(t) = ae−ct , a half-life of 5730 years means that m(5730) = ae−5730c = a2 , so that e−5730c = 12 ln 2 and thus c = − ln(1/2) 5730 = 5730 . To find out how old the posts are, we want to find t such that 5730 ln 0.45 ≈ 6600 years. ln 2 √ √ 17. Following Example 6.92, the general equation of the spring motion is x(t) = c1 cos K t + c2 sin K t. Then x(0) = 10 = c1 cos 0 + c2 sin 0 = c1 , so that c1 = 10. Then √ √ √ 5 − 10 cos 10 K √ x(10) = 5 = 10 cos 10 K + c2 sin 10 K , so that c2 = . sin 10 K m(t) = ae−t ln 2/5730 = 0.45a, which gives t = − Therefore the equation of the spring, giving its length at time t, is √ √ √ 5 − 10 cos 10 K √ sin K t. c2 = 10 cos K t + sin 10 K k k 18. From the analysis in Example 6.92, since K = m and m = 50, we have K = 50 , so that r r √ √ k k t + c2 sin t. x(t) = c1 cos K t + c2 sin K t = c1 cos 50 50 √2π = 10, Since the period of sin bt and cos bt are each 2π b , the fact that the period is 10 means that k/50 q k π so that 50 = 5 and thus k = 2π 2 . 19. The differential equation for the pendulum motion is the same as the equation of spring motion, so we apply the analysis from Example 6.92. Recall that the period of sin bt and cos bt are each 2π b . (a) Since θ00 + Lg θ = 0, we have K = Lg and √ √ θ(t) = c1 cos K t + c2 sin r K t = c1 cos g t + c2 sin L r g t. L √2π = 2π Recall that the period of sin bt and cos bt are each 2π b . Then the period is P = gL q 1 2π this case, since L = 1, we get P = 2π g = √g . q L g . In (b) Since θ1 does not appear in the formula, the period does not depend on θ1 . It depends only on the gravitational constant g and the length of the pendulum L. (Note that it also does not depend on the weight of the pendulum). 20. We want to show that if y1 and y2 are two solutions, and c is a constant, then y1 + y2 and cy1 are also solutions. But this follows directly from properties of the derivative: (y1 + y2 )00 + a(y1 + y2 )0 + b(y1 + y2 ) = (y100 + ay10 + by1 ) + (y200 + ay20 + by2 ) = 0 + 0 = 0 (cy1 )00 + a(cy1 )0 + b(cy1 ) = cy100 + acy10 + bcy1 = c(y100 + ay10 + by1 ) = c · 0 = 0. Thus the solution set is a subspace of F . 21. Since we know that dim S = 2, it suffices to show that eλt and teλt are solutions (are in S) and that they are linearly independent. Note that λ is a solution of λ2 + aλ + b = 0. Then 00 0 eλt + a eλt + beλt = λ2 eλt + aλeλt + beλt = eλt (λ2 + aλ + b) = 0. For teλt , note that 0 teλt = eλt + λteλt , 00 0 teλt = eλt + λteλt = λeλt + λeλt + λ2 teλt = 2λeλt + λ2 teλt . Chapter Review 531 Then teλt 00 + a teλt 0 + bteλt = 2λeλt + λ2 teλt + a eλt + λteλt + bteλt = (2λ + a)eλt + teλt (λ2 + aλ + b) = (2λ + a)eλt . √ 2 But the solutions of λ2 + aλ + b are λ = −a± 2a −4b , so if the two roots are equal, we must have a2 − 4b = 0, so that λ = − a2 , so that 2λ = −a and then 2λ + a = 0. So the remaining term vanishes, and teλt is a solution. To see that the two are linearly independent, suppose c1 eλt + c2 teλt is zero for all values of t. Setting t = 0 gives c1 = 0, so that c2 teλt is zero. Then set t = 1 to get c2 = 0. Thus the two are linearly independent, so they form a basis for the solution set S. 22. Suppose that c1 ept cos qt + c2 ept sin qt is identically zero. Setting t = 0 gives c1 = 0, so that c2 ept sin qt is identically zero, which clearly forces c2 = 0, proving linear independence. Chapter Review 1. (a) False. All we can say is that dim V ≤ n, so that any spanning set for V contains at most n vectors. For example, let V = R, v1 = e1 and v2 = 2e1 . Then V = span(v1 , v2 ), but V has spanning sets consisting of only one vector. (b) True. See Exercise 17(a) in Section 6.2. (c) True. For example, 1 I= 0 0 , 1 0 J= 1 1 , 0 1 L= 0 1 , 1 1 R= 1 1 . 0 Then R − I = E11 , L − I = E12 , J + I − L = E21 , J + I − R = E22 . Therefore this set of four invertible matrices spans; since dim M22 = 4, it is a basis. (d) False. By Exercise 2 in Section 6.5, the kernel of the map tr : M22 → R has dimension 3, so that matrices with trace 0 span a subspace of M22 of dimension 3. (e) False. In general, kx + yk = 6 kxk + kyk. For example, take x = e1 and y = e2 in R2 . (f ) True. If T is one-to-one and {ui } is a basis for V , then {T ui } is a linearly independent set by Theorem 6.22. It spans range(T ), so it is a basis for range(T ). Thus if T is onto, range(T ) = W , so that {T ui } is a basis for W and therefore dim V = dim W . (g) False. This is only true if T is onto. For example, let T : R → R be T (x) = 0. Then ker T = R but the codomain is not {0} (although the range is). (h) True. If nullity(T ) = 4, then rank T = dim M33 − nullity(T ) = 9 − 4 = 5. So range(T ) is a five-dimensional subspace of P4 , so it is the entire space. (i) True. Since dim P3 = 4, it suffices to show that dim V = 4. But polynomials in V are of the form a + bx + cx2 + dx3 + ex4 where a + b + c + d + e = 0, so they are of the form (−b − c − d − e) + bx + cx2 + dx3 + ex4 , where b, c, d, and e are arbitrary. Thus a basis for V is {x, x2 , x3 , x4 }, so dim V = 4. (j) False. See Example 6.80 in Section 6.6. 2. This is the subspace {0}, since the only vector x 0 . satisfying x2 + 3y 2 = 0 is y 0 3. The conditions a + b = a + c and a + c = c + d imply that b = c and a = d, so that a b W = . b a 532 CHAPTER 6. VECTOR SPACES (Note that matrices of this form satisfy all of the original equalities.) To show that this is a subspace, we must show that it is closed under sums and scalar multiplication. Thus suppose 0 a b a b0 , ∈ W, c a scalar. b a b0 a0 Then a b 0 b a b0 a + a0 b + b0 + 0 = ∈W a b a0 b + b0 a + a0 a b ca cb c = ∈ W. b a cb ca 4. To show that this is a subspace, we must show that it is closed under sums and scalar multiplication. Suppose p(x), q(x) ∈ W . Then 1 1 1 1 1 3 3 3 3 x (p + q) =x p +q =x p +x q = p(x) + q(x) = (p + q)(x), x x x x x so that p + q ∈ W . Next, if c is a scalar, then 1 1 = cx3 p = cp(x) = (cp)(x), x3 (cp) x x so that cp ∈ W . Thus W is a subspace. 5. To show that this is a subspace, we must show that it is closed under sums and scalar multiplication. Suppose f (x), g(x) ∈ W . Then (f + g)(x + π) = f (x + π) + g(x + π) = f (x) + g(x) = (f + g)(x), so that f + g ∈ W . Next, if c is a scalar, then (cf )(x + π) = cf (x + π) = cf (x) = (cf )(x), so that cf ∈ W . Thus W is a subspace. 6. They are linearly dependent. By the double angle formula, cos 2x = 1 − 2 sin2 x, so that 1 − cos 2x − 2 sin2 x = 1 − cos 2x − 2 3 sin2 x = 0. 3 7. Suppose that cA + dB = O. Taking the transpose of both sides gives cAT + dB T = O; since A is symmetric and B is skew-symmetric, this is just cA − dB = O. Adding these two equations gives 2cA = O. Since A is nonzero, we must have c = 0, so that dB = O. Then since B is nonzero, we must have d = 0. Thus A and B are linearly independent. 8. The condition a + d = b + c means that a = b + c − d, so that a b b+c−d b W = : a+d=b+c = . c d c d Then define B by finding three matrices in W in each of which exactly one of b, c, and d is 1 while the other are zero: 1 1 1 0 −1 0 B= B= , C= , D= . 0 0 1 0 0 1 Consider the expression bB + cC + dD = b+c−d c b . d If this is O, then clearly b = c = d = 0, so that B, C, and D are linearly independent. But also, bB + cC + dD is a general matrix in W , so that B, C, and D span W . Thus {B, C, D} is a basis for W. Chapter Review 533 9. Suppose p(x) = a + bx + cx2 + dx3 + ex4 + f x5 ∈ P5 with p(−x) = p(x). Then p(1) = p(−1) ⇒ a + b + c + d + e + f = a − b + c − d + e − f ⇒ b + d + f = 0 p(2) = p(−2) ⇒ a + 2b + 4c + 8d + 16e + 32f = a − 2b + 4c − 8d + 16e − 32f ⇒ b + 4d + 16f = 0 p(3) = p(−3) ⇒ a + 3b + 9c + 27d + 81e + 243f = a − 3b + 9c − 27d + 81e − 243f ⇒ b + 9d + 81f = 0. These three equations in the unknowns b, d, f have only the trivial solution, so that b = d = f = 0, and p(x) = a + cx2 + dx4 . Any polynomial of this form clearly has p(x) = p(−x), since (−x)2 = x2 and (−x)4 = x4 . So {1, x2 , x4 } is a basis for W . 10. Let B = {1, 1 + x, 1 + x + x2 } = {b1 , b2 , b3 } and C = {1 + x, x + x2 , 1 + x2 } = {c1 , c2 , c3 }. Then       0 −1 1 [c1 ]B = 1 , [c2 ]B =  0 , [c3 ]B = −1 , 0 1 1 so that by Theorem 6.13  0 PB←C = 1 0 Then  −1 1 0 −1 . 1 1  1 2  1 −1 PC←B = PB←C = − 2 1 2 1 0 0  1 2 1 . 2 1 2 11. To show that T is linear, we must show that T (x + z) = T x + T z and that T (cx) = cT x, where c is a scalar. T (x + z) = y(x + z)T y = y xT + zT y = yxT y + yzT y = T x + T z T (cx) = y(cx)T y = cyxT y = cT x. Thus T is linear. 12. T is not a linear transformation. Let A ∈ Mnn and c be a scalar other than 0 or 1. Then T (cA) = (cA)T (cA) = c2 AT A = c2 T (A). Since c 6= 0, 1, it follows that c2 T (A) 6= cT (A), so that T is not linear. 13. To show that T is linear, we must show that (T (p + q))(x) = T p(x) + T q(x) and that (T (cp))(x) = cT p(x), where c is a scalar. (T (p + q))(x) = (p + q)(2x − 1) = p(2x − 1) + q(2x − 1) = T p(x) + T q(x) (T (cp))(x) = (cp)(2x − 1) = cp(2x − 1) = T p(x). Thus T is linear. 14. We first write 5 − 3x + 2x2 in terms of 1, 1 + x, and 1 + x + x2 : 5 − 3x + 2x2 = a(1) + b(1 + x) + c(1 + x + x2 ) = (a + b + c) + (b + c)x + cx2 . Then c = 2, so that (from the x term) b = −5, and then a = 8. Thus T (5 − 3x + 2x2 ) = 8T (1) − 5T (1 + x) + 2T (1 + x + x2 ) 1 =8 0 0 1 −5 1 0 1 0 +2 1 1 −1 3 = 0 2 −7 . 3 534 CHAPTER 6. VECTOR SPACES 15. Since T : Mnn → R is onto, we know from the rank theorem that n2 = dim Mnn = rank(T ) + nullity(T ) = 1 + nullity(T ), so that nullity(T ) = n2 − 1. 16. (a) We want a map T : M22 → M22 such that T (A) = O if and only if A is upper triangular. If a b A= , c d an obvious map to try is 0 T (A) = c 0 . 0 Clearly the matrices sent to O by T are exactly the upper triangular matrices (together with O), which is the subspace W . To see that T is linear, note that we can rewrite T as T (A) = a21 E21 . Then T (A + B) = (A + B)21 E21 = (a21 + b21 )E21 = a21 E21 + b21 E21 = T (A) + T (B) T (cA) = (cA)21 E21 = ca21 E21 = cT (A). (b) For this map, we want the image of any matrix A to be upper triangular. With A as above, an obvious map to try is the one that sets the entry below the main diagonal to zero: a b T (A) = . 0 d It is obvious that range T = W , since any upper triangular matrix is the image of itself under T , and every matrix of the form T (A) is upper triangular by construction. To see that T is linear, write T (A) = A − a21 E21 ; then T (A + B) = (A + B) − (A + B)21 E21 = (A + B) − (a21 + b21 )E21 = (A − a21 E21 ) + (B − b21 E21 ) = T (A) + T (B) T (cA) = (cA) − (cA)21 E21 = cA − ca21 E21 = c(A − a21 E21 ) = cT (A). 17. We have   1 0  ⇒ [T (1)]C =  0 1 1 1 1 0 0 1 T (x) = T ((x + 1) − 1) = T (x + 1) − T (1) = − = 0 1 0 1 0 0   0 1  = 0 · E11 + 1 · E12 + 0 · E21 + 0 · E22 ⇒ [T (x)]C =  0 0 0 −1 1 T (x2 ) = T ((1 + x + x2 ) − (1 + x)) = T (1 + x + x2 ) − T (1 + x) = − 1 0 0   −1 −2  = −1 · E11 − 2 · E12 + 1 · E21 − 1 · E22 ⇒ [T (1)]C =   1 . −1 1 T (1) = 0 0 = 1 · E11 + 0 · E12 + 0 · E21 + 1 · E22 1 1 −1 = 1 1 −2 −1 Chapter Review 535 Since these vectors are the columns of [T ]C←B , we have   1 0 −1 0 1 −2 . [T ]C←B =  0 0 1 1 0 −1 18. We must show that S spans V and that S is linearly independent. First, since every vector can be written as a linear combination of elements of S in one way, S spans V . Next, suppose c1 v1 + c2 v2 + · · · + cn vn = 0. Since we know that we can write 0 as the linear combination of the elements of S with zero coefficients, and since the representation of 0 is unique by hypothesis, it follows that ci = 0 for all i, so that S is linearly independent. Thus S is a basis for V . 19. If u ∈ U , then T u ∈ ker(S), so that (S ◦ T )u = S(T u) = 0. Thus S ◦ T is the zero map. 20. Since {T (v1 ), T (v2 ), . . . , T (vn )} is a basis for V , it follows that range(T ) = V since any element of V can be expressed as a linear combination of elements in range(T ). By Theorem 6.30 (p) and (t), this means that T is invertible. Chapter 7 Distance and Approximation 7.1 Inner Product Spaces 1. With hu, vi = 2u1 v1 + 3u2 v2 , we get (a) hu, vi = 2 · 1 · 4 + 3 · (−2) · 3 = −10. p p √ (b) kuk = hu, ui = 2 · 1 · 1 + 3 · (−2) · (−2) = 14. p √ −3 = 2 · (−3) · (−3) + 3 · (−5) · (−5) = 93. (c) d(u, v) = ku − vk = −5 6 2 2. With A = , we get 2 3 6 2 4 T (a) hu, vi = u Av = 1 −2 = −4. 2 3 3 s p √ 6 2 1 1 −2 = 10. (b) kuk = hu, ui = 2 3 −2 s √ 6 2 −3 −3 −3 −5 = (c) d(u, v) = ku − vk = = 189. −5 2 3 −5 a 3. We want a vector w = such that b hu, wi = 2 · 1 · a + 3 · (−2) · b = 2a − 6b = 0. 3t 3 Any vector of the form is such a vector; for example, . t 1 a 4. We want a vector w = such that b 6 2 a T hu, wi = u aw = 1 −2 = 2a − 4b = 0. 2 3 b 2 There are many such vectors, for example w = . 1 5. (a) (b) (c) hp(x), q(x)i = 3 − 2x, 1 + x + x2 = 3 · 1 + (−2) · 1 + 0 · 1 = 1. p p √ kp(x)k = hp(x), p(x)i = 3 · 3 + (−2) · (−2) + 0 · 0 = 13. p √ d(p(x), q(x)) = kp(x) − q(x)k = 2 − 3x − x2 = 22 + (−3)2 + (−1)2 = 14. 537 538 CHAPTER 7. DISTANCE AND APPROXIMATION 6. Z 1 (3 − 2x)(1 + x + x2 ) dx = Z 1 (−2x3 + x2 + x + 3) dx (a) hp(x), q(x)i = (b) 1 10 1 4 1 3 1 2 = = − x + x + x + 3x . 2 3 2 3 0 s s 1 r Z 1 p 13 4 3 2 2 (3 − 2x) dx = = kp(x)k = hp(x), p(x)i = . 9x − 6x + x 3 3 0 0 s Z 1 2 (2 − 3x − x2 )2 dx d(p(x), q(x)) = kp(x) − q(x)k = 2 − 3x − x = 0 0 (c) 0 s = 1 5 3 4 5 3 x + x + x − 6x2 + 4x 5 2 3 1 r = 0 41 . 30 7. We want a polynomial r(x) = a + bx + cx2 such that hp(x), r(x)i = 3 · a − 2 · b + 0 · c = 3a − 2b = 0. There are many such polynomials; two examples are 2 − 3x and 4 − 6x + 7x2 . 8. We want a polynomial r(x) = a + bx + cx2 such that Z 1 hp(x), r(x)i = (3 − 2x)(a + bx + cx2 ) dx 0 Z 1 = (3a + (3b − 2a)x + (3c − 2b)x2 − 2cx3 ) dx 0 3b − 2a 2 3c − 2b 2 c 4 x + x − x = 3ax + 2 3 2 5 1 = 2a + b + c = 0. 6 2 1 0 There are many such polynomials. Three examples are −1 + 4x2 , 5 − 12x, and 12x − 20x2 . 9. Recall the double-angle formulas sin2 x = 21 − 12 cos 2x and cos2 x = 12 + 12 cos 2x. Then (a) Z 2π hf, gi = Z 2π f (x)g(x) dx = 0 0 Z 2π = 0 (b) kf k = R 2π 0 sin2 x + sin x cos x dx f (x)2 dx = R 2π 0 1 1 − cos 2x + sin x cos x 2 2 sin2 x dx = R 2π 0 dx = 2π 1 1 1 x − sin 2x + sin2 x = π. 2 4 2 0 1 1 1 2 − 2 cos 2x dx = 2π 1 2 x − 4 sin 2x 0 = π. (c) Z 2π d(f, g) = kf − gk = (− cos x)2 dx = 0 10. We want to find a function g such that 1 2π 2 2 sin x 0 = 0, so choose g(x) = cos x. Z 2π 0 R 2π 0 cos2 x dx = Z 2π 0 1 1 + cos 2x dx 2 2 2π 1 1 = = π. x + sin 2x 2 4 0 sin x · g(x) dx = 0. Note from the previous exercise that 7.1. INNER PRODUCT SPACES 539 11. We must show it satisfies all four properties of an inner product. First, hp(x), q(x)i = p(a)q(a) + p(b)q(b) + p(c)q(c) = q(a)p(a) + q(b)p(b) + q(c)p(c) = hq(x), p(x)i . Next, hp(x), q(x) + r(x)i = p(a)(q(a) + r(a)) + p(b)(q(b) + r(b)) + p(c)(q(c) + r(c)) = p(a)q(a) + p(a)r(a) + p(b)q(b) + p(b)r(b) + p(c)q(c) + p(c)r(c) = p(a)q(a) + p(b)q(b) + p(c)q(c) + p(a)r(a) + p(b)r(b) + p(c)r(c) = hp(x), q(x)i + hp(x), r(x)i . Third, hdp(x), q(x)i = dp(a)q(a) + dp(b)q(b) + dp(c)q(c) = d(p(a)q(a) + p(b)q(b) + p(c)q(c)) = d hp(x), q(x)i . Finally, hp(x), p(x)i = p(a)p(a) + p(b)q(b) + p(c)q(c) = p(a)2 + p(b)2 + p(c)2 ≥ 0, and if equality holds, then p(a) = p(b) = p(c) = 0 since squares are always nonnegative. But p ∈ P2 , so if it is nonzero, it has at most two roots. Since a, b, and c are distinct, it follows that p(x) = 0. 12. We must show it satisfies all four properties of an inner product. First, hp(x), q(x)i = p(0)q(0) + p(1)q(1) + p(2)q(2) = q(0)p(0) + q(1)p(1) + q(2)p(2) = hq(x), p(x)i . Next, hp(x), q(x) + r(x)i = p(0)(q(0) + r(0)) + p(1)(q(1) + r(1)) + p(2)(q(2) + r(2)) = p(0)q(0) + p(0)r(0) + p(1)q(1) + p(1)r(1) + p(2)q(2) + p(2)r(2) = p(0)q(0) + p(1)q(1) + p(2)q(2) + p(0)r(0) + p(1)r(1) + p(2)r(2) = hp(x), q(x)i + hp(x), r(x)i . Third, hcp(x), q(x)i = cp(0)q(0) + cp(1)q(1) + cp(2)q(2) = c(p(0)q(0) + p(1)q(1) + p(2)q(2)) = c hp(x), q(x)i . Finally, hp(x), p(x)i = p(0)p(0) + p(1)q(1) + p(2)q(2) = p(0)2 + p(1)2 + p(2)2 ≥ 0, and if equality holds, then 0, 1, and 2 are all roots of p since squares are always nonnegative. But p ∈ P2 , so if it is nonzero, it has at most two roots. It follows that p(x) = 0. 0 13. The fourth axiom does not hold. For example, let u = . Then hu, ui = 0 but u 6= 0. 1 1 14. The fourth axiom does not hold. For example, let u = . Then hu, ui = 1 · 1 − 1 · 1 = 0, but u 6= 0. 1 1 Alternatively, if u = , then u · u = −3 < 0, so the fourth axiom again fails. 2 1 15. The fourth axiom does not hold. For example, let u = . Then hu, ui = 0 · 1 + 1 · 0 = 0, but u 6= 0. 0 1 Alternatively, if u = , then u · u = −4 < 0, so the fourth axiom again fails. −2 16. The fourth axiom does not hold. For example, if p is any nonzero polynomial with zero constant term, then hp(x), p(x)i = 0 but p(x) is not the zero polynomial. 540 CHAPTER 7. DISTANCE AND APPROXIMATION 17. The fourth axiom does not hold. For example, let p(x) = 1 − x; then p(1) = 0 so that hp(x), p(x)i = 0, but p(x) is not the zero polynomial. 18. The fourth axiom does not hold. For example E12 E22 = O, so that hE12 , E22 i = det O = 0, but 1 1 E11 6= O. Alternatively, let A = ; then A 6= O but hA, Ai = 0. 1 1 4 19. Examining Example 7.3, the matrix should be 1 u1 4 u2 1 1 v1 = 4u1 + u2 4 v2 1 ; and indeed 4 v1 u1 + 4u2 = 4u1 v1 + u2 v1 + u1 v2 + 4u2 v2 , v2 as desired. 1 20. Examining Example 7.3, the matrix should be 2 u1 u2 1 2 2 v1 = u1 + 2u2 5 v2 2 ; and indeed 5 2u1 + 5u2 v1 = u1 v1 + 2u2 v1 + 2u1 v2 + 5u2 v2 , v2 as desired. 21. The unit circle is 1 2 u1 2 u= : u1 + u2 = 1 . u2 4 u2 2 1 -1.0 0.5 -0.5 -1 -2 1.0 u1 7.1. INNER PRODUCT SPACES 22. The unit circle is 541 u1 2 2 u= : 4u1 + 2u1 u2 + 4u2 = 1 . u2 u2 0.6 0.4 0.2 -0.6 -0.4 0.2 -0.2 0.4 0.6 u1 -0.2 -0.4 -0.6 23. We have, using properties of the inner product, hu, cvi = hcv, ui (Property 1) = c hv, ui (Property 3) = c hu, vi (Property 1). 24. Let w be any vector. Then hu, 0i = hu, 0wi which, by Theorem 7.1(b), is equal to 0 hu, wi = 0. Similarly, h0, vi = h0w, vi which, by property 3 of inner products, is equal to 0 hw, vi = 0. 25. Using properties of the inner product and Theorem 7.1 as needed, hu + w, v − wi = hu, v − wi + hw, v − wi = hu, vi + hu, −wi + hw, vi + hw, −wi = hu, vi − hu, wi + hv, wi − hw, wi 2 = 1 − 5 + 0 − kwk = −4 − 4 = −8. 26. Using properties of the inner product and Theorem 7.1 as needed, h2v − w, 3u + 2wi = h2v, 3u + 2wi + h−w, 3u + 2wi = h2v, 3ui + h2v, 2wi + h−w, 3ui + h−w, 2wi = 6 hv, ui + 4 hv, wi − 3 hw, ui − 2 hw, wi 2 = 6 hu, vi + 4 hv, wi − 3 hu, wi − 2 kwk = 6 · 1 + 4 · 0 − 3 · 5 − 2 · 22 = −17. 27. Using properties of the inner product and Theorem 7.1 as needed, p ku + vk = hu + v, u + vi p = hu, ui + hu, vi + hv, ui + hv, vi q 2 2 = kuk + 2 hu, vi + kvk √ √ = 1 + 2 · 1 + 3 = 6. 542 CHAPTER 7. DISTANCE AND APPROXIMATION 28. Using properties of the inner product and Theorem 7.1 as needed, p k2u − 3v + wk = h2u − 3v + w, 2u − 3v + wi p = 4 hu, ui − 12 hu, vi − 6 hv, wi + 4 hu, wi + 9 hv, vi + hw, wi q 2 2 2 = 4 kuk − 12 hu, vi − 6 hv, wi + 4 hu, wi + 9 kvk + kwk r √ 2 = 4 · 12 − 12 · 1 − 6 · 0 + 4 · 5 + 9 · 3 + 22 √ = 43. 29. u+v = w means that u+v−w = 0; this is true if and only if hu + v − w, u + v − wi = 0. Computing this inner product gives hu + v − w, u + v − wi = hu, ui + 2 hu, vi − 2 hu, wi − 2 hv, wi + hv, vi + h−w, −wi 2 2 2 = kuk + 2 · 1 − 2 · 5 − 2 · 0 + kvk + kwk = 1 + 2 − 10 + 3 + 2 = 0. 2 30. Since kuk = 1, we also have 1 = kuk = hu, ui. Similarly, 1 = hv, vi. But then 0 ≤ hu + v, u + vi = hu, ui + 2 hu, vi + hv, vi = 2 + 2 hu, vi . Solving this inequality gives 2 hu, vi ≥ −2, so that hu, vi ≥ −1. 31. We have hu + v, u − vi = hu, ui + hu, vi − hv, ui − hv, vi . Since inner products are commutative, the middle two terms cancel, leaving us with 2 2 hu, ui − hv, vi = kuk − kvk . 32. We have 2 ku + vk = hu + v, u + vi = hu, ui + hu, vi + hv, ui + hv, vi . Since inner products are commutative, the middle two terms are the same, and we get 2 2 hu, ui + 2 hu, vi + hv, vi = kuk + 2 hu, vi + kvk . 33. From Exercise 32, we have 2 2 2 ku + vk = kuk + 2 hu, vi + kvk . (1) Substituting −v for v gives another identity, 2 2 2 2 2 ku − vk = kuk + 2 hu, −vi + k−vk = kuk − 2 hu, vi + kvk . Adding these two gives 2 2 2 2 ku + vk + ku − vk = 2 kuk + 2 kvk , and dividing through by 2 gives the required identity. 34. Solve (1) and (2) in the previous exercise for hu, vi, giving 1 2 2 2 ku + vk − kuk − kvk 2 1 2 2 2 hu, vi = − ku − vk + kuk + kvk . 2 hu, vi = (2) 7.1. INNER PRODUCT SPACES 543 Add these two equations, giving 2 hu, vi = 1 2 2 ku + vk − ku − vk ; 2 dividing through by 2 gives hu, vi = 1 1 1 2 2 2 2 ku + vk − ku − vk = ku + vk − ku − vk . 4 4 4 35. Use Exercise 34. First, if ku + vk = ku − vk, then Exercise 34 implies that hu, vi = 0. Second, suppose that u and v are orthogonal. Then again from Exercise 34, 0= 1 1 1 2 2 ku + vk − ku − vk = (ku + vk + ku − vk) (ku + vk − ku − vk) , 4 4 4 so that one of the two factors must be zero. If the first factor is zero, then both norms are zero, so they are equal. If the second factor is zero, then we have ku + vk − ku − vk = 0 and again the norms are equal. 36. From (2) in Exercise 33, we have d(u, v) = ku − vk = This is equal to q 2 q 2 2 kuk − 2 hu, vi + kvk . 2 kuk + kvk if and only if hu, vi = 0, which is to say when u and v are orthogonal. 37. The inner product we are working with is hu, vi = 2u1 v1 + 3u2 v2 , so that 1 v1 = 0 2·1·1+3·1·0 1 1 1 1 0 v2 = − = − = . 1 1 0 1 2·1·1+3·0·0 0 1 0 The orthogonal basis is , . 0 1 38. The inner product we are working with is 4 hu, vi = uT Av = uT −2 −2 = 4u1 v1 − 2u1 v2 − 2u2 v1 + 7u2 v2 . 7 Then 1 0 " # " # " # " # " # 1 1 1 4·1·1−2·1·1−2·1·0+7·0·1 1 1 1 v2 = − = − = 2 . 4·1·1−2·1·0−2·1·0+7·0·0 0 2 0 1 1 1 v1 = 39. The inner product we are working with is a0 + a1 x + a2 x2 , b0 + b1 x + b2 x2 = a0 b0 + a1 b1 + a2 b2 . Then v1 = 1 1 · (1 + x) 1 = (1 + x) − 1 = x 1·1 1 · (1 + x + x2 ) x · (1 + x + x2 ) v3 = (1 + x + x2 ) − 1− x = (1 + x + x2 ) − 1 − x = x2 . 1·1 x·x v2 = (1 + x) − The orthogonal basis is {1, x, x2 }. 544 CHAPTER 7. DISTANCE AND APPROXIMATION 40. The inner product we are working with is hp(x), q(x)i = R1 0 p(x)q(x) dx. Then v1 = 1 R1 (1 + x) dx 3 1 1 = (1 + x) − 1 = − + x R1 2 2 1 dx 0 − 12 + x · (1 + x + x2 ) 1 · (1 + x + x2 ) 1 v3 = (1 + x + x2 ) − − +x 1− 1 1 1·1 2 −2 + x · −2 + x R1 R1 2 (1 + x + x ) dx − 21 + x (1 + x + x2 ) dx 1 2 0 0 1− R1 = (1 + x + x ) − − +x R1 2 1 dx − 12 + x · − 12 + x 0 0 1/6 1 11 − +x = (1 + x + x2 ) − 1 − 6 1/12 2 11 1 = (1 + x + x2 ) − 1 − 2 − + x 6 2 1 = − x + x2 . 6 1 · (1 + x) 1 = (1 + x) − v2 = (1 + x) − 1·1 41. The inner product here is hf, gi = R1 −1 0 f (x)g(x) dx. (a) To normalize the first three Legendre polynomials, we find the norm of each: sZ 1 sZ 1 1 · 1 dx = k1k = 1 dx = −1 −1 sZ 1 sZ 1 √ 2 s r 1 1 3 2 kxk = x · x dx = x = 3 3 −1 −1 −1 sZ sZ 1 1 1 1 1 2 2 1 2 2 2 4 x − = x − x − dx = x − x + dx 3 3 3 3 9 −1 −1 s √ 1 1 5 2 3 1 2 2 = x − x + x = √ . 5 9 9 −1 3 5 x2 dx = So the normalized Legendre polynomials are 1 1 =√ , k1k 2 √ x 6 = x, kxk 2 x2 − 13 x2 − 31 √ √ 3 5 1 5 = √ x2 − = √ (3x2 − 1). 3 2 2 2 2 7.1. INNER PRODUCT SPACES 545 (b) For the computation, we will use the unnormalized Legendre polynomials given in Example 3.8, 1, x, and x2 − 31 , and normalize the fourth one at the end. With x4 = x3 , we have x4 · v1 x4 · v2 x4 · v3 v4 = x4 − v1 − v2 − v3 v1 · v1 v2 · v2 v3 · v3 R1 3 R 1 3 2 1 R1 3 x · x dx x x − 3 dx x · 1 dx 1 −1 −1 −1 3 2 1− R1 x− R1 = x − R1 x − 3 1 · 1 dx x · x dx x2 − 31 x2 − 13 dx −1 −1 −1 R1 4 R1 R1 3 x dx x5 − 31 x3 dx x dx 1 −1 −1 1 − x − x2 − = x3 − R−1 R R 1 1 1 1 2 2 2 4 3 1 dx x dx x − 3 x + 9 dx −1 −1 −1 1 5 1 1 6 1 4 1 1 1 4 1 4 x −1 5 x −1 6 x − 12 x −1 1− x − x2 − = x3 − 1 1 1 1 2 1 2 3 x3 x5 − x3 + x 3 −1 5 9 9 −1 2/5 = x3 − 0 − x−0 2/3 3 = x3 − x. 5 The norm of this polynomial is sZ sZ 1 1 3 9 3 6 x3 − x dx = kv4 k = x3 − x x6 − x4 + x2 dx 5 5 5 25 −1 −1 s √ 1 1 7 6 5 3 3 2 14 x − x + x . = = 7 25 25 35 −1 Thus the normalized Legendre polynomial is √ 3 14 35 3 √ x − x = (5x3 − 3x). 5 4 2 14 42. (a) Starting with `0 (x) = 1, `1 (x) = x, `2 (x) = x2 − 31 , `3 (x) = x3 − 53 x, we have `0 (1) = 1 ⇒ L0 (x) = 1 `1 (1) = 1 ⇒ L1 (x) = x ⇒ L2 (x) = 2 1 = 3 3 3 2 `3 (1) = 1 − − 5 5 `2 (1) = 1 − ⇒ 3 1 3 1 x2 − = x2 − 2 3 2 2 5 5 3 3 L3 (x) = x3 − x = x3 − x. 2 5 2 2 (b) For L2 and L3 , the given recurrence says 2·2−1 2−1 3 1 xL1 (x) − L0 (x) = x2 − 2 2 2 2 2·3−1 3−1 5 3 2 1 2 5 3 L3 (x) = xL2 (x) − L1 (x) = x x − − x = x3 − x, 3 3 3 2 2 3 2 2 L2 (x) = so that the recurrence indeed works for n = 2 and n = 3. Then 2·4−1 4−1 7 5 3 3 3 3 2 1 L4 (x) = xL3 (x) − L2 (x) = x x − x − x − 4 4 4 2 2 4 2 2 35 4 15 2 3 = x − x + 8 4 8 2·5−1 5−1 9 35 4 15 2 3 4 5 3 3 L5 (x) = xL4 (x) − L3 (x) = x x − x + − x − x 5 5 5 8 4 8 5 2 2 63 5 35 3 15 = x − x + x. 8 4 8 546 CHAPTER 7. DISTANCE AND APPROXIMATION 43. Let B = {ui } be an orthogonal basis for W . Then hprojW (v), uj i = X hui , vi ui , uj hui , ui i = huj , vi uj , uj huj , uj i = huj , vi huj , uj i = huj , vi . huj , uj i Then hperpW (v), uj i = hv − projW (v), uj i = hv, uj i − hprojW (v), uj i = hv, uj i − huj , vi = 0. Thus perpW (v) is orthogonal to every vector in B, so that perpW (v) is orthogonal to every w ∈ W . 44. (a) We have 2 2 htu + v, tu + vi = t2 hu, ui + 2t hu, vi + hv, vi = kuk t2 + 2 hu, vi t + kvk ≥ 0. 2 2 This is a quadratic at2 + bt + c in t, where a = kuk , b = 2 hu, vi, and c = kvk . 2 (b) Since a = kuk > 0, this is an upward-opening parabola, so its vertex is its lowest point. That b vertex is at − 2a , so the lowest point, which must be nonnegative, is 2 b ab2 b2 − 2b2 + 4ac 4ac − b2 b b2 +b − +c= 2 − +c= = ≥ 0. a − 2a 2a 4a 2a 4a 4a Since 4a > 0, we must have 4ac ≥ b2 , so that √ ac ≥ 12 |b|. (c) Substituting back in the values of a, b, and c from part (a) gives √ q ac = 2 2 kuk kvk = kuk kvk ≥ 1 1 |b| = |2 hu, vi| = |hu, vi| . 2 2 Thus kuk kvk ≥ |hu, vi|, which is the Cauchy-Schwarz inequality. Exploration: Vectors and Matrices with Complex Entries 2 1. Recall that if z ∈ C, then zz = |z| . Then for v as given, we have kvk = √ v·v = √ v 1 v1 + v 2 v2 + · · · + v n vn = q 2 2 |v1 | + · · · + |vn | . 2 Finally, we must show that |z| = z 2 . But 2 2 |z| = (zz) = zzz 2 = z 2 z 2 = z 2 , so the above is in fact equal to p |v12 | + · · · + |vn2 |. 2. (a) u · v = i · (2 − 3i) + 1 · (1 + 5i) = −i(2 − 3i) + (1 + 5i) = −2 + 3i. p p √ (b) By Exercise 1, kuk = |i2 | + |12 | = |−1| + |1| = 2. (c) By Exercise 1, kvk = p |(2 − 3i)2 | + |(1 + 5i)2 | = (d) d(u, v) = ku − vk = √ 3 5. −2 + 4i −5i = p q (2 + 3i)(2 − 3i) + (1 − 5i)(1 + 5i) = 2 2 |−2 + 4i| + |−5i| = √ 13 + 26 = √ 39. p (−2 − 4i)(−2 + 4i) + (5i)(−5i) = Exploration: Vectors and Matrices with Complex Entries (e) If w = 547 a is orthogonal to u, then b u · w = i · a + 1 · b = −ai + b = 0, so that b = ai. Thus for example setting a = 1, we get 1 . i a (f ) If w = is orthogonal to v, then b u · w = 2 − 3i · a + 1 + 5i · b = (2 + 3i)a + (1 − 5i)b = 0. 2+3i So b = − 1−5i a; setting for example a = 1 − 5i, we get b = −2 − 3i, and one possible vector is 1 − 5i −2 − 3i 3. (a) v · u = v 1 u1 + · · · + v n un = v1 u1 + · · · + vn un = v1 u1 + · · · + vn un = u1 v1 + · · · + un vn = u · v. (b) u · (v + w) = u1 (v1 + w1 ) + u2 (v2 + w2 ) + · · · un (vn + wn ) = u1 v1 + u1 w1 + u2 v2 + u2 w2 + · · · + un vn + un wn = (u1 v1 + u2 v2 + · · · + un vn ) + (u1 w1 + u2 w2 + · · · + un wn ) = u · v + u · w. (c) First, (cu) · v = cu1 v1 + · · · + cun vn = cu1 v1 + · · · + cun vn = c(u1 v1 + · · · + nn vn ) = cu · v. For the second equality, we have, using part (a), u · (cv) = (cu) · v = cu · v = cu · v = cv · u. 2 (d) By Exercise 1, u · u = kuk = u21 + · + u2n ≥ 0. Further, equality holds if and only if each 2 u2i = |ui | = 0, so all the ui must be zero and therefore u = 0. 4. Just apply the definition: conjugate A and take its transpose: T −i i ∗ (a) A = A = . −2i 3 T 2 5 − 2i ∗ (b) A = A = . 5 + 2i −1   2+i 4 T 0 . (c) A∗ = A = 1 − 3i −2 3 + 4i   −3i 1 + i 1 − i T 4 0 . (d) A∗ = A =  0 1 − i −i i 548 CHAPTER 7. DISTANCE AND APPROXIMATION 5. These facts follow directly from standard properties of complex numbers together with the fact that matrix conjugation is element by element: (a) A = aij = [aij ] = A. (b) A + B = aij + bij = aij + bij = [aij ] + bij = A + B. (c) cA = [caij ] = [c · aij ] = c [aij ] = cA. P (d) Let C = AB; then cij = k aik bkj , so that " # " # X X AB ij = C ij = aik bkj aik bik = = A · B ij , k ij k ij so that AB = A · B. T (e) The ij entry of A is aji . The ij entry of AT is aji , so its conjugate, which is the ij entry of AT , is aji . So the two matrices have the same entries and thus are equal. 6. These facts follow directly from the facts in Exercise 5. ∗ T T T T ∗ ∗ = A = AT = A. (a) (A ) = AT = AT (b) (A + B)∗ = (A + B)T = AT + B T = A∗ + B ∗ . (c) (cA)∗ = (cA)T = cAT = cAT = cA∗ . (d) (AB)∗ = (AB)T = B T AT = B T · AT = B ∗ A∗ .   v1   P  v2  7. We have u · v = ui vi = u1 u2 · · · un  .  = uT v = u∗ v.  ..  vn 8. If A is Hermitian, then aij = aji for all i and j. In particular, when i = j, this gives aii = aii , so that aii is equal to its own conjugate and therefore must be real. 9. (a) A is not Hermitian since it does not have real diagonal entries. (b) A is not Hermitian since a12 = 2 − 3i and a21 = 2 − 3i = 2 + 3i; the two are not equal. (c) A is not Hermitian since a12 = 1 − 5i and a21 = −1 − 5i; the two are not equal. (d) A is Hermitian, since every entry is the conjugate of the same entry in its transpose. (e) A is not Hermitian. Since it is a real matrix, it is equal to its conjugate, but AT 6= A since A is not symmetric. (f ) A is Hermitian. Since it is a real matrix, it is equal to its conjugate, and it is symmetric, so that T A = AT = A. 10. Suppose that λ is an eigenvalue of a Hermitian matrix A, with corresponding eigenvector v. Now, v∗ A = v∗ A∗ = (Av)∗ = (λv)∗ since A is Hermitian. But by Exercise 6(c), (λv)∗ = λv∗ . Then λ(v∗ v) = v∗ (λv) = v∗ (Av) = (v∗ A)v = λv∗ v = λ(v∗ v). 2 Then since v 6= 0, we have by Exercise 7 v∗ v = v · v = kvk 6= 0. So we can divide the above equation through by v∗ v to get λ = λ, so that λ is real. Exploration: Vectors and Matrices with Complex Entries 549 11. Following the hint, adapt the proof of Theorem 5.19. Suppose that λ1 6= λ2 are eigenvalues of a Hermitian matrix A corresponding to eigenvectors v1 and v2 respectively. Then λ1 (v1 · v2 ) = (λ1 v1 ) · v2 = (Av1 ) · v2 = (Av1 )∗ v2 = (v1∗ A∗ )v2 . But A is Hermitian, so A∗ = A and we continue: (v1∗ A∗ )v2 = (v1∗ A)v2 = v1∗ (Av2 ) = v1∗ (λ2 v2 ) = λ2 (v1∗ v2 ) = λ2 (v1 · v2 ). Thus λ1 (v1 · v2 ) = λ2 (v1 · v2 ), so that (λ1 − λ2 )(v1 · v2 ) = 0. But λ1 6= λ2 by assumption, so we must have v1 · v2 = 0. " # − √i2 − √i2 ∗ −1 ∗ 12. (a) Multiplying, we get AA = I, so that A is unitary and A = A = . √i √i − 2 2 4 0 (b) Multiplying, we get AA∗ = 6= I, so that A is not unitary. 0 4 # " 3 − 4i ∗ −1 ∗ 5 5 . (c) Multiplying, we get AA = I, so that A is unitary and A = A = − 45 − 3i 5   1−i √ √ 0 −1+i 6 3   (d) Multiplying, we get AA∗ = I, so that A is unitary and A−1 = A∗ =  1 0   0 . 2 1 √ √ 0 6 3 13. First show (a) ⇔ (b). Let ui denote the ith column of a matrix U . Then the ith row of U ∗ is ui . Then by the definition of matrix multiplication, (U ∗ U )ij = ui · uj . Then the columns of U form an orthonormal set if and only if ( 0 i 6= j ui · uj = , 1 i=j which holds if and only if ( ∗ (U U )ij = 0 1 i 6= j , i=j which is to say that U ∗ U = I. Thus the columns of U are orthogonal if and only if U is unitary. Now apply this to the matrix U ∗ . We get that U ∗ is unitary if and only if the columns of U ∗ are orthogonal. But U ∗ is unitary if and only if U is, since (U ∗ )∗ U ∗ = U U ∗ , and the columns of U ∗ are just the conjugates of the rows of U , so that (letting Ui be the ith row of U) ∗ Ui · Uj = Ui Uj = (Ui )T Uj = Uj · Ui . So the columns of U ∗ are orthogonal if and only if the rows of U are orthogonal. This proves (a) ⇔ (c). Next, we show (a) ⇒ (e) ⇒ (d) ⇒ (e) ⇒ (a). To prove (a) ⇒ (e), assume that U is unitary. Then U ∗ U = I, so that ∗ U x · U y = (U x) U y = x∗ U ∗ U y = x∗ y = x · y. 550 CHAPTER 7. DISTANCE AND APPROXIMATION Next, for (e) ⇒ (d), assume that (e) holds. Then Qx · Qx = x · x, so that kQxk = √ x · x = kxk. Next, we show (d) ⇒ (e). Let ui denote the ith column of U . Then √ Qx · Qx = 1 2 2 kx + yk − kx − yk 4 1 2 2 = kU (x + y)k − kU (x − y)k 4 1 2 2 kU x + U yk − kU x − U yk = 4 = U x · U y. x·y = Thus (d) ⇒ (e). Finally, to see that (e) ⇒ (a), let ei be the ith standard basis vector. Then ui = U ei , so that since (e) holds, ( 0 i 6= j ui · uj = U ei · U ej = ei · ej = 1 i = j. This shows that the columns of U form an orthonormal set, so that U is unitary. 14. (a) Since " # " # − √i2 i i i i −√ ·√ =0 · = −√ · −√ i √ 2 2 2 2 2 " # " # √i √i 2 · 2 = − √i · √i − √i · √i = 1 i i √ √ 2 2 2 2 2 2 # " # " − √i2 − √i2 i i i i √ √ · = · − − √ · √ = 1, i i √ √ 2 2 2 2 √i 2 √i 2 2 2 the columns of the matrix are orthonormal, so the matrix is unitary. (b) The dot product of the first column with itself is 1 + i(1 + i) + 1 − i(1 − i) = (1 − i)(1 + i) + (1 + i)(1 − i) = 4 6= 1, so that the matrix is not unitary. (c) Since " # " 3 5 4i 5 · − 45 # 3i 5 3 4 4i 3i = · − − · =0 5 5 5 5 " # " # 3 5 4i 5 3 5 4i 5 3 3 4i 4i · − · =1 5 5 5 5 " # " # − 54 − 45 4 3i 3i 4 · = − · − − · = 1, 3i 3i 5 5 5 5 5 5 · = the columns of the matrix are orthonormal, so the matrix is unitary. Exploration: Vectors and Matrices with Complex Entries 551 (d) Clearly column 2 is a unit vector, and is orthogonal to each of the other two columns, so we need only show the first and third columns are unit vectors and are orthogonal. But  1+i   2  √ √ 6 6 −1 + i 1 2 − 2i −1 + i     1−i 2 ·√ = + =0  0 · 0 = √ · √ +0+ √ 6 3 6 6 3 3 −1−i 1 √ √ 3 3  1+i   1+i  √ √ 6 6 −1 + i −1 − i 2 2     1−i 1+i · √ = + =1  0 · 0 = √ · √ +0+ √ 6 3 6 6 3 3 −1−i −1−i √ √ 3 3  2   2  √ √ 6 6 2 1 1 4 1 2      0  ·  0  = √ · √ + 0 + √ · √ = + = 1, 6 3 6 6 3 3 1 1 √ 3 √ 3 and we are done. 15. (a) With A as given, the characteristic polynomial is (λ − 2)2 − i · (−i) = λ2 − 4λ + 3 = (λ − 1)(λ − 3). Thus the eigenvalues are λ1 = 1 and λ2 = 3. Corresponding eigenvectors are 1 1+i λ=1: , λ=3: . i 1−i Since these eigenvectors correspond to different eigenvalues, they are already orthogonal. Their norms are q p √ 1 1+i 2 2 = 1 · 1 + (−i) · i = 2, = |1 + i| + |1 − i| = 2. i 1−i So defining a matrix U whose columns are the normalized eigenvectors gives # " U= √1 2 √i 2 1+i 2 1−i 2 , which is unitary. Then U ∗ AU = 1 0 0 . 3 (b) With A as given, the characteristic polynomial is λ2 + 1. Thus the eigenvalues are λ1 = i and λ2 = −i. Corresponding eigenvectors are i −i λ=1: , λ=3: . 1 1 Since these eigenvectors correspond to different eigenvalues, they are already orthogonal. Their norms are q q √ √ −i i 2 2 2 2 = |−i| + |1| = 2. = |i| + |1| = 2, 1 1 So defining a matrix U whose columns are the normalized eigenvectors gives " # √i − √i2 2 U= 1 , 1 √ √ 2 2 which is unitary. Then U ∗ AU = i 0 0 . −i 552 CHAPTER 7. DISTANCE AND APPROXIMATION (c) With A as given, the characteristic polynomial is (λ+1)λ−(1+i)(1−i) = λ2 +λ−2 = (λ+2)(λ−1). Thus the eigenvalues are λ1 = 1 and λ2 = −2. Corresponding eigenvectors are 1+i 1+i λ=1: , λ=3: . 2 −1 Since these eigenvectors correspond to different eigenvalues, they are already orthogonal. Their norms are q q √ √ 1+i 1+i 2 2 2 2 = |1 + i| + |1| = 3. = |1 + i| + |2| = 6, −1 2 So defining a matrix U whose columns are the normalized eigenvectors gives # " U= 1+i √ 6 √2 6 1+i √ 3 − √13 , which is unitary. Then 1 U AU = 0 ∗ 0 . −2 (d) With A as given, the characteristic polynomial is (λ − 1) ((λ − 2)(λ − 3) − (1 − i)(1 + i)) = (λ − 1) λ2 − 5λ + 4 = (λ − 1)2 (λ − 4) Thus the eigenvalues are λ1 = 1 and λ2 = 4. Corresponding eigenvectors are       0 1 0 λ = 4 : 1 − i . λ = 1 : −1 + i , 0 , 1 2 0 The eigenvectors corresponding to different eigenvalues are already orthogonal, and the two eigenvectors corresponding to λ = 1 are also orthogonal, so these vectors form an orthogonal set. Their norms are     q 0 1 √ 2 2 2 −1 + i = |0| + |−1 + i| + |1| = 3, 0 = 1, 1 0   q 0 √ 1 − i = |0|2 + |1 − i|2 + |2|2 = 6. 2 So defining a matrix U whose columns are the normalized eigenvectors gives   0 1 0   −1+i √ √ , 0 1−i U =  3 6 √1 0 √26 3 which is unitary. Then  1 U ∗ AU = 0 0 0 1 0  0 0 . 4 16. If A is Hermitian, then A∗ = A, so that A∗ A = AA = AA∗ . Next, if U is unitary, then U U ∗ = I. But if U is unitary, then its rows are orthonormal. Since the columns of U ∗ are the conjugates of the rows of U , they are orthonormal as well and therefore U ∗ is also unitary. Hence U U ∗ = U ∗ U = I and therefore U is normal. Finally, if A is skew-Hermitian, then A∗ = −A; then A∗ A = −AA = A(−A) = AA∗ . So all three of these types of matrices are normal. 17. By the result quoted just before Exercise 16, if A is unitarily diagonalizable, then A∗ A = AA∗ ; this is the definition of A being normal. Exploration: Geometric Inequalities and Optimization Problems 553 Exploration: Geometric Inequalities and Optimization Problems 1. We have √ √ √ √ √ x· y+ y· x=2 x y q √ 2 √ 2 √ kuk = kvk = x + ( y) = x + y, |u · v| = √ √ √ √ √ so that 2 x y ≤ x + y · x + y = x + y, and thus √ xy ≤ x+y . 2 √ √ √ √ 2 2. (a) Since x − y is a square, it is nonnegative. Expanding then gives x − 2 x y + y ≥ 0, so √ that 2 xy ≤ x + y and thus x+y √ xy ≤ . 2 (b) Let AD = a and BD = b. Then by the Pythagorean Theorem, CD2 = a2 − x2 = b2 − y 2 , √ so that 2CD2 = (a2 + b2 ) − x2 − y 2 = (x + y)2 − (x2 + y 2 ) = 2xy, so that CD = xy. But CD is clearly at most equal to a radius of the circle, since it extends from a diameter to the circle. Since the radius is x+y 2 , we again get √ xy ≤ x+y . 2 √ 3. Since the area is xy = 100, we have 10 = xy ≤ x+y 2 , so that 40 ≤ 2(x + y), which is the perimeter. By the AMGM, equality holds when x = y; that is, when x = y = 10, so the square of side 10 is the rectangle with smallest perimeter. 4. Let x1 play the role of y in the AMGM. Then r x· √ x + x1 1 1 ≤ , or 2 1 = 2 ≤ x + . x 2 x Equality holds if and only if x = x1 , i.e., when x = 1. Thus 1 + 11 = 2 is the minimum value for f (x) for x > 0. 5. If the dimensions of the cut-out square are x × x, then the base of the box will be (10 − 2x) × (10 − 2x), so its volume will be x(10 − 2x)2 . Then √ 3 V ≤ x + (10 − 2x) + (10 − 2x) 20 − 3x = , 3 3 10 with equality holding if x = 10 − 2x. Thus x = 10 3 , so that 10 − 2x = 3 as well. Then the largest 10 10 10 possible volume is 3 × 3 × 3 . 6. We have p (x + y) + (y + z) + (z + x) 2 3 f= = (x + y + z), 3 3 with equality holding if x + y = y + z = z + x. But this implies that x = y = z; since xyz = 1, we must have x = y = z = 1, so that f (x, y, z) = (1 + 1)(1 + 1)(1 + 1) = 8. This is the minimum value of f subject to the constraint xyz = 1. 554 CHAPTER 7. DISTANCE AND APPROXIMATION 7. Use the substitution u = x − y. Then the original expression becomes u+y+ 8 . uy Using the AMGM for three variables, we get 8 r 3 u·y· √ u + y + uy 8 8 3 ≤ , or 3 8 = 6 ≤ u + y + , uy 3 uy 8 with equality holding if and only if u = y = uy ; this implies that u = y = 2, so that x = u + y = 4. So 8 the minimum values is 4 + 2·2 = 6, which occurs when x = 4 and y = 2.   x √ 8. Follow the method of Example 7.11. x2 + 2y 2 + z 2 is the square of the norm of v =  2 y , so that z   1 √ x + 2y + 4z is the dot product of that vector with u =  2. Then the component-wise form of the 4 Cauchy-Schwarz Inequality gives √ 2 √ √ 2 2 2 2 (1·x+ 2· 2 y +4·z) = (x+2y +4z) ≤ 1 + 2 + 4 (x2 +2y 2 +z 2 ) = 19(x2 +2y 2 +z 2 ) = 19. So x + 2y + 4z ≤ √ 19, and the maximum value is √ 19.   x 9. Follow the method of Example 7.11. Here f (x, y, z) is the square of the norm of v =  y , so set √z 2   1 u = √1 . Then the component-wise form of the Cauchy-Schwarz Inequality gives 2   2 x 1 √ 2 1 2 2 2  1  ·  y  = (x + y + z)2 ≤ 12 + 12 + z . 2 x + y + √ 2 √z 2 2  Since x + y + z = 10, we get 1 2 1 100le4 x + y + z , or 25 ≤ x2 + y 2 + z 2 . 2 2 2 2 Thus the minimum value of the function is 25. 1 sin θ 2 10. Follow the method of Example 7.11 with u = and v = . (This works because kvk = 1 cos θ sin2 θ + cos2 θ = 1.) Then the component-wise form of the Cauchy-Schwarz Inequality gives (u · v)2 = (sin θ + cos θ)2 ≤ (12 + 12 )(sin2 θ + cos2 θ) = 2. √ √ Therefore sin θ + cos θ ≤ 2, so the maximum value of the expression is 2. x 1 11. We want to minimize x + y subject to the constraint x + 2y = 5. Thus set v = and u = . y 2 Then the component-wise form of the Cauchy-Schwarz Inequality gives 2 2 (x + 2y)2 ≤ (12 + 22 )(x2 + y 2 ). Exploration: Geometric Inequalities and Optimization Problems 555 Since x + 2y = 5, we get 25 ≤ 5(x2 + y 2 ), or 5 ≤ x2 + y 2 . Thus the minimum value is 5, which occurs at equality. To find the point where equality occurs, substitution 5 − 2y for x to get 5 = (5 − 2y)2 + y 2 = 25 − 20y + 5y 2 , so that y 2 − 4y + 4 = 0. This has the solution y = 2; then x = 5 − 2y = 1. So the point on x + 2y = 5 closest to the origin is (1, 2). √ 2 , apply the AMGM to x0 = x1 and y 0 = y1 , giving 12. To show that xy ≥ 1/x+1/y p x0 + y 0 x0 y 0 ≤ , or 2 r 2 1 2 √ ≥ 0 , or xy ≥ . x0 y 0 x + y0 1/x + 1/y Since this inequality results from the AMGM, we have equality when x0 = y 0 , so when x = y. For the first inequality, note first that (x − y)2 = x2 − 2xy + y 2 ≥ 0, so that x2 + y 2 ≥ 2xy. Then r 2 x2 + y 2 + (x2 + y 2 ) x2 + 2xy + y 2 x+y x2 + y 2 x+y x2 + y 2 = ≥ = ≥ . , so that 2 4 4 2 2 2 Equality occurs when x2 + y 2 = 2xy, which is the same as (x − y)2 = 0, so that x = y. 13. The area of the rectangle shown is 2xy, where x2 + y 2 = r2 . From Exercise 12, we have r p √ √ p p x2 + y 2 √ ≥ xy ⇒ x2 + y 2 ≥ 2xy ⇒ r2 = r ≥ 2xy = A ⇒ A ≤ r2 . 2 The maximum occurs at equality, and the maximum area is A = r2 . By Exercise 12, this happens when x = y. 14. Following the hint, we have (x + y)2 = (x + y) xy 1 1 + x y . So from Exercise 12, we have 2 x+y ≥ ⇒ (x + y) 2 1/x + 1/y 1 1 + x y ≥ 4. Thus the minimum value is 4; this minimum occurs when x = y, so when 1 4x 1 + = = 4, (x + x) x x x or x = y = 1. 15. Squaring the inequalities in Exercise 12 gives x2 + y 2 ≥ 2 x+y 2 2 ≥ 2 1/x + 1/y 2 . Substituting x + x1 for x and y + y1 for y gives from the first inequality x + x1 2 2 + y + y1 2 2 x + x1 + y + y1  . ≥ 2  556 CHAPTER 7. DISTANCE AND APPROXIMATION Since the constraint we are working with is x + y = 1, we get 2  2 x + x1 + y + y1 (x + y) + x1 + y1   =  = 2 2  1 1 1 x + y + 2 2 !2 ≥ 1 2 + 2 x+y , using the second and fourth terms in Exercise 12 for the final inequality. But now again since x+y = 1, we get 2 2 2 5 25 1 + = = , 2 x+y 2 4 25 so that f (x, y) = 2 · 25 4 = 2 is the minimum when x + y = 1. This equality holds when x = y, so that 1 x = y = 2. 7.2 Norms and Distance Functions 1. We have kukE = p √ (−1)2 + 42 + (−5)2 = 42 kuks = |−1| + |4| + |−5| = 10 kukm = max (|−1| , |4| , |−5|) = 5. 2. We have kvkE = p √ 22 + (−2)2 + 02 = 2 2 kvks = |2| + |−2| + |0| = 4 kvkm = max (|2| , |−2| , |0|) = 2.   −3 3. u − v =  6, so −5 dE (u, v) = p √ (−3)2 + 62 + (−5)2 = 70 ds (u, v) = |−3| + |6| + |−5| = 14 dm (u, v) = max (|−3| , |6| , |−5|) = 6. 4. (a) ds (u, v) measures the total difference in magnitude between corresponding components. (b) dm (u, v) measure the greatest difference between any pair of corresponding components. 5. kukH = w(u), which is the number of 1s in u, or 4. kvkH = w(v), which is the number of 1s in v, or 5. 6. dH (u, v) = w(u − v) = w(u) + w(v) less twice the overlap, or 4 + 5 − 2 · 2 = 5; this is the number of 1s in u − v. pP P 2 2 7. (a) If kvkE = vi2 = max{|vi |} = kvkm , then squaring both sides gives vi = max{|vi | } = 2 2 max{vi }. Since vi ≥ 0 for all i, equality holds if and only if exactly one of the vi is nonzero. P (b) Suppose kvks = |vi | = max{|vi |} = kvkm . Since |vi | ≥ 0 for all i, equality holds if and only if exactly one of the vi is nonzero. (c) By parts (a) and (b), kvkE = kvkm = kvks if and only if exactly one of the components of v is nonzero. 7.2. NORMS AND DISTANCE FUNCTIONS 557 8. (a) Since ku + vkE = X X X X (ui + vi )2 = u2i + vi2 + 2 ui vi = kukE + kvkE + 2u · v, we see that ku + vkE = kukE + kvkE if and only if u · v = 0; that is, if and only if u and v are orthogonal. (b) We have ku + vks = X |ui + vi | , kuks + kvks = X (|ui | + |vi |). For a given i, if the signs of ui and vi are the same, then |ui + vi | = |ui | + |vi |, while if they are different, then |ui + vi | < |ui | + |vi |. Thus the two norms are equal exactly when ui and vi have the same sign for each i, which is to say that ui vi ≥ 0 for all i. (c) We have ku + vks = max{|ui + vi |}, kuks + kvks = max{|ui |} + max{|vi |}. Suppose that |ur | = max{|ui |} and |vt | = max{|vi |}. Then for any j, we know that |uj + vj | ≤ |uj | + |vj | ≤ |ur | + |vt | , where the first inequality is the triangle inequality and the second comes from the condition on r and t. Choose j such that |uj + vj | = ku + vks . If the two norms are equal, then |uj + vj | = ku + vks = kuks + kvks = |ur | + |vt | . Thus both of the inequalities above must be equalities. The first is an equality if uj and vj have the same sign. For the second, since |uj | ≤ |ur | and |vj | ≤ |vt |, equality means that |uj | = |ur | and |vj | = |vt |. Putting this together gives the following (somewhat complicated) condition for equality to hold in the triangle inequality for the max norm: equality holds if there is some component j such that uj and vj have the same sign and such that |uj | is a component of maximal magnitude of u and |vj | is a component of maximal magnitude of v. 9. Let vk be a component of v with maximum magnitude. Then the result follows from the string of equalities and inequalities qX q kvkm = |vk | = vk2 ≤ vi2 = kvkE . P 2 P 2 2 2 10. We have kvkE = vi ≤ ( |vi |) ≤ kvks . Taking square roots gives the desired result since norms are always nonnegative. 11. We have kvks = P 12. We have kvkE = pP |vi | ≤ P vi2 ≤ max{|vi |} = n max{|vi |} = n kvkm . q P 2 (max{vi }) = q √ √ 2 n max{|vi | } = n max{|vi |} = n kvkm . 13. The unit circle in R2 relative to the sum norm is {(x, y) : |x| + |y| = 1} = {(x, y) : x + y = 1 or x − y = 1 or − x + y = 1 or − x − y = 1}. The unit circle in R2 relative to the max norm is {(x, y) : max{|x| , |y|} = 1}. 558 CHAPTER 7. DISTANCE AND APPROXIMATION These unit circles are shown below: -1.0 1.0 1.0 0.5 0.5 1.0 0.5 -0.5 -1.0 0.5 -0.5 -0.5 -0.5 -1.0 -1.0 1.0 14. For example, in R2 , let u = h2, 1i and v = h1, 2i. Then kuk = kvk = 2 + 1 = 3. Also, ku + vk = kh3, 3ik = 6 and ku − vk = kh1, −1ik = 2. But then 2 2 kuk + kvk = 32 + 32 = 18 6= 1 1 2 2 ku + vk + ku − vk = 18 + 2 = 20. 2 2 a ≥ 0, and since max{|2a| , |3b|} = 0 if and only if both a and b are zero, we see that the 15. Clearly b norm is zero only for the zero vector. Thus property (1) is satisfied. For property 2, ca a = max{|2ac| , |3bc|} = |c| max{|2a| , |3b|}. = c cb b Finally, a a+c c = max{|2(a + c)| , |3(b + d)|}. + = b b+d d Now use the triangle inequality: max{|2(a + c)| , |3(b + d)|} ≤ max{|2a| + |2c| , |3b| + |3d|} ≤ max{|2a| , |3b|} + max{|2c| , |3d|}. a c But the last expression is just + , and we are done. b d 16. Clearly max{|aij |} ≥ 0, and is zero only when all of the aij are zero, so when A = O. Thus property i,j (1) is satisfied. For property (2), kcAk = max{|caij |} = max{|c| |aij |} = |c| max{|aij |} = |c| kAk . i,j i,j i,j For property (3), using the triangle inequality for absolute values gives kA + Bk = max{|aij + bij |} ≤ max{|aij | + |bij |} ≤ max{|aij |} + max{|bij |} = kAk + kBk . i,j i,j i,j i,j 7.2. NORMS AND DISTANCE FUNCTIONS 559 17. Clearly kf k ≥ 0, since |f | ≥ 0. Since f is continuous, |f | is also continuous, so its integral is zero if and only if it is identically zero. So property (1) holds. For property (2), use standard properties of integrals: Z Z Z 1 kcf k = 1 |cf (x)| dx = 0 1 |c| · |f (x)| dx = |c| 0 |f (x)| dx = |c| kf k . 0 Finally, property (3) relies on the triangle inequality for absolute value: Z 1 Z 1 Z 1 Z 1 |g(x)| dx = kf k + kgk . |f (x)| dx + (|f (x)| + |g(x)|) dx = |f (x) + g(x)| dx ≤ kf + gk = 0 0 0 0 18. Since the norm is the maximum of a set of nonnegative numbers, it is always nonnegative, and is zero exactly when all of the numbers are zero, which means that f is the zero function on [0, 1]. So property (1) holds. For property (2), we have kcf k = max {|cf (x)|} = max {|c| |f (x)|} = |c| max {|f (x)|} = |c| kf k . 0≤x≤1 0≤x≤1 0≤x≤1 Finally, property (3) relies on the triangle inequality for absolute values: kf + gk = max {|f (x) + g(x)|} ≤ max {|f (x)|+|g(x)|} ≤ max {|f (x)|}+ max {|g(x)|} = kf k+kgk . 0≤x≤1 0≤x≤1 0≤x≤1 0≤x≤1 19. By property (2) of the definition of a norm, d(u, v) = ku − vk = |−1| ku − vk = k(−1)(u − v)k = kv − uk = d(v, u). √ √ 20. The Frobenius norm is kAkF = 22 + 32 + 42 + 12 = 30. From Theorem 7.7 and the comment following, kAk1 is the largest column sum of |A|, which is 2 + 4 = 6, while kAk∞ is the largest row sum of |A|, which is 2 + 3 = 4 + 1 = 5. p √ 21. The Frobenius norm is kAkF = 02 + (−1)2 + (−3)2 + 32 = 19. From Theorem 7.7 and the comment following, kAk1 is the largest column sum of |A|, which is |−1| + 3 = 4, while kAk∞ is the largest row sum, which is |−3| + 3 = 6. p √ 22. The Frobenius norm is kAkF = 12 + 52 + (−2)2 + (−1)2 = 31. From Theorem 7.7 and the comment following, kAk1 is the largest column sum of |A|, which is 5 + |−1| = 6, while kAk∞ is the largest row sum, which is 1 + 5 = 6. √ √ 23. The Frobenius norm is kAkF = 22 + 12 + 12 + 12 + 32 + 22 + 12 + 12 + 32 = 31. From Theorem 7.7 and the comment following, kAk1 is the largest column sum of |A|, which is 1 + 2 + 3 = 6, while kAk∞ is the largest row sum, which is 1 + 3 + 2 = 6. p √ 24. The Frobenius norm is kAkF = 02 + (−5)2 + 22 + 32 + 12 + (−3)2 + (−4)2 + (−4)2 + 32 = 89. From Theorem 7.7 and the comment following, kAk1 is the largest column sum of |A|, which is |−5| + 1 + |−4| = 10, while kAk∞ is the largest row sum, which is |−4| + |−4| + 3 = 11. p √ 2 2 2 2 2 2 2 2 2 25. The √ Frobenius norm is kAkF = 4 + (−2) + (−1) + 0 + (−1) + 2 + 3 + (−3) + 0 = 44 = 2 11. From Theorem 7.7 and the comment following, kAk1 is the largest column sum of |A|, which is 4 + 0 + 3 = 7, while kAk∞ is the largest row sum, which is 4 + |−2| + |−1| = 7. 26. Since kAk1 = 6, which is the sum of the first column, such a vector x is e1 , since 2 3 1 2 2 = , and = 6. 4 1 0 4 4 s 1 Since kAk∞ = 5, which is the sum of (say) the first row, let y = ; then 1 5 2 3 1 5 = 5. = , and 4 1 1 5 5 m 560 CHAPTER 7. DISTANCE AND APPROXIMATION 27. Since kAk1 = 4, which is the sum of the magnitudes of the second column, such a vector x is e2 , since 0 −1 0 −1 −1 = = |−1| + 3 = 4. , and −3 3 1 3 3 s Since kAk∞ = 6, which is the sum of the magnitudes of the second row, let y = column in that row is negative; then 0 −1 −1 −1 = , and −3 3 1 6 −1 since the first 1 −1 = 6. 6 m 28. Since kAk1 = 6, which is the sum of the magnitudes of the second column, such a vector x is e2 , since 1 5 0 5 5 = = 5 + |−1| = 6. , and −2 −1 1 −1 −1 s 1 Since kAk∞ = 6, which is the sum of the first row, let y = ; then 1 1 5 1 6 6 = 6. = , and −2 −1 1 −3 −3 m 29. Since kAk1 = 6, which is the sum of the magnitudes of the third column, such a vector x is e3 , since        1 2 1 1 0 1 1 3 2 0 = 2 , and 2 = 6. 3 s 1 1 3 1 3   1 Since kAk∞ = 6, which is the sum of the second row, let y = 1; then 1        4 4 1 2 1 1 1 3 2 1 = 6 , and 6 = 6. 5 m 5 1 1 1 3 30. Since kAk1 = 10, which is the sum of the magnitudes of the second column, such a vector x is e2 , since        0 −5 2 0 −5 −5  3 1 −3 1 =  1 , and  1 = 10. −4 −4 3 0 −4 −4 s   −1 Since kAk∞ = 11, which is the sum of the magnitudes of the third row, let y = −1 since the entries 1 in the first two columns are negative; then        0 −5 2 −1 7 7  3 1 −3 −1 = −7 , and −7 = 11. −4 −4 3 1 11 11 m 31. Since kAk1 = 7, which is the sum of the magnitudes of the first column, such a vector x is e1 , since        4 −2 −1 1 4 4 0 −1 2 0 = 0 , and 0 = 7. 3 −3 0 0 3 3 s 7.2. NORMS AND DISTANCE FUNCTIONS 561   1 Since kAk∞ = 7, which is the sum of the magnitudes of the first row, let y = −1 since the entries −1 in the last two columns are negative; then        4 −2 −1 1 7 7 0 −1 2 −1 = −1 , and −1 = 7. 3 −3 0 −1 6 m 6 32. Let Ai be the ith row of A, let M = max {kAj ks } be the maximum absolute row sum, and let x j=1,2,...,n be any vector with kxkm = 1 (which means that |xi | ≤ 1 for all i). Then kAxkm = max{|Ai x|} i = max{|ai1 x1 + ai2 x2 + · · · + ain xn |} i ≤ max{|ai1 | |x1 | + |ai2 | |x2 | + · · · + |ai1 | |xn |} i ≤ max{|ai1 | + |ai2 | + · · · + |ain |} i = max{kAi ks } = M. i Thus kAxkm is at most equal to the maximum absolute row sum. Next, we find a specific unit vector x so that it is actually equal to M . Let row k be the row of A that gives the maximum sum norm; that is, such that kAk ks = M . Let ( 1 Akj ≥ 0 yj = , j = 1, 2, . . . , n. −1 Akj < 0 T Then with y = y1 · · · yn , we have kykm = 1, and we have corrected any negative signs in row k, so that akj yj ≥ 0 for all j and therefore the k th entry of Ay is X X X X akj yj = |akj yj | = |akj | |yj | = |akj | = kAk ks = M. j j j j So we have shown that the ∞ norm of A is bounded by the maximum row sum, and then found a vector x for which kAxk ≤ kAk∞ is equal to the maximum row sum. Thus kAk∞ is equal to the maximum row sum, as desired. 33. (a) If k·k is an operator norm, then kIk = max kI x̂k = max kx̂k = 1. kx̂k=1 kx̂k=1 pP √ 12 = n 6= 1. When n = 1, (b) No, there is not, for n > 1, since in the Frobenius norm, kIkF = the Frobenius norm is the same as the absolute value norm on the real number line. 34. Let λ be an eigenvalue, and v a corresponding unit vector. Normalize v to v̂. Then Av̂ = λv̂, so that kAk = max kAx̂k ≥ kAv̂k = kλv̂k = |λ| kv̂k = |λ| . kx̂k=1 35. Computing the inverse gives " A −1 1 = −2 − 12 3 2 # . Then cond1 (A) = kAk1 A−1 1 = (3 + 4) (1 + |−2|) = 21 3 cond∞ (A) = kAk∞ A−1 ∞ = (4 + 2) |−2| + = 21. 2 The matrix is well-conditioned. 562 CHAPTER 7. DISTANCE AND APPROXIMATION 36. Since det A = 0, by definition cond1 (A) = cond∞ (A) = ∞, so this matrix is ill-conditioned. 37. Computing the inverse gives A−1 = 100 −99 . −100 100 Then cond1 (A) = kAk1 A−1 1 = (1 + 1)(100 + |−100|) = 400 cond∞ (A) = kAk∞ A−1 ∞ = (1 + 1)(100 + |−100|) = 400. The matrix is ill-conditioned. 38. Computing the inverse gives " A−1 = 2001 50 3001 − 100 # −2 3 2 . Then 2001 3001 + = 294266 50 100 2001 A−1 ∞ = (3001 + 4002) + 2 = 294266. 50 cond1 (A) = kAk1 A−1 1 = (200 + 4002) cond∞ (A) = kAk∞ The matrix is ill-conditioned. 39. Computing the inverse gives  0 A−1 =  6 −5  0 1 −1 −1 . 1 0 Then cond1 (A) = kAk1 A−1 1 = (1 + 5 + 1)(0 + 6 + 5) = 77 cond∞ (A) = kAk∞ A−1 ∞ = (5 + 5 + 6)(6 + 1 + 1) = 128. The matrix is moderately ill-conditioned. 40. Computing the inverse gives   9 −36 30 192 −180 . A−1 = −36 30 −180 180 Then 1 1 1+ + (36 + 192 + 180) = 748 2 3 1 1 A−1 ∞ = 1 + + (36 + 192 + 180) = 748. 2 3 cond1 (A) = kAk1 A−1 1 = cond∞ (A) = kAk∞ The matrix is ill-conditioned. 41. (a) Computing the inverse gives A−1 = 1 1 1 − k −1 −k . 1 Then cond∞ (A) = kAk∞ A−1 ∞ = max{|k| + 1, 2} max k 1 2 + , k−1 k−1 k−1 . 7.2. NORMS AND DISTANCE FUNCTIONS 563 (b) As k → 1, since the second factor goes to ∞, cond∞ (A) → ∞. This makes sense since when k = 1, A is not invertible. 42. Since the matrix norm is compatible, we have from the definition that kAxk ≤ kAk kxk for any x, so that 1 kAk kbk = AA−1 b ≤ kAk A−1 b , so that ≤ . kA−1 bk kbk Then A−1 ∆b A−1 k∆bk k∆bk k∆xk kAk A−1 k∆bk = cond(A) = ≤ ≤ . kxk kA−1 bk kA−1 bk kbk kbk 43. (a) Since " 9 − 10 A−1 = 1 # 1 , −1 we have cond∞ (A) = (10 + 10)(1 + 1) = 40. (b) Here 10 ∆A = 10 Therefore 10 10 − 11 10 10 0 = 9 0 0 2 ⇒ k∆Ak = 2. k∆Ak∞ k∆xkm 2 ≤ cond∞ (A) = 40 · , kx0 km kAk∞ 20 so that this can produce at most a change of a factor of 4, or a 400% relative change. (c) Row-reducing gives A 0 A Thus and 10 b = 10 10 b = 10 9 x= , 1 10 9 100 99 10 11 100 99 → → 1 0 1 0 0 1 9 1 0 1 11 . −1 11 2 0 x = , so that ∆x = x − x = , −1 −2 0 k∆xkm 2 kxkm = 9 , for approximately a 22% relative error. (d) Using Exercise 42, ∆b = 100 100 0 − = , 101 99 2 so that k∆bkm = 2 and k∆xkm k∆bkm 2 ≤ cond∞ (A) = 40 · = 0.8, kxkm kbkm 100 so that at most an 80% change can result. (e) Row-reducing A b gives the same result as above, while 10 10 100 1 A b0 = → 10 9 101 0 Thus and again 11 9 2 ∆x = − = , −1 1 −2 k∆xkm 2 kxkm = 9 , for about a 22% actual relative error. 0 1 11 . −1 564 CHAPTER 7. DISTANCE AND APPROXIMATION 44. (a) Since  −10  −1 A = 4 7  3 5  −1 −2 , −2 −3 we have cond1 (A) = (1 + 5 + |−1|)(|−10| + 4 + 7) = 147. (b) Here  1 ∆A = 2 1 Therefore   1 1 1 5 0 − 1 −1 2 1   1 1 0 5 0 = 1 −1 2 0 0 0 0  0 0 ⇒ k∆Ak1 = 1. 0 k∆xks k∆Ak1 1 ≤ cond1 (A) = 147 · = 21, kx0 ks kAk1 7 so that this can produce at most a change of a factor of 21, or a 2100% relative change. (c) Row-reducing gives h h Thus and A A 0   1 i  b = 2 1  1 i  b = 1 1  11   x = −4 , −6 1 1 5 0 −1 2 1 1 5 0 −1 2  1  2 3  1  2 3 → →  1 0 0  0 1 0 0 0 0  1 0 0  0 1 0 0 0 1  11  −4 −6  − 11 2 3  . 2  5     33 − 11 − 2 2     x0 =  32  , so that ∆x = x0 − x =  11 , 2  5 11 k∆xks 33 kxks = 21 ≈ 1.57, for approximately a 157% actual relative error. (d) Using Exercise 42,       1 0 1 ∆b = 1 − 2 = −1 , 3 0 3 so that k∆bks = 1 and k∆xks k∆bks 1 ≤ cond1 (A) = 147 · = 24.5, kxks kbks 6 so that at most a 2450% relative change can result. (e) Row-reducing A b gives the same result as above, while    1 1 1 1 1 A b0 = 1 5 0 1 → 0 1 −1 2 3 0 Thus and 0 1 0       −4 11 −15 ∆x =  1 − −4 =  5 4 −6 10 k∆xks 30 kxks = 21 ≈ 1.43, for about a 143% actual relative error. 0 0 1  −4 1 . 4 7.2. NORMS AND DISTANCE FUNCTIONS 565 45. From Exercise 33(a), 1 = kIk = A−1 A ≤ A−1 kAk = cond(A). 46. From Exercise 33(a), cond(AB) = (AB)−1 kABk = B −1 A−1 kABk ≤ A−1 kAk B −1 kBk = cond(A) cond(B). 47. From Theorem 4.18 (b) in Chapter 4, if λn is an eigenvalue of A, then λ1n is an eigenvalue of A−1 . By Exercise 34, kAk ≥ |λ1 | and A−1 ≥ 1 λn , so that cond(A) = A−1 kAk ≥ 48. With 7 A= 1 |λ1 | . |λn | −1 , −5 we have M = −D −1 " 7 (L + U ) = − 0 so that kM k∞ = 15 . Then " # 6 ⇒ b= −4 #−1 " 0 0 −5 1 " c=D −1 b= # " −1 − 71 = 0 0 1 7 0 0 − 15 #" 0 0 1 1 5 # " # 6 6 = 74 −4 5 # " −1 0 = 1 0 5 0 kckm = 6 . 7 #" k ⇒ 1 7 # , k Note that x0 = 0, so that x1 = c and then kM k∞ kx1 − x0 km = kM k∞ kckm < 0.0005 means that 4+log (kck /5) . In this case, with kM k∞ = 15 and kckm = 67 , we get k > log 101/kMm k∞ ) 10 ( 4 + log10 (6/35) ≈ 4.6, so that k ≥ 5. log10 5 0.999 Computing x5 using xk+1 = M xk + c gives x5 = , as compared with an actual solution of 0.999 1 x= . 1 k> 49. With 4.5 A= 1 −0.5 , −3.5 we have #−1 " 4.5 0 0 (L + U ) = − 0 −3.5 1 " M = −D −1 so that kM k∞ = 27 . Then " # 1 b= ⇒ −1 " c=D −1 b= # " −0.5 − 92 = 0 0 2 9 0 0 − 27 k # " # 2 1 = 92 −1 7 #" 0 0 2 1 7 # " −0.5 0 = 2 0 7 #" ⇒ k kckm = 1 9 0 # , 2 . 7 Note that x0 = 0, so that x1 = c and then kM k∞ kx1 − x0 km = kM k∞ kckm < 0.0005 means that 4+log (kck /5) k > log 101/kMm . In this case, with kM k∞ = 27 and kckm = 27 , we get k∞ ) 10 ( k> 4 + log10 (2/35) ≈ 5.06, so that k ≥ 6. log10 (7/2) 566 CHAPTER 7. DISTANCE AND APPROXIMATION 0.2622 Computing x6 using xk+1 = M xk + c gives x6 = , as compared with an actual solution of 0.3606 0.2623 x= . 0.3606 50. With  20 A= 1 −1  1 −1 −10 1 , 1 10  20 0  −1 M = −D (L + U ) = −  0 −10 0 0 −1  0 1 0   0  1 0 −1 1 10 we have 2 so that kM k∞ = 10 = 51 . Then    0 17  1   −1 b = 13 ⇒ c = D b = − 10 1 18 − 10   0 −1  1 1 =  10 1 0 10 1 10     17 1 17 − 20 20   1   − 10  13 = − 13 20  9 18 0 5 k k 1 20 0  1 − 20 0 1 − 10 ⇒ 1 20 1  10  0 kckm = 9 . 5 Note that x0 = 0, so that x1 = c and then kM k∞ kx1 − x0 km = kM k∞ kckm < 0.0005 means that 4+log (kck /5) . In this case, with kM k∞ = 15 and kckm = 95 , we get k > log 101/kMm k∞ ) 10 ( 4 + log10 (9/25) ≈ 5.08, so that k ≥ 6. log10 5   0.999 Computing x6 using xk+1 = M xk + c gives x6 = −1.000, as compared with an actual solution of 1.999   1 x = −1. 2 k> 51. With  3 A = 1 0 we have  3  −1 M = −D (L + U ) = − 0 0 0 4 0 so that kM k∞ = 24 = 21 . Then    1 0   1 −1 b = 1 ⇒ c = D b =  4 1 0  0 1 , 3 1 4 1 −1  0 0   0 1 3 0 1 3 0 1 3 k 1 0 1   0 0   1 1 = − 4 0 0     1 1 3 1 1   = 4 4  1 1 0 1 3 − 31 0 − 31  0  − 41  0 0 ⇒ k kckm = 1 . 3 Note that x0 = 0, so that x1 = c and then kM k∞ kx1 − x0 km = kM k∞ kckm < 0.0005 means that 4+log (kck /5) k > log 101/kMm . In this case, with kM k∞ = 12 and kckm = 13 , we get k∞ ) 10 ( k> 4 + log10 (1/15) ≈ 9.3, so that k ≥ 10. log10 2 7.2. NORMS AND DISTANCE FUNCTIONS 567   0.299 Computing x10 using xk+1 = M xk + c gives x10 = 0.099, as compared with an actual solution of 0.299   0.3 x = 0.1. 0.3 52. The sum and max norms are operator norms induced from the sum and max norms on Rn , so they are matrix norms. (a) If C and D are any two matrices and k·k is either the sum or max norm, then kCDk ≤ kCk kDk. Now assume A is such that kAk = δ < 1. Then n kAn k ≤ kAk = δ n . Thus limn→∞ kAn k = limn→∞ δ n = 0. So by property (1) in the definition of matrix norms, An → O. (b) A computation shows that (I − A)(I + A + · · · + An ) = I − An+1 , since all the other terms cancel in pairs. Thus (I − A)(I + A + A2 + A3 + · · · ) = (I − A) lim (I + A + · · · + An ) n→∞ = lim ((I − A)(I + A + · · · + An )) n→∞ = lim I − An+1 n→∞ = I. It follows that I − A is invertible and that its inverse is I + A + A2 + A3 + · · · . (c) Recall that a consumption matrix C is a matrix such that all entries are nonnegative and the sum of the entries in each column does not exceed 1. A consumption matrix C is productive if I − C is invertible and (I − C)−1 ≥ O; that is, if it has only nonnegative entries. To prove Corollary 3.35, use k·k∞ , the max norm. Note that since C ≥ O, the sum of the absolute values of the entries of a row is the same as the sum of the entries of the row. If the sum of each row of C is less than 1, then kCk∞ < 1, so by part (b), I − C is invertible. Since C ≥ O, it is clear that each entry of (I − C)−1 = I + C + C 2 + C 3 + · · · is nonnegative, so that (I − C)−1 ≥ O. Thus C is productive. The proof of Corollary 3.36 is similar, but it uses k·k1 , the sum norm. If the sum of each column of C is less than 1, then kCk1 < 1. The rest of the argument is identical to that in the preceding paragraph. 568 7.3 CHAPTER 7. DISTANCE AND APPROXIMATION Least Squares Approximation 1. The corresponding data points predicted by the line y = −2 + 2x are (1, 0), (2, 2), and (3, 4), so the least squares error is p √ (0 − 0)2 + (2 − 1)2 + (4 − 5)2 = 2. A plot of the points and the line on the same set of axes is 6 4 2 0.5 1. 2. 1.5 3. 2.5 3.5 -2 2. The corresponding data points predicted by the line y = x are (1, 1), (2, 2), and (3, 3), so the least squares error is p √ (1 − 0)2 + (2 − 1)2 + (3 − 5)2 = 6. A plot of the points and the line on the same set of axes is 5 4 3 2 1 1 2 3 4 -1 3. The corresponding data points predicted by the line y = −3 + 25 x are 1, − 12 , (2, 2), and 3, 29 , so the least squares error is s √ 2 2 1 9 6 2 − − 0 + (2 − 1) + −5 = . 2 2 2 A plot of the points and the line on the same set of axes is 4 2 0.5 -2 1. 1.5 2. 2.5 3. 3.5 7.3. LEAST SQUARES APPROXIMATION 569 4. The corresponding data points predicted by the line y = 3 − 13 x are 14 1 1 , 0, 3 − · 0 = (0, 3), −5, 3 − (−5) = −5, 3 3 3 4 1 1 1 5, 3 − · 5 = 5, , 10, 3 − · 10 = 10, − , 3 3 3 3 so the least squares error is s √ 2 2 2 r 4 14 1 25 4 1 30 2 3− + (3 − 3) + 2 − + 0− − = +0+ + = . 3 3 3 9 9 9 3 A plot of the points and the line on the same set of axes is 5 4 3 2 1 10 5 -5 5. The corresponding data points predicted by the line y = 3 − 13 x are 5 5 5 5 , 0, , 5, , 10, , −5, 2 2 2 2 so the least squares error is s 2 2 2 2 r 5 1 1 1 25 √ 5 5 5 3− + 3− + 2− + 0− = + + + = 7. 2 2 2 2 4 4 4 4 A plot of the points and the line on the same set of axes is 5 4 3 2 1 10 5 -5 6. The corresponding data points predicted by the line y = 2 − 15 x are (−5, 3) , (0, 2) , (5, 1) , (10, 0) , so the least squares error is q √ 2 2 2 (3 − 3) + (3 − 2) + (2 − 1) + (0 − 0) 62 = 2. 570 CHAPTER 7. DISTANCE AND APPROXIMATION A plot of the points and the line on the same set of axes is 3. 2.5 2. 1.5 1. 0.5 10 5 -5 7. We are looking for a line y = a+bx that best fits the given points. Using the least squares approximation theorem, we want a solution to Ax = b, where     y1 1 x1     1 1 0  y2  1 x2  a      = 1 2 , x = , b =  .  = 1 . A = .  . ..  b  ..   .. 1 3 5 yn 1 xn Since " 1 A A= 1 T 1 2  # 1 1  1 3 1  " 1 3  2 = 6 3 # " 7 6 T −1 3 ⇒ (A A) = −1 14 −1 # , 1 2 we get " x = (AT A)−1 AT b = 7 3 −1 #" −1 1 1 1 2 1 2   # 0 " # 1   −3 . 1 = 5 3 2 5 Thus the least squares approximation is y = −3 + 25 x. To find the least squares error, compute    0 1    e = b − Ax = 1 − 1 5 1 Then kek = q 1 2 + (−1)2 + 2 1 2 = 2    1 1 " # 2 −3    = 2 −1  . 5 1 2 3 2 √ 6 2 . 8. We are looking for a line y = a+bx that best fits the given points. Using the least squares approximation theorem, we want a solution to Ax = b, where     1 x1 y1     1 1 6 1 x2   y2  a         1 2 3 . A = . = , x = , b = =  ..  ..  b  .. . .  1 3 1 1 xn yn Since " 1 A A= 1 T 1 2  # 1 1  1 3 1  " 1 3  2 = 6 3 # " 7 6 T −1 3 ⇒ (A A) = −1 14 −1 1 2 # , 7.3. LEAST SQUARES APPROXIMATION 571 we get " x = (AT A)−1 AT b = 7 3 −1 #" −1 1 1 1 2 1 2   " # # 6 25 1   3 . 3 = 3 − 25 1 5 Thus the least squares approximation is y = 25 3 − 2 x. To find the least squares error, compute    1 1 " 25 # 6    2 35 = − 13  . −2 1 3 6    1 6    e = b − Ax = 3 − 1 1 1 Then kek = q 1 2 + 6 − 13 2 1 2 = 6 + √ 6 6 . 9. We are looking for a line y = a+bx that best fits the given points. Using the least squares approximation theorem, we want a solution to Ax = b, where     y1 1 x1     4 1 0  y2  1 x2  a         1 . 1 1 = , x = , b = A = . =    . .. b  ..   .. .  0 1 2 yn 1 xn Since " 1 T A A= 0 1 1  " 0 3  1 = 3 2  # 1 1  1 2 1 # " 5 3 T −1 6 ⇒ (A A) = 5 − 12 − 21 # , 1 2 we get " x = (AT A)−1 AT b = 5 6 1 −2 − 12 1 2 #" 1 0 1 1   # 4 " # 11 1   1 = 3 . 2 −2 0 Thus the least squares approximation is y = 11 3 − 2x. To find the least squares error, compute    1 4    e = b − Ax = 1 − 1 0 1 Then kek = q 1 2 + 3 − 23 2 1 2 = 3 +    1 0 " 11 # 3  3  2 = − 3  . 1 −2 1 2 3 √ 6 3 . 10. We are looking for a line y = a+bx that best fits the given points. Using the least squares approximation theorem, we want a solution to Ax = b, where     1 x1 y1     1 0 3 1 x2   y2  a        1 1 A = . , x= , b =  .  = 3 . ..  = b  ..  ..  .  1 2 5 1 xn yn Since " 1 T A A= 0 1 1  # 1 1  1 2 1  " 0 3  1 = 3 2 # " 5 3 T −1 6 ⇒ (A A) = − 12 5 − 21 1 2 # , 572 CHAPTER 7. DISTANCE AND APPROXIMATION we get " 5 6 − 12 x = (AT A)−1 AT b = − 21 1 2 #" 1 0   " # # 3 8 1   3 = 3 . 2 1 5 1 1 Thus the least squares approximation is y = 83 + x. To find the least squares error, compute       1 1 0 "8# 3 3  3     2 e = b − Ax = 3 − 1 1 = − 3  . 1 1 1 2 5 3 q √ 2 2 1 2 + − 23 + 13 = 36 . Then kek = 3 11. We are looking for a line y = a+bx that best fits the given points. Using the least squares approximation theorem, we want a solution to Ax = b, where         1 x1 y1 1 −5 −1 1 x2    y2     1 0 a       1  A = . , x= , b =  .  =  . ..  = 1 5 b 2  ..  ..  .  1 10 4 1 xn yn Since " 1 A A= −5 T 1 0 1 5  # 1 1  1  10 1 1  −5 " 4 0  = 10 5 " #" # " 3 10 T −1 10 ⇒ (A A) = 1 150 − 50 1 − 50 # , 1 125 10 we get 3 10 1 − 50 x = (AT A)−1 AT b = 1 − 50 1 −5 1 125 1 0 1 5   # −1 " #  7 1   1 .   = 10 8 10  2 25 4 7 8 Thus the least squares approximation is y = 10 + 25 x. To find the least squares error, compute  1     − 10 −1 1 −5 " #  3  1 1  7 0      10  e = b − Ax =   −  .  8 =  10 3   2 1 − 10  5 25 4 Then kek = q 1 − 10 2 + 3 2 + 10 3 − 10 2 + 1 1 2 = 10 10 1 10 √ 5 5 . 12. We are looking for a line y = a+bx that best fits the given points. Using the least squares approximation theorem, we want a solution to Ax = b, where         1 x1 y1 1 −5 3 1 x2    y2     0 a   1   3 A = . , x= , b =  .  =  . ..  = 1 5 b 2  ..  ..  .  1 10 0 1 xn yn Since " 1 A A= −5 T 1 0 1 5  # 1 1  1  10 1 1  −5 " 0 4  = 5 10 10 # " 3 10 T −1 10 ⇒ (A A) = 1 − 50 150 1 − 50 1 125 # , 7.3. LEAST SQUARES APPROXIMATION 573 we get " x = (AT A)−1 AT b = 3 10 1 − 50 1 − 50 #" 1 125 1 −5 1 0   " # # 3  5 1  3 2 .  = − 15 10 2 1 5 0 Thus the least squares approximation is y = 52 − 15 x. To find the least squares error, compute    1 3 3 1    e = b − Ax =   −  2 1 1 0 Then kek = q − 12 2 + 1 2 + 2 1 2 + 2   1 −5 " # −2   5 1 0 2 =  2 .   1 1  2 5 − 5 − 12 10 1 2 = 1. 2 13. We are looking for a line y = a+bx that best fits the given points. Using the least squares approximation theorem, we want a solution to Ax = b, where         1 1 1 1 x1 y1  3 1 2 1 x2      a   y2      1 3 4 A = . , x= , b= . = ..  =    . . b .  ..   .  1 4 5 1 xn yn 1 5 7 Since " 1 A A= 1 1 2 T 1 3 1 4  1 # 1 1  1 5   1 1  1  2 "  5 3  = 15  4 5 # " 11 15 T −1 10 ⇒ (A A) = 3 − 10 55 3 − 10 1 10 # , we get " T x = (A A) −1 T A b= 11 10 3 − 10 #" 1 1 1 10 3 − 10 1 2 1 3 1 4   1  # 3 " 1 #  1  4 = − 5 .  7 5    5 5 7 Thus the least squares approximation is y = − 15 + 75 x. To find the least squares error, compute       −1 1 1 1     " #  52  3 1 2  5       −1 5      e = b − Ax = 4 − 1 3 =  0 . 7      2 5 5 1 4 − 5  1 7 1 5 5 q √ 2 2 2 2 Then kek = − 15 + 25 + 02 + − 25 + 15 = 510 . 14. We are looking for a line y = a+bx that best fits the given points. Using the least squares approximation theorem, we want a solution to Ax = b, where         1 1 10 1 x1 y1  8 1 2 1 x2      a     y2    1 3 5 A = . , x= , b= . = ..  =    . . b .  ..   .  1 4 3 1 xn yn 1 5 0 574 CHAPTER 7. DISTANCE AND APPROXIMATION Since " 1 A A= 1 T 1 2 1 3 1 4  1  # " 2 " 11  5 15 T −1 10  ⇒ (A A) = 3 = 3 15 55 − 10  4 5  1 # 1 1  1 5   1 1 3 − 10 # 1 10 , we get " 11 10 3 − 10 x = (AT A)−1 AT b = #" 1 1 1 10 3 − 10 1 2 1 3 1 4   10  #  8  " 127 #  1   5  = 10 .  − 25 5    3 0 5 Thus the least squares approximation is y = 127 10 − 2 x. To find the least squares error, compute       −1 10 1 1     " #  35   10   8  1 2 127  1     10      e = b − Ax =  5  − 1 3 = − 5  . 5  3     −2  10   3  1 4 0 1 5 − 15 Then kek = q − 15 2 + 3 2 + 10 − 15 2 + 3 2 + 10 − 15 2 √ 30 10 . = 15. Following the method of Example 7.28, we substitute the given values of x into the quadratic equation a + bx + cx2 = y to get a+ b+ c= 1 a + 2b + 4c = −2 which in matrix form is  1 1  1 1 a + 3b + 9c = 3 a + 4b + 16c = 4, 1 2 3 4    1 1   a −2 4  b =   .  3 9 c 4 16 Thus we want the least squares approximation to Ax = b, where A is the 4 × 3 matrix on the left and b is the matrix on the right in the above equation. Since     1 1 1     31 5 1 1 1 1  4 10 30 − 27  4 4 4   1 2 4      129 AT A = 1 2 3 4   − 54  ,  = 10 30 100 ⇒ (AT A)−1 = − 27 4 20 1 3 9  5 1 1 4 9 16 30 100 354 − 54 4 4 1 4 16 we get  31 4  27 T −1 T x = (A A) A b = − 4 5 4 − 27 4 129 20 − 54  5 1 4  − 45  1 1 1 4 2 Thus the least squares approximation is y = 3 − 18 5 x+x . 1 2 4 1 3 9    1 1   3  −2   . 4    = − 18 5   3 16 1 4   7.3. LEAST SQUARES APPROXIMATION 575 16. Following the method of Example 7.28, we substitute the given values of x into the quadratic equation a + bx + cx2 = y to get a+ b+ c=6 a + 2b + 4c = 0 a + 3b + 9c = 0 a + 4b + 16c = 2, which in matrix form is  1 1  1 1 1 2 3 4    1   6 a 0 4  b =   . 0 9 c 16 2 Thus we want the least squares approximation to Ax = b, where A is the 4 × 3 matrix on the left and b is the matrix on the right in the above equation. Since        1 1 1  31 5 4 10 30 1 1 1 1  − 27  4 4 4     1 2 4    129 AT A = 1 2 3 4   − 54  ,  = 10 30 100 ⇒ (AT A)−1 = − 27 4 20 1 3 9  5 1 30 100 354 1 4 9 16 − 54 4 4 1 4 16 we get  31 4  x = (AT A)−1 AT b = − 27 4 5 4 − 27 4 129 20 − 54  5 1 4 5  − 4  1 1 1 4 1 2 4 1 3 9     6  15 1    0   . 4    = − 56 5  0 16 2 2 2 Thus the least squares approximation is y = 15 − 56 5 x + 2x . 17. Following the method of Example 7.28, we substitute the given values of x into the quadratic equation a + bx + cx2 = y to get a − 2b + 4c = 4 a− b+ c= 7 a = 3 a+ b+ c= 0 a + 2b + 4c = −1, which in matrix form is  1 1  1  1 1    −2 4   4  7 −1 1 a        0 0  b =  3 .  0 1 1 c 2 4 −1 Thus we want the least squares approximation to Ax = b, where A is the 5 × 3 matrix on the left and b is the matrix on the right in the above equation. Since   1 −2 4        17 1 1 1 1 1 1 −1 1 5 0 10 0 − 71 35         T −1 1 AT A = −2 −1 0 1 2  0 , 0 0 1  =  0 10 0  ⇒ (A A) =  0 10   1 − 17 0 14 4 1 0 1 4 1 1 1 10 0 34 1 2 4 576 CHAPTER 7. DISTANCE AND APPROXIMATION we get  4     18 1 1 1 1  7 5      17  −1 0 1 2   3 = − 10  .   1 0 1 4  0 − 12 −1   17 35  x = (AT A)−1 AT b =  0 − 17 0 1 10 0  − 71 1  0 −2 1 4 14 17 1 2 Thus the least squares approximation is y = 18 5 − 10 x − 2 x . 18. Following the method of Example 7.28, we substitute the given values of x into the quadratic equation a + bx + cx2 = y to get a − 2b + 4c = 0 a − b + c = −11 = −10 a a + b + c = −9 a + 2b + 4c = which in matrix form is 8,    0 −2 4   −11 −1 1    a     0 0 b =  −10 .   −9 c 1 1 8 2 4  1 1  1  1 1 Thus we want the least squares approximation to Ax = b, where A is the 5 × 3 matrix on the left and b is the matrix on the right in the above equation. Since   1 −2 4        17 5 0 10 1 1 1 1 1 1 −1 1 0 − 71 35        T −1 1 AT A = −2 −1 0 1 2  0 0 0 ,  =  0 10 0  ⇒ (A A) =  0 10 1   1 1 1 10 0 34 4 1 0 1 4 1 − 17 0 14 1 2 4 we get   17 35  x = (AT A)−1 AT b =  0 − 17 0 1 10 0  − 17 1  0 −2 1 4 14  0    62  1 1 1 1 −11 −    59   = . −1 0 1 2  −10   5   1 0 1 4  −9 4 8 9 2 Thus the least squares approximation is y = − 62 5 + 5 x + 4x . 19. The normal equation is AT Ax = AT b. We have 3 A A= 1 T AT b = 1 1 1 −2   3 1 1  11 6 1 1 = 2 6 6 1 2   1 3 2   5 1 = , −2 1 4 1 so that the normal form of this system is 11 6 6 5 x= . 6 4 7.3. LEAST SQUARES APPROXIMATION Thus 577 #−1 " # " 1 11 6 5 5 x= = 6 6 4 − 51 " #" # " # 1 5 5 = 11 7 4 30 15 − 15 is the least squares solution. 20. The normal equation is AT Ax = AT b. We have   1 −2 3 2  14 3 −2 = −2 1 −6 2 1   1 3 2   6 1 = , −2 1 −3 1 1 −2 AT A = 1 −2 AT b = −6 9 so that the normal form of this system is 14 −6 6 x= . −6 9 −3 Thus #−1 " # " 1 −6 6 = 10 1 9 −3 15 " 14 x= −6 1 15 7 45 #" # " # 2 6 15 = 1 − 15 −3 is the least squares solution. 21. The normal equation is AT Ax = AT b. We have AT A = AT b =   1 −2  0 2 3  0 −3 = 14 8 8 38 5 −3 5 0 2 3 0   4  12 0 2 3   1 = , −21 −3 5 0 −2 4 1 −2 1 −2 so that the normal form of this system is 14 8 Thus " 14 8 x= 8 38 8 12 x= . 38 −21 #−1 " # " 19 12 234 = 2 −21 − 117 2 − 117 7 234 #" # " # 4 12 3 = −21 − 56 is the least squares solution. 22. The normal equation is AT Ax = AT b. We have 1 AT A = 0 AT b = 1 0   1 0  2 −1 0  6  2 −1 = −1 1 2 −1 1 −3 0 2   1  2 −1 0   5 = 12 , −1 1 2 −1 −2 2 −3 6 578 CHAPTER 7. DISTANCE AND APPROXIMATION so that the normal form of this system is 6 −3 Thus " 6 x= −3 −3 12 x= . 6 −2 #−1 " # " 2 −3 12 = 91 6 −2 9 1 9 2 9 #" # " # 22 12 = 98 −2 9 is the least squares solution. 23. Instead of trying to invert AT A, we instead use an alternative method, row-reducing AT A AT b . We will see that AT A does not reduce to the identity matrix, so that the system does not have a unique solution. We get      3 0 2 1 1 0 0 1 1 0 1 1  1 0 −1 −1 1 3 −2 −1 0 1 1   = 0  AT A =  0 1 3 2 1 1 0 −1 1 1 2 −2 1 −1 2 2 0 1 1 0 1 −1 1 0      2 1 1 1 0 1       −5 −3 1 0 −1 −1   =  . AT b =  0 1 1 1  2  3 −1 4 0 1 1 0 Row-reducing, we have  3 0  2 1 0 2 1 3 −2 −1 −2 3 2 −1 2 2   2 1 0 0 0 1 0 −5 → 0 0 1 3 0 0 0 −1 −1 1 2 0  4 −5 . −5 0 So AT A is not invertible, and therefore there is not a unique solution. The general solution is       1 4 4+t −1  −5 − t  −5       −5 − 2t = −5 + t −2 . 1 0 t 24. Instead of trying to invert AT A, we instead use an alternative method, row-reducing AT A AT b . We will see that AT A does not reduce to the identity matrix, so that the system does not have a unique solution. We get      0 1 1 1 0 1 1 0 3 0 3 0 1 −1 0 1 1 −1 1 −1 0 3 1 2  =  AT A =  1 1 1 1 1 0 1 0 3 1 4 0 0 −1 0 1 1 1 1 1 0 2 0 2      0 1 1 1 5 3 1 −1 0 1  3  3 T   =  . A b= 1 1 1 1 −1  8 0 −1 0 1 1 −2 Row-reducing, we have  3 0 3 0 0 3 1 2  3 1 4 0 0 2 0 2   3 1 0 0 1 0 1 0 3 1 → 0 0 1 −1 8 −2 0 0 0 0  −5 −1 . 6 0 7.3. LEAST SQUARES APPROXIMATION 579 So AT A is not invertible, and therefore there is not a unique solution. The general solution is       −5 − t −5 −1 −1 − t −1     =   + t −1 .  6 + t   6  1 t 0 1 25. Solving these equations corresponds to solving Ax = b where     2 1 1 −1 6  0 −1  2   b= A= 11 .  3 2 −1 0 −1 0 1 T To solve, we row-reduce A A AT b :       1 1 −1 1 0 3 −1  11 7 −5  0 −1 2  2 0  7 6 −5 AT A =  1 −1 =  3 2 −1 −1 2 −1 1 −5 −5 7 −1 0 1      2  35 1 0 3 −1   6   18 . 2 0  = AT b =  1 −1 11 −1 −1 2 −1 1 0 Row-reducing, we have  11 7 −5  6 −5  7 −5 −5 7   1 0 0 35   18 → 0 1 0 −1 0 0 1  42 11 19  . 11  42 11 Thus the best approximation is     42 x 11    19  x = y  =  11  . 42 z 11 26. Solving these equations corresponds to solving Ax = b where     21 2 3 1  7  1 1 1   A= b= 14 . −1 1 −1 0 0 2 1 T To solve, we row-reduce A A AT b :      2 3 1 2 1 −1 0  6  1 1 1  1 2  6 AT A = 3 1 = −1 1 −1 1 1 −1 1 4 0 2 1     21   2 1 −1 0   35 7   1 2  84 . AT b = 3 1 = 14 1 1 −1 1 14 0 Row-reducing, we have  6  6 4 6 15 5 4 5 4   35 1 0 0   84 → 0 1 0 14 0 0 1  469 66 224  . 33  133 − 11 6 15 5  4 5 4 580 CHAPTER 7. DISTANCE AND APPROXIMATION Thus the best approximation is     469 x 66     x = y  =  224 . 33  z − 133 11 27. With Ax = QRx = b, we get " Rx = QT b, or # " 2 1 x = 13 1 3 3 0 2 3 − 32 1 3 2 3 #  " # 2 3   .  3 = −2 −1  Back-substitution gives the solution " x= 5 3 # . −2 28. With Ax = QRx = b, we get "√ Rx = QT b, or √ 6 − 26 0 √1 2 # " x= √1 6 √1 2 √2 6 0   " # # 1 √2 − √16   1 = √6 .   √1 2 2 1 Back-substitution gives the solution " # x= 4 3 2 . 29. We are looking for a line y = a+bx that best fits the given points. Using the least squares approximation theorem, we want a solution to Ax = b, where     1 20 14.5     1 x1 y1 1 40   31    1 x2    y2     a 36       1 48  .  A = . , x = , b = = =  ..   ..  1 60  b  ..  .  45.5 .     1 80   59  1 xn yn 1 100 73.5 Since " 1 A A= 20 1 40 T 1 48 1 60 1 80  1   # 1 1  1 100  1  1 1  20  40  "  48  = 6 60  348   80  100 # " 1519 348 T −1 1545 ⇒ (A A) = 29 − 2060 24304 29 − 2060 1 4120 we get " T x = (A A) −1 T A b= 1519 1545 29 − 2060 29 − 2060 1 4120 #" 1 20 1 40 1 48 Thus the least squares approximation is y = 0.918 + 0.730x. 1 60 1 80   14.5   31  " # #    36 0.918 1   = .  100  0.730 45.5    59  73.5 # , 7.3. LEAST SQUARES APPROXIMATION 581 30. (a) We are looking for a line L = a + bF that best fits the given points. Using the least squares approximation theorem, we want a solution to Ax = b, where         y1 1 x1 7.4 1 2  y2   1 x2     9.6    1 4 . , x = a , b =  = A = .  =  . .  ..  1 6 b  ..  11.5  .. 13.6 1 8 yn 1 xn Since " 1 A A= 2 T 1 4  2 4   6  # 1 1  1  8 1 1 1 6 " 4 = 20 # " 3 20 T −1 2 ⇒ (A A) = 120 − 14 − 41 1 20 # , 8 we get  7.4 " #  5.4 1   9.6  . =  1.025 8 11.5 13.6  " x = (AT A)−1 AT b = 3 2 − 41 #" − 14 1 1 2 20 1 4 1 6 # Thus the least squares approximation is L = 5.4 + 1.025F . a = 5.4 represents the length of the spring when no force is applied (F = 0). (b) When a weight of 5 ounces is attached, the spring length will be about 5.4 + 1.025 · 5 = 10.525 inches. 31. (a) We are looking for a line y = a + bx that best fits the given points. Letting x = 0 correspond to 1920 and using the least squares approximation theorem, we want a solution to Ax = b, where     1 0 54.1         1 10 59.7 1 x1 y1 1 20 62.9   1 x2    y2   1 30 a 68.2          A = . ..  = 1 40 , x = b , b =  ..  = 69.7 .  .. .  .     1 50 70.8 1 xn yn     1 60 73.7 1 70 75.4 Since " 8 A A= 280 T # " 5 280 T −1 12 ⇒ (A A) = 1 14000 − 120 1 − 120 1 4200 # , we get " T x = (A A) −1 T A b= 5 12 1 − 120 #" 1 1 0 4200 1 − 120 1 10 1 20 1 30 1 40 1 50 1 60   54.1   59.7     # 62.9 " #  1 68.2 56.63  .  =  70  69.7 0.291   70.8     73.7 75.4 Thus the least squares approximation is y = 56.63 + 0.291x. The model predicts that the life expectancy of someone born in 2000 is y(80) = 56.63 + 0.291 · 80 = 79.91 years. 582 CHAPTER 7. DISTANCE AND APPROXIMATION (b) To see how good the model is, compute the least squares error:       −2.53 1 0 54.1  0.16  59.7 1 10        0.45  62.9 1 20        2.84  68.2 1 30 56.63  , so kek ≈ 4.43.      = e = b − Ax =    −  1.43  69.7 1 40 0.291 −0.38 70.8 1 50       −0.39 73.7 1 60 −1.6 1 70 75.4 The model is not bad, with an error of under 10% of the initial value. 32. (a) Following the method of Example 7.28, we substitute the given values of x into the equation a + bx + cx2 = y to get a + 0.5b + 0.25c = 11 a+ b+ c = 17 a + 1.5b + 2.25c = 21 which in matrix form is a+ 2b + 4c = 23 a+ 3b + 9c = 18,     11 1 0.5 0.25   17  a 1 1 1     1 1.5 2.25  b  = 21 .     23 1 2 4  c 1 3 9 18 Thus we want the least squares approximation to Ax = b, where A is the 5 × 3 matrix on the left and b is the matrix on the right in the above equation. Since  1 AT A =  0.5 0.25 1 1 1 1.5 1 2.25   1 1 1  1 2 3  1 4 9 1 1 0.5 1 1.5 2 3  0.25  5 1    8 2.25 =  16.5 4  9 8 16.5 39.5  16.5 39.5  ⇒ 103.125  3.330 −4.082 1.031 5.735 −1.543 , (AT A)−1 = −4.082 1.031 −1.543 0.436  we get  3.330 x = (AT A)−1 AT b = −4.082 1.031  −4.082 1.031 1 5.735 −1.543  0.5 −1.543 0.436 0.25    11    1 1 1 1  1.918 17    1 1.5 2 3  21 = 20.306 . 1 2.25 4 9 23 −4.972 18 Thus the least squares approximation is y = 1.918 + 20.306t − 4.972t2 . (b) The height at which the object was released is y(0) ≈ 1.918 m. Its initial velocity was y 0 (0) ≈ 2 20.306 m/sec, and its acceleration due to gravity is y 00 (t) ≈ −9.944 m/s . (c) The object hits the ground when y(t) = 0. Solving the quadratic using the quadratic formula gives y ≈ −0.092 and y ≈ 4.176 seconds. So the object hits the ground after about 4.176 seconds. 7.3. LEAST SQUARES APPROXIMATION 583 33. (a) If p(t) = cekt , then p(0) = 150 means that c = 150, so that p(t) = 150ekt . Taking logs gives ln p(t) = ln 150 + kt ≈ 5.011 + kt. We wish to find the value of k which best fits the data. Let t = 0 correspond to 1950 and t = 1 to 1960. Then kt = ln p(t) − ln 150 = ln p(t) 150 , so   1   2    A= 3 ,   4 5 h i x= k ,     179 ln 150 0.177   203   ln 150  0.303   227      b= ln 150  ≈ 0.414 .   250   ln 150  0.511 0.628 ln 281 150 Since 1 AT A = 55 and thus (AT A)−1 = 55 , we get x = (AT A)−1 AT b = 1 1 55 2 3 4   0.177   0.303  5 0.414  = 0.131. 0.511 0.628 Thus the least squares approximation is p(t) = 150e0.131t . (b) 2010 corresponds to t = 6, so the predicted population is p(6) = 150e0.131·6 ≈ 329.19 million. 34. Let t = 0 correspond to 1970. (a) For a quadratic approximation, follow the method of Example 7.28, and substitute the given values of x into the equation a + bx + cx2 = y to get a a + 5b + = 29.3 25c = 44.7 a + 10b + 100c = 143.8 a + 15b + 225c = 371.6 a + 20b + 400c = 597.5 a + 25b + 625c = 1110.8 a + 30b + 900c = 1895.6 a + 35b + 1225c = 2476.6, which in matrix form is  1 1  1  1  1  1  1 1 0 5 10 15 20 25 30 35    0 29.3  44.7  25      100     143.8    a   225    b  =  371.6  .    400   597.5  c    625  1110.8 1895.6 900  1225 2476.6 Thus we want the least squares approximation to Ax = b, where A is the 8 × 3 matrix on the left and b is the matrix on the right in the above equation. Since     3 1 17 − 40 8 140 3500 24 600    3 53 1  AT A =  140 − 3000 3500 98000  ⇒ (AT A)−1 = − 40 , 4200 1 1 1 − 3000 105000 3500 98000 2922500 600 584 CHAPTER 7. DISTANCE AND APPROXIMATION we get  17 24 3 − 40 1 600 53 4200 1 − 3000  3 x = (AT A)−1 AT b = − 40  1  1   1 1  1 600   1 − 3000  1 1  105000  1  1 1 0 5 10 15 20 25 30 35  T  29.3 0    25   44.7         100    143.8  57.063    225   371.6     = −20.335 .      400   597.5  2.589   625   1110.8    900  1895.6 2476.6 1225 Thus the least squares approximation is y = 57.063 − 20.335t + 2.589t2 . (b) With y(t) = cekt , since y(0) = 29.3 we have c = 29.3 so that p(t) = 29.3ekt . Taking logs gives p(t) ln p(t) = ln 29.3 + kt, or kt = ln 29.3 . Then   0   5   10   15   A =  , 20   25     30 35    0.422 ln 44.7 29.3  ln 143.8  1.591    29.3      371.6  ln 29.3  2.540   597.5      b=  ln 29.3  ≈ 3.015 .   1110.8   ln 29.3  3.635   1895.6   ln 29.3  4.170 4.437 ln 2476.6 29.3  Then AT A = 3500 and AT b = 487.696 , so that x = 487.696 ≈ 0.1393 . Thus y(t) = 3500 29.3e0.1393t . (c) The models produce the following predictions: Year 1970 1975 1980 1985 1990 1995 2000 2005 Actual 29.3 44.7 143.8 371.6 597.5 1110.8 1895.6 2476.6 Quadratic 57.1 20.1 112.6 334.6 686.0 1166.8 1777.1 2516.9 Exponential 29.3 58.8 118.0 236.8 475.1 953.5 1913.3 3839.4 The quadratic appears to be a better fit overall, even though it does a poor job in the first 15 years or so. The exponential model simply grows too fast. (d) Using the quadratic approximation, we get for 2010 and 2015 y(40) = 57.063 − 20.335 · 40 + 2.589 · 402 = 3386.1 y(45) = 57.063 − 20.335 · 45 + 2.589 · 452 = 4384.7. 35. Using Example 7.29, taking logs gives     t1 30     A = t2  = 60 , t3 90     kt1 ln 172 200     b = kt2  = ln 148 . 200  128 kt2 ln 200 7.3. LEAST SQUARES APPROXIMATION 585 62.8 Then AT A = 12600 and AT b ≈ −62.8 , so that x ≈ − 12600 ≈ −0.005. Thus k = −0.005 and p(t) = ce−0.005t . To compute the half-life, we want to solve 12 c = ce−0.005t , so that −0.005t = ln 12 = ln 2 ≈ 139 days. − ln 2; then t = 0.005 36. We want to find the best fit z = a + bx + cy to the given data; the data give the equations − 4c = 0 = 0 a + 4b − c = 1 a + b − 3c = 1 a a + 5b a − b − 5c = −2, which in matrix form is  1 1  1  1 1 0 5 4 1 −1    0 −4    0 0    a     −1  b =  1 .  1  c −3 −2 −5 Thus we want the least squares approximation to Ax = b, where A is the 5 × 3 matrix on the left and b is the matrix on the right in the above equation. Since     2189 541 5 9 −13 − 433 15 15 15     86 AT A =  9 43 −2 ⇒ (AT A)−1 = − 433 , − 107 15 15 15  541 107 134 −13 −2 51 − 15 15 15 we get   2189 15  x = (AT A)−1 AT b = − 433 15 541 15 − 433 15 86 15 − 107 15  541 1 15 107   − 15   0 134 −4 15 1 5 0  0     43 1 1 1  0   38    4 1 −1   1 = − 3  .   11 −1 −3 −5  1 3 −2 Thus the least squares approximation is y = 31 (43 − 8x + 11y). 37. We have A= 1 1 ⇒ AT = 1 1 ⇒ AT A = 2 , (AT A)−1 = 21 . So the standard matrix of the orthogonal projection onto W is " # " i 1 1 h1i h T −1 T 2 P = A(A A) A = = 1 1 1 1 2 2 Then " projW (v) = P v = 38. We have 1 A= −2 ⇒ AT = 1 −2 1 2 1 2 ⇒ 1 2 1 2 1 2 1 2 # . #" # " # 7 3 = 27 . 4 2 AT A = 5 , (AT A)−1 = 51 . So the standard matrix of the orthogonal projection onto W is " # " h ih i 1 1 1 5 P = A(AT A)−1 AT = = 1 −2 − 25 −2 5 − 25 4 5 # . 586 CHAPTER 7. DISTANCE AND APPROXIMATION Then " projW (v) = P v = #" # " # 1 − 15 = . 4 2 1 5 5 − 25 1 5 − 25 39. We have   1 A = 1 1 ⇒ AT = 1 1 1 AT A = 3 , (AT A)−1 = 13 . ⇒ So the standard matrix of the orthogonal projection onto W is   1 h ih   1 T −1 T P = A(A A) A = 1 3 1 1  1 i 3 1 1 = 3 1 1 3 1 3 1 3 1 3  1 3 1 . 3 1 3 Then    1 3 1 3 1 3 −1 ⇒ 1 3 1 projW (v) = P v =  3 1 3   1 1 2 3   1   = 2 . 3  2 1 3 2 3 40. We have  2 A =  2 −1  ⇒ AT = 2 2 AT A = 9 , (AT A)−1 = 19 . So the standard matrix of the orthogonal projection onto W is  2 h ih   P = A(AT A)−1 AT =  2 91 2 −1   2 −1 i 4 9  =  49 − 29 4 9 4 9 − 92  − 92  − 92  . 1 9 Then 4 9  4 projW (v) = P v =  9 − 29 4 9 4 9 − 29     4 − 92 1 9  4 2   − 9  0 =  9  . 1 0 − 29 9 1 1 ⇒ AT A =  41. We have  1 A= 0 −1  1 1 1 ⇒ AT = 0 1 −1 1 2 0 1 0 , (AT A)−1 = 2 3 0 So the standard matrix of the orthogonal projection onto W is  1  P = A(AT A)−1 AT =  0 −1  1 "1  1 2 0 1 #" 0 1 1 1 3 0 1  5 # 6 −1  1 = 3 1 − 16 Then  5 6  1 projW (v) = P v =  3 − 61 1 3 1 3 1 3     5 1 6      1 =  13  . 3  0 5 0 − 16 6 − 16 1 3 1 3 1 3 − 16  1 . 3 5 6 0 1 3 . 7.3. LEAST SQUARES APPROXIMATION 587 42. We have  1 A = −2 1  1 0 −1 1 A = 1 T ⇒ −2 1 0 −1 ⇒ 1 0 T −1 , (A A) = 6 2 0 6 A A= 0 T So the standard matrix of the orthogonal projection onto W is    2 #" # 1 1 "1 3 1 −2 1 0    P = A(AT A)−1 AT = −2 = − 31 0 6 1 0 −1 0 2 1 1 −1 − 31 Then  2 3  1 projW (v) = P v = − 3 − 31 43. With the new basis, we have   " 1 1 1   T A = 1 3 ⇒ A = 1 0 1 1 3 # 0 1 2 3 − 13 . 1 2  − 31  − 31  . 2 3     −1 1 − 13   1   − 3  2 =  0 . 2 1 3 3

linear-algebra-a-modern-introduction-4e-ism-4nbsped-1285869605-9781285869605 compress

Related documents

Products

Support

linear-algebra-a-modern-introduction-4e-ism-4nbsped-1285869605-9781285869605 compress

Related documents

Add this document to collection(s)

Add this document to saved

Suggest us how to improve StudyLib