Selected Solutions to Linear Algebra Done Wrong Jing Huang August 2018 Introduction Linear Algebra Done Wrong by Prof. Sergei Treil is a well-known linear algebra reference book for collage students. However, I found no solution manual was available when I read this book and did the exercises. In contrast, the solution manual for the other famous book Linear Algebra Done Right is readily available for its greater popularity (without any doubt, Dong Wrong is an excellent book). Reference solutions are important since insights tend to hide behind mistakes and ambiguity, which may be missed if there is no good way to check our answers. They also help to save time when there is no hint or the hint is obscure. I scanned all and did most problems in the first seven chapters (2014 version) and share those valuable here (those are relatively hard or insightful from my perspective. I read this book for reviewing and deeper mathematical understanding). The rest problems should be tractable even for a novice in linear algebra. In Chapter 7, there are a few easy problems and none is selected herein. The last chapter, Chapter 8, deals with dual spaces and tensors, which could be advanced for most readers and is also not covered. As references aiming to facilitate readers’ learning process, the correctness and optimality of the solutions are not guaranteed. Lastly, do not copy the contents as it can violate your college’s academic rules. I’m now a PhD student with limited time to craft the contents. Typos and grammar mistakes are inevitable. Available at huangjingonly@gmail.com if any feedback. Cheers. Chapter 1. Basic Notations 1.4. Prove that a zero vector of a vector space V is unique. Proof Suppose there exist two different zero vectors 01 and 02 . Then, for any v ∈ V v + 01 = v v + 02 = v. 1 Find the difference of the equations above 02 − 01 = v − v = 01 /(or) 02 then 02 = 01 + 01 /02 = 01 . Thus, the zero vector is unique. 1.6. Prove that the additive inverse, defined in Axiom 4 of a vector space is unique. Proof Assume there exist two different additive inverses w1 and w2 of vector v ∈ V . Then v + w1 = 0 v + w2 = 0. Obtain the difference of the two equations, w1 − w2 = 0 then w1 = w2 . Therefore, the additive inverse is unique. 1.7. Prove that 0v = 0 for any vector v ∈ V . Proof 0v = (0α)v = α(0v) for any scalar α, so (1 − α)0v = 0 for any scalar (1 − α). Then 0v = 0. 1.8. Prove that for any vector v its additive inverse −v is given by (−1)v. Proof v + (−1)v = (1 − 1)v = 0v = 0 and we know from Problem 1.6 that the additive inverse is unique. Hence, −v = (−1)v. 2.5. Let a system of vectors v1 , v2 , ..., vr be linearly independent but not generating. Show that it is possible to find a vector vr+1 such that the system v1 , v2 , ..., vr , vr+1 is linearly independent. Pr Proof Take vr+1 that can not be represented as k=1 αk vk . Such a vr+1 exists because v1 , v2 , ..., vr are not generating. Now we need to show v1 , v2 , ..., vr , vr+1 are linearly independent. Suppose that v1 , v2 , ..., vr , vr+1 are linearly dependent, i.e., α1 v1 + α2 v2 + ... + αr vr + αr+1 vr+1 = 0, and Pr+1 k=1 |αk | = 6 0. If αr+1 = 0, then α1 v1 + α2 v2 + ... + αr vr = 0, 2 P 6 0. This contradicts that v1 , v2 , ..., vr are linearly indepenand rk=1 |αk | = dent. So αr+1 6= 0 and vr+1 can be represented as vr+1 = − 1 αr+1 r X αk vk . k=1 This contradicts the premise that vr+1 can not be represented by v1 , v2 , ..., vr . Thus, the system v1 , v2 , ..., vr , vr+1 is linearly independent. 3.2. Let a linear transformation in R2 be the reflection in the line x1 = x2 . Find its matrix. Solution 1. Reflection is a linear transformation. It is completely defined T T on the standard basis. And e1 = [1 0]T = ⇒ r1 = [0 1]T , e2 = [0 1]T = ⇒ r2 = [1 0]T . The matrix is the combination of the two transformed standard basis as its first and second column, i.e., 0 1 T = [r1 r2 ] = . 1 0 Solution 2. (A more general method) Transformations such as reflection, rotation w.r.t. coordinate axes are easy to be expressed. The general idea is to convert transformations into forms w.r.t. coordinates. Let α be the angle between the x-axis and the line. The reflection can be achieved through the following steps: First, rotate the line around the origin −α so the line aligns with the x-axis. x1 = x2 passes through the origin. If it does not pass through the origin, translation is needed to make it pass through the origin. (Homogeneous coordinates will be needed since translation is not a linear transformation if represented in standard coordinates). Secondly, perform reflection about the x-axis, whose matrix is easy to get. Lastly, rotate the current frame back to its original location or perform other corresponding inverse transformations. In this problem, T = Rotz(−α) · Ref · Rotz(α). That is cos(−α) − sin(−α) 1 0 cos(α) − sin(α) T = sin(−α) cos(−α) 0 −1 sin(α) cos(α) π π cos(− 4 ) − sin(− 4 ) 1 0 cos( π4 ) − sin( π4 ) = sin(− π4 ) cos(− π4 ) 0 −1 sin( π4 ) cos( π4 ) 0 1 = . 1 0 3.7. Show that any linear transformation in C (treated as a complex vector space) is a multiplication by α ∈ C. 3 Proof Suppose a linear transformation is T : C = ⇒ C and T (1) = a + ib for two real numbers a and b. Then, T (−1) = −T (1) = −a − ib. Note that i2 = −1. T (−1) = T (i2 ) = iT (i). Thus T (i) = −a − ib = i(a + ib). i For any ω = x + iy ∈ C, x, y ∈ R, T (ω) = T (x + iy) = xT (1) + yT (i) = x(a + ib) + yi(a + ib) = (x + iy)(a + ib) = ωT (1) = ωα where α = T (1). 5.4. Find the matrix of the orthogonal projection in R2 onto the line x1 = −2x2 . Solution Following similar steps presented in Problem 3.2, we have T = R(α)P R(−α) 1 0 R(−α). = R(α) 0 0 α = tan−1 (− 12 ), and the resulting matrix is 4 − 25 5 T = . 1 − 52 5 5.7. Find the matrix of the reflection through the line y = −2x/3. Perform all the multiplications. Solution As above, T = R(α)Ref R(−α) 1 0 = R(α) R(−α). 0 −1 α = tan−1 (− 23 ), T = 5 13 − 12 13 12 − 13 5 . − 13 6.1. Prove that if A : V → W is an isomorphism (i.e. an invertible linear transformation) and v1 , v2 , ..., vn is a basis in V , then Av1 , Av2 , ..., Avn is 4 a basis in W . Proof For any w ∈ W , A−1 w = v ∈ V , w = Av = A[v1 v2 ... vn ][v1 v2 ... vn ]T = [Av1 Av2 ... Avn ][v1 v2 ... vn ]T . [Av1 Av2 ... Avn ] is in the form of a basis in W . Next we show that Av1 Av2 ... Avn is linearly independent. If they are not liearly independent, suppose Av1 can be expressed as a linear combination of Av2 Av3 ... Avn without loss of generality. Multiplying them with A−1 in the left side, it results in that v1 can be expressed by v2 ... vn , which contradicts the fact that v1 , v2 , ..., vn is a basis in V . The proposition is proved. 7.4. Let X and Y be subspaces of a vector space V. Using the previous exercise, show that X ∪ Y is a subspace if and only if X ⊂ Y or Y ⊂ X. Proof The sufficiency is obvious and easy to verify. For the necessity, suppose X * Y, Y * X, and X ∪ Y is a subspace of V. Then there are vectors x ∈ X, y ∈ Y and x ∈ / Y, y ∈ / X. According to Problem 7.3, x + y ∈ / X, x+y ∈ / Y. As a result, x + y ∈ / X ∪ Y. i.e., x ∈ X ∪ Y, y ∈ X ∪ Y, but x+y ∈ / X ∪ Y, which contradicts that X ∪ Y is a subspace. Thus, X ⊂ Y or Y ⊂ X. 8.5. A transformation T in R3 is a rotation about the line y = x + 3 in the x-y plane through an angle γ. Write a 4 × 4 matrix corresponding to this transformation. You can leave the result as a product of matrices. Solution For a general spatial rotation about a given line through an angle γ, the 3 × 3 rotation matrix can be given by: R = Rx−1 Ry−1 Rz (γ)Ry Rx where the rotation by γ is appointed to be performed around z-axis. Rx and Ry are rotations used to align the original line direction with z-axis which can be determined by simple trigonometry. For this problem, the line y = x + 3 does not go through the origin, so extra step T0 is needed to translate the line to make it pass the origin and homogeneous coordinates are applied: R4×4 = T0−1 Rx−1 Ry−1 Rz (γ)Ry Rx T0 where rotation matrices are also in their 4 × 4 forms. T0 is not unique for the translation to make two parallel lines align. For example, consider the following matrix: 1 0 0 3 0 1 0 0 T0 = 0 0 1 0 (Move y = x + 3 to pass through the origin as y = x.) 0 0 0 1 5 0 0 0 0 −1 0 (Rotate y = x about the x-axis π/2.) 1 0 0 0 0 1 √ 2 2 0 − 0 2 2 0 √0 1 √0 Ry = 2 (Rotate the line about the y-axis −π/4.) 2 2 0 0 2 0 0 0 1 cos γ − sin γ 0 0 sin γ cos γ 0 0 (Rotate about the z-axis γ.) Rz (γ) = 0 0 1 0 0 0 0 1 1 0 Rx = 0 0 √ Combining all the matrices above and their inverses yields √ cos γ + 1 1 − cos γ 2 sin γ 3 cos γ − 3 − 2 2 2 √2 1 − cos γ cos γ + 1 2 sin γ 3 − 3 cos γ − R4×4 = √ 2 . 2 √ 2 √2 2 sin γ 2 sin γ 3 2 sin γ cos γ − − 2 2 2 0 0 0 1 Notice that processing above showcases a general scenario where we need to perform two rotations about two coordinate axes in sequence to align the line with the third coordinate axis. In this problem, it is easier to align the line with the x-axis or y-axis since the line lies on the x − y plane. For instance, conduct the rotation about the x-axis, R4×4 is given by R4×4 = T0−1 Rz−1 Rx (γ)Rz T0 . Specifically, T0 remains the same and √ √ 2 2 0 0 2√2 √22 0 0 2 Rz = − 2 (Rotate about the z-axis −π/4.) 0 0 1 0 0 0 0 1 1 0 0 0 cos γ − sin γ Rx (γ) = 0 sin γ cos γ 0 0 0 0 0 (Rotate about the x-axis γ.) 0 1 After computation, it can be verified that this solution will also result in the identical R4×4 as shown above. 6 Chapter 2. Systems of Linear Equations 3.8. Show that if the equation Ax = 0 has unique solution (i.e. if echelon form of A has pivot in every column), then A is left invertible. Proof Ax = 0 has unique solution, then the solution is trivial solution. The echelon form of A has pivot in every column. Let the dimension of A be m × n, then m ≥ n. The row number is greater or equal to the column number. The reduced echelon form of A can be denoted as In×n Are = . 0(m−n)×n Suppose Are is obtained by a sequence of elementary row operation E1 , E2 , ..., Ek , Are = Ek ... E2 E1 A Ei is m × m. The left inverse of A is the first n rows of the product of Ei . i.e. Elef t = In×m Ek ... E2 E1 , where In×m 1 0 0 0 1 0 = . . .. .. .. . 0 0 ... ... ... .. . 0 0 .. . 1 ... , n×m is used to extract the In×n identity matrix in Are . Elef t A = In×m Are = In×n , thus A is left invertible. 5.5. Let vectors u, v, w be a basis in V . Show that u + v + w, v + w, w is also a basis in V . Solution For any vector x ∈ V , suppose x = x1 u + x2 v + x3 w. It is easy to figure out that x = x1 (u + v + w) + (x2 − x1 )(v + w) + (x3 − x2 − x1 )w. Clearly, u + v + w, v + w, w are linearly independent, can express any x, and thus form a basis in V . 7.4. Prove that if A : X → Y and V is a subspace of X then dim AV ≤ rank A. (AV here means the subspace V transformed by the transformation A, i.e., any vector in AV can be represented as Av, v ∈ V ). Deduce from here that rank(AB) ≤ rankA. Proof dimAV ≤ dimAX ≤ dim RanA = rankA Suppose that the column vectors of B compose a basis of space V . Then rank(AB) ≤ rankA. 7.5. Prove that if A : X → Y and V is a subspace of X then dim AV ≤ dim V . Deduce from here that rank(AB) ≤ rank B. 7 Proof Suppose that dim V = k and let v1 , v2 , ..., vk be a basis of V . AV is defined by the transformation on the basis: Av1 , Av2 , ..., Avk . dim AV = rank [Av1 , Av2 , ..., Avk ] ≤ k = dim V . Similarly, assume rank B = k and b1 , b2 , ..., bk are linearly independent column vectors in B. rank AB = rank [Ab1 , Ab2 , ..., Abk ] ≤ k = rank B. 7.6. Prove that if the product AB of two n × n matrices is invertible, then both A and B are invertible. Do not use determinant for this problem. Proof AB is invertible, rank(AB) = n. From Problem 7.5, we have rank(AB) = n 6 rank(A) 6 n. Thus rank(A) = n. So is B. Both A and B have full rank and are invertible. 7.7. Prove that if Ax = 0 has unique solution, then the equation AT x = b has a solution for every right side b. (Hint: count pivots) Proof Suppose A ∈ Rm × Rn . Note that for Ax = 0, there is always a trivial solution x = 0 ∈ Rn . We now know the trivial solution is unique, which indicates that the echelon form of A has a pivot at every column. Accordingly, the echelon form of AT will have a pivot at every row (Consider that the echelon form of AT is acquired by column reduction that corresponds to the row reduction of A). Hence, AT x = b is consistent for any b. 7.14. Is it possible for a real matrix A that Ran A = Ker AT ? Is it possible for a complex A? Solution Both are not possible. Suppose A is m × n and Ran A = Ker AT . Then Ran A ⊂ Ker AT , i.e., AT Av = 0 for any v ∈ Rn . This holds only when AT A = 0n×n . Then A = 0m×n . (Use the column vectors of A and check the diagonal entries of AT A equal to 0. It will lead to the conclusion that the column vectors are all-zero vectors, e.g., A = [a1 , a2 , ..., an ] with 2 ai ∈ Rm . AT A1,1 = aT 1 a1 = ||a1 || = 0, then a1 = 0m×1 .) On the other hand, if Ran A = Ker AT , Ker AT ⊂ Ran A, i.e., if AT b = 0, then Ax = b has a solution. We have A = 0m×n , then for arbitrary b ∈ Rm , AT b = 0 holds. But for b 6= 0, Ax = b does not have a solution. This is contradictory. So it is not possible for the real or complex matrix A that Ran A = Ker AT . 8.5. Prove that if A and B are similar matrices then trace A = trace B. (Hint: recall how trace(XY ) and trace(Y X) are related.) Proof trace(A) = trace(Q−1 BQ) = trace(Q−1 QB) = trace(B). (Note that trace(AB) = trace(BA) as long as AB, BA can be performed.) 8 Chapter 3. Determinants 3.4. A square matrix (n × n) is called skew-symmetric (or antisymmetric) if AT = −A. Prove that if A is skew-symmetric and n is odd, then det A = 0. Is this true for even n? Proof det A = det AT = det(−A) = (−1)n det A by using the properties of determinant and skew-symmetric matrices. If n is odd, (−1)n = −1, we have det A = − det A, thus det A = 0. If n is even, we just have det A = det A, so the result above is generally not true. 3.5. A square matrix is called nilpotent if Ak = 0 for some positive integer k. Show that for a nilpotent matrix A, det A = 0. Proof det Ak = (det A)k = det 0 = 0, thus det A = 0. 3.6. Prove that if A and B are similar, then det A = det B. proof A and B are similar, then A = Q−1 BQ for an invertible matrix Q. det A = det Q−1 BQ = (det Q−1 )(det B)(det Q) = (det Q−1 )(det Q)(det B) = (det Q−1 Q)(det B) = (det I)(det B) = det B. 3.7. A real square matrix Q is called orthogonal if QT Q = I. Prove that if Q is an orthogonal matrix then det Q = ±1. Proof det QT Q = (det QT )(det Q) = (det Q)2 = det I = 1. det Q = ±1. 3.9. Let points A, B and C in the plane R2 have coordinates (x1 , y1 ), (x2 , y2 ) and (x3 , y3 ) respectively. Show that the area of triangle ABC is the absolute value of 1 x1 y1 1 1 x2 y2 . 2 1 x3 y3 Hint: use row operation and geometric interpretation of 2 × 2 determinants (area). Proof The area of triangle ABC is half of the parallelogram defined by neighbouring sides AB, AC, which also can be computed by 1 x − x1 y2 − y1 S4ABC = abs( 2 ) x3 − x1 y3 − y1 2 1 = |(x2 − x1 )(y3 − y1 ) − (x3 − x1 )(y2 − y1 )|. 2 9 In the same time, if we use row reduction to check the determinant 1 x1 y1 1 x1 y1 1 1 1 x2 y2 = 0 x2 − x1 y2 − y1 2 2 1 x3 y3 0 x3 − x1 y3 − y1 1 x1 y1 1 0 x2 − x1 y2 − y1 2 1 0 0 y3 − y1 − (y2 − y1 ) xx32 −x −x1 1 = ((x2 − x1 )(y3 − y1 ) − (x3 − x1 )(y2 − y1 )). 2 = We assume that x2 − x1 6= 0 and it can be verified if x2 − x1 = 0, the result still holds. With the absolute value, we can see the conclusion holds. 3.10. Let A be a square matrix. Show that block triangular matrices A 0 I 0 A ∗ I ∗ ∗ I ∗ A 0 I 0 A all have determinant equal to det A. Here ∗ can be anything. Proof Considering performing row reduction to make A triangular, the whole matrix will be triangular and the rest part on the diagonal remains I. Thus the determinant of the block matrix equals to det A. (Problem 3.11 and 3.12 are just applications of the conclusion of Problem 3.10. The hint just tells the answer.) 4.2. Let P be a permutation matrix, i.e., an n × n matrix consisting of zeros and ones and such that there is exactly one 1 in every row and every column. a) Can you describe the corresponding linear transformation? That will explain the name. b) Show that P is invertible. Can you describe P −1 ? c) Show that for some N > 0 P N := P | P{z... P} = I. N times Use the fact that there are only finitely many permutations. Solution a) Consider the linear transformation y = P x and rows of P . There is only one 1 in each row of P . Suppose in the first row of P , P1,j = 1, then y1 = p1 x = xj , where p1 is the first row of P . Namely, xj is moved to the 1st place after the linear transformation. Similarly, for the second row of P , suppose P2,k = 1, then y2 = xk and xk is moved to the second place, so on and so forth. There is also only one 1 in each column, then 10 the column indices in 1 entries P1,j , P2,k , P3,m , ... comprise a permutation of n as (j, k, m, ...). After multiplying by the permutation matrix P , the elements in x change their orders to [xj , xk , xm , ...]T . (Considering from the perspective of column vectors of P also works.) b) Suppose P is invertible, by multiplying P −1 , x = P −1 y. But we know −1 y1 = xj , then we have Pj,1 = 1 so that xj can return to its original po−1 sition. Similarly, y2 = xk , then Pk,2 = 1. Following this, we can see that −1 −1 Pi,j = Pj,i . So P is invertible and P = P T . c) Note that P x, P 2 x, P 3 x ... P N x are all permutations of (x1 , x2 , ..., xn ). If P N can never equal to I, P x, P 2 x, P 3 x ... P N x will be different permutations. And N can be infinitely big, so there will be infinitely many permutations of (x1 , x2 , ..., xn ), which is impossible. Thus there must be some N > 0, P N = I. In fact, there are n! different permutations of n distinct elements, N ≤ n!. Exercises Prat 5 and Part 7 in this chapter are not difficult. Some ideas and answers are given for reference: • Problem 5.3, we can use the last column expansion and the left matrix (A + tI)i,j is a triangular matrix. The final expression is det(A + tI) = a0 + a1 t + a2 t2 + ... + an−1 tn−1 . The order of −1 in each term is even. • Problem 5.7, n! multiplications is needed. We can use induction to prove it. • Problem 7.4 and Problem 7.5, consider det RA = (det R)(det A) = det A, where R is the rotation matrix with its determinant equal to 1. For proof of the parallelogram area, we can also utilize the parameter angle, i.e., v1 = [x1 , y1 ]T = [v1 cos α, v1 sin α]T , v2 = [x2 , y2 ]T = [v2 cos β, v2 sin β]T . v1 , v2 are the lengths of v1 , v2 , respectively. α, β represents the angle between the vector and x-axis positive direction. Then det A = x1 x2 = x 1 y2 − x 2 y1 y1 y2 = v1 v2 (cos α sin β − cos β sin α) = v1 v2 sin(β − α). β − α is the angle from v1 to v2 . Chapter 4. Introduction to Spectral Theory (Eigenvalues and Eigenvectors) 1.1. (Part) True or false: 11 b) If a matrix has one eigenvector, it has infinitely many eigenvectors; True, if Ax = λx, A(αx) = λ(αx), α is an arbitrary scalar but zero. αx is also an eigenvector of A. c) There exists a square matrix with no real eigenvalues; True, such as the 2D rotation matrix Rα , α 6= nπ. d) There exists a square matrix with no (complex) eigenvectors; False, when discussing in complex space, there are always eigenvalues and as a result, A − λI has nonempty null space. f) Similar matrices always have the same eigenvectors; False, if A, B are similar and A = SBS −1 . If Ax = λx, then SBS −1 x = λx. i.e., B(S −1 x) = λ(S −1 x), S −1 x is an eigenvector of B, not x. g) The sum of two eigenvectors of a matrix A is always an eigenvector; False 1.6. An operator A is called nilpotent if Ak = 0 for some K. Prove that if A is nilpotent, then σ(A) = {0} (i.e. that 0 is the only eigenvalue of A). Proof Note that if λ is a nonzero eigenvalue of A and Ax = λx. Then A2 x = A(λx) = λ2 x, A3 x = A(λ2 x) = λ3 x ... Ak x = λk x. That is to say if λ ∈ σ(A), λk ∈ σ(Ak ). Now Ak = 0, σ(Ak ) = {0}. Then 0 is the only eigenvalue of A. 1.8. Let v1 , v2 , ..., vn be a basis in a vector space V . Assume also that the first k vectors v1 , v2 , ..., vk of the basis are eigenvectors of an operator A, corresponding to an eigenvalue λ (i.e. that Avj = λvj , j = 1, 2, ..., k). Show that in this basis the matrix of the operator A has block triangular form λIk ∗ 0 B Proof AV V = [I]V S ASS [I]SV , where S represents the standard basis. [I] is the coordinate change matrix and [I]SV = [v1T , v2T , ..., vnT ]. [I]SV AV V = ASS [I]SV = ASS [v1T , v2T , ..., vnT ] = [λv1T , λv2T , ..., λvkT , ..., ASS vnT ]. Denote the i-th column of AV V with ai . Consider a1 , then [I]SV a1 = λv1T . Since v1 , v2 , ..., vn is a basis, then a1 can only be the form a1 = [λ, 0, 0, ..., 0]T . Similarly, check the first k columns of AV V , they are λ times the first k standard base vector. So ASS has the block triangular form above. 1.9. Use the two previous exercises to prove that geometric multiplicity of an eigenvalue cannot exceed its algebraic multiplicity. Proof We consider the problem in the basis v1 , v2 , ..., vn and A has the block triangular form shown in Problem 1.8. Note k is the number of linearly independent eigenvectors corresponding to λk , which is also the dimension 12 of Ker(A − λk I) (consider the queation (A − λk I)x = 0). Namely, k is the geometric multiplicity of λk . For the algebraic multiplicity, consider the determinant det(A − λI) = (λk − λ)Ik ∗ 0 B − λIn−k = (λk − λ)k det(B − λIn−k ). So the algebraic multiplicity of λk is at least k. It is further possible that λk is a root of the polynomial det(B − λIn−k ), then in this case the algebraic multiplicity will just exceed k. Thus geometric multiplicity of an eigenvalue cannot exceed its algebraic multiplicity. 1.10. Prove that determinant of a matrix A is the product of its eigenvalues (counting multiplicity). Proof (Just use the hint) The characteristic polynomial of n × n square matrix A is det(A − λI) and we consider the roots of it in complex space. According to the fundamental theorem of algebra, det(A − λI) has n roots counting multiplicity and can be factorized as det(A − λI) = (λ1 − λ)(λ2 − λ)...(λ − λn ). (Recall the formal definition of determinant, the highest order term of λ, λn , is generated by the diagonal product Πni=1 (aii − λ). Thus the sign of the factorization is correct.) Then let λ = 0, we will get det A = λ1 λ2 ...λn . 1.11. Prove that the trace a matrix equals the sum of eigenvalues in three steps. First, compute the coefficient of λn−1 in the right side of the equality det(A − λI) = (λ1 − λ)(λ2 − λ)...(λn − λ). Then show that det(A − λI) can be represented as det(A − λI) = (a1,1 − λ)(a2,2 − λ)...(an,n − λ) + q(λ). where q(λ) is polynomial of degree at most n − 2. And finally, comparing the coefficients of λn−1 get the conclusion. Proof First, recall the binomial theorem, the coefficient of λn−1 in det(A − λI) = (λ1 − λ)(λ2 − λ)...(λn − λ) is C(λn−1 ) = (−1)n−1 (λ1 + λ2 + ... + λn ). Because to get the term λn−1 , we need to pick −λ n times from the total n factors λi − λ. Then the last one pick is λj from the factor whose −λ is not picked. The resulting term is then (−1)n−1 λj λn−1 . There are n combinations and the sum of each term is C(λn−1 )λn−1 . Then, we show det(A − λI) can be represented as det(A − λI) = (a1,1 − λ)(a2,2 − λ)...(an,n − λ) + q(λ). That is to say in det(A − λI), the term λn−1 are all from (a1,1 − λ)(a2,2 − λ)...(an,n − λ). This holds because λ only appears on the diagonal of A − λI. 13 Using the formal definition of determinant, if we pick n − 1 diagonal term with λ, then the last pick must also be on the diagonal. There is no other way to generate λn−1 . Thus q(λ) is a polynomial of degree at most n − 2. Then we know the coefficient of λn−1 also equals to C(λn−1 ) = (−1)n−1 (a1,1 + a2,2 + ... + an,n ). derived from two different ways are identical, so we have Pn PnThe coefficients λ a = i=1 i , namely, the trace a matrix equals the sum of eigenvali=1 i,i ues. 2.1. Let A be n × n matrix. True or false: a) AT has the same eigenvalues as A. True, det(A − λI) = det(A − λI)T = det(AT − λI) b) AT has the same eigenvectors as A. False. c) If A is diagonalizable, then so is AT . True, A = SDS −1 , AT = (SDS −1 )T = (S −1 )T DT S T = (S T )−1 DS T . 2.2. Let A be a square matrix with real entries, and let λ be its complex eigenvalue. Suppose v = (v1 , v2 , ..., vn )T is a corresponding eigenvector, Av = λv. Prove that the λ̄ is an eigenvalue of A and Av̄ = λ̄v̄. Here v̄ is the complex conjugate of the vector v, v̄ := (v¯1 , v¯2 , ..., v¯n )T . Proof A is real matrix. Then Āv̄ = Av̄. In the same time Āv̄ = Av = λv = λ̄v̄. Thus Av̄ = λ̄v̄. Chapter 5. Inner Product Spaces 1.4. Prove that for vectors in an inner product space kx ± yk2 = kxk2 + kyk2 ± 2Re(x, y). Recall that Re(z) = 12 (z + z). Proof kx − yk2 = (x − y, x − y) = (x, x − y) − (y, x − y) = (x, x) − (x, y) − (y, x) + (y, y) = kxk2 + kyk2 − (x, y) − (x, y) = kxk2 + kyk2 − 2Re(x, y). Similarly kx + yk2 = kxk2 + kyk2 + 2Re(x, y). 1.5. Hint: a) Check conjugate symmetry. b) Check linearity. c) Check 14 conjugate symmetry. 1.7. Prove the parallelogram identity for an inner product space V , kx + yk2 + kx − yk2 = 2(kxk2 + kyk2 ). Proof kx + yk2 + kx − yk2 = (x + y, x + y) + (x − y, x − y) = (x, x) + (x, y) + (y, x) + (y, y)+ (x, x) − (x, y) − (y, x) + (y, y) = 2(x, x) + 2(y, y) = 2(kxk2 + kyk2 ). 1.8. Proof sketch: a) Let v = x, then (x, x) = 0, x = 0. b) (x, vk = 0), ∀ k, then (x, v) = 0, form conclusion in a), x = 0. c) (x − y, vk ), ∀ k, form b) x − y = 0, then x = y. 2.3. Let v1 , v2 , ..., vn be an orthonormal basis in V . P P a) Prove that for any x = nk=1 αk vk , y = nk=1 βk vk (x, y) = n X αk β k . k=1 b) Deduce from this Parseval’s identity (x, y) = n X (x, vk )(y, vk ). k=1 c) Assume now that v1 , v2 , ..., vn is only an orthogonal basis, not an orthonormal one. Can you write down Parseval’s identity in this case? Proof a) n n n X n n X X X X (x, y) = ( α k vk , β k vk ) = αi β j (vi , vj ) = αk β k . k=1 k=1 i=1 j=1 k=1 Because v1 , v2 , ..., vn is an orthonormal basis, (vi , vj ) = 0, i 6= j, (vi , vj ) = 1, i = j. 15 b) Use (x, vk ) = αk , (y, vk ) = βk and conclusion in a). c) Use equation in a), (x, y) = ( = n X k=1 n X αk vk , n X βk vk ) = k=1 αi β j (vi , vj ) = i=1 j=1 k=1 αk β k kvk k2 = n X n X n X (x, vk )(y, vk ) k=1 kvk k2 n X αk β k (vk , vk ) k=1 . As the basis is only orthogonal, not orthonomal, then (x, vk ) = (αk vk , vk ) = αk kvk k2 . 3.3 Complete an orthogonal system obtained in the previous problem to an orthogonal basis in R3 , i.e., add to the system some vectors (how many?) to get an orthogonal basis. Can you describe how to complete an orthogonal system to an orthogonal basis in general situation of Rn or Cn ? Solution For 3D space, we already have 2 orthogonal vectors v1 , v2 as the basis components. Then we just need another basis vector v3 . The computation of v3 exploits the orthogonality, i.e., (v1 , v3 ) = 0, (v2 , v3 ) = 0. Expressed in matrix form, let A = [v1 , v2 ]. Then solve AT v3 = 0. (Since it is in 3D space, using cross product is also simple.) Generally, to complete an orthogonal system of v1 , v2 ...vr . Consider A = [v1 , v2 , ..., vr ], using the orthogonality, we compute the rest basis vectors by solving AT v = 0, i.e., the rest basis vectors compose an basis of Ker AT or Null AT . 3.9 (Using eigenvalues to compute determinants). a) Find the matrix of the orthogonal projection onto the one-dimensional subspace in Rn spanned by the vector (1, 1, ..., 1)T ; b) Let A be the n × n matrix with all entries equal 1. Compute its eigenvalues and their multiplicities (use the previous problem); c) Compute eigenvalues (and multiplicities) of the matrix A − I, i.e., of the matrix with zeros on the main diagonal and ones everywhere else; d) Compute det(A − I). Solution a) Note that from Remark 3.5, we know PE = For this one-dimensional subspace, it is 1 1 ... 1 1 1 1 . . . 1 PE = . . . . . . . .. n .. .. 1 1 . . . 1 n×n 16 Pn 1 ∗ k=1 kvk k2 vk vk . b) Note that A = nPE . Suppose Ax = λx, then nPE x = λx. i.e., n times of the eigenvector’s projection on the 1D subspace equals λ times of itself. It also shows that the eigenvector’s orthogonal projection is parallel to itself. Then there are two possibilities: • One is the eigenvector is parallel with basis of the 1D subspace v, i.e., x = αv, α 6= 0. In this case, PE x = x, then λ = n. The geometric multiplicity is 1 for 1D eigenspace. • The eigenvector is orthogonal to v, i.e., x ⊥ v and PE x = 0, λ = 0. We can totally find n − 1 linearly independent eigenvectors so the geometric multiplicity of eigenvalue λ = 0 is n − 1. c) det(A − I − λI) = det(A − (λ + 1)I). i.e., an eigenvalue of A − I plus 1 is an eigenvalue of A. Then the eigenvalues of A − I equal to the eigenvalues of A minus 1. Thus the eigenvalues of A − I are n − 1 with multiplicity 1 and −1 with multiplicity n − 1. d) det(A−I) = (n−1)(−1)n−1 , which equals n−1 if n is odd, 1−n if n is even. 3.10. (Legendre’s polynomials) Hint: Using the Gram-Schmidt orthogonalization algorithm is sufficient. But remember R 1 to use the inner product 2 defined in the problem, e.g., k1k = (1, 1) = −1 1 · 1̄dt = 2. 3.11. Let P = PE be the matrix of an orthogonal projection onto a subspace E. Show that a) The matrix P is self-adjoint, meaning that P ∗ = P . b) P 2 = P . Remark The above 2 properties completely characterize orthogonal projection. Proof a) From the orthogonality, we have (x, x−P x) = (x−P x, x) = 0, ∀x. (x, x − P x) = (x − P x)∗ x = (x∗ − x∗ P ∗ )x = x∗ x − x∗ P ∗ x = 0. On the other hand, (x − P x, x) = x∗ (x − P x) = x∗ x − x∗ P x = 0. Subtract two equalities, x∗ (P − P ∗ )x = 0, ∀x. Then P − P ∗ = 0n×n , P = P ∗ . b) Consider (P x, x − P x) = (x − P x)∗ P x = (x∗ − x∗ P ∗ )P x = x∗ (P − P ∗ P )x = 0. Thus P = P ∗ P = P 2 since P = P ∗ . 3.13 Suppose P is the orthogonal projection onto an subspace E, and Q is the orthogonal projection onto the orthogonal complement E ⊥ . a) What are P + Q and PQ? b) Show that P − Q is its inverse. 17 Proof a) P + Q = I since (P + Q)x = P x + Qx = PE x + QE ⊥ x = x. P Q = 0n×n as x∗ P Qx = x∗ P ∗ Qx = (Qx, P x) = 0, ∀x (using P is selfadjoint shown in Problem 3.11). b) (P − Q)2 = (P − Q)(P − Q) = P 2 − P Q − QP + Q2 = P 2 + Q2 = P 2 + Q2 + P Q + QP = (P + Q)2 = I 2 = I (using P Q = QP = 0). i.e., (P − Q)−1 = P − Q. 4.5. Minimal norm solution. Let an equation Ax = b has a solution, and let A has non-trivial kernel (so the solution is not unique). Prove that a) There exists a unique solution x0 of Ax = b minimizing the norm kxk, i.e., that there exists unique x0 such that Ax0 = b and kx0 k 6 kxk for any x satisfying Ax = b. b) x0 = P(Ker A)⊥ x for any x satisfying Ax = 0. Proof a) Suppose x0 , x1 are solutions of Ax = b. Then A(x1 − x0 ) = 0. i.e., x1 − x0 ∈ Ker A. As a result, P(Ker A)⊥ (x1 − x0 ) = 0 = P(Ker A)⊥ x1 − P(Ker A)⊥ x0 . So we have P(Ker A)⊥ x1 = P(Ker A)⊥ x0 = const := h. Note that kxk2 = kP(Ker A)⊥ xk2 + kx − P(Ker A)⊥ xk2 > khk2 for any x satisfying Ax = b. When x0 − P(Ker A)⊥ x0 = 0, x0 = h, such a x0 has the smallest norm among all the solutions. The existence and uniqueness of x0 are guaranteed by h. b) It is shown above. 5.1. Show that for a square matrix A the equality det(A∗ ) = det(A) holds. T Proof det(A∗ ) = det(A ) = det(A) = det(A). 5.3. Let A be an m × n matrix. Show that Ker A = Ker (A∗ A). Proof It is easy to see Ker A ⊂ Ker (A∗ A). Next we show Ker (A∗ A) ⊂ Ker A. Consider kAxk2 = (Ax, Ax) = x∗ A∗ Ax. Thus if A∗ Ax = 0, we have x∗ A∗ Ax = kAxk2 = 0, i.e., Ax = 0. Thus Ker (A∗ A) ⊂ Ker A. As a result, we can conclude Ker A = Ker (A∗ A). 6.4. Show that a product of unitary (orthogonal) matrices is unitary (orthogonal) as well. Proof Suppose U1 , U2 are unitary (orthogonal), then (U1 U2 )∗ U1 U2 = U2∗ U1∗ U1 U2 = U2∗ IU2 = I. From Lemma 6.2, we know the product is unitary (orthogonal). 18 Chapter 6. Structure of Operators in Inner Product Spaces 1.1. Use the upper triangular representations of an operator to give an alternative proof of the fact that the determinant is the product and the trace is the sum of eigenvalues counting multiplicities. Proof (The proof use the fact that the entries on the diagonal of T are the eigenvalues of A, counting multiplicity, which seems not be explicitly stated in the book and can be found like in the Wiki.) A = U T U ∗ . det A = (det U )(det T )(det U ∗ ) = det U = Πni=1 λi because U is unitary and T is upper triangular with eigenvalues of A on its diagonal. To consider the trace, suppose U = [u1 , u2 , ..., un ]. Then A can be represented by T λ1 t12 . . . t1n u 1T λ . . . t 2 2n u2 .. A = u1 u2 . . . un .. . . . .. 0 λn T un = λ1 u1 t12 u1 + λ2 u2 . . . T T u1 T u2 t1n u1 + t2n u2 + ... + λn un .. . T un T T = λ1 u1 u1 + λ2 u2 u2 + ... + λn un un , where we exploit the orthogonality of u1 , u2 , ...un . Then note that the trace T of matrix ui ui = [ui1 ui ui2 ui ... uin ui ] (outer product) is u2i1 + u2i2 + ... + T T u2in = kui k2 = 1. Thus traceA = trace(λ1 u1 u1 ) + trace(λ2 u2 u2 ) + ... + P T trace(λn un un ) = λ1 + λ2 + ... + λn = ni=1 λi . 2.2. True or false: The sum of normal operators is normal? Justify your conclusion. Solution True. Suppose two normal operators are N1 = U1 D1 U1∗ , N2 = U2 D2 U2∗ . U1 , U2 are unitary and D1 , D2 are diagonal. (N1 + N2 )∗ (N1 + N2 ) = N1∗ N1 + N1∗ N2 + N2∗ N1 + N2∗ N2 (N1 + N2 )(N1 + N2 )∗ = N1 N1∗ + N1 N2∗ + N2 N1∗ + N2 N2∗ N1 , N2 are normal, we need to prove N1∗ N2 + N2∗ N1 = N1 N2∗ + N2 N1∗ . In fact, N1∗ N2 = U1 D1∗ U1∗ U2 D2 U2∗ . N1 N2∗ = U1 D1 U1∗ U2 D2∗ U2∗ . As can be shown, D1∗ U1∗ U2 D2 = D1 U1∗ U2 D2∗ because D1 , D2 are diagonal matrices, D1∗ = D1 , D2∗ = D2 . D1∗ D2 = D1 D2∗ (for complex numbers c1 , c2 , c1 c2 = c1 c2 ). By checking the entries of the product, one can conclude D1∗ U1∗ U2 D2 = 19 D1 U1∗ U2 D2∗ and N1∗ N2 = N1 N2∗ , N2∗ N1 = N2 N1∗ . So the statement is true. 2.9. Give a proof if the statement is true, or give a counterexample if it is false: a) If A = A∗ then A + iI is invertible. True. The eigenvalues of A + iI are λi + i where λi are eigenvalues of A and are real. Then det(A + iI) = Πi=1 n(λi + i) 6= 0. (If c1 , c2 ∈ C, c1 c2 = 0, then at least one of c1 , c2 is 0.) b) If U is unitary, U + 34 I is invertible. True. If (U + 43 I)x = U x + 34 x = 0, note that kU xk = kxk. Then kU x + 43 xk > kU xk − k 34 xk = 41 kxk. So the homogeneous equation only has the trivial solution, U + 34 I is invertible. c) If a matrix is real, A − iI is invertible. False. A can have an eigenvalue i. 3.1. Show that the number of non-zero singular values of a matrix A coincides with its rank. Proof Suppose the dimension of A is m × n. It is known that Ker A = Ker (A∗ A). Then Rank A = n− dim Ker A = n - dim Ker (A∗ A) = Rank (A∗ A). The SVD of A is A = W ΣV ∗ , then A∗ A = V Σ∗ W ∗ W ΣV ∗ = V Σ2 V ∗ . Then Rank A = Rank (A∗ A) = Rank (V Σ2 V ∗ ) = Rank Σ2 = number of non-zero singular values, because V is an orthogonal matrix (full rank). 3.5. Find singular value decomposition 2 A= 0 of the matrix 3 . 2 Use it to find a) maxkxk61 kAxk and the vector where the maximum is attained; b) maxkxk=1 kAxk and the vector where the minimum is attained; c) the image A(B) of the closed unit ball in R2 , B = {x ∈ R2 : kxk 6 1}. Describe A(B) geometrically. Solution (The SVD steps are ignored here.) a) Suppose A = W ΣV ∗ , then (Ax, Ax) = x∗ A∗ Ax = x∗ V Σ2 V ∗ x = (V ∗ x)∗ Σ2 (V ∗ x). Define y = [y1 y2 ]T = V ∗ x. Because V is orthogonal, then y also lies in the unit ball. Thus (Ax, Ax) = y∗ Σ2 y 16 0 y1 = y1 y2 0 1 y2 = 16y12 + y22 . 20 y12 + y22 6 1. Hence the maximum is 16 attained when y = [1 0]T . Corresponding x can be solved by x = V y. b) Similarly, the minimum is 1 attained when y = [0 1]T . c) Ellipse. 3.8. Let A be an m × n matrix. Prove that non-zero eigenvalues of the matrices A∗ A and AA∗ (counting multiplicities) coincide. Proof Suppose v is an eigenvector of A∗ A corresponding to a non-zero eigenvalue λ, i.e., A∗ Av = λv. Then AA∗ Av = A(λv) = λ(Av), i.e., λ is an eigenvalue of AA∗ with corresponding eigenvector Av. Similarly, we can show that the non-zero eigenvalues of AA∗ are also eigenvalues of A∗ A. Thus non-zero eigenvalues of the matrices A∗ A and AA∗ coincide. 4.2. Let A be a normal operator, and let λ1 , λ2 , ..., λn be its eigenvalues (counting multiplicities). Show that singular values of A are |λ1 |, |λ2 |, ..., |λn |. Proof First, we show for normal operator A, kAxk = kA∗ xk. Note that AA∗ = A∗ A, then ((AA∗ − A∗ A)x, x) = (0, x) = (AA∗ x, x) − (A∗ Ax, x) = (A∗ x, A∗ x) − (Ax, Ax) = kA∗ xk2 − kAxk2 = 0. Thus kAxk = kA∗ xk. Suppose v is an eigenvector of A corresponding to the eigenvalue λ. Note that A − λI is also normal (see Problem 2.2). Thus we have k(A − λI)vk = k(A − λI)∗ vk = k(A∗ − λI)vk = 0, i.e., A∗ v = λv. So A∗ Av = A∗ (λv) = λλv = |λ|2 v. |λ|2 is an eigenvalue of A∗ A, then |λ| is a singular value of A. fΣ e Ve ∗ be a reduced singular value decomposition of A. Show 4.4. Let A = W f , and then by taking adjoint that Ran A∗ = Ran W f. that Ran A = Ran W e f Proof Suppose A is an m × n matrix. Σ = diag(σ1 , σ2 , ..., σr ). W is m × r f , we just need to show Ran Σ e Ve ∗ and Ve ∗ is r × n. To show Ran A = Ran W r ∗ ∗ e e e = R = Ran V (Σ has full rank r), which holds because Rank V = r and e Ve ∗ = Rr and Ran A = Ran W f. r 6 n. Thus Ran Σ f. By taking adjoint, it can be easily shown that Ran A∗ = Ran W ∗ ∗ ∗ ∗ ∗ End ∗ ∗ ∗ ∗ ∗ 21