Math 103, Summer 2006Orthonormal Bases, Orthogonal Complements and Projections July 17, 2006 ORTHONORMAL BASES, ORTHOGONAL COMPLEMENTS AND PROJECTIONS 1. Recap Recently we’ve talked about plenty of topics. These will form the core of our linear algebra vocabulary and are as important as ever (so don’t forget them!). Mastering this language, however, means understanding how different terms interrelate. The following theorem is just a single attempt to bring together many of the concepts we have discussed in this class. You should think of how you could make this result more general. Theorem 1.1. Suppose that A is an n × n matrix. Then the following are equivalent statements. (1) (2) (3) (4) (5) (6) (7) (8) (9) (10) (11) A is invertible. The linear system The linear system The linear system rref(A) = In . rank(A) = n. im(A) = Rn . − → ker(A) = { 0 }. The columns of A The columns of A The columns of A → − − → − A→ x = b has a unique solution for all b ∈ Rn . → − − → − A→ x = b has a unique solution for some b ∈ Rn . → − − A→ x = 0 has a unique solution. are linearly independent. span Rn . form a basis for Rn . 2. Orthogonality The dot product has been our friend over the last few weeks, as it is a simple algebraic tool which reveals a lot about the geometry of vectors. For instance, we capture whether two vectors are perpendicular via dot product. → → → → We have also described the magnitude of a vector with the dot product. Since we know − x ·− y = cos(θ)k− x kk− yk → − − → where θ is the angle between x and y , we can solve for θ: µ − ¶ → − x ·→ y θ = arccos . → → k− x kk− yk The inequality |x · y| ≤ kxkkyk is known as the Cauchy Schwartz inequality and follows from formula for the dot product listed above: simply take the absolute value of each side and remember that cosine takes values between −1 and 1. We’re going to focus on orthogonality this week, and we’ll find that this will give us ‘convenient’ bases for a space V . → → Definition 2.1. A collection of vectors − v1 , · · · , − vs in Rm is called orthonormal if ½ 0 if i 6= j − → → − vi · vj = 1 if i = j. This means that distinct vectors in the collection are orthogonal and that each vector in the collection is unit length. aschultz@stanford.edu http://math.stanford.edu/~aschultz/summer06/math103 Page 1 of 6 Math 103, Summer 2006Orthonormal Bases, Orthogonal Complements and Projections July 17, 2006 Example. You can check very quickly that the standard basis of Rn is a collection of orthonormal vectors. You can also check that for any value of θ, the collection ½µ ¶ µ ¶¾ cos(θ) − sin(θ) , sin(θ) cos(θ) is orthonormal. In fact, any orthonormal collection of two vectors in R2 takes this form (though this is harder to prove, but not by much). Orthonormal collections are nice for many reasons, but one of the big ones is the following Theorem 2.1. A collection of orthonormal vectors is linearly independent. → − Proof. Let − v1 , · · · , → vs be our orthonormal collection, and suppose we have → − → − c − v + ··· + c → v = 0. 1 1 s s We would like to show that ci = 0 for every i. → Toward this end, we will take the dot product of the above equation with − vi : − → − → → − → v · (c − v + ··· + c → v )=− v · 0. i 1 1 s s i The right hand side is clearly 0; on the left hand side we distribute the dot product across our sum and then pull out scalars, leaving → → → − → → c1 − vi · − v1 + · · · + ci − vi · → vi + · · · + cs − vi · − vs . − → → − → − − → But since our collection is orthonormal, v · v = 0 if i 6= j and v · v = 1. This means the left hand side of our i j i i equation is just ci , and we conclude ci = 0 as desired. ¤ Corollary 2.2. n orthonormal vectors in an n dimensional space form a basis Proof. We have just seen that orthonormal vectors are linearly independent. An old result on dimension says that if we have n linearly independent vectors in an n dimensional space, these vectors actually form a basis for the space. Since we are living in an n dimensional space by hypothesis, our collection of n linearly independent (orthonormal) vectors is therefore a basis. ¤ We will now introduce a second topic in the vein of orthogonality. Definition 2.2. For a subspace V of Rm , the orthogonal complement of V is defined to be → → → → V ⊥ := {− x ∈ Rm : − x ·− v = 0 for all − v ∈ V }. In words, V ⊥ is the collection of all vectors which are orthogonal to every vector in V . The orthogonal complement will be useful to us in many ways, and it is helpful to have some basic facts under our belt. Theorem 2.3. For a subspace V of Rn , (1) (2) (3) (4) V ⊥ is a subspace; → − V ∩ V ⊥ = { 0 }; dim(V ) + dim(V ⊥ ) = n; and (V ⊥ )⊥ = V . aschultz@stanford.edu http://math.stanford.edu/~aschultz/summer06/math103 Page 2 of 6 Math 103, Summer 2006Orthonormal Bases, Orthogonal Complements and Projections July 17, 2006 Sketch of proof. We have dealt with spaces like V ⊥ before, and in particular on the test you have proved essentially the first fact (for a more detailed proof of this fact, just come talk to me). For the second statement, − → → suppose that → v ∈ V ∩ V ⊥ . This means that − v is a vector which is orthogonal to every vector in − v , including − → − → → itself! On the test we saw that the only vector which is orthogonal to itself is 0 , and so we conclude − v = 0 as desired. We’ll prove the third statement tomorrow after we recognize V ⊥ as the kernel of a linear operator. The fourth statement is a kind of duality statement; it is a cute exercise you might want to do by yourself (so I’ll leave it to you). ¤ 3. Projection We’re going to discuss a class of linear operators which are simplified greatly because of orthonormal bases. We’ll start by first considering the 1 dimensional case. − Example. Suppose L is a line through the origin in R2 . For a vector → v ∈ R2 , what is a formula for the − → projection of v to L? → Solution. In the sketch above we see that the projection of − v to L lies in the direction of L, so we only need → − → to find its magnitude. If θ is the angle between v and L then this magnitude is just cos(θ)k− v k (just think of − → v as the hypotenuse of a right triangle and use your trigonometry skills). How can we express this in a more → linear-algebra friendly way? Choose a unit vector − u in the direction of L. The magnitude of the projection of − → v onto L is then → → − kprojL (− v )k = − u ·→ v. − → Since we have u handy, we can in fact describe more than just the magnitude of the projection; we can give a formula for the vector itself: → → − → projL (− v ) = (− u ·→ v )− u. ¤ Now we’ll bump this up to a more general space. → Theorem 3.1. For a vector − x ∈ Rm and a subspace V of Rm , we can write k ⊥ → − − → x =→ x +− x k ⊥ → → → where − x ∈ V and − x ∈ V ⊥ . Moreover this is the only way to write − x as a linear combination of a vector in ⊥ V and a vector in V . − → x → − x⊥ V → − xk − Figure 1. Decomposing → x into its components aschultz@stanford.edu http://math.stanford.edu/~aschultz/summer06/math103 Page 3 of 6 Math 103, Summer 2006Orthonormal Bases, Orthogonal Complements and Projections July 17, 2006 →, · · · , − → of V . We’ll see in class tomorrow that Proof. I’ll start the proof by choosing an orthonormal basis − u u 1 s such a basis always exists. Now we’ll do a little scratch work to help us find the vectors we’re after. First, since k − → →, · · · , − → is a basis of V , we know we will be able to write x ∈ V , and since − u u 1 s k − → → + ··· + c − → x = c1 − u 1 s us . ⊥ k → − → To find what the ci should be, we’ll recognize that since − x =→ x −− x is an element of V ⊥ , we should have ⊥ k − → → → → 0=→ ui · − x =− ui · (− x −− x ) → → → − ··· − c − → =− ui · (− x − c1 − u 1 s us ) → − → →) − · · · − c (− → → − → − − → =− ui · → x − c1 (− ui · − u 1 i ui · ui ) − · · · − cs (us · us ) → − =− u ·→ x −c . i i k → → − This means that we should choose ci = − ui · − x to make our vector → x . So let’s do it. k → →·→ − − → → − − →− → − → − →k Define − x = (− u 1 x )u1 + · · · + (us · x )us . Since the ui are in V , this means that x ∈ V as desired. Now ⊥ k k ⊥ ⊥ → − → → → → → define − x =→ x −− x ; this makes the equality − x =− x +− x clear. We now have to show that − x ∈ V ⊥. − → − → − → Toward this end, let v be an element of V . We’ll write v = d1 u1 + · · · + ds us , and we compute ⊥ → − → → + ··· + d − → − → − → − →− → → − − →− → v ·− x = (d1 − u 1 s us ) · ( x − (u1 · x )u1 − · · · − (us · x )us ) →·− → →·→ − → + ··· + d → − →·− → → − · · · − (− →·→ − →) = d (− u x ) + · · · + d (− u x ) − (d − u u ) · ((− u x )− u u x )− u 1 1 s s 1 1 s s 1 1 s s →·− → − → → − = d1 (− u 1 x ) + · · · + ds (us · x ) → · ((− →·− → → − · · · − (− →·− → →) −d − u u x )− u u x )− u 1 1 1 1 s s − ··· − →·− → → − · · · − (− →·− → − −d → u · ((− u x )− u u x )→ u ) s s 1 1 s s →·− → − → → − − → − → − → − → − → − → − → − → = d1 (− u 1 x ) + · · · + ds (us · x ) − d1 (u1 · x )(u1 · u1 ) − · · · − ds (us · x )(us · us ) →·− → →·→ − →·− → − → = d (− u x ) + · · · + d (− u x ) − d (− u x ) − · · · − d (→ u ·− x) 1 1 s s 1 1 s s = 0. → In the second- and third-to-last equalities we have used the fact that the − ui are an orthonormal basis. k ⊥ k ⊥ → → → → → Great! So we have written − x =− x +− x with − x ∈ V and − x ∈ V ⊥ as desired. We have only to show this decomposition is unique. Suppose then that − → →k + − →⊥ = − →k + − →⊥ x =− x x x x 1 1 2 2 k ⊥ − − →k = − →k and − →⊥ = − →⊥ . Using the equality above I with → xi ∈ V and → xi ∈ V ⊥ . We need to show that − x x x x 1 2 1 2 reorganize and find − →k − − →k = − →⊥ − − →⊥ . x x x x 1 2 2 1 ⊥ But the left hand side is in V while the right hand side is in V , so this vector is an element of V ∩ V ⊥ . We have seen earlier that the only element in both V and V ⊥ is the 0 vector, and hence we conclude → − →k − − →k = − →⊥ − − →⊥ . 0 =− x x x x 1 2 2 1 →k = − →k and − →⊥ = − →⊥ as desired. Hence we have − x x x x 1 2 1 2 ¤ For your general math edification, the first part of this theorem says Rm = V + V ⊥ . The uniqueness part make this a more special sum: sometimes mathematicians will write Rm = V ⊕ V ⊥ and say that the sum is ‘direct.’ aschultz@stanford.edu http://math.stanford.edu/~aschultz/summer06/math103 Page 4 of 6 Math 103, Summer 2006Orthonormal Bases, Orthogonal Complements and Projections July 17, 2006 → We now define a map which projects any vector onto a space V : for any vector − v ∈ Rm and any subspace − → V ⊆ Rm , the projection of x onto V is k − → projV (→ x)=− x , k → where − x is the vector in the theorem above. The projection function will be handy later in the week. For now let’s record some consequences of the theorem above (and its proof). →, · · · , − → is an orthonormal basis for V . Then Corollary 3.2. Suppose that − u u 1 s →·→ − − → − → − →− → projV (x) = (− u 1 x )u1 + · · · + (us · x )us . In the special case where V = Rm , the projection map leaves every vector alone. Combined with the previous corollary, this gives →, · · · , − Corollary 3.3. For a orthonormal basis − u u→ of Rm , 1 m − → →·→ − − → −→ − → −→ x = (− u 1 x )u1 + · · · + (um · x )um . Proof. The previous corollary gives → →·− →− → −→ − → −→ projRm (− x ) = (− u 1 x )u1 + · · · + (um · x )um , → → while the observation projRm (− x)=− x finishes the proof. ¤ One geometric property to note is that magnitude cannot go up after projection: − Corollary 3.4. For a vector → v ∈ Rm and a subspace V ⊆ Rm , → − kprojV (− x )k ≤ k→ x k, − with equality if and only if → v ∈V. k ⊥ → → → Proof. The proof of this fact is essentially the Pythagorean theorem. Notice that since − x,− x and − x make a ⊥ → right triangle (at least after we shift − x ), so we have k ⊥ → → → k− x k2 = k− x k2 + k− x k2 . − → ⊥ ⊥ k → → → − → Since k− x k2 ≥ 0 (and equal to 0 only in the case when − x = 0 , i.e., when − x =→ x , i.e., when − x ∈ V ) we k − → − → − → 2 have k x k ≤ k x k with equality if and only if x ∈ V . Taking square roots of both sides gives the desired result. → − x⊥ → − x → − x ⊥ (shifted) → − − x k = projV (→ x) k ⊥ − → → → x, − x = projV (− x ) and − x make a right triangle Figure 2. → ¤ aschultz@stanford.edu http://math.stanford.edu/~aschultz/summer06/math103 Page 5 of 6 Math 103, Summer 2006Orthonormal Bases, Orthogonal Complements and Projections July 17, 2006 Finally we will notice that projections are linear transformations, thus justifying why we—as linear algebraists— care about them at all. → → Corollary 3.5. The map T (− x ) = proj (− x ) is linear. V Proof. We just need to show that the action of projection is given by a matrix. First recall that →·− → → + · · · + (− →·→ − → proj (x) = (− u x )− u u x )− u 1 V 1 s s →, · · · , − → where − u us is an orthonormal basis for V . Now let A be the matrix 1 → − u 1 .. → ··· − → u u A= − . 1 s . → − us Then we have → ··· − u A→ x = − 1 − → u s − → u 1 .. . − → u s − − → → x = u1 · · · − →·− → u 1 x .. → − − → − →− → − → − →→ − us = (u1 · x )u1 + · + (us · x ) x . . − →·→ − u s x ¤ aschultz@stanford.edu http://math.stanford.edu/~aschultz/summer06/math103 Page 6 of 6