THE GRAM-SCHMIDT PROCESS MATH 5316, FALL 2012 1. The Gram-Schmidt Process K

THE GRAM-SCHMIDT PROCESS MATH 5316, FALL 2012 LANCE D. DRAGER 1. The Gram-Schmidt Process Algorithm 1.1. Let 𝑉 be an inner product space over K. Let 𝑣1 , 𝑣2 , . . . , 𝑣𝑑 be a basis of 𝑉 . The Gram-Schmidt Process below constructs an orthonormal basis 𝑢1 , 𝑢2 , . . . , 𝑢𝑑 such that span(𝑢1 , 𝑢2 , . . . , 𝑢𝑘 ) = span(𝑣1 , 𝑣2 , . . . , 𝑣𝑘 ), 𝑘 = 1, 2, . . . , 𝑑. Introduce the notation 𝑈𝑘 = span(𝑢1 , 𝑢2 , . . . , 𝑢𝑘 ) 𝑉𝑘 = span(𝑣1 , 𝑣2 , . . . , 𝑣𝑘 ). Recall that if 𝑊 ⊆ 𝑉 is a subspace with orthonormal basis 𝑢1 , 𝑢2 , . . . , 𝑢𝑘 , and 𝑣 ∈ 𝑉 , we can write 𝑣 uniquely as 𝑣 = 𝑤 + 𝑝, where 𝑤 ∈ 𝑊 and 𝑝 is orthogonal to 𝑊 . Specifically 𝑤 = proj𝑊 (𝑣) = 𝑘 ∑︁ ⟨𝑢𝑖 , 𝑣⟩𝑢𝑖 𝑖=1 𝑝= proj⊥ 𝑊 (𝑣) = 𝑣 − proj𝑊 (𝑣). To start the inductive construction, let 𝑢′1 = 𝑣1 𝑢1 =𝑢′1 /‖𝑢′1 ‖ It should be clear that 𝑈1 = 𝑉1 and we record the fact that 𝑣1 = ‖𝑢′1 ‖𝑢1 . For the next step, we define 𝑢′2 = proj⊥ 𝑉1 (𝑣2 ) = 𝑣2 − proj𝑈1 (𝑣2 ) = 𝑣2 − ⟨𝑢1 , 𝑣1 ⟩𝑢1 𝑢2 = 𝑢′2 /‖𝑢′2 ‖ Version Time-stamp: ”2012-11-29 16:58:23 drager”. 1 2 LANCE D. DRAGER Clearly 𝑢′2 , and hence 𝑢2 , are orthogonal to 𝑢1 . If 𝑢′2 = 0, then 𝑣2 ∈ 𝑈1 = 𝑉1 , which implies that 𝑣1 and 𝑣2 are dependent contrary to our assumption. Thus, 𝑢′2 ̸= 0 and the definition of 𝑢2 is legitimate. Since 𝑢1 ∈ 𝑈1 = 𝑉1 , we can say that 𝑢′2 ∈ span(𝑣1 , 𝑣2 ) = 𝑉2 and 𝑢1 ∈ 𝑉1 ⊆ 𝑉2 , so span(𝑢1 , 𝑢′2 ) ⊆ 𝑉2 . On the other hand, we have 𝑣2 = 𝑢′2 + ⟨𝑢1 , 𝑣1 ⟩𝑢1 ∈ span(𝑢1 , 𝑢′2 ) and 𝑣1 ∈ 𝑉1 = 𝑈1 = span(𝑢1 ). Thus, 𝑉2 = span(𝑣1 , 𝑣2 ) ⊆ span(𝑢1 , 𝑢′2 ), and we conclude that span(𝑢1 , 𝑢′2 ) = span(𝑣1 , 𝑣2 ) = 𝑉2 . Since 𝑢2 is just a scalar multiple of 𝑢′2 , span(𝑢′2 , 𝑢1 ) = span(𝑢1 , 𝑢2 ) = 𝑈2 . Thus, 𝑈2 = 𝑉2 . We record the equation 𝑣2 = ‖𝑢′2 ‖𝑢2 + ⟨𝑢1 , 𝑣1 ⟩𝑢1 For the next step, we define 𝑢′3 = proj⊥ 𝑉2 (𝑣3 ) = 𝑣3 − proj𝑉2 (𝑣3 ) = 𝑣3 − ⟨𝑢1 , 𝑣3 ⟩𝑢1 − ⟨𝑢2 , 𝑣3 ⟩𝑢2 𝑢3 = 𝑢′3 /‖𝑢′3 ‖. and we can prove 𝑈3 = 𝑉3 . The reader should check this as an exercise, but we’ll do the general case in a moment. For the inductive step, suppose that we have constructed an orthonormal list 𝑢1 , 𝑢2 , . . . , 𝑢ℓ so that 𝑉𝑘 = 𝑈𝑘 . for 𝑘 = 1, 2 . . . , ℓ. We THE GRAM-SCHMIDT PROCESS MATH 5316, FALL 2012 3 define 𝑢′ℓ+1 = proj⊥ 𝑉ℓ (𝑣ℓ+1 ) = 𝑣ℓ+1 − proj𝑉ℓ (𝑣ℓ+1 ) = 𝑣ℓ+1 − 𝑢ℓ+1 = ℓ ∑︁ ⟨𝑢𝑗 , 𝑣ℓ+1 ⟩𝑢𝑗 , 𝑗=1 ′ 𝑢ℓ+1 /‖𝑢′ℓ+1 ‖. If we had 𝑢′ℓ+1 = 0, we would have 𝑣ℓ+1 ∈ 𝑈ℓ = 𝑉ℓ , which would contradict the independence of 𝑣1 , 𝑣2 , . . . , 𝑣𝑑 . Thus, our definition of 𝑢ℓ+1 is legitimate. Since 𝑈ℓ = 𝑉ℓ , we have 𝑢′ℓ+1 ∈ 𝑉ℓ+1 , and we already know that 𝑢1 , . . . , 𝑢ℓ are in 𝑉ℓ ⊆ 𝑉ℓ+1 . Thus, span(𝑢1 , 𝑢2 . . . , 𝑢ℓ , 𝑢′ℓ+1 ) ⊆ 𝑉ℓ+1 . On the other hand, 𝑣ℓ+1 = 𝑢′ℓ+1 + ℓ ∑︁ ⟨𝑢𝑗 , 𝑣ℓ+1 ⟩𝑢𝑗 , 𝑗=1 so 𝑣ℓ+1 ∈ span(𝑢1 , 𝑢2 . . . , 𝑢ℓ , 𝑢′ℓ+1 ) We already know 𝑣1 , . . . , 𝑣ℓ ∈ span(𝑢1 , . . . , 𝑢ℓ ) = 𝑉ℓ = 𝑈ℓ . Thus, span(𝑣1 , 𝑣2 , . . . , 𝑣ℓ+1 ) ⊆ span(𝑢1 , 𝑢2 . . . , 𝑢ℓ , 𝑢′ℓ+1 ). Thus, span(𝑢1 , 𝑢2 . . . , 𝑢ℓ , 𝑢′ℓ+1 ) = 𝑉ℓ+1 . Since 𝑢ℓ+1 is just a scalar multiple of 𝑢′ℓ+1 , we conclude that 𝑈ℓ+1 = 𝑉ℓ+1 . We record the fact that 𝑣ℓ+1 = ‖𝑢′ℓ+1 ‖𝑢ℓ+1 + ℓ ∑︁ ⟨𝑢𝑗 , 𝑣ℓ+1 ⟩𝑢𝑗 . 𝑗=1 Continuing this inductive construction, we arrive at 𝑢1 , 𝑢2 , . . . , 𝑢𝑑 , as stated in the algorithm. Corollary 1.2. Every finite dimensional inner product space over K has an orthonormal basis. In the rest of these notes, we’ll work out some consequences of the Gram-Schmidt algorithm. 4 LANCE D. DRAGER 2. Adjoint Transformations First, we prove a classic theorem, which is simple in this case. Theorem 2.1 (Riez Representation Theorem). Let 𝑉 be an inner product space over K. Let 𝜙 : 𝑉 → K be a linear map. Then there is a unique vector 𝑤 ∈ 𝑉 so that 𝜙(𝑣) = ⟨𝑤, 𝑣⟩, ∀𝑣 ∈ 𝑉. Briefly, 𝜙 = ⟨𝑤, ·⟩. Remark 2.2. By tradition, a linear map 𝜙 : 𝑉 → K is called a linear functional. Proof of Theorem. Let 𝜙 be a linear functional on 𝑉 . Choose an orthonormal basis 𝑢1 , 𝑢2 , . . . , 𝑢𝑑 , where 𝑑 = dim(𝑉 ). Define a vector 𝑤 ∈ 𝑉 by 𝑤 = 𝜙(𝑢1 )𝑢1 + 𝜙(𝑢2 )𝑢2 + · · · + 𝜙(𝑢𝑑 )𝑢𝑑 . Then we have ⟨𝑤, 𝑢𝑖 ⟩ = ⟨∑︁ 𝑑 ⟩ 𝜙(𝑢𝑗 ) 𝑢𝑗 , 𝑢𝑖 𝑗=1 = 𝑑 ∑︁ 𝜙(𝑢𝑗 )⟨𝑢𝑗 , 𝑢𝑖 ⟩ 𝑗=1 = 𝑑 ∑︁ 𝜙(𝑢𝑗 )𝛿𝑗𝑖 𝑗=1 = 𝜙(𝑢𝑖 ). Since 𝑖 was arbitrary, we conclude 𝜙(𝑢𝑖 ) = ⟨𝑤, 𝑢𝑖 ⟩ for 𝑖 = 1, 2, . . . , 𝑑. Thus, the linear maps 𝜙 and ⟨𝑤, ·⟩ agree on a basis, so they must be the same. To prove the vector 𝑤 is unique, suppose that ⟨𝑤1 , ·⟩ = 𝜙 = ⟨𝑤2 , ·⟩. Then, for all 𝑣 ∈ 𝑉 , we have < 𝑤1 , 𝑣 >=< 𝑤2 , 𝑣 > =⇒ < 𝑤1 , 𝑣 > − < 𝑤2 , 𝑣 >= 0 =⇒ < 𝑤1 − 𝑤2 , 𝑣 >= 0. Thus, 𝑤1 − 𝑤2 = 0. THE GRAM-SCHMIDT PROCESS MATH 5316, FALL 2012 5 Theorem 2.3. Let 𝑉 and 𝑊 be inner product spaces over K and let 𝑇 : 𝑉 → 𝑊 be a linear map. Then there is a unique linear map 𝑆 : 𝑊 → 𝑉 so that ⟨𝑤, 𝑇 (𝑣)⟩𝑊 = ⟨𝑆(𝑤), 𝑣⟩𝑉 Here ⟨·, ·⟩𝑉 is the inner product on 𝑉 and ⟨·, ·⟩𝑊 is the inner product on 𝑊 . Remark 2.4. Usually we’ll drop the subscripts on the inner products, which should be clear from context, unless it seems particularly useful to show the distinction. Proof of Theorem. If we fix a vector 𝑤 ∈ 𝑊 , then the map 𝑣 ↦→ ⟨𝑤, 𝑇 (𝑣)⟩ is a linear functional on 𝑉 . By the Riez Representation Theorem, there is a unique vector 𝑢 ∈ 𝑉 so that ⟨𝑤, 𝑇 (𝑣)⟩ = ⟨𝑢, 𝑣⟩. Since 𝑢 is determined by 𝑤, there is a unique function 𝑆 : 𝑊 → 𝑉 that sends 𝑤 to the corresponding vector 𝑢. Thus, ⟨𝑤, 𝑇 (𝑣)⟩ = ⟨𝑆(𝑤), 𝑣⟩ for all 𝑣 and 𝑤. It remains to prove that this function 𝑆 is linear. To do this, let 𝑤1 and 𝑤2 be vectors in 𝑊 and let 𝛼 and 𝛽 be scalars. Consider ⟨𝛼𝑤1 + 𝛽𝑤2 , 𝑇 (𝑣)⟩. On the one hand, ⟨𝛼𝑤1 + 𝛽𝑤2 , 𝑇 (𝑣)⟩ = ⟨𝑆(𝛼𝑤1 + 𝛽𝑤2 ), 𝑣⟩. On the other hand, ¯ 2 , 𝑇 (𝑣)⟩ ⟨𝛼𝑤1 + 𝛽𝑤2 , 𝑇 (𝑣)⟩ = 𝛼 ¯ ⟨𝑤1 , 𝑇 (𝑣)⟩ + 𝛽⟨𝑤 ¯ =𝛼 ¯ ⟨𝑆(𝑤1 ), 𝑣⟩ + 𝛽⟨𝑆(𝑤 2 ), 𝑣⟩ = ⟨𝛼𝑆(𝑤1 ) + 𝛽𝑆(𝑤2 ), 𝑣⟩. Thus, ⟨𝑆(𝛼𝑤1 + 𝛽𝑤2 ), 𝑣⟩ = ⟨𝛼𝑆(𝑤1 ) + 𝛽𝑆(𝑤2 ), 𝑣⟩, 6 LANCE D. DRAGER Since 𝑣 is arbitrary, we conclude that 𝑆(𝛼𝑤1 + 𝛽𝑤2 ) = 𝛼𝑆(𝑤1 ) + 𝛽𝑆(𝑤2 ). We’ve now shown that 𝑆 is linear and the proof is complete. Definition 2.5. The unique linear transformation 𝑆 in Theorem 2.3 will be denoted 𝑇 * . We call 𝑇 * the adjoint of 𝑇 . The reader is highly advised to write out the details of the following Theorem. Theorem 2.6. The operation 𝑇 ↦→ 𝑇 * has the following properties. (1) (𝑇 * )* = 𝑇 . ¯ *. (2) (𝛼𝑇 + 𝛽𝑆)* = 𝛼 ¯ 𝑇 * + 𝛽𝑆 * * * (3) (𝑆𝑇 ) = 𝑇 𝑆 . Note that the order reverses. Exercise 2.7. Prove the following. (1) 𝑇 is injective if and only if 𝑇 * is subjective. (2) 𝑇 is subjective if and only if 𝑇 * is injective. Let’s examine what happens in the case of the standard inner products on K𝑑 . The usual inner product is ⟨𝑥, 𝑦⟩ = 𝑑 ∑︁ 𝑥¯𝑖 𝑦𝑗 , 𝑖=1 where ⎡ ⎤ 𝑥1 ⎢ 𝑥2 ⎥ ⎥ 𝑥=⎢ ⎣ ... ⎦ , ⎡ ⎤ 𝑦1 ⎢𝑦2 ⎥ ⎥ 𝑦=⎢ ⎣ ... ⎦ , 𝑥𝑑 𝑦𝑑 are column vectors. If 𝐴 is an 𝑚 × 𝑛 matrix over K, with entries 𝐴 = [𝑎𝑖𝑗 ], we define the matrix 𝐴* = [𝑏𝑖𝑗 ] by 𝑏𝑖𝑗 = 𝑎𝑗𝑖 . ¯ = (𝐴𝑡 )¯, where we take the conjugate of each In other words, 𝐴 = (𝐴) entry in the matrix and then take the transpose. In the case K = R, 𝐴* = 𝐴𝑡 . * 𝑡 Remark 2.8. One convection sometimes used is to write 𝐴𝑇 for the transpose and 𝐴𝐻 for 𝐴* (H for Hermitian transpose). THE GRAM-SCHMIDT PROCESS MATH 5316, FALL 2012 7 Exercise 2.9. Show that the operation on matrices sending 𝐴 to 𝐴* has the properties (1) (𝐴* )* = 𝐴. ¯ * (2) (𝛼𝐴 + 𝛽𝐵)* = 𝛼 ¯ 𝐴* + 𝛽𝐵 (3) (𝐴𝐵)* = 𝐵 * 𝐴* . With this definition, we can write ⟨𝑥, 𝑦⟩ = 𝑑 ∑︁ 𝑥¯𝑖 𝑦𝑖 𝑖=1 [︀ = 𝑥¯1 𝑥¯2 ⎡ ⎤ 𝑦1 ]︀ ⎢𝑦2 ⎥ ⎥ . . . 𝑥¯𝑑 ⎢ ⎣ ... ⎦ 𝑦𝑑 * = 𝑥 𝑦. Let 𝐴 be an 𝑚 × 𝑛 matrix, which we can think of as defining a linear transformation K𝑛 → K𝑚 . We then have ⟨𝐴𝑥, 𝑦⟩ = (𝐴𝑥)* 𝑦 = (𝑥* 𝐴* )𝑦 = 𝑥* (𝐴* 𝑦) = ⟨𝑥, 𝐴* 𝑦⟩. In other words, if 𝑇 : K𝑛 → K𝑚 : 𝑥 ↦→ 𝐴𝑥 is the transformation given by multiplication by 𝐴, then 𝑇 * : K𝑚 → K𝑛 , the adjoint of 𝑇 , is given by multiplication by 𝐴* (so our notation should not cause any confusion). We can carry this idea further to general vector spaces. Theorem 2.10. Let 𝑉 be an inner product space over K and let 𝑛 = dim(𝑉 ). Let 𝒱 be an orthonormal basis 𝑉 . Recall that the coordinate map 𝜉𝒱 : 𝑉 → K𝑛 sends 𝑣 to it’s coordinate vector, denoted [𝑣]𝒱 , with respect to 𝒱. In other words, [𝑣]𝒱 is the unique column vector so that 𝑣 = 𝒱 [𝑣]𝒱 . Then we have < 𝑣, 𝑤 >𝑉 = ⟨[𝑣]𝒱 , [𝑤]𝒱 ⟩K𝑛 = [𝑣]*𝒱 [𝑤]𝒱 . 8 LANCE D. DRAGER Perhaps it will help to say that the following diagram commutes. ⟨·,·⟩𝑉 𝑉 ×𝑉 / K 𝜉𝒱 ×𝜉𝒱 / K𝑛 × K𝑛 ⟨·,·⟩K𝑛 K where 𝜉𝒱 × 𝜉𝒱 : (𝑣, 𝑤) ↦→ ([𝑣]𝒱 , [𝑤]𝒱 ) [︀ ]︀ Proof of Theorem. Our orthonormal basis is 𝒱 = 𝑣1 𝑣2 . . . 𝑣𝑑 . If [𝑣]𝒱 = 𝑥, then 𝑣 = 𝑥1 𝑣1 + 𝑥2 𝑣2 + · · · + 𝑥𝑑 𝑣𝑑 . Similarly, if [𝑤]𝒱 = 𝑦, 𝑤 = 𝑦1 𝑣1 + 𝑦2 𝑣2 + · · · + 𝑦𝑑 𝑣𝑑 . Then, ⟨𝑣, 𝑤⟩ = ⟨∑︁ 𝑑 𝑥𝑖 𝑣𝑖 , 𝑖=1 = 𝑑 ∑︁ 𝑑 ∑︁ 𝑑 ∑︁ ⟩ 𝑦𝑗 𝑣𝑗 𝑗=1 𝑥¯𝑖 𝑦𝑗 ⟨𝑣𝑖 , 𝑣𝑗 ⟩ 𝑖=1 𝑗=1 = 𝑑 ∑︁ 𝑑 ∑︁ 𝑥¯𝑖 𝑦𝑗 𝛿𝑖𝑗 𝑖=1 𝑗=1 = 𝑑 ∑︁ 𝑥¯𝑖 𝑦𝑖 𝑖=1 = ⟨𝑥, 𝑦⟩ = ⟨[𝑣]𝒱 , [𝑤]𝒱 ⟩. Theorem 2.11. Let 𝑉 and 𝑊 be inner product spaces over K. Let 𝑛 = dim(𝑉 ) and 𝑚 = dim(𝑊 ). Choose orthonormal bases 𝒱 for 𝑉 and 𝒲 for 𝑊 . Let 𝑇 : 𝑉 → 𝑊 be a linear transformation and let 𝐴 = [𝑇 ]𝒱𝒲 be the matrix of 𝑇 with respect to our chosen bases. Then then matrix of 𝑇 * : 𝑊 → 𝑉 is 𝐴* , i.e., [𝑇 * ]𝒲𝒱 = [𝑇 ]*𝒱𝒲 . THE GRAM-SCHMIDT PROCESS MATH 5316, FALL 2012 9 To put it yet another way, if 𝑣 ∈ 𝑉 and 𝑤 ∈ 𝑊 , then (2.1) ⟨𝑇 (𝑣), 𝑤⟩𝑊 = ⟨𝐴 [𝑣]𝒱 , [𝑤]𝒲 ⟩K𝑚 (︀ )︀* = 𝐴 [𝑣]𝒱 [𝑤]𝒲 = [𝑣]*𝒱 (𝐴* [𝑤]𝒲 ) = ⟨[𝑣]𝒱 , 𝐴* [𝑤]𝒲 ⟩K𝑛 = ⟨𝑣, 𝑇 * (𝑤)⟩𝑉 . Remark 2.12. Warning! Warning! The last theorem only works if you chose orthonormal bases. Proof of Theorem. The manipulations in (2.1) are straight forward. We just have to show that (2.1) implies that [𝑇 * ]𝒲𝒱 = 𝐴* . Let’s focus on ⟨[𝑣]𝒱 , 𝐴* [𝑤]𝒲 ⟩K𝑛 = ⟨𝑣, 𝑇 * (𝑤)⟩𝑉 . For notational convenience, let 𝐶 = [𝑇 * ]𝒱𝒲 . By Theorem 2.10, ⟨𝑣, 𝑇 * (𝑤)⟩𝑉 = ⟨[𝑣]𝒱 , [𝑇 * (𝑤)]𝒱 ⟩K𝑛 = ⟨[𝑣]𝒱 , [𝑇 * ]𝒱𝒲 [𝑤]𝒲 ⟩K𝑛 = ⟨[𝑣]𝒱 , 𝐶 [𝑤]𝒲 ⟩K𝑛 . Thus, ⟨[𝑣]𝒱 , 𝐶 [𝑤]𝒲 ⟩K𝑛 = ⟨[𝑣]𝒱 , 𝐴* [𝑤]𝒲 ⟩K𝑛 , for all vectors 𝑣 ∈ 𝑉 and 𝑤 ∈ 𝑊 . But we can make [𝑣]𝒱 and [𝑤]𝒲 any vectors we want by an appropriate choice of 𝑣 and 𝑤. Thus, we must have ⟨𝑥, 𝐴* 𝑦⟩ = ⟨𝑥, 𝐶𝑦⟩ for all vectors 𝑥 ∈ K𝑛 and 𝑦 ∈ K𝑚 . If we fix 𝑦, we have ⟨𝑥, 𝐴* 𝑦 − 𝐶𝑦⟩ = 0 for all 𝑥, which implies 𝐴* 𝑦 − 𝐶𝑦 = 0. Thus 𝐴* 𝑦 = 𝐶𝑦 for all 𝑦, which we know from previous work implies 𝐴* = 𝐶. Our proof is complete. 10 LANCE D. DRAGER 3. Orthogonal Decompositions Let 𝑉 be an inner product space over K. If 𝑋 ⊆ 𝑉 is any set, we define 𝑋 ⊥ = {𝑣 ∈ 𝑉 | ∀𝑥 ∈ 𝑋, ⟨𝑣, 𝑥⟩ = 0}, i.e., the set of vectors that are orthogonal to everything in 𝑋. Theorem 3.1. Let 𝑉 be an inner product space over K and let 𝑋 ⊆ 𝑉 be any set. (1) 𝑋 ⊥ is a subspace of 𝑉 . (2) For any set 𝑋 ⊆ 𝑉 , 𝑋 ⊥ = span(𝑋)⊥ . (3) If 𝑆 = span(𝑠1 , 𝑠2 , . . . , 𝑠𝑘 ) then 𝑣 ∈ 𝑆 ⊥ if and only if < 𝑠𝑗 , 𝑣 >= 0 for 𝑗 = 1, 2, . . . , 𝑘. Proof. For the first part, note first that 0 ∈ 𝑋 ⊥ . To show that 𝑋 ⊥ is closed under addition and scalar multiplication, let 𝑣1 , 𝑣2 ∈ 𝑋 ⊥ and let 𝜆1 , 𝜆2 ∈ K. For any 𝑥 ∈ 𝑋, we have ⟨𝑥, 𝜆1 𝑣1 + 𝜆2 𝑣2 ⟩ = 𝜆1 ⟨𝑥, 𝑣1 ⟩ + 𝜆2 ⟨𝑥, 𝑣2 ⟩ = 𝜆1 0 + 𝜆2 0 = 0, so 𝜆1 𝑣1 + 𝜆2 𝑣2 ∈ 𝑋 ⊥ . Since we didn’t say that 𝑋 is finite, we should add that span(𝑋) is defined to be the set of all finite linear combinations of elements of 𝑋, i.e., all sums of the form 𝜆 1 𝑥1 + 𝜆 2 𝑥2 + · · · + 𝜆 𝑘 𝑥𝑘 where 𝑥1 , 𝑥2 , . . . , 𝑥𝑘 ∈ 𝑋 and the 𝜆𝑖 ’s are scalars. See Exercise 3.2. Clearly, 𝑋 ⊆ span(𝑋), since if 𝑥 ∈ 𝑋, then 𝑥 = 1𝑥 ∈ span(𝑋). Consider the second statement in the Theorem. We first show that span(𝑋)⊥ ⊆ 𝑋 ⊥ . To do this, suppose that 𝑣 ∈ span(𝑋)⊥ . This means that ⟨𝑣, 𝑠⟩ = 0 for all 𝑠 ∈ span(𝑋). But, 𝑋 ⊆ span(𝑋), so ⟨𝑣, 𝑥⟩ = 0 for all 𝑥 ∈ 𝑋. Thus 𝑣 ∈ 𝑋 ⊥ . See Exercise 3.3 Secondly, we show the other inclusion 𝑋 ⊥ ⊆ span(𝑋)⊥ . Suppose 𝑣 ∈ 𝑋 ⊥ , which means ⟨𝑣, 𝑥⟩ = 0 for all 𝑥 ∈ 𝑋. If 𝑠 ∈ span(𝑋), then 𝑠 = 𝜆1 𝑥1 + 𝜆2 𝑥2 + · · · + 𝜆𝑘 𝑥𝑘 for some 𝑥𝑖 ∈ 𝑋 and scalars 𝜆𝑖 . But then ⟨ ∑︁ ⟩ ∑︁ 𝑘 𝑘 𝑘 ∑︁ ⟨𝑣, 𝑠⟩ = 𝑣, 𝜆𝑖 𝑥𝑖 = 𝜆𝑖 ⟨𝑣, 𝑥𝑖 ⟩ = 𝜆𝑖 0 = 0. 𝑖=1 𝑖=1 𝑖=1 Thus, 𝑣 ∈ span(𝑋)⊥ . The proof of the third statement is very similar to the proof of the second statement and is left as (yet another) exercise. THE GRAM-SCHMIDT PROCESS MATH 5316, FALL 2012 11 Exercise 3.2. Suppose that 𝑋 ⊆ 𝑉 . Show that span(𝑋), as defined in the proof is a subspace. Show span(𝑋) is the smallest subspace containing 𝑋. Show that is 𝑋 is finite, span(𝑋) is the span of finitely many vectors as we have previously defined it. The next exercise will (probably) be used later. Exercise 3.3. Let 𝑋 ⊆ 𝑌 ⊆ 𝑉 , where 𝑉 is an inner product space. Then 𝑌 ⊥ ⊆ 𝑋 ⊥ . Theorem 3.4 (Orthongonal Decomposition Theorem). Let 𝑉 be an inner product space of dimension 𝑑 over K. If 𝑆 is a subspace of 𝑉 , then 𝑉 = 𝑆 ⊕ 𝑆 ⊥. It follows that 𝑆 ⊥⊥ = 𝑆. Proof. Let the dimension of 𝑆 be 𝑛. Choose any basis 𝑣1 , 𝑣2 , . . . , 𝑣𝑛 of 𝑆. We can complete this linearly independent set to a basis 𝑣1 , 𝑣2 , . . . , 𝑣𝑛 , 𝑣𝑛+1 , . . . 𝑣𝑑 of 𝑉 . Apply the Gram-Schmidt process of the basis 𝑣1 , 𝑣2 , . . . , 𝑣𝑑 to get an orthonormal basis 𝑢1 , 𝑢2 , . . . , 𝑢𝑑 of 𝑉 . By the properties of the GramSchmidt process, span(𝑢1 , 𝑢𝑠 , . . . , 𝑢𝑛 ) = span(𝑣1 , 𝑣2 , . . . , 𝑣𝑛 ) = 𝑆, so 𝑢1 , 𝑢2 , . . . , 𝑢𝑛 is an orthonormal basis of 𝑆. Define 𝑊 = span(𝑢𝑛+1 , 𝑢𝑛+2 , . . . , 𝑢𝑑 ). We have, of course, 𝑉 = 𝑆 ⊕ 𝑊, (if that’s not obvious, check the definition of direct sum). We claim that 𝑊 = 𝑆 ⊥ . To see this, first suppose that 𝑤 ∈ 𝑊 . Then we have 𝑤= 𝑑−𝑛 ∑︁ 𝑖=1 𝜆𝑛+𝑖 𝑢𝑛+𝑖 12 LANCE D. DRAGER for some scalars 𝜆𝑛+𝑖 . Consider 𝑢𝑗 , where 𝑗 ∈ {1, . . . , 𝑛}. We have ⟨𝑢𝑗 , 𝑤⟩ = ⟨𝑢𝑗 , 𝑑−𝑛 ∑︁ 𝜆𝑛+𝑖 𝑢𝑛+𝑖 ⟩ 𝑖=1 = 𝑑−𝑛 ∑︁ 𝜆𝑛+𝑖 ⟨𝑢𝑗 , 𝑢𝑛+𝑖 ⟩ 𝑖=1 = 𝑑−𝑛 ∑︁ 𝜆𝑛+𝑖 𝛿𝑗,𝑛+𝑖 𝑖=1 = 𝑑−𝑛 ∑︁ because 𝑗 ̸= 𝑛 + 𝑖 𝜆𝑛+1 0, 𝑖=1 = 0. By the third statement in Theorem 3.1, we conclude that 𝑤 ∈ 𝑆 ⊥ . Thus, 𝑊 ⊆ 𝑆 ⊥ . To do the reverse inclusion, suppose that 𝑥 ∈ 𝑆 ⊥ . Since 𝑥 ∈ 𝑉 , we can write it in terms of our orthonormal basis; in fact, we know what the coefficients must be. We have 𝑥 = ⟨𝑢1 , 𝑥⟩𝑢1 + ⟨𝑢2 , 𝑥⟩𝑢2 + · · · + ⟨𝑢𝑛 , 𝑥⟩𝑢𝑛 + ⟨𝑢𝑛+1 , 𝑥⟩ + · · · + ⟨𝑢𝑑 , 𝑥⟩𝑢𝑑 . But, 𝑥 ∈ 𝑆 ⊥ , so we must have ⟨𝑢𝑗 , 𝑥⟩ = 0, for 𝑗 = 1, 2, . . . , 𝑛 (since these 𝑢𝑗 ’s are in 𝑆). But then 𝑥 = ⟨𝑢𝑛 , 𝑥⟩𝑢𝑛 + ⟨𝑢𝑛+1 , 𝑥⟩ + · · · + ⟨𝑢𝑑 , 𝑥⟩𝑢𝑑 ∈ 𝑊. We’ve now show 𝑆 ⊥ ⊆ 𝑊 , so 𝑊 = 𝑆 ⊥ . We now have 𝑉 = 𝑆 ⊕ 𝑆⊥ To get the last statement of the theorem, we have to show that 𝑆 = 𝑆 ⊥⊥ = (𝑆 ⊥ )⊥ . Let 𝑣 ∈ 𝑉 . The we can write 𝑣 = 𝑠 + 𝑝 uniquely where 𝑠 ∈ 𝑆 and 𝑝 ∈ 𝑆 ⊥ . Thus, 𝑣 ∈ 𝑆 if and only if 𝑝 = 0. We have 𝑣 ∈ (𝑆 ⊥ )⊥ ⇐⇒ 𝑣 ⊥ 𝑆 ⊥ ⇐⇒ ⟨𝑣, 𝑞⟩ = 0, ∀𝑞 ∈ 𝑆 ⊥ ⇐⇒ 0 = ⟨𝑠 + 𝑝, 𝑞⟩ = ⟨𝑠, 𝑞⟩ + ⟨𝑝, 𝑞⟩ = ⟨𝑝, 𝑞⟩, ∀𝑞 ∈ 𝑆 ⊥ ⇐⇒ 𝑝 = 0, since 𝑝 ∈ 𝑆 ⊥ ⇐⇒ 𝑣 = 𝑠 ⇐⇒ 𝑣 ∈ 𝑆. THE GRAM-SCHMIDT PROCESS MATH 5316, FALL 2012 13 Thus, (𝑆 ⊥ )⊥ = 𝑆. Exercise 3.5. If 𝑋 is just a subset of 𝑉 , show that (𝑋 ⊥ )⊥ = span(𝑋). This theorem has many nice consequences. Exercise 3.6. Let 𝑉 and 𝑊 be inner product spaces over K and let 𝑇 : 𝑉 → 𝑊 be a linear transformation, so 𝑇 * : 𝑊 → 𝑉 . Show that 𝑊 = im(𝑇 ) ⊕ ker(𝑇 * ) 𝑉 = im(𝑇 * ) ⊕ ker(𝑇 ), where these are orthogonal direct sums, e.g., im(𝑇 )⊥ = ker(𝑇 * ). Department of Mathematics and Statistics, Texas Tech University, Lubbock, TX 79409-1042 E-mail address: lance.drager@ttu.edu

THE GRAM-SCHMIDT PROCESS MATH 5316, FALL 2012 1. The Gram-Schmidt Process K

Related documents

Products

Support

THE GRAM-SCHMIDT PROCESS MATH 5316, FALL 2012 1. The Gram-Schmidt Process K

Related documents

Add this document to collection(s)

Add this document to saved

Suggest us how to improve StudyLib