THE GRAM-SCHMIDT PROCESS MATH 5316, FALL 2012 1. The Gram-Schmidt Process K

advertisement
THE GRAM-SCHMIDT PROCESS
MATH 5316, FALL 2012
LANCE D. DRAGER
1. The Gram-Schmidt Process
Algorithm 1.1. Let 𝑉 be an inner product space over K. Let 𝑣1 , 𝑣2 , . . . , 𝑣𝑑
be a basis of 𝑉 . The Gram-Schmidt Process below constructs an orthonormal basis 𝑒1 , 𝑒2 , . . . , 𝑒𝑑 such that
span(𝑒1 , 𝑒2 , . . . , π‘’π‘˜ ) = span(𝑣1 , 𝑣2 , . . . , π‘£π‘˜ ),
π‘˜ = 1, 2, . . . , 𝑑.
Introduce the notation
π‘ˆπ‘˜ = span(𝑒1 , 𝑒2 , . . . , π‘’π‘˜ )
π‘‰π‘˜ = span(𝑣1 , 𝑣2 , . . . , π‘£π‘˜ ).
Recall that if π‘Š ⊆ 𝑉 is a subspace with orthonormal basis 𝑒1 , 𝑒2 , . . . , π‘’π‘˜ ,
and 𝑣 ∈ 𝑉 , we can write 𝑣 uniquely as 𝑣 = 𝑀 + 𝑝, where 𝑀 ∈ π‘Š and 𝑝
is orthogonal to π‘Š . Specifically
𝑀 = projπ‘Š (𝑣) =
π‘˜
∑︁
βŸ¨π‘’π‘– , π‘£βŸ©π‘’π‘–
𝑖=1
𝑝=
proj⊥
π‘Š (𝑣)
= 𝑣 − projπ‘Š (𝑣).
To start the inductive construction, let
𝑒′1 = 𝑣1
𝑒1 =𝑒′1 /‖𝑒′1 β€–
It should be clear that π‘ˆ1 = 𝑉1 and we record the fact that
𝑣1 = ‖𝑒′1 ‖𝑒1 .
For the next step, we define
𝑒′2 = proj⊥
𝑉1 (𝑣2 )
= 𝑣2 − projπ‘ˆ1 (𝑣2 ) = 𝑣2 − βŸ¨π‘’1 , 𝑣1 βŸ©π‘’1
𝑒2 = 𝑒′2 /‖𝑒′2 β€–
Version Time-stamp: ”2012-11-29 16:58:23 drager”.
1
2
LANCE D. DRAGER
Clearly 𝑒′2 , and hence 𝑒2 , are orthogonal to 𝑒1 . If 𝑒′2 = 0, then 𝑣2 ∈
π‘ˆ1 = 𝑉1 , which implies that 𝑣1 and 𝑣2 are dependent contrary to our
assumption. Thus, 𝑒′2 ΜΈ= 0 and the definition of 𝑒2 is legitimate.
Since 𝑒1 ∈ π‘ˆ1 = 𝑉1 , we can say that
𝑒′2 ∈ span(𝑣1 , 𝑣2 ) = 𝑉2
and 𝑒1 ∈ 𝑉1 ⊆ 𝑉2 , so
span(𝑒1 , 𝑒′2 ) ⊆ 𝑉2 .
On the other hand, we have
𝑣2 = 𝑒′2 + βŸ¨π‘’1 , 𝑣1 βŸ©π‘’1 ∈ span(𝑒1 , 𝑒′2 )
and 𝑣1 ∈ 𝑉1 = π‘ˆ1 = span(𝑒1 ). Thus,
𝑉2 = span(𝑣1 , 𝑣2 ) ⊆ span(𝑒1 , 𝑒′2 ),
and we conclude that
span(𝑒1 , 𝑒′2 ) = span(𝑣1 , 𝑣2 ) = 𝑉2 .
Since 𝑒2 is just a scalar multiple of 𝑒′2 , span(𝑒′2 , 𝑒1 ) = span(𝑒1 , 𝑒2 ) =
π‘ˆ2 . Thus,
π‘ˆ2 = 𝑉2 .
We record the equation
𝑣2 = ‖𝑒′2 ‖𝑒2 + βŸ¨π‘’1 , 𝑣1 βŸ©π‘’1
For the next step, we define
𝑒′3 = proj⊥
𝑉2 (𝑣3 )
= 𝑣3 − proj𝑉2 (𝑣3 )
= 𝑣3 − βŸ¨π‘’1 , 𝑣3 βŸ©π‘’1 − βŸ¨π‘’2 , 𝑣3 βŸ©π‘’2
𝑒3 = 𝑒′3 /‖𝑒′3 β€–.
and we can prove π‘ˆ3 = 𝑉3 . The reader should check this as an exercise,
but we’ll do the general case in a moment.
For the inductive step, suppose that we have constructed an orthonormal list 𝑒1 , 𝑒2 , . . . , 𝑒ℓ so that π‘‰π‘˜ = π‘ˆπ‘˜ . for π‘˜ = 1, 2 . . . , β„“. We
THE GRAM-SCHMIDT PROCESS
MATH 5316, FALL 2012
3
define
𝑒′β„“+1 = proj⊥
𝑉ℓ (𝑣ℓ+1 )
= 𝑣ℓ+1 − proj𝑉ℓ (𝑣ℓ+1 )
= 𝑣ℓ+1 −
𝑒ℓ+1 =
β„“
∑︁
βŸ¨π‘’π‘— , 𝑣ℓ+1 βŸ©π‘’π‘— ,
𝑗=1
′
𝑒ℓ+1 /‖𝑒′β„“+1 β€–.
If we had 𝑒′β„“+1 = 0, we would have 𝑣ℓ+1 ∈ π‘ˆβ„“ = 𝑉ℓ , which would
contradict the independence of 𝑣1 , 𝑣2 , . . . , 𝑣𝑑 . Thus, our definition of
𝑒ℓ+1 is legitimate.
Since π‘ˆβ„“ = 𝑉ℓ , we have 𝑒′β„“+1 ∈ 𝑉ℓ+1 , and we already know that
𝑒1 , . . . , 𝑒ℓ are in 𝑉ℓ ⊆ 𝑉ℓ+1 . Thus,
span(𝑒1 , 𝑒2 . . . , 𝑒ℓ , 𝑒′β„“+1 ) ⊆ 𝑉ℓ+1 .
On the other hand,
𝑣ℓ+1 =
𝑒′β„“+1
+
β„“
∑︁
βŸ¨π‘’π‘— , 𝑣ℓ+1 βŸ©π‘’π‘— ,
𝑗=1
so 𝑣ℓ+1 ∈ span(𝑒1 , 𝑒2 . . . , 𝑒ℓ , 𝑒′β„“+1 ) We already know 𝑣1 , . . . , 𝑣ℓ ∈ span(𝑒1 , . . . , 𝑒ℓ ) =
𝑉ℓ = π‘ˆβ„“ . Thus,
span(𝑣1 , 𝑣2 , . . . , 𝑣ℓ+1 ) ⊆ span(𝑒1 , 𝑒2 . . . , 𝑒ℓ , 𝑒′β„“+1 ).
Thus, span(𝑒1 , 𝑒2 . . . , 𝑒ℓ , 𝑒′β„“+1 ) = 𝑉ℓ+1 . Since 𝑒ℓ+1 is just a scalar multiple of 𝑒′β„“+1 , we conclude that π‘ˆβ„“+1 = 𝑉ℓ+1 .
We record the fact that
𝑣ℓ+1 =
‖𝑒′β„“+1 ‖𝑒ℓ+1
+
β„“
∑︁
βŸ¨π‘’π‘— , 𝑣ℓ+1 βŸ©π‘’π‘— .
𝑗=1
Continuing this inductive construction, we arrive at 𝑒1 , 𝑒2 , . . . , 𝑒𝑑 , as
stated in the algorithm.
Corollary 1.2. Every finite dimensional inner product space over K
has an orthonormal basis.
In the rest of these notes, we’ll work out some consequences of the
Gram-Schmidt algorithm.
4
LANCE D. DRAGER
2. Adjoint Transformations
First, we prove a classic theorem, which is simple in this case.
Theorem 2.1 (Riez Representation Theorem). Let 𝑉 be an inner product space over K. Let πœ™ : 𝑉 → K be a linear map. Then there is a unique
vector 𝑀 ∈ 𝑉 so that
πœ™(𝑣) = βŸ¨π‘€, π‘£βŸ©,
∀𝑣 ∈ 𝑉.
Briefly, πœ™ = βŸ¨π‘€, ·βŸ©.
Remark 2.2. By tradition, a linear map πœ™ : 𝑉 → K is called a linear
functional.
Proof of Theorem. Let πœ™ be a linear functional on 𝑉 . Choose an orthonormal basis 𝑒1 , 𝑒2 , . . . , 𝑒𝑑 , where 𝑑 = dim(𝑉 ).
Define a vector 𝑀 ∈ 𝑉 by
𝑀 = πœ™(𝑒1 )𝑒1 + πœ™(𝑒2 )𝑒2 + · · · + πœ™(𝑒𝑑 )𝑒𝑑 .
Then we have
βŸ¨π‘€, 𝑒𝑖 ⟩ =
⟨∑︁
𝑑
⟩
πœ™(𝑒𝑗 ) 𝑒𝑗 , 𝑒𝑖
𝑗=1
=
𝑑
∑︁
πœ™(𝑒𝑗 )βŸ¨π‘’π‘— , 𝑒𝑖 ⟩
𝑗=1
=
𝑑
∑︁
πœ™(𝑒𝑗 )𝛿𝑗𝑖
𝑗=1
= πœ™(𝑒𝑖 ).
Since 𝑖 was arbitrary, we conclude πœ™(𝑒𝑖 ) = βŸ¨π‘€, 𝑒𝑖 ⟩ for 𝑖 = 1, 2, . . . , 𝑑.
Thus, the linear maps πœ™ and βŸ¨π‘€, ·βŸ© agree on a basis, so they must be
the same.
To prove the vector 𝑀 is unique, suppose that βŸ¨π‘€1 , ·βŸ© = πœ™ = βŸ¨π‘€2 , ·βŸ©.
Then, for all 𝑣 ∈ 𝑉 , we have
< 𝑀1 , 𝑣 >=< 𝑀2 , 𝑣 > =⇒ < 𝑀1 , 𝑣 > − < 𝑀2 , 𝑣 >= 0
=⇒ < 𝑀1 − 𝑀2 , 𝑣 >= 0.
Thus, 𝑀1 − 𝑀2 = 0.
THE GRAM-SCHMIDT PROCESS
MATH 5316, FALL 2012
5
Theorem 2.3. Let 𝑉 and π‘Š be inner product spaces over K and
let 𝑇 : 𝑉 → π‘Š be a linear map. Then there is a unique linear map
𝑆 : π‘Š → 𝑉 so that
βŸ¨π‘€, 𝑇 (𝑣)βŸ©π‘Š = βŸ¨π‘†(𝑀), π‘£βŸ©π‘‰
Here ⟨·, ·βŸ©π‘‰ is the inner product on 𝑉 and ⟨·, ·βŸ©π‘Š is the inner product
on π‘Š .
Remark 2.4. Usually we’ll drop the subscripts on the inner products,
which should be clear from context, unless it seems particularly useful
to show the distinction.
Proof of Theorem. If we fix a vector 𝑀 ∈ π‘Š , then the map
𝑣 ↦→ βŸ¨π‘€, 𝑇 (𝑣)⟩
is a linear functional on 𝑉 . By the Riez Representation Theorem, there
is a unique vector 𝑒 ∈ 𝑉 so that
βŸ¨π‘€, 𝑇 (𝑣)⟩ = βŸ¨π‘’, π‘£βŸ©.
Since 𝑒 is determined by 𝑀, there is a unique function 𝑆 : π‘Š → 𝑉 that
sends 𝑀 to the corresponding vector 𝑒. Thus,
βŸ¨π‘€, 𝑇 (𝑣)⟩ = βŸ¨π‘†(𝑀), π‘£βŸ©
for all 𝑣 and 𝑀.
It remains to prove that this function 𝑆 is linear. To do this, let 𝑀1
and 𝑀2 be vectors in π‘Š and let 𝛼 and 𝛽 be scalars. Consider
βŸ¨π›Όπ‘€1 + 𝛽𝑀2 , 𝑇 (𝑣)⟩.
On the one hand,
βŸ¨π›Όπ‘€1 + 𝛽𝑀2 , 𝑇 (𝑣)⟩ = βŸ¨π‘†(𝛼𝑀1 + 𝛽𝑀2 ), π‘£βŸ©.
On the other hand,
¯ 2 , 𝑇 (𝑣)⟩
βŸ¨π›Όπ‘€1 + 𝛽𝑀2 , 𝑇 (𝑣)⟩ = 𝛼
¯ βŸ¨π‘€1 , 𝑇 (𝑣)⟩ + π›½βŸ¨π‘€
¯
=𝛼
¯ βŸ¨π‘†(𝑀1 ), π‘£βŸ© + π›½βŸ¨π‘†(𝑀
2 ), π‘£βŸ©
= βŸ¨π›Όπ‘†(𝑀1 ) + 𝛽𝑆(𝑀2 ), π‘£βŸ©.
Thus,
βŸ¨π‘†(𝛼𝑀1 + 𝛽𝑀2 ), π‘£βŸ© = βŸ¨π›Όπ‘†(𝑀1 ) + 𝛽𝑆(𝑀2 ), π‘£βŸ©,
6
LANCE D. DRAGER
Since 𝑣 is arbitrary, we conclude that
𝑆(𝛼𝑀1 + 𝛽𝑀2 ) = 𝛼𝑆(𝑀1 ) + 𝛽𝑆(𝑀2 ).
We’ve now shown that 𝑆 is linear and the proof is complete.
Definition 2.5. The unique linear transformation 𝑆 in Theorem 2.3
will be denoted 𝑇 * . We call 𝑇 * the adjoint of 𝑇 .
The reader is highly advised to write out the details of the following
Theorem.
Theorem 2.6. The operation 𝑇 ↦→ 𝑇 * has the following properties.
(1) (𝑇 * )* = 𝑇 .
¯ *.
(2) (𝛼𝑇 + 𝛽𝑆)* = 𝛼
¯ 𝑇 * + 𝛽𝑆
*
* *
(3) (𝑆𝑇 ) = 𝑇 𝑆 . Note that the order reverses.
Exercise 2.7. Prove the following.
(1) 𝑇 is injective if and only if 𝑇 * is subjective.
(2) 𝑇 is subjective if and only if 𝑇 * is injective.
Let’s examine what happens in the case of the standard inner products on K𝑑 . The usual inner product is
⟨π‘₯, π‘¦βŸ© =
𝑑
∑︁
π‘₯¯π‘– 𝑦𝑗 ,
𝑖=1
where
⎑ ⎀
π‘₯1
⎒ π‘₯2 βŽ₯
βŽ₯
π‘₯=⎒
⎣ ... ⎦ ,
⎑ ⎀
𝑦1
βŽ’π‘¦2 βŽ₯
βŽ₯
𝑦=⎒
⎣ ... ⎦ ,
π‘₯𝑑
𝑦𝑑
are column vectors.
If 𝐴 is an π‘š × π‘› matrix over K, with entries 𝐴 = [π‘Žπ‘–π‘— ], we define the
matrix 𝐴* = [𝑏𝑖𝑗 ] by
𝑏𝑖𝑗 = π‘Žπ‘—π‘– .
¯ = (𝐴𝑑 )¯, where we take the conjugate of each
In other words, 𝐴 = (𝐴)
entry in the matrix and then take the transpose. In the case K = R,
𝐴* = 𝐴𝑑 .
*
𝑑
Remark 2.8. One convection sometimes used is to write 𝐴𝑇 for the
transpose and 𝐴𝐻 for 𝐴* (H for Hermitian transpose).
THE GRAM-SCHMIDT PROCESS
MATH 5316, FALL 2012
7
Exercise 2.9. Show that the operation on matrices sending 𝐴 to 𝐴*
has the properties
(1) (𝐴* )* = 𝐴.
¯ *
(2) (𝛼𝐴 + 𝛽𝐡)* = 𝛼
¯ 𝐴* + 𝛽𝐡
(3) (𝐴𝐡)* = 𝐡 * 𝐴* .
With this definition, we can write
⟨π‘₯, π‘¦βŸ© =
𝑑
∑︁
π‘₯¯π‘– 𝑦𝑖
𝑖=1
[οΈ€
= π‘₯¯1 π‘₯¯2
⎑ ⎀
𝑦1
]οΈ€ βŽ’π‘¦2 βŽ₯
βŽ₯
. . . π‘₯¯π‘‘ ⎒
⎣ ... ⎦
𝑦𝑑
*
= π‘₯ 𝑦.
Let 𝐴 be an π‘š × π‘› matrix, which we can think of as defining a linear
transformation K𝑛 → Kπ‘š . We then have
⟨𝐴π‘₯, π‘¦βŸ© = (𝐴π‘₯)* 𝑦
= (π‘₯* 𝐴* )𝑦
= π‘₯* (𝐴* 𝑦)
= ⟨π‘₯, 𝐴* π‘¦βŸ©.
In other words, if 𝑇 : K𝑛 → Kπ‘š : π‘₯ ↦→ 𝐴π‘₯ is the transformation given by
multiplication by 𝐴, then 𝑇 * : Kπ‘š → K𝑛 , the adjoint of 𝑇 , is given by
multiplication by 𝐴* (so our notation should not cause any confusion).
We can carry this idea further to general vector spaces.
Theorem 2.10. Let 𝑉 be an inner product space over K and let 𝑛 =
dim(𝑉 ). Let 𝒱 be an orthonormal basis 𝑉 . Recall that the coordinate
map πœ‰π’± : 𝑉 → K𝑛 sends 𝑣 to it’s coordinate vector, denoted [𝑣]𝒱 , with
respect to 𝒱. In other words, [𝑣]𝒱 is the unique column vector so that
𝑣 = 𝒱 [𝑣]𝒱 .
Then we have
< 𝑣, 𝑀 >𝑉 = ⟨[𝑣]𝒱 , [𝑀]𝒱 ⟩K𝑛 = [𝑣]*𝒱 [𝑀]𝒱 .
8
LANCE D. DRAGER
Perhaps it will help to say that the following diagram commutes.
⟨·,·βŸ©π‘‰
𝑉 ×𝑉
/
K
πœ‰π’± ×πœ‰π’±
/
K𝑛 × K𝑛
⟨·,·βŸ©K𝑛
K
where
πœ‰π’± × πœ‰π’± : (𝑣, 𝑀) ↦→ ([𝑣]𝒱 , [𝑀]𝒱 )
[οΈ€
]οΈ€
Proof of Theorem. Our orthonormal basis is 𝒱 = 𝑣1 𝑣2 . . . 𝑣𝑑 .
If [𝑣]𝒱 = π‘₯, then 𝑣 = π‘₯1 𝑣1 + π‘₯2 𝑣2 + · · · + π‘₯𝑑 𝑣𝑑 . Similarly, if [𝑀]𝒱 = 𝑦,
𝑀 = 𝑦1 𝑣1 + 𝑦2 𝑣2 + · · · + 𝑦𝑑 𝑣𝑑 . Then,
βŸ¨π‘£, π‘€βŸ© =
⟨∑︁
𝑑
π‘₯𝑖 𝑣𝑖 ,
𝑖=1
=
𝑑 ∑︁
𝑑
∑︁
𝑑
∑︁
⟩
𝑦𝑗 𝑣𝑗
𝑗=1
π‘₯¯π‘– 𝑦𝑗 βŸ¨π‘£π‘– , 𝑣𝑗 ⟩
𝑖=1 𝑗=1
=
𝑑 ∑︁
𝑑
∑︁
π‘₯¯π‘– 𝑦𝑗 𝛿𝑖𝑗
𝑖=1 𝑗=1
=
𝑑
∑︁
π‘₯¯π‘– 𝑦𝑖
𝑖=1
= ⟨π‘₯, π‘¦βŸ©
= ⟨[𝑣]𝒱 , [𝑀]𝒱 ⟩.
Theorem 2.11. Let 𝑉 and π‘Š be inner product spaces over K. Let
𝑛 = dim(𝑉 ) and π‘š = dim(π‘Š ). Choose orthonormal bases 𝒱 for 𝑉
and 𝒲 for π‘Š .
Let 𝑇 : 𝑉 → π‘Š be a linear transformation and let 𝐴 = [𝑇 ]𝒱𝒲 be
the matrix of 𝑇 with respect to our chosen bases. Then then matrix of
𝑇 * : π‘Š → 𝑉 is 𝐴* , i.e.,
[𝑇 * ]𝒲𝒱 = [𝑇 ]*𝒱𝒲 .
THE GRAM-SCHMIDT PROCESS
MATH 5316, FALL 2012
9
To put it yet another way, if 𝑣 ∈ 𝑉 and 𝑀 ∈ π‘Š , then
(2.1)
βŸ¨π‘‡ (𝑣), π‘€βŸ©π‘Š = ⟨𝐴 [𝑣]𝒱 , [𝑀]𝒲 ⟩Kπ‘š
(οΈ€
)οΈ€*
= 𝐴 [𝑣]𝒱 [𝑀]𝒲
= [𝑣]*𝒱 (𝐴* [𝑀]𝒲 )
= ⟨[𝑣]𝒱 , 𝐴* [𝑀]𝒲 ⟩K𝑛
= βŸ¨π‘£, 𝑇 * (𝑀)βŸ©π‘‰ .
Remark 2.12. Warning! Warning! The last theorem only works if you
chose orthonormal bases.
Proof of Theorem. The manipulations in (2.1) are straight forward. We
just have to show that (2.1) implies that [𝑇 * ]𝒲𝒱 = 𝐴* .
Let’s focus on
⟨[𝑣]𝒱 , 𝐴* [𝑀]𝒲 ⟩K𝑛 = βŸ¨π‘£, 𝑇 * (𝑀)βŸ©π‘‰ .
For notational convenience, let 𝐢 = [𝑇 * ]𝒱𝒲 . By Theorem 2.10,
βŸ¨π‘£, 𝑇 * (𝑀)βŸ©π‘‰ = ⟨[𝑣]𝒱 , [𝑇 * (𝑀)]𝒱 ⟩K𝑛
= ⟨[𝑣]𝒱 , [𝑇 * ]𝒱𝒲 [𝑀]𝒲 ⟩K𝑛
= ⟨[𝑣]𝒱 , 𝐢 [𝑀]𝒲 ⟩K𝑛 .
Thus,
⟨[𝑣]𝒱 , 𝐢 [𝑀]𝒲 ⟩K𝑛 = ⟨[𝑣]𝒱 , 𝐴* [𝑀]𝒲 ⟩K𝑛 ,
for all vectors 𝑣 ∈ 𝑉 and 𝑀 ∈ π‘Š . But we can make [𝑣]𝒱 and [𝑀]𝒲 any
vectors we want by an appropriate choice of 𝑣 and 𝑀. Thus, we must
have
⟨π‘₯, 𝐴* π‘¦βŸ© = ⟨π‘₯, πΆπ‘¦βŸ©
for all vectors π‘₯ ∈ K𝑛 and 𝑦 ∈ Kπ‘š . If we fix 𝑦, we have
⟨π‘₯, 𝐴* 𝑦 − πΆπ‘¦βŸ© = 0
for all π‘₯, which implies 𝐴* 𝑦 − 𝐢𝑦 = 0. Thus 𝐴* 𝑦 = 𝐢𝑦 for all 𝑦,
which we know from previous work implies 𝐴* = 𝐢. Our proof is
complete.
10
LANCE D. DRAGER
3. Orthogonal Decompositions
Let 𝑉 be an inner product space over K. If 𝑋 ⊆ 𝑉 is any set, we
define
𝑋 ⊥ = {𝑣 ∈ 𝑉 | ∀π‘₯ ∈ 𝑋, βŸ¨π‘£, π‘₯⟩ = 0},
i.e., the set of vectors that are orthogonal to everything in 𝑋.
Theorem 3.1. Let 𝑉 be an inner product space over K and let 𝑋 ⊆ 𝑉
be any set.
(1) 𝑋 ⊥ is a subspace of 𝑉 .
(2) For any set 𝑋 ⊆ 𝑉 , 𝑋 ⊥ = span(𝑋)⊥ .
(3) If 𝑆 = span(𝑠1 , 𝑠2 , . . . , π‘ π‘˜ ) then 𝑣 ∈ 𝑆 ⊥ if and only if < 𝑠𝑗 , 𝑣 >=
0 for 𝑗 = 1, 2, . . . , π‘˜.
Proof. For the first part, note first that 0 ∈ 𝑋 ⊥ . To show that 𝑋 ⊥ is
closed under addition and scalar multiplication, let 𝑣1 , 𝑣2 ∈ 𝑋 ⊥ and let
πœ†1 , πœ†2 ∈ K. For any π‘₯ ∈ 𝑋, we have
⟨π‘₯, πœ†1 𝑣1 + πœ†2 𝑣2 ⟩ = πœ†1 ⟨π‘₯, 𝑣1 ⟩ + πœ†2 ⟨π‘₯, 𝑣2 ⟩ = πœ†1 0 + πœ†2 0 = 0,
so πœ†1 𝑣1 + πœ†2 𝑣2 ∈ 𝑋 ⊥ .
Since we didn’t say that 𝑋 is finite, we should add that span(𝑋) is
defined to be the set of all finite linear combinations of elements of 𝑋,
i.e., all sums of the form
πœ† 1 π‘₯1 + πœ† 2 π‘₯2 + · · · + πœ† π‘˜ π‘₯π‘˜
where π‘₯1 , π‘₯2 , . . . , π‘₯π‘˜ ∈ 𝑋 and the πœ†π‘– ’s are scalars. See Exercise 3.2.
Clearly, 𝑋 ⊆ span(𝑋), since if π‘₯ ∈ 𝑋, then π‘₯ = 1π‘₯ ∈ span(𝑋).
Consider the second statement in the Theorem. We first show that
span(𝑋)⊥ ⊆ 𝑋 ⊥ . To do this, suppose that 𝑣 ∈ span(𝑋)⊥ . This means
that βŸ¨π‘£, π‘ βŸ© = 0 for all 𝑠 ∈ span(𝑋). But, 𝑋 ⊆ span(𝑋), so βŸ¨π‘£, π‘₯⟩ = 0
for all π‘₯ ∈ 𝑋. Thus 𝑣 ∈ 𝑋 ⊥ . See Exercise 3.3
Secondly, we show the other inclusion 𝑋 ⊥ ⊆ span(𝑋)⊥ . Suppose
𝑣 ∈ 𝑋 ⊥ , which means βŸ¨π‘£, π‘₯⟩ = 0 for all π‘₯ ∈ 𝑋. If 𝑠 ∈ span(𝑋), then
𝑠 = πœ†1 π‘₯1 + πœ†2 π‘₯2 + · · · + πœ†π‘˜ π‘₯π‘˜ for some π‘₯𝑖 ∈ 𝑋 and scalars πœ†π‘– . But then
⟨ ∑︁
⟩ ∑︁
π‘˜
π‘˜
π‘˜
∑︁
βŸ¨π‘£, π‘ βŸ© = 𝑣,
πœ†π‘– π‘₯𝑖 =
πœ†π‘– βŸ¨π‘£, π‘₯𝑖 ⟩ =
πœ†π‘– 0 = 0.
𝑖=1
𝑖=1
𝑖=1
Thus, 𝑣 ∈ span(𝑋)⊥ .
The proof of the third statement is very similar to the proof of the
second statement and is left as (yet another) exercise.
THE GRAM-SCHMIDT PROCESS
MATH 5316, FALL 2012
11
Exercise 3.2. Suppose that 𝑋 ⊆ 𝑉 . Show that span(𝑋), as defined
in the proof is a subspace.
Show span(𝑋) is the smallest subspace containing 𝑋.
Show that is 𝑋 is finite, span(𝑋) is the span of finitely many vectors
as we have previously defined it.
The next exercise will (probably) be used later.
Exercise 3.3. Let 𝑋 ⊆ π‘Œ ⊆ 𝑉 , where 𝑉 is an inner product space.
Then π‘Œ ⊥ ⊆ 𝑋 ⊥ .
Theorem 3.4 (Orthongonal Decomposition Theorem). Let 𝑉 be an
inner product space of dimension 𝑑 over K. If 𝑆 is a subspace of 𝑉 ,
then
𝑉 = 𝑆 ⊕ 𝑆 ⊥.
It follows that 𝑆 ⊥⊥ = 𝑆.
Proof. Let the dimension of 𝑆 be 𝑛. Choose any basis 𝑣1 , 𝑣2 , . . . , 𝑣𝑛 of 𝑆.
We can complete this linearly independent set to a basis 𝑣1 , 𝑣2 , . . . , 𝑣𝑛 , 𝑣𝑛+1 , . . . 𝑣𝑑
of 𝑉 .
Apply the Gram-Schmidt process of the basis 𝑣1 , 𝑣2 , . . . , 𝑣𝑑 to get an
orthonormal basis 𝑒1 , 𝑒2 , . . . , 𝑒𝑑 of 𝑉 . By the properties of the GramSchmidt process,
span(𝑒1 , 𝑒𝑠 , . . . , 𝑒𝑛 ) = span(𝑣1 , 𝑣2 , . . . , 𝑣𝑛 ) = 𝑆,
so 𝑒1 , 𝑒2 , . . . , 𝑒𝑛 is an orthonormal basis of 𝑆.
Define
π‘Š = span(𝑒𝑛+1 , 𝑒𝑛+2 , . . . , 𝑒𝑑 ).
We have, of course,
𝑉 = 𝑆 ⊕ π‘Š,
(if that’s not obvious, check the definition of direct sum).
We claim that π‘Š = 𝑆 ⊥ . To see this, first suppose that 𝑀 ∈ π‘Š .
Then we have
𝑀=
𝑑−𝑛
∑︁
𝑖=1
πœ†π‘›+𝑖 𝑒𝑛+𝑖
12
LANCE D. DRAGER
for some scalars πœ†π‘›+𝑖 . Consider 𝑒𝑗 , where 𝑗 ∈ {1, . . . , 𝑛}. We have
βŸ¨π‘’π‘— , π‘€βŸ© = βŸ¨π‘’π‘— ,
𝑑−𝑛
∑︁
πœ†π‘›+𝑖 𝑒𝑛+𝑖 ⟩
𝑖=1
=
𝑑−𝑛
∑︁
πœ†π‘›+𝑖 βŸ¨π‘’π‘— , 𝑒𝑛+𝑖 ⟩
𝑖=1
=
𝑑−𝑛
∑︁
πœ†π‘›+𝑖 𝛿𝑗,𝑛+𝑖
𝑖=1
=
𝑑−𝑛
∑︁
because 𝑗 ΜΈ= 𝑛 + 𝑖
πœ†π‘›+1 0,
𝑖=1
= 0.
By the third statement in Theorem 3.1, we conclude that 𝑀 ∈ 𝑆 ⊥ .
Thus, π‘Š ⊆ 𝑆 ⊥ .
To do the reverse inclusion, suppose that π‘₯ ∈ 𝑆 ⊥ . Since π‘₯ ∈ 𝑉 , we
can write it in terms of our orthonormal basis; in fact, we know what
the coefficients must be. We have
π‘₯ = βŸ¨π‘’1 , π‘₯βŸ©π‘’1 + βŸ¨π‘’2 , π‘₯βŸ©π‘’2 + · · · + βŸ¨π‘’π‘› , π‘₯βŸ©π‘’π‘› + βŸ¨π‘’π‘›+1 , π‘₯⟩ + · · · + βŸ¨π‘’π‘‘ , π‘₯βŸ©π‘’π‘‘ .
But, π‘₯ ∈ 𝑆 ⊥ , so we must have βŸ¨π‘’π‘— , π‘₯⟩ = 0, for 𝑗 = 1, 2, . . . , 𝑛 (since
these 𝑒𝑗 ’s are in 𝑆). But then
π‘₯ = βŸ¨π‘’π‘› , π‘₯βŸ©π‘’π‘› + βŸ¨π‘’π‘›+1 , π‘₯⟩ + · · · + βŸ¨π‘’π‘‘ , π‘₯βŸ©π‘’π‘‘ ∈ π‘Š.
We’ve now show 𝑆 ⊥ ⊆ π‘Š , so π‘Š = 𝑆 ⊥ . We now have
𝑉 = 𝑆 ⊕ 𝑆⊥
To get the last statement of the theorem, we have to show that
𝑆 = 𝑆 ⊥⊥ = (𝑆 ⊥ )⊥ .
Let 𝑣 ∈ 𝑉 . The we can write 𝑣 = 𝑠 + 𝑝 uniquely where 𝑠 ∈ 𝑆 and
𝑝 ∈ 𝑆 ⊥ . Thus, 𝑣 ∈ 𝑆 if and only if 𝑝 = 0. We have
𝑣 ∈ (𝑆 ⊥ )⊥ ⇐⇒ 𝑣 ⊥ 𝑆 ⊥
⇐⇒ βŸ¨π‘£, π‘žβŸ© = 0, ∀π‘ž ∈ 𝑆 ⊥
⇐⇒ 0 = βŸ¨π‘  + 𝑝, π‘žβŸ© = βŸ¨π‘ , π‘žβŸ© + βŸ¨π‘, π‘žβŸ© = βŸ¨π‘, π‘žβŸ©, ∀π‘ž ∈ 𝑆 ⊥
⇐⇒ 𝑝 = 0, since 𝑝 ∈ 𝑆 ⊥
⇐⇒ 𝑣 = 𝑠
⇐⇒ 𝑣 ∈ 𝑆.
THE GRAM-SCHMIDT PROCESS
MATH 5316, FALL 2012
13
Thus, (𝑆 ⊥ )⊥ = 𝑆.
Exercise 3.5. If 𝑋 is just a subset of 𝑉 , show that (𝑋 ⊥ )⊥ = span(𝑋).
This theorem has many nice consequences.
Exercise 3.6. Let 𝑉 and π‘Š be inner product spaces over K and let
𝑇 : 𝑉 → π‘Š be a linear transformation, so 𝑇 * : π‘Š → 𝑉 . Show that
π‘Š = im(𝑇 ) ⊕ ker(𝑇 * )
𝑉 = im(𝑇 * ) ⊕ ker(𝑇 ),
where these are orthogonal direct sums, e.g., im(𝑇 )⊥ = ker(𝑇 * ).
Department of Mathematics and Statistics, Texas Tech University,
Lubbock, TX 79409-1042
E-mail address: lance.drager@ttu.edu
Download