Orthogonal Projection Given any nonzero vector v, it is possible to

advertisement
Orthogonal Projection
Given any nonzero vector v, it is possible to
decompose an arbitrary vector u into a component
that points in the direction of v and one that points
in a direction orthogonal to v (see Fig. 2, p. 386; the
plane of this diagram is the plane determined by
the two vectors u and v). The component of u in
the direction of v, also called the projection of u
onto v, denoted û; it equals α v for an appropriate
choice of scalar α . The component of u
orthogonal to v, a vector we label w, must
therefore satisfy
€
€
u = û + w.
Thus, since w is orthogonal to v, we have
0 = w ⋅ v = ( u − û) ⋅ v = ( u − α v ) ⋅ v = u ⋅ v − α ( v ⋅ v )
€
u⋅v
u⋅v
that is, α =
. In other words, û = α v = 
 v.
v⋅v
v⋅v 
Notice that the projection of u onto any multiple cv
€of v is the same vector, €
 u ⋅cv 
u⋅v
û=
 (cv) = 
 v.
 cv ⋅cv 
v⋅v 
€
That is, u has the same projection onto any nonzero
vector in the linear subspace L spanned by v. For
this reason, we often denote û by projL u,
recognizing that it has the same value for any
vector v chosen from L:
u⋅v 
û = projL u = 
v .
 v⋅v 
If the vectors u and v lie in R 2 (see Fig. 3, p. 387),
then the
€ point determined by the vector û is the
point on the line L through v that lies closest to the
point determined by u. It follows that the distance
€
between the point determined by u and the line L is
the distance between u and û: ||u − û|
|=||w||.
We can apply the formula for projL u to a much
more general setting. Suppose that
€
S = { v 1 , v 2 ,…, v k } is an orthogonal set of vectors in
R n . Then the following theorem comes into play:
€
€
Theorem If S = { v 1 , v 2 ,…, v k } is an orthogonal
set of vectors in R n , then it is a basis for the
subspace it spans.
€
Proof Suppose
that 0 = c 1 v 1 + + c k v k for suitable
€
scalars c1 ,…, c k . Then, because the v’s are
€
€
orthogonal to each other,
0 = 0 ⋅vi
= (c1 v 1 + + c k v k ) ⋅ v i
= c1 ( v 1 ⋅ v i ) + + c k ( v k ⋅ v i )
= ci ( v i ⋅ v i )
But since v i ⋅ v i is never zero, it follows that each of
the€c’s equal 0. So S is a linearly independent set,
and is therefore a basis for the space it spans. //
€
Thus, any orthogonal set of vectors is automatically
an orthogonal basis for the space it spans.
€
Theorem Let V = Span { v 1 , v 2 ,…, v k } be the
subspace of R n spanned by an orthogonal set
S = { v 1 , v 2 ,…, v k } of vectors. Then any vector u in
V can be represented
in terms of the basis S as
€
€
 u ⋅ v1 
 u ⋅vk 
u=
 v 1 + + 
v k .
 v1 ⋅ v1 
v k ⋅vk 
Proof The component of u in the direction of the
basis vector v i is its projection onto v i , namely
€
€
€
 u ⋅vi 
projv i u = 
v i .
 vi ⋅vi 
The result follows. //
€
The representation given in the last theorem is
simplified considerably when, in addition to being
orthogonal, the basis { v 1 , v 2 ,…, v k } is
orthonormal, i.e., each of the basis vectors has
unit length. For then we have v i ⋅ v i =||v i ||2 = 1
and the denominators
of the fractions disappear. In
€
terms of an orthonormal basis, vectors in such a
space have the simple form
€
u = ( u ⋅ v 1 ) v 1 + + ( u ⋅ v k ) v k .
Orthonormal sets of vectors can be used to build
matrices that are important in many applications
of €
linear algebra.
Theorem An m × n matrix U has orthonormal
columns if and only if U T U = I . (Since m and n
need not be equal, this is not equivalent to saying
that U€is invertible!)
€
[
Proof Suppose that U = u 1
u2
]
 u n , with
u 1 , u 2 ,…, u n ∈ R m . Then
 T
€  u1 
T

T
u
U U = 2  u1 u 2  u n
  
 T
 u n 
 uT u
u T1 u 2  u 1T u n 
1
1


T
T
T
=  u 2 u 1 u 2 u 2  u 2 u n 


 
 T
 u n u 1 u Tn u 2  u Tn u n 
 u1 ⋅ u1 u1 ⋅ u 2  u1 ⋅ u n 


u ⋅u
u2 ⋅ u2  u2 ⋅ un 
= 2 1
 






 u n ⋅ u1 u n ⋅ u 2  u n ⋅ u n 
€
[
€
€
]
which equals the identity matrix if and only if
{ u 1 , u 2 ,…, u n } is an orthonormal set of vectors. //
€
€
€
Theorem Let U be an m × n matrix with
orthonormal columns. Then for any x, y ∈ R n ,
(1) ||Ux||=||x||;
(2) (Ux ) ⋅(Uy )€= x ⋅ y ; and
(3) (Ux ) ⋅(Uy ) = 0 if and only if x ⋅ y = 0.
€
That is, the transformation T :R n → R m with
matrix representation T ( x ) = Ux preserves lengths
of vectors and the angle€between vectors.
€
Proof If U = €u 1 u 2  u n ,
 x1 
 y1 
 
 
x
y
x =  2  and y =  2 , then
  
€   
 
 
x n 
y n 
[
]
(Ux ) ⋅(Uy ) = ( x 1 u 1 + + x n u n ) ⋅ ( y1 u 1 + + y n u n )
€
= ( x 1 u 1 + + x n u n ) ⋅ ( y1 u 1 ) +
+ ( x 1 u 1 + + x n u n ) ⋅ ( y n u n )
= [ x1 y 1 ( u 1 ⋅ u 1 ) + + x n y 1 ( u n u 1 )] + 
+ [ x1 y n ( u 1 ⋅ u n ) + + x n y n ( u n u n )]
= [ x1 y 1 ( u 1 ⋅ u 1 )] + + [x n y n ( u n u n )]
= x 1 y1 + + x n y n
= x⋅y
€
€
which proves (2). (1) follows, for if we set y = x,
||Ux||2 = (Ux ) ⋅(Ux ) = x ⋅ x =||x||2 ⇒||Ux||=|
|x||.
Finally, (3) is an even more immediate consequence
of (2). //
Returning to the idea of decomposing a vector with
respect to an orthogonal basis, we have the
following important generalization:
The Orthogonal Decomposition Theorem Let
V be a subspace of R n . Then every vector u in R n
has a unique decomposition of the form u = û + w
where û lies in V and w lies in V ⊥. If V has an
orthogonal€basis { v 1 , v 2 ,…, v k }, then €
 u ⋅ v1  €
 u ⋅vk 
û=
 v 1 + + 
v k
 v1 ⋅ v1 
v k ⋅vk 
€
and so w = u – û.
€
Proof
The vector
 u ⋅ v1 
 u ⋅vk 
û=
 v 1 + + 
v k
 v1 ⋅ v1 
v k ⋅vk 
certainly lies in V. Also, the vector w = u – û lies
in€V ⊥ because for every v i ,
€
€
w ⋅vi = u ⋅vi − û ⋅vi
 u⋅v 
 u⋅v 
1
k
= u ⋅vi − 
v1 ⋅ v i −  − 
v k ⋅ v i
 v1 ⋅ v1 
v k ⋅vk 
 u⋅v 
i
= u ⋅vi − 
v i ⋅ v i
vi ⋅vi 
=0
€
€
€
€
€
€
whereby the decomposition u = û + w does
represent u as a sum of a vector in V and a vector
in V ⊥. This decomposition is unique, for if there
are vectors û ′ ∈ V and w ′ ∈ V ⊥ for which
u = û ′ + w ′ , then û ′ + w ′ = û + w ⇒ û ′ − û = w − w ′ .
But the vector on the left side of this last equation
lies in V while the vector on the right side lies in
€
⊥€
V . Thus,
€ it is orthogonal to itself. But since
v ⋅ v = 0 ⇒ v = 0, we must have û ′ − û = 0 and
w − w ′ = 0. That is, û ′ = û and w ′ = w . //
€
Notice that the proof of the uniqueness of the
€
€
decomposition u = û + w is independent of the
choice of the basis for the space V. Thus, despite
the formula given in the theorem, the vector û does
not depend on the choice of basis, only on the space
V. It makes sense then to use the notation projV u
for û.
Corollary When the basis { u 1 , u 2 ,…, u k } of V is
orthonormal, the projection of u in R n onto V is
projV u = ( u€⋅ u 1 ) u 1 + + ( u ⋅ u k ) u k .
[
If U = u 1
€
u2
 uk
]
€
, then projV u = UU T u.
€
Proof The first formula for projV u follows directly
from the theorem. To €
get the second formula,
observe that the weights u ⋅ u 1 ,…, u ⋅ u k in the first
T
T
formula can be written
in
the
form
u
u,…,
u
1
k u.
€
That is, they are the entries of the vector U T u.
Consequently, €
€
projV u = ( u ⋅ u 1 ) u 1 + +€
( u ⋅ u k )uk
= u1 u 2  u k U T u
[
]
= UU T u
and the second formula follows. //
€
We mentioned earlier that in R 2, the point
determined by the projection of u onto the line L
through v is the point on the line closest to that
determined by u itself. This has a generalization to
€
higher Euclidean space:
The Best Approximation Theorem Let V be a
subspace of R n . Then given any u in R n , the vector
û = projV u is the closest point in V to u; that is,
||u − û|
|<|
| u€− v||
€
for any v ∈ V different from û.
€ v ∈ V be any vector other than û. Then
Proof Let
€û −⊥ v is a nonzero vector in V. But w = u – û lies in
V so it is orthogonal to û − v. Now the sum of
these
€ orthogonal vectors is ( u − û) + ( û − v ) = u − v,
so by the Pythagorean Theorem,
€
€
€ 2
||u − û|
|€ +||û − v||2 =||u − v||2 .
Since û − v is nonzero, ||û − v||2 > 0 , so
||u€
− û|
|2 <||u − v||2 from which the result follows. //
€
€
€
Download