Orthogonal Projection Given any nonzero vector v, it is possible to decompose an arbitrary vector u into a component that points in the direction of v and one that points in a direction orthogonal to v (see Fig. 2, p. 386; the plane of this diagram is the plane determined by the two vectors u and v). The component of u in the direction of v, also called the projection of u onto v, denoted û; it equals α v for an appropriate choice of scalar α . The component of u orthogonal to v, a vector we label w, must therefore satisfy € € u = û + w. Thus, since w is orthogonal to v, we have 0 = w ⋅ v = ( u − û) ⋅ v = ( u − α v ) ⋅ v = u ⋅ v − α ( v ⋅ v ) € u⋅v u⋅v that is, α = . In other words, û = α v = v. v⋅v v⋅v Notice that the projection of u onto any multiple cv €of v is the same vector, € u ⋅cv u⋅v û= (cv) = v. cv ⋅cv v⋅v € That is, u has the same projection onto any nonzero vector in the linear subspace L spanned by v. For this reason, we often denote û by projL u, recognizing that it has the same value for any vector v chosen from L: u⋅v û = projL u = v . v⋅v If the vectors u and v lie in R 2 (see Fig. 3, p. 387), then the € point determined by the vector û is the point on the line L through v that lies closest to the point determined by u. It follows that the distance € between the point determined by u and the line L is the distance between u and û: ||u − û| |=||w||. We can apply the formula for projL u to a much more general setting. Suppose that € S = { v 1 , v 2 ,…, v k } is an orthogonal set of vectors in R n . Then the following theorem comes into play: € € Theorem If S = { v 1 , v 2 ,…, v k } is an orthogonal set of vectors in R n , then it is a basis for the subspace it spans. € Proof Suppose that 0 = c 1 v 1 + + c k v k for suitable € scalars c1 ,…, c k . Then, because the v’s are € € orthogonal to each other, 0 = 0 ⋅vi = (c1 v 1 + + c k v k ) ⋅ v i = c1 ( v 1 ⋅ v i ) + + c k ( v k ⋅ v i ) = ci ( v i ⋅ v i ) But since v i ⋅ v i is never zero, it follows that each of the€c’s equal 0. So S is a linearly independent set, and is therefore a basis for the space it spans. // € Thus, any orthogonal set of vectors is automatically an orthogonal basis for the space it spans. € Theorem Let V = Span { v 1 , v 2 ,…, v k } be the subspace of R n spanned by an orthogonal set S = { v 1 , v 2 ,…, v k } of vectors. Then any vector u in V can be represented in terms of the basis S as € € u ⋅ v1 u ⋅vk u= v 1 + + v k . v1 ⋅ v1 v k ⋅vk Proof The component of u in the direction of the basis vector v i is its projection onto v i , namely € € € u ⋅vi projv i u = v i . vi ⋅vi The result follows. // € The representation given in the last theorem is simplified considerably when, in addition to being orthogonal, the basis { v 1 , v 2 ,…, v k } is orthonormal, i.e., each of the basis vectors has unit length. For then we have v i ⋅ v i =||v i ||2 = 1 and the denominators of the fractions disappear. In € terms of an orthonormal basis, vectors in such a space have the simple form € u = ( u ⋅ v 1 ) v 1 + + ( u ⋅ v k ) v k . Orthonormal sets of vectors can be used to build matrices that are important in many applications of € linear algebra. Theorem An m × n matrix U has orthonormal columns if and only if U T U = I . (Since m and n need not be equal, this is not equivalent to saying that U€is invertible!) € [ Proof Suppose that U = u 1 u2 ] u n , with u 1 , u 2 ,…, u n ∈ R m . Then T € u1 T T u U U = 2 u1 u 2 u n T u n uT u u T1 u 2 u 1T u n 1 1 T T T = u 2 u 1 u 2 u 2 u 2 u n T u n u 1 u Tn u 2 u Tn u n u1 ⋅ u1 u1 ⋅ u 2 u1 ⋅ u n u ⋅u u2 ⋅ u2 u2 ⋅ un = 2 1 u n ⋅ u1 u n ⋅ u 2 u n ⋅ u n € [ € € ] which equals the identity matrix if and only if { u 1 , u 2 ,…, u n } is an orthonormal set of vectors. // € € € Theorem Let U be an m × n matrix with orthonormal columns. Then for any x, y ∈ R n , (1) ||Ux||=||x||; (2) (Ux ) ⋅(Uy )€= x ⋅ y ; and (3) (Ux ) ⋅(Uy ) = 0 if and only if x ⋅ y = 0. € That is, the transformation T :R n → R m with matrix representation T ( x ) = Ux preserves lengths of vectors and the angle€between vectors. € Proof If U = €u 1 u 2 u n , x1 y1 x y x = 2 and y = 2 , then € x n y n [ ] (Ux ) ⋅(Uy ) = ( x 1 u 1 + + x n u n ) ⋅ ( y1 u 1 + + y n u n ) € = ( x 1 u 1 + + x n u n ) ⋅ ( y1 u 1 ) + + ( x 1 u 1 + + x n u n ) ⋅ ( y n u n ) = [ x1 y 1 ( u 1 ⋅ u 1 ) + + x n y 1 ( u n u 1 )] + + [ x1 y n ( u 1 ⋅ u n ) + + x n y n ( u n u n )] = [ x1 y 1 ( u 1 ⋅ u 1 )] + + [x n y n ( u n u n )] = x 1 y1 + + x n y n = x⋅y € € which proves (2). (1) follows, for if we set y = x, ||Ux||2 = (Ux ) ⋅(Ux ) = x ⋅ x =||x||2 ⇒||Ux||=| |x||. Finally, (3) is an even more immediate consequence of (2). // Returning to the idea of decomposing a vector with respect to an orthogonal basis, we have the following important generalization: The Orthogonal Decomposition Theorem Let V be a subspace of R n . Then every vector u in R n has a unique decomposition of the form u = û + w where û lies in V and w lies in V ⊥. If V has an orthogonal€basis { v 1 , v 2 ,…, v k }, then € u ⋅ v1 € u ⋅vk û= v 1 + + v k v1 ⋅ v1 v k ⋅vk € and so w = u – û. € Proof The vector u ⋅ v1 u ⋅vk û= v 1 + + v k v1 ⋅ v1 v k ⋅vk certainly lies in V. Also, the vector w = u – û lies in€V ⊥ because for every v i , € € w ⋅vi = u ⋅vi − û ⋅vi u⋅v u⋅v 1 k = u ⋅vi − v1 ⋅ v i − − v k ⋅ v i v1 ⋅ v1 v k ⋅vk u⋅v i = u ⋅vi − v i ⋅ v i vi ⋅vi =0 € € € € € € whereby the decomposition u = û + w does represent u as a sum of a vector in V and a vector in V ⊥. This decomposition is unique, for if there are vectors û ′ ∈ V and w ′ ∈ V ⊥ for which u = û ′ + w ′ , then û ′ + w ′ = û + w ⇒ û ′ − û = w − w ′ . But the vector on the left side of this last equation lies in V while the vector on the right side lies in € ⊥€ V . Thus, € it is orthogonal to itself. But since v ⋅ v = 0 ⇒ v = 0, we must have û ′ − û = 0 and w − w ′ = 0. That is, û ′ = û and w ′ = w . // € Notice that the proof of the uniqueness of the € € decomposition u = û + w is independent of the choice of the basis for the space V. Thus, despite the formula given in the theorem, the vector û does not depend on the choice of basis, only on the space V. It makes sense then to use the notation projV u for û. Corollary When the basis { u 1 , u 2 ,…, u k } of V is orthonormal, the projection of u in R n onto V is projV u = ( u€⋅ u 1 ) u 1 + + ( u ⋅ u k ) u k . [ If U = u 1 € u2 uk ] € , then projV u = UU T u. € Proof The first formula for projV u follows directly from the theorem. To € get the second formula, observe that the weights u ⋅ u 1 ,…, u ⋅ u k in the first T T formula can be written in the form u u,…, u 1 k u. € That is, they are the entries of the vector U T u. Consequently, € € projV u = ( u ⋅ u 1 ) u 1 + +€ ( u ⋅ u k )uk = u1 u 2 u k U T u [ ] = UU T u and the second formula follows. // € We mentioned earlier that in R 2, the point determined by the projection of u onto the line L through v is the point on the line closest to that determined by u itself. This has a generalization to € higher Euclidean space: The Best Approximation Theorem Let V be a subspace of R n . Then given any u in R n , the vector û = projV u is the closest point in V to u; that is, ||u − û| |<| | u€− v|| € for any v ∈ V different from û. € v ∈ V be any vector other than û. Then Proof Let €û −⊥ v is a nonzero vector in V. But w = u – û lies in V so it is orthogonal to û − v. Now the sum of these € orthogonal vectors is ( u − û) + ( û − v ) = u − v, so by the Pythagorean Theorem, € € € 2 ||u − û| |€ +||û − v||2 =||u − v||2 . Since û − v is nonzero, ||û − v||2 > 0 , so ||u€ − û| |2 <||u − v||2 from which the result follows. // € € €