Vectors and Vector Operations

advertisement
7.2 Projection on a Plane
In the previous section we looked at projections on a line. In this section we extend this
to projections on planes.
Problem. Given a plane P through the origin and a point v,
find the point w on P closest to v.
v
P
A plane through the origin can be described as all linear
combinations of any two vectors lying in the plane that do
not lie on the same line through the origin. So the problem
can be restated as follows.
w
Problem. Given u1 and u2, find a numbers c1 and c2 so that
| v - (c1u1 + c2u2) |  | v - (a1u1 + a2u2) | for all numbers a1 and a2.
As in the case of same problem for a line instead of a plane, it turns out that the vector
from w to v is perpendicular to P and w is called the orthogonal projection of v on P.
Proposition 1. Let u1, u2 and v be given vectors. If c1 and c2 are numbers such that
v - (c1u1 + c2u2) is orthogonal to u1 and u2, then | v - (c1u1 + c2u2) |  | v - (a1u1 + a2u2) | for
all numbers a1 and a2.
Proof. We apply the Pythagorean theorem (1) in the
previous section with y = v - (c1u1 + c2u2) and
x = (a1u1 + a2u2) – (c1u1 + c2u2). By hypothesis y is
orthogonal to u1 and u2, i.e. y . u1 = 0 and y . u2 = 0. One
has
v
y – x = v - au
L
y = v- cu
au
x = au- cu
w = cu
y . x = y . ((a1 – c1)u1 + (a2 – c2)u2)
= (a1 – c1)(y . u1) + (a2 – c2)(y . u2) = 0
so x and y are orthogonal. Then y – x = v - (a1u1 + a2u2) and (1) of the previous section
becomes
| v – (a1u1 + a2u2) |2 = | (a1u1 + a2u2) – (c1u1 + c2u2) |2 + | v – (c1u1 + c2u2)u |2
The right side is greater than | v – (c1u1 + c2u2)u |2 so we get
| v - (c1u1 + c2u2) |  | v - (a1u1 + a2u2) |. //
7.2 - 1
It turns out that finding c1 and c2 so that v - (c1u1 + c2u2) is orthogonal to u1 and u2 is
equivalent to solving a system of equations.
Proposition 2. Let u1, u2 and v be given vectors. Then v - (c1u1 + c2u2) is orthogonal to
u1 and u2 if and only if
(u1 . u1) c1 + (u1 . u2) c2 = u1 . v
(1)
(u2 . u1) c1 + (u2 . u2) c2 = u2 . v
Proof. v - (c1u1 + c2u2) is orthogonal to u1  u1 . (v - (c1u1 + c2u2)) = 0 
u1 . v - (c1(u1 . u1) + c2(u1 . u2)) = 0  the first equation in (1) holds. Similarly
v - (c1u1 + c2u2) is orthogonal to u2  the second equation in (1) holds. //
0
Example 1. Find the orthogonal projection of  0  on the plane of all linear combination
3
1
1
 
 
of  - 1  and  - 1 .
 1
 1
 1
-1
0
1
1
In this case u1 =  , u2 =   and v =  0 . One has u1 . u1 = 3, u1 . u2 = 1, u2 . u2 = 3,
 1
 1
3
u1 . v = 3 and u2 . v = 3, so the equations (1) become
3c1 + c2 = 3
c1 + 3c2 = 3
Multiply the first equation by three and subtract the second equation to get 8c1 = 6. So
c1 = 3/4. Substitute into the first equation and solve for c2 to get c2 = 3/4. So
 0 
w = c1u1 + c2u2 =  - 3/4 .
 3/4 
Another Point of View. Suppose we wanted to find c1 and c2 to satisfy all three of the
equations
c1 - c2 = 0
(2)
- c1 - c2 = 0
c1 + c2 = 3
Clearly we can't do it. However, for a given pair of values of c1 and c2 let
7.2 - 2
e1 = 0 - (c1 - c2)
= error in the first equation
e2 = 0 - (- c1 - c2) = error in the second equation
e3 = 3 - (c1 + c2)
= error in the third equation
S = (e1)2 + (e2)2 + (e3)2
= (0 - (c1 - c2))2 + (0 - (- c1 - c2))2 + (3 - (c1 + c2))2
So S is the sum of the squares of the errors in the equations (2) for a given value of c1 and
c2. It is a measure of how far off c1 and c2 is from a solution of (2). As a substitute for an
exact solution of (2) we might ask to find c1 and c2 to minimize S. The c1 and c2 is called
a least squares solution to (2). Note that S can be written as
  3 - (c1 - c2)  
0   1
 - 1  
S =   0 - (- c1 - c2)   =   0  - c1  - 1  + c2  - 1  )
  0 - (c1 + c2)  
3   1
 1  
= | v – (c1u1 + c2u2) |
So finding c1 and c2 to minimize S is just what we did to find the orthogonal projection of
v on the plane through u1 and u2. We saw that c1 = 3/4 and c2 = 3/4.
An Equivalent Way of Finding the Orthogonal Projection of v on P. When one
writes the equations (1) in vector form and uses the fact that x . y = xTy one gets
(3)
 (u1)Tu1 (u1)Tu2   c1  =  (u1)Tv 
 (u2)Tu1 (u2)Tu2   c2 
 (u2)Tv 
Let A be the matrix whose columns are u1 and u2. Then AT is the matrix whose rows are
(u1)T and (u2)T. Then
(u1)Tu1 (u1)Tu2 
ATA = 
 (u2)Tu1 (u2)Tu2 
(u1)Tv 
ATv = 
 (u2)Tv 
since the entries of ATA are rows of AT times the columns of A and the entries of ATv are
rows of AT times v. So (2) becomes
(4)
c1
ATA   = ATv
 c2 
If the u1 and u2 are linearly independent, then ATA is invertible and (4) can be written as
 c1  = (ATA)-1ATv
 c2 
7.2 - 3
The orthogonal projection w of v on the plane of all linear combinations of u1 and u2 is
c1
w = c1u1 + c2u2 = A . So
 c2 
w = A(ATA)-1ATv = Qv
where
Q = A(ATA)-1AT
Q is called the othogonal projection on the plane of all linear combinations of u1 and u2.
 1
-1
Example 2. Let P be the plane of all linear combinations of  - 1  and  - 1 . Find the
 1
 1
0
orthogonal projection Q on P and use it to find the closest point of P to  0 .
3
1
1
0
 
 
 
 1 -1
As in Example 1, one has u1 =  - 1 , u2 =  - 1  and v =  0 . So A =  - 1 - 1 ,
 1
 1
3
 1 1
1 -1 1
3 1
1 3 -1
T
T
T -1


A =  - 1 - 1 1  and A A =  1 3 . It is not hard to see that (A A) = 8  - 1 3 . Then
1 -1
1
3 -1
1 -1 1
Q = A(ATA)-1AT = 8  - 1 - 1   - 1 3   - 1 - 1 1 
 1 1
2
2
2 0 0

1
1 -1 1
1
= 4  - 1 - 1   - 1 - 1 1  = 2  0 1 - 1 




1
1
w = Qv =
1
2
0 -1
1
0 
2 0 00
1
 0 1 - 1   0  = 2  - 3/2 
0 -1 1 3
 3/2 
Least Squares Curve Fitting. Just as with the case of finding the orthogonal projection
of a point on a line, the process of finding the orthogonal
Age of
Number of
projection of a point on a plane has applications to curve
driver,
x
fatalities,
y
fitting. We illustrate this with an example.
20
25
30
35
Example 3. (taken from Linear Algebra with Applications,
3rd edition, by Gareth Williams, p. 380) A study was made
of traffic fatalities and the age of the driver. At the right is a
table of data that was collected on the number y of
fatalities and the age x of the driver. A plot of the data is
also at the right.
101
115
92
64
y
120
100
80
60
Often we would like to summarize a set of data values by
a linear equation y = mx + b. Usually there will be no
linear equation that fits the data exactly. Instead we find
the linear equation that fits the data best in the sense of
7.2 - 4
40
20
0
x
0
10
20
30
40
least squares.
In general we have a set (x1, y1), (x2, y2), …, (xn, yn) of data values. In our example n = 4.
We want to find a linear function y = mx + b that describes the data best in some sense.
We use the following measure of how well a particular linear function y = mx + b fits the
data.
For each pair of data values (xj, yj) we measure the horizontal distance from (xj, yj) to the
line y = mx + b. This distance is yj – (mxj + b). We square this distance: (yj - (mxj + b))2.
This squared value is a measure of how well the line y = mx + b fits the pair of data
values (xj, yj). For example, for the first pair (20, 101) of data values in the above
example this squared value would be
We add up all these squared values.
n
S =  (yj – (mxj + b))2
j=1
S is a measure of how well the line y = mx + b fits the set (x1, y1), (x2, y2), …, (xn, yn) of
data values; the smaller S the better the fit.
We want to find m and b so as to minimize S. This is called the line that fits the data best
in the sense of least squares.
7.2 - 5
Download