Math 480 Notes on Orthogonality The word orthogonal is a synonym

advertisement
Math 480
Notes on Orthogonality
The word orthogonal is a synonym for perpendicular.
Question 1: When are two vectors ~v1 and ~v2 in Rn orthogonal to one another?
The most basic answer is “if the angle between them is 90◦ ” but this is not very practical. How
could you tell whether the vectors
 
 
1
1
 1  and  3 
1
1
are at 90◦ from one another?
One way to think about this is as follows: ~v1 and ~v2 are orthogonal if and only if the triangle
formed by ~v1 , ~v2 , and ~v1 − ~v2 (drawn with its tail at ~v2 and its head at ~v1 ) is a right triangle. The
Pythagorean Theorem then tells us that this triangle is a right triangle if and only if
||~v1 ||2 + ||~v2 ||2 = ||~v1 − ~v2 ||2 ,
(1)
where || − || denotes the length of a vector.


x1
The length of a vector ~x =  ...  is easy to measure: the Pythagorean Theorem (once again)
xn
tells us that
q
||~x|| = x21 + · · · + x2n .
This expression under the square root is simply the matrix product


x1
~xT ~x = (x1 · · · xn )  ...  .
xn
Definition. The inner product (also called the dot product) of two vectors ~x, ~y ∈ Rn , written h~x, ~y i
or ~x · ~y , is defined by
n
X
hx, yi = ~xT ~y =
xi yi .
i=1
Since matrix multiplication is linear, inner products satisfy
h~x, y~1 + ~y2 i = h~x, ~y1 i + h~x, ~y2 i
h~x1 , a~y i = ah~x1 , ~y i.
(Similar formulas hold in the first coordinate, since h~x, ~y i = h~y , ~xi.)
Now we can write
||~v1 − ~v2 ||2 = h~v1 − ~v2 , ~v1 − ~v2 i = h~v1 , ~v1 i − 2h~v1 , ~v2 i + h~v2 , ~v2 i = ||~v1 ||2 − 2h~v1 , ~v2 i + ||~v2 ||,
so Equation (1) holds if and only if
h~v1 , ~v2 i = 0.
Answer to Question 1: Vectors ~v1 and ~v2 in Rn are orthogonal if and only if h~v1 , ~v2 i = 0.
Exercise 1: Which of the following pairs of vectors are orthogonal to one another? Draw pictures
to check your answers.
i)
1
−2
,
2
1
ii)
  

1
1
 1  ,  −1 
3
0
iii)
   
1
2
 1 ,  1 
1
3
Exercise 2: Find two orthogonal vectors in R6 all of whose coordinates are non-zero.
Definition. Given a subspace S ⊂ Rn , the orthogonal complement of S, written S ⊥ , is the subspace
consisting of all vectors ~v ∈ Rn that are orthogonal to every ~s ∈ S.
Theorem 1. If S is a subspace of Rn and dim(S) = k, then dim(S ⊥ ) = n − k.
The basic idea here is that every vector in Rn can be built up from vectors in S and vectors in
S ⊥ , and these subspaces do not overlap. Think about the case of R3 : the orthogonal complement
of a line (a 1-dimensional subspace) is a plane (a 2-dimensional subspace) and vice versa.
Key Example: Given an m × n matrix A ∈ Rm×n , the orthogonal complement of the row space
Row(A) is precisely N (A).
Why is this? The definition of matrix multiplication shows that being in the nullspace of A is
exactly the same as being orthogonal to every row of A: recall that if


~a1
 ~a2 

A=
 ... 
~am
n
and ~x ∈ R , then the product A~x is given by

~a1 · ~x
 ~a2 · ~x 
.
A~x = 
..


.

~am · ~x
Now, notice how the theorem fits with what we know about the fundamental subspaces: if
dim N (A) = k, then there are k free variables and n − k pivot variables in the system A~x = ~0.
Hence dim Row(A) = n − k. So the dimensions of N (A) and its orthogonal complement Row(A)
add to n, as claimed by the Theorem.
This argument actually proves the Theorem in general: every subspace S in Rn has a basis
~s1 , ~s2 , . . . , ~sm (for some m 6 ∞), and S is then equal to the row space of the matrix
 T 
~s1
 ~sT2 

A=
 ...  .
~sTm
The statement that S contains a finite basis deserves some explanation, and will be considered in
detail below.
Another Key Example: Given an m × n matrix A ∈ Rm×n , the orthogonal complement of the
column space Col(A) is precisely N (AT ).
This follows by the same sort of argument as for the first Key Example.
The following theorem should seem geometrically obvious, but it is annoyingly difficult to prove
directly.
Theorem 2. If V and W are subspaces of Rn and V = W ⊥ , then W = V ⊥ as well.
Proof. One half of this statement really is easy: if V is the orthogonal complement of W , this means
V consists of all vectors ~v ∈ Rn such that for all w
~ ∈ W , ~v · w
~ = 0. Now if w
~ ∈ W , then this means
w
~ is definitely perpendicular to every ~v ∈ V (i.e. ~v · w
~ = 0), and hence W ⊂ V ⊥ . But why must
every vector that is orthogonal to all of V actually lie in W ? We can prove this using what we
know about dimensions. Say V is k–dimensional. Then the dimension of V ⊥ is n − k by Theorem
1. But Theorem 1 also tells us that the dimension of W is n − k (because V = W ⊥ ). So W is an
(n − k)–dimensional subspace of the (n − k)–dimensional space V ⊥ , and from Section 5 of the Notes
of Linear Independence, Bases, and Dimension, we know that W must in fact be all of V ⊥ .
Corollary. For any matrix A, Col(A) = N (AT )⊥ .
Note that this statement has a nice implication for linear systems: the column space Col(A)
consists of all vectors ~b such that A~x = ~b has a solution. If you want to check whether A~x = ~b
has a solution, you can now just check whether or not ~b is perpendicular to all vectors in N (AT ).
Sometimes this is easy to check - for instance, if you have a basis for N (AT ). (Note that if a vector
w
~ is perpendicular to each vector in a basis for some subspace V , then w
~ is in fact perpendicular
⊥
to all linear combinations of these basis vectors, so w
~ ∈V .
This raises question: how can we find a basis for N (AT )? Row reduction gives rise to an equation
EA = R,
where R is the reduced echelon form of A and E is a product of elementary matrices and permutation
matrices (corresponding the the row operations performed on A). Say R has k rows of zeros. Note
that the dimension of N (AT ) is precisely the number of rows of zeros in R (why?), so we are looking
for an independent set of vectors in N (AT ) of size k. If you look at the last k rows in the matrix
equation EA = R, you’ll see that this equation says that the last k rows of E lie in the left-hand
nullspace N (AT ). Moreover, these vectors are independent, because E is a product of invertible
matrices, hence invertible (so its rows are independent).
We now consider in detail the question of why every subspace of Rn has a basis.
Theorem 3. If S is a subspace of Rn , then S has a basis containing at most n elements. Equivalently, dim(S) 6 n.
Proof. First, recall that every set of n + 1 (or more) vectors in Rn is linearly dependent, since they
form the columns of a matrix with more columns than rows. So every sufficiently large set of vectors
in S is dependent. Let k be the smallest number between 1 and n + 1 such that every set of k
vectors in S is dependent. If k = 1, then every vector ~s ∈ S forms a dependent set (all by itself) so
S must contain only the zero vector. In this case, the zero vector forms a spanning set for S. So
we’ll assume k > 1. Then there is a set ~s1 , . . . , ~sk−1 of vectors in S which is linearly independent,
and every larger set in S is dependent.
We claim that this set actually spans S (and hence is a basis for S). The proof will be by
contradiction, meaning that we’ll consider what would happen if this set did not span S, and we’ll
see that this would lead to a contradiction.
If this set did not span S, then there would be a vector ~s ∈ S that is not a linear combination of
the vectors ~s1 , . . . , ~sk−1 . We claim that this makes the set
~s1 , . . . , ~sk−1 , ~s
linearly independent. Say
(2)
c1~s1 + · · · + ck−1~sk−1 + ck~s = ~0.
We will prove that all the scalars ci must be zero. If ck were non-zero, then we could solve the
above equation for ~s, yielding
c1
ck−1
~s = − ~s1 − · · · −
~sk−1 .
ck
ck
But that’s impossible, since ~s is not a linear combination of the vectors ~s1 , . . . , ~sk−1 ! So ck is zero,
and the equation (2) becomes
c1~s1 + · · · + ck−1~sk−1 = ~0.
Since ~s1 , . . . , ~sk−1 is independent, the rest of the ci must be zero as well. We have now shown that
~s1 , . . . , ~sk−1 , ~s
is a linearly independent set in S, but this contradicts the assumption that all sets of size k in S
are dependent.
Download