Document 10639891

advertisement
Miscellaneous Results, Solving Equations,
and Generalized Inverses
c
Copyright 2012
Dan Nettleton (Iowa State University)
Statistics 611
1 / 51
Result A.7:
Suppose S and T are vector spaces. If S ⊆ T and dim(S) = dim(T ),
then S = T .
c
Copyright 2012
Dan Nettleton (Iowa State University)
Statistics 611
2 / 51
Proof of Result A.7:
Let k denote the common dimension of S and T . Let a1 , . . . , ak be
a basis for S.
Then a1 , . . . , ak are LI vectors in T , and are thus a basis for T by
V4. Because S and T have a common basis, S = T .
c
Copyright 2012
Dan Nettleton (Iowa State University)
Statistics 611
3 / 51
Result A.8:
Supposem×n
A and b satisfy
m×1
Ax + b = 0 ∀ x ∈ Rn .
m×n
m×1
Thenm×n
A =m×n
0 and b = 0.
m×1
m×1
c
Copyright 2012
Dan Nettleton (Iowa State University)
Statistics 611
4 / 51
Proof of Result A.8:
When x = 0, Ax + b = 0 becomes A0 + b = 0 =⇒ b = 0.
Next, let columns of A be a1 , . . . , an 3 A = [a1 , . . . , an ].
When x = ei , Ax = 0 becomes
[a1 , . . . , ai , . . . , an ]ei = 0 =⇒ ai = 0.
This is true for all i = 1, . . . , n. ∴ A = 0.
Alternatively, note that Ax = 0 ∀ x ∈ Rn =⇒ C(A) = {0} =⇒ A = 0.
c
Copyright 2012
Dan Nettleton (Iowa State University)
Statistics 611
5 / 51
Corollary A.1:
Ifm×n
B andm×n
C satisfy Bx = Cx ∀ x ∈ Rn , then B = C.
c
Copyright 2012
Dan Nettleton (Iowa State University)
Statistics 611
6 / 51
Proof of Corollary A.1:
Bx = Cx ∀ x ∈ Rn
=⇒ Bx − Cx = 0 ∀ x ∈ Rn
=⇒ (B − C)x = 0 ∀ x ∈ Rn
=⇒ B − C = 0 by Result A.8
∴ B = C.
c
Copyright 2012
Dan Nettleton (Iowa State University)
Statistics 611
7 / 51
Corollary A.2:
Supposem×n
A has full column rank. Then
AB = AC =⇒ B = C.
c
Copyright 2012
Dan Nettleton (Iowa State University)
Statistics 611
8 / 51
Proof of Corollary A.2:
Result A.3 implies that N (A) = {0}. Thus, Ax = 0 =⇒ x = 0.
Now
AB = AC =⇒ AB − AC = 0
=⇒ A(B − C) = 0
=⇒ A times each column of B − C is 0
=⇒
Each column of B − C is 0
=⇒ B − C = 0 =⇒ B = C.
c
Copyright 2012
Dan Nettleton (Iowa State University)
Statistics 611
9 / 51
Lemma A.1:
C0 C = 0 =⇒ C = 0.
Proof of Lemma A.1:
Let ci denote the ith column of C. Then the ith diagonal element of
C0 C is c0i ci .
∵ C0 C = 0, c0i ci = 0 ∀ i = 1, . . . , n.
Now c0i ci = 0 ∀ i =⇒ ci = 0 ∀ i =⇒ C = 0.
c
Copyright 2012
Dan Nettleton (Iowa State University)
Statistics 611
10 / 51
Another Result on the Rank of a Product
Suppose rank(m×n
A ) = n and rank( C ) = k. Then
k×l
rank( B ) = rank(ABC).
n×k
Proof: HW problem.
Corollaries:
If A is full-column rank, rank(AB) = rank(B).
If C is full-row rank, rank(BC) = rank(B).
c
Copyright 2012
Dan Nettleton (Iowa State University)
Statistics 611
11 / 51
Solving Equations:
Consider a system of linear equations
Ax = c,
wherem×n
A is a known matrix and c is a known vector.
c
Copyright 2012
Dan Nettleton (Iowa State University)
m×1
Statistics 611
12 / 51
We seek a solution vector x that satisfies Ax = c.
If m = n so thatm×n
A is a square and ifm×n
A is nonsingular, then
A−1 Ax = A−1 c =⇒ x = A−1 c
is the unique solution to Ax = c.
c
Copyright 2012
Dan Nettleton (Iowa State University)
Statistics 611
13 / 51
Ifm×n
A is singular or not square, Ax = c may have no solution or
infinitely many solutions or a unique solution.
c
Copyright 2012
Dan Nettleton (Iowa State University)
Statistics 611
14 / 51
A system of equations Ax = c is consistent if there exists a
solution x∗ such that Ax∗ = c.
A systems of equation Ax = c is inconsistent if Ax 6= c ∀ x ∈ Rn .
c
Copyright 2012
Dan Nettleton (Iowa State University)
Statistics 611
15 / 51
Result A.9:
A system of equations Ax = c is consistent iff c ∈ C(A).
Provide an example m×n
A, c 3 Ax = c is inconsistent.
m×1
Provide an example m×n
A, c 3 Ax = c is consistent.
m×1
c
Copyright 2012
Dan Nettleton (Iowa State University)
Statistics 611
16 / 51


 
1 0
0


 



A = 0 1 , c = 0
/ C(A) = {x ∈ R3 : x3 = 0}. Thus
 . Then c ∈
0 0
1
Ax = c is inconsistent.


 
1 0
4


 

 
A=
0 1 , c = 2 . Here Ax = c is consistent because
0 0
0
" #
4
c ∈ C(A). The unique solution to Ax = c is x∗ =
.
2
c
Copyright 2012
Dan Nettleton (Iowa State University)
Statistics 611
17 / 51
 


7
1 0 3
 


 

A=
0 1 3 , c = 3 .
3
0 1 3
Here c ∈ C(A); hence, Ax = c is consistent.
There are infinitely many solution given by
{x ∈ R3 : x1 + 3x3 = 7, x2 + 3x3 = 3}.
c
Copyright 2012
Dan Nettleton (Iowa State University)
Statistics 611
18 / 51
Generalized Inverse
A matrix G is a generalized inverse (GI) of a matrix A iff
AGA = A.
Every matrix has at least one GI. We will use A− to denote a GI of
a matrix A.
c
Copyright 2012
Dan Nettleton (Iowa State University)
Statistics 611
19 / 51
If A is nonsingular, then A−1 is the unique GI of A.
c
Copyright 2012
Dan Nettleton (Iowa State University)
Statistics 611
20 / 51
Proof:
AA−1 A = IA = A.
AGA = A =⇒ A−1 (AGA)A−1 = A−1 AA−1
=⇒ G = A−1 .
c
Copyright 2012
Dan Nettleton (Iowa State University)
Statistics 611
21 / 51
Result A.10:
Suppose rank(m×n
A) = r. If A can be partitioned as

A=
C
r×r
D
r×(n−r)
E
F
(m−r)×r

,
(m−r)×(n−r)
where rank(r×r
C) = r, then
"
G=
n×m
#
C−1 0
0
0
is a GI ofm×n
A.
c
Copyright 2012
Dan Nettleton (Iowa State University)
Statistics 611
22 / 51
Proof of Result A.10:
"
AG =
#"
#
C−1 0
E F
"
=
I
EC−1
"
∴ AGA =
C D
I
0
0
#
0
.
0
#"
#
0 C D
EC−1 0
E F
"
=
C
D
#
E EC−1 D
.
We need to show EC−1 D = F.
c
Copyright 2012
Dan Nettleton (Iowa State University)
Statistics 611
23 / 51
First note that
rank(r×r
C) = r =⇒ rank([C, D]) = r
∵ [C, D] has at least r LI columns and at most r LI rows.
(rank([C, D]) ≥ r and rank([C, D]) ≤ r =⇒ rank([C, D]) = r.)
c
Copyright 2012
Dan Nettleton (Iowa State University)
Statistics 611
24 / 51
Now rank
"
#!
C D
E F
= r =⇒ each row of [E, F] is a LC of the rows of
[C, D].
Thus ∃ a matrix K 3
K[C, D] = [E, F]
⇐⇒ [KC, KD] = [E, F]
⇐⇒ KC = E, KD = F.
c
Copyright 2012
Dan Nettleton (Iowa State University)
Statistics 611
25 / 51
Now KC = E ⇐⇒ K = EC−1 . Together with KD = F, this implies
EC−1 D = F.
c
Copyright 2012
Dan Nettleton (Iowa State University)
Statistics 611
26 / 51
Permutation Matrix
A matrixn×n
P is a permutation matrix if the rows of P are the same as the
rows ofn×n
I but not necessarily in the same order.
c
Copyright 2012
Dan Nettleton (Iowa State University)
Statistics 611
27 / 51
Example:


0 1 0



P=
1 0 0 .
0 0 1




1 2 3 4
5 6 7 8







If A = 
5 6 7 8  , then PA = 1 2 3 4  .
9 10 11 12
9 10 11 12
Order of rows of PA are permuted relative to order of rows of A.
c
Copyright 2012
Dan Nettleton (Iowa State University)
Statistics 611
28 / 51
Example:


"
#
0 1 0


1
2
3

P=
1 0 0 , B = 4 5 6 .
0 0 1
"
#
2 1 3
.
Then BP =
5 4 6
Order of columns of BP are permuted relative to order of columns
of B.
c
Copyright 2012
Dan Nettleton (Iowa State University)
Statistics 611
29 / 51
A Permutation Matrix is Nonsingular
The rows (and columns) of a permutation matrix are the same as
those of the identity matrix. Thus, a permutation matrix has full rank
and is therefore nonsingular.
Furthermore, if P = [p1 , . . . , pn ] is a permutation matrix,
P−1 = P0 ∵ p0i pj =
c
Copyright 2012
Dan Nettleton (Iowa State University)

0
if i 6= j
1
if i = j.
Statistics 611
30 / 51
Result A.11:
Suppose rank(m×n
A) = r. There exist permutation matricesm×m
P and Q such
n×n
that
"
PAQ =
C D
#
,
E F
where rank(r×r
C) = r. Furthermore,
"
G=Q
C−1 0
0
0
#
P
is a GI of A.
c
Copyright 2012
Dan Nettleton (Iowa State University)
Statistics 611
31 / 51
Proof of Result A.11:
Because rank(A) = r, there exists a set of r rows of A that are LI.
Let P be a permutation matrix, 3 the first r rows of PA are LI.
Let H be the matrix consisting of the first r rows of PA. Then
rank(H) = r.
c
Copyright 2012
Dan Nettleton (Iowa State University)
Statistics 611
32 / 51
This implies that ∃ a set of r columns of H that are LI.
Let Q be a permutation matrix 3 the first r columns of HQ are LI.
Then the submatrix consisting of the first r rows and first r
columns of PAQ has rank r.
c
Copyright 2012
Dan Nettleton (Iowa State University)
Statistics 611
33 / 51
Thus we can partition PAQ as
"
By Result A.10,
#
C−1 0
0
0
c
Copyright 2012
Dan Nettleton (Iowa State University)
"
#
C D
E F
, where rank(r×r
C) = r.
"
is a GI for PAQ =
C D
E F
#
.
Statistics 611
34 / 51
PAQ =
"
#
C D
"
⇐⇒ A = P
−1
C D
#
Q−1
E F
E F
"
#
C−1 0
∴ AQ
PA =
0
0
"
#
"
#
"
#
−1
C
D
C
0
C
D
= P−1
Q−1 Q
PP−1
Q−1
E F
0
0
E F
"
#"
#"
#
C−1 0 C D −1
−1 C D
=P
Q
E F
0
0 E F
"
#
−1 C D
=P
Q−1 = A.
E F
c
Copyright 2012
Dan Nettleton (Iowa State University)
Statistics 611
35 / 51
Use Result A.11 to find a GI for


1 1 3 4



A=
2 2 6 8  .
3 3 0 12
c
Copyright 2012
Dan Nettleton (Iowa State University)
Statistics 611
36 / 51




1 0 0
1 1 3 4




 . Then PA = 3 3 0 12 .
Let P = 
0
0
1




0 1 0
2 2 6 8

1

0

Let Q = 
0

0
0 0
0 1
1 0
0 0

0

0

.
0

1
c
Copyright 2012
Dan Nettleton (Iowa State University)
Statistics 611
37 / 51


1 3 1 4



Then PAQ = 
3 0 3 12 .
2 6 2 8
"
Note that
1 3
3 0
#−1
"
=
0
1
3
c
Copyright 2012
Dan Nettleton (Iowa State University)
1
3
− 91
#
.
Statistics 611
38 / 51

0

1

Thus,  3
0

0

0

0

 is GI for PAQ.
0 0

0 0
1
3
− 91
GI for A is

1

0


0

0


1


0 0 0
0
0
3
 1 0 0


1
1

0 1 0
  3 − 9 0 
=


0
0
1


0
0
0
1 0 0

 0 1 0
0 0 1
0
0 0
c
Copyright 2012
Dan Nettleton (Iowa State University)
Statistics 611
39 / 51



0
0 

 1 0 0
0

0 0


= 1

0 0 1


1
 − 0
9
3
 0 1 0
0
0 0


1
0 0
3

0 0

0


= 1
.
 0 −1
3
9


0 0
0
c
Copyright 2012
Dan Nettleton (Iowa State University)
1
3
Statistics 611
40 / 51
To verify, note that



1 1 3 4



AG = 
2
2
6
8


3 3 0 12
(A)
0 0

1
3



1
0
0
0 0

0

 
2 0 0
=
1


 0 −1 
9
3
0 0 1
0 0
0
(AG)
(G)
and



1 0 0
1 1 3 4



 2 2 6 8  =
(AG)A = 
2
0
0



0 0 1
3 3 0 12
(AG)
c
Copyright 2012
Dan Nettleton (Iowa State University)
(A)


1 1 3 4


2 2 6 8  .


3 3 0 12
(AGA)
Statistics 611
41 / 51
Algorithm for finding a GI of A:
1. Find an r × r nonsingular submatrix of A, where r = rank(A). Call
this matrix W.
2. Compute (W −1 )0 .
3. Replace each element of W in A with the corresponding elements
of (W −1 )0 .
4. Replace all other elements in A with zeros.
5. Transpose resulting matrix to get A− .
c
Copyright 2012
Dan Nettleton (Iowa State University)
Statistics 611
42 / 51
Result A.12:
Let Ax = c be a consistent system of equations, and let G be any GI of
A. Then Gc is a solution to Ax = c, i.e., AGc = c.
c
Copyright 2012
Dan Nettleton (Iowa State University)
Statistics 611
43 / 51
Proof of Result A.12:
∵ Ax = c is consistent, ∃ a solution x∗ 3 Ax∗ = c.
∴ AGc = AGAx∗ = Ax∗ = c.
c
Copyright 2012
Dan Nettleton (Iowa State University)
Statistics 611
44 / 51
Result A.13:
Let Ax = c be a consistent system of equations, and let G be any GI of
A. Then x̃ is a solution to Ax = c iff ∃ z 3 x̃ = Gc + (I − GA)z.
c
Copyright 2012
Dan Nettleton (Iowa State University)
Statistics 611
45 / 51
Proof of Result A.13
(⇐=):
A[Gc + (I − GA)z]
= AGc + (A − AGA)z
= c + (A − A)z
( by Result A.12)
= c.
∴ Gc + (I − GA)z is a solution to Ax = c ∀ z (of appropriate dimension).
c
Copyright 2012
Dan Nettleton (Iowa State University)
Statistics 611
46 / 51
(=⇒):
If Ax̃ = c, then we have
x̃ = Gc + x̃ − Gc
= Gc + x̃ − GAx̃
= Gc + (I − GA)x̃
∴ ∃ z( namely z = x̃) 3 x̃ = Gc + (I − GA)z.
c
Copyright 2012
Dan Nettleton (Iowa State University)
Statistics 611
47 / 51
Prove that a consistent system of equations
Ax = c
has a unique solution if and only if A is full-column rank and infinitely
many solutions if and only if A is less than full-column rank.
c
Copyright 2012
Dan Nettleton (Iowa State University)
Statistics 611
48 / 51
Proof:
First, suppose A is full-column rank. Then,
Aw = 0 ⇐⇒ w = 0.
Suppose x1 and x2 are any two solutions to Ax = c. Then
Ax1 = Ax2 ⇐⇒ Ax1 − Ax2 = 0
⇐⇒ A(x1 − x2 ) = 0
⇐⇒ x1 − x2 = 0
⇐⇒ x1 = x2 .
c
Copyright 2012
Dan Nettleton (Iowa State University)
Statistics 611
49 / 51
Thus, the solution to Ax = c is unique when A has full-column rank.
Now suppose rank(m×n
A) < n.
Then ∃ w 6= 0 3 Aw = 0.
c
Copyright 2012
Dan Nettleton (Iowa State University)
Statistics 611
50 / 51
Let x∗ denote any solution to
Ax∗ = c.
Then for any d ∈ R,
A(x∗ + dw) = Ax∗ + Adw
= Ax∗ + dAw
= c + d0
= c.
Thus, {x∗ + dw : d ∈ R} is an infinite set of solutions for Ax = c.
c
Copyright 2012
Dan Nettleton (Iowa State University)
Statistics 611
51 / 51
Download