MATRIX RANK

advertisement
MATRIX RANK
╔╗ ╔╗ ╔╗ ╔╗ ╔╗ ╔╗ ╔╗ ╔╗ ╔╗ ╔╗ ╔╗ ╔╗ ╔╗ ╔╗ ╔╗
╚╝ ╚╝ ╚╝ ╚╝ ╚╝ ╚╝ ╚╝ ╚╝ ╚╝ ╚╝ ╚╝ ╚╝ ╚╝ ╚╝ ╚╝
On the first path through multiple regression, it’s convenient to assume that X has full
n p
column rank. Let’s see exactly what this means. Recall that we’ve assumed n > p; this
means that X is a “tall skinny” matrix with more rows than columns. Also, we can write
X in the form X = 1 x1 x2 x3 ... xK  to identify its columns. We will use here the
accounting p = K + 1.
The condition for full column rank is stated thus:
X is full column rank  { X a = 0  a = 0 }
n p
n p
p1
n1
p1
p1
The product X a is really a linear combination of the columns, as
n p
p1
 a0 
a 
 1 
a 
X a = 1 x1 x2 x3 ... xK   2  = 1 a0  x1 a1  x2 a2  x3 a3  ... xK aK
n p p1
 a3 
 
 
 aK 
The condition X a = 0 says that some linear combination of the columns is zero.
n p
p1
n1
The matrix X is said to be full column rank if and only if the only linear combination of
its columns that is zero is formed by the vector of zero coefficients. A number of
observations should be made.
(1)
(2)
(3)
We are illustrating these notions for regression design matrices, and these have an
initial column 1. The definitions of rank have no particular relationship to this 1
column.
Finding a non-zero a for which X a = 0 shows that X does not have full column
rank.
If n < p (a condition which we are disallowing) then X cannot have full column
1 4 2 
rank by this definition. As a quick illustration, suppose that X = 
.
1 0 7 
 a0 
1 4 2   
 1
 4
 2
a

a

The product X a is 
=
a
0
1
1
 1
0
 7  a2
 
 
 
 
1 0 7   
 a2 
1
╔╗
╚╝
gs2011
╔╗
╚╝
MATRIX RANK
╔╗ ╔╗ ╔╗ ╔╗ ╔╗
╚╝ ╚╝ ╚╝ ╚╝ ╚╝
╔╗ ╔╗ ╔╗ ╔╗
╔╗ ╔╗ ╔╗ ╔╗ ╔╗
╚╝ ╚╝ ╚╝ ╚╝
╚╝ ╚╝ ╚╝ ╚╝ ╚╝
 a  4a1  2a2 
=  0
. There are many choices for a for which this is 0. For
 7a2 
 a0
 7 
example a =  1.25  will do.


 1 


(4)
In the regression context, column rank deficiency is often detected easily. Here
is a prototype example:
1
1

1

1
1

1
1

1
1

1
13
9
10
4
2
0
8
6
3
11
1
1
0
0
1
0
1
1
1
0
0
0 
1

1
0

1
0

0
0

1 
The last two columns might, for example, be gender indicators. Column x2 (the
third column) could be the dummy variable for male subjects and column x3 (the
final column) could be the dummy variable for female subjects. Note that
x2 + x3 = 1.
(5)
If X has a column rank deficiency, then some columns are exact linear
combinations of other columns. Columns can be removed until the matrix that
remains has full column rank. The column rank of X is defined as the maximum
number of (selected) columns which, if considered a matrix by themselves, would
1 2 4 
1 3 6 


have full column rank. The matrix  1 9 18  has column rank 2. If we


1 0 0 
1 4 8 


eliminate either the right column or the middle column, the resulting matrix
would have full column rank 2.
2
╔╗
╚╝
gs2011
╔╗
╚╝
(6)
╔╗
╚╝
╔╗
╚╝
╔╗
╚╝
1 3
1 3

The matrix  1 3

1 3
1 3

MATRIX RANK
╔╗ ╔╗ ╔╗ ╔╗ ╔╗
╚╝ ╚╝ ╚╝ ╚╝ ╚╝
╔╗
╚╝
2 
2 
2  has column rank 1.

2 
2 
╔╗
╚╝
╔╗
╚╝
╔╗
╚╝
╔╗
╚╝
╔╗
╚╝
There is a related concept called row rank, and it concerns linear combinations of
the form c X . It can be shown that row rank is exactly equal to column rank.
1n n p
 
Moreover rank M  min(a, b).
ab
In the context of regression independent variable matrices X, this is
another way of saying that we don’t want to consider n < p. In such a
case, the rank could be at most n, and if n < p, we could not have full
column rank.
Here are some additional examples. These three matrices all have full column rank.
1
1

1

1
1

1
1

23 
19 
28 

20 
34 

17 
22 
1
1

1

1
1

1
1

23
19
28
20
34
17
22
10.4 
13.2 
12.8 

16.6 
19.4 

17.0 
9.6 
1
1

1

1
1

1
1

3
╔╗
╚╝
gs2011
23
19
28
20
34
17
22
10.4 -2.46 
13.2 0.23 
12.8 -1.77 

16.6 1.28 
19.4 3.22 

17.0 -2.40 
9.6 0.71 
MATRIX RANK
╔╗ ╔╗ ╔╗ ╔╗ ╔╗ ╔╗ ╔╗ ╔╗ ╔╗ ╔╗
╚╝ ╚╝ ╚╝ ╚╝ ╚╝ ╚╝ ╚╝ ╚╝ ╚╝ ╚╝
These do not have full column rank.
1
1

1

1
1

1
1

23
19
28
20
34
17
22
23 
19 
28 

20 
34 

17 
22 
1
1

1

1
1

1
1

1
1

1

1
1

1
1

23
19
28
20
34
17
22
15
11
15
14
13
11
12
38 
30 
43 

34 
47 

28 
34 
1
1

1

1
1

1
1

23
19
28
20
34
17
22
15
11
15
14
13
11
12
238
330
413
374
407
328
334
8.4
6.6
7.8
7.2
6.5
6.0
9.3
-2.46
0.23
-1.77
1.28
3.22
-2.40
0.71
23
19
28
20
34
17
22
46
38
56
40
68
34
44
10.4 
13.2 
12.8 

16.6 
19.4 

17.0 
9.6 
1
1

1

1
1

1
1

1
1
1
0
0
0
0
0
0 
0

0
0

1
1 
65
74
58
61
70
72
66
0
0
0
1
1
0
0
0.31
0.45
0.42
0.36
0.49
0.28
0.33
╔╗
╚╝
╔╗
╚╝
1
1

1

1
1

1
1

1
1

1

1
1

1
1

╔╗
╚╝
╔╗
╚╝
╔╗
╚╝
23 6 
19 6 
28 6 

20 6 
34 6 

17 6 
22 6 
1
1
1
0
0
0
0
0
0
0
1
1
0
0
0
0 
0

0
0

0
0 
7
-1 
4

6
-2 

5
3 
The final example is 7-by-9. It has full row rank 7 (and thus rank 7), but it does not have
full column rank. We would not use it in a regression analysis.
4
╔╗
╚╝
gs2011
Download