Textbook: “Matrix algebra useful for statistics”, Searle

advertisement
1
Textbook: “Matrix algebra useful for statistics”, Searle.
Webpage: Webpage:
1. course notes:
http://mail.thu.edu.tw/~wenwei/cgi,
then click on 統計教材 and then click on
and then click on
Math Algebra ( Word , PDF )
2.
Online grades:
http://mail.thu.edu.tw/~wenwei
Then, click on
 Online Grade: 2006, Summer, Basic Statistics
Objective: introduce
basic concepts and skills in matrix
algebra. In addition, some applications of
matrix algebra in statistics are described.
2
Section 1. Introduction and Matrix Operations
Definition of r  c matrix:
An r  c matrix A is a rectangular array of rc real numbers arranged in r
horizontal rows and c vertical columns:
 a11
a
A   21
 

 ar1
a12
a22



ar 2


a1c 
a2 c 
 .

arc 
The i’th row of A is
rowi ( A)  ai1 ai 2  aic , i  1,2,, r , ,
and the j’th column of A is
 a1 j 
a 
2j
col j ( A)   , j  1,2,, c.
  
 
 arj 
We often write A as
 
A  aij  Ar c .
Matrix addition:
Let
A  Arc
 a11 a12
a
a
 [aij ]   21 22
 


 ar 1 ar 2
 a1c 
 a2c 
  ,

 arc 
3
B  Bcs
b11 b12
b b
 [bij ]   21 22



bc1 bc 2
 b1s 
 b2 s 
  ,

 bcs 
D  Drc
 d11 d12
d
d
 [d ij ]   21 22
 


 d r1 d r 2
 d1c 
 d 2c 
  .

 d rc 
Then,
 (a11  d11 )
( a  d )
21
A  D  [aij  d ij ]   21



 ( a r1  d r1 )
 pa11
 pa
pA  [ paij ]   21
 

 par1
(a12  d12 )  (a1c  d1c ) 
(a22  d 22 )  (a2c  d 2c )
,




(ar 2  d r 2 )  (arc  d rc ) 
pa12
pa22



par 2


pa1c 
pa2 c 
, p  R.
 

parc 
and the transpose of A is denoted as
At  Actr
 a11
a
 [a ji ]   12
 

 a1c
a21  ar1 
a22  ar 2 
   

a2c  arc 
Example 1:
Let
 1
A
 4
3
5
3
1
B

and
8
0

7
1
0
.
1
Then,
 1 3
A B  
 4  8
37
5 1
1  0  4

0  1
 4
4
6
1
,
1

4
 1 2
2A  
 4  2
3 2
5 2
1 2   2

0  2
  8
6
10
2
0

and
1  4
At   3 5  .
 1 0 
Matrix multiplication:
We first define the dot product or inner product of n-vectors.
Definition of dot product:
The dot product or inner product of the n-vectors
a  a1 a2  ac 
 b1 
b 
and b   2  ,
  


bc 
are
c
a  b  a1b1  a2b2    ac bc   ai bi .
i 1
Example 1:
 4
Let a  1 2 3 and b  5 . Then, a  b  1 4  2  5  3  6  32 .
 
6
Definition of matrix multiplication:
 
E  Ers  eij
 e11
e
  21
 

er1
 row1 ( A)  col1 ( B)
row ( A)  col ( B)
2
1




 rowr ( A)  col1 ( B)
e12
e22



er 2


e1s 
e2 s 
 

ers 
row1 ( A)  col 2 ( B)  row1 ( A)  col s ( B) 
row2 ( A)  col 2 ( B)  row2 ( A)  col s ( B)





rowr ( A)  col 2 ( B)  rowr ( A)  col s ( B) 
5
 row1 ( A) 
row ( A) 
2
col ( B )

1





 rowr ( A) 
 a11
a
  21
 

 ar1
a12
a22

ar 2




col2 ( B )  cols ( B )
a1c  b11
a2 c  b21
  

arc  bc1
b12
b22

bc 2
 b1s 
 b2 s 
 Arc Bcs
  

 bcs 
That is,
eij  rowi ( A)  col j ( B)  ai1b1 j  ai 2 b2 j    aicbcj , i  1,, r , j  1,, s.
Example 2:
1 2 
0 1 3 
A22  
,
B

 23  1 0  2 .
3  1


Then,
 row ( A)  col1 ( B) row1 ( A)  col 2 ( B) row1 ( A)  col3 ( B)   2 1  1
E23   1
   1 3 11 
row
(
A
)

col
(
B
)
row
(
A
)

col
(
B
)
row
(
A
)

col
(
B
)
2
3

2
1
2
2

 
since
0
0
row1 ( A)  col1 ( B)  1 2   2 , row2 ( A)  col1 ( B)  3  1   1
 1
 1
1
row1 ( A)  col2 ( B )  1 2   1 , row2 ( A)  col 2 ( B)  3  11  3
0 
0 
 3 
row1 ( A)  col3 ( B)  1 2   1 , row2 ( A)  col3 ( B)  3
  2
Example 3
a31
 3 
 1   11 .
  2
1
1
 4 5 




 2, b12  4 5  a 31b12  24 5   8 10 
3
3
12 15
Another expression of matrix multiplication:
6
Arc Bcs
 row1 ( B) 
row ( B)
 col1 ( A) col 2 ( A)  colc ( A) 2 
  


rowc ( B) 
c
 col1 ( A)row1 ( B)  col 2 ( A)row2 ( B)    colc ( A)rowc ( B)   coli ( A)rowi ( B)
i 1
where coli ( A)rowi ( B) are r  s matrices.
Example 2 (continue):
 row ( B) 
A22 B23  col1 ( A) col 2 ( A) 1   col1 ( A)row1 ( B)  col 2 ( A)row2 ( B)
row2 ( B)
1
2
 0 1 3   2 0  4  2 1  1
  0 1 3    1 0  2  



3
 1
0 3 9  1 0 2   1 3 11 
Note:
 row1 ( A) 
row ( A)
2
 and
Heuristically, the matrices A and B, 





 rowr ( A) 
col1 ( B)
col 2 ( B)  col s ( B) , can be thought as r  1 and 1 s vectors.
Thus,
Arc Bcs
 row1 ( A) 
row ( A)
2
col ( B )

1





 rowr ( A) 
col2 ( B )  cols ( B )
can be thought as the multiplication of r  1 and 1 s vectors. Similarly,
Arc Bcs  col1 ( A)
 row1 ( B ) 
row ( B )
2

col2 ( A)  colc ( A)





 rowc ( B ) 
7
can be thought as the multiplication of 1 c and c  1 vectors.
Note:
I.
AB
BA . For instance,
3
2  1
and
B

0 2 
 1


is not necessarily equal to
1
A
2

II.
AC  BC
 2 5  0 7 
AB  
  4  2  BA .
4

4

 

 A
might be not equal to
1 3
2
A  , B  
0 1
2
 2
 AC  
1
III.

IV.
4
 1  2
and
C

 1 2 
3


4
 BC but A  B

2
AB  0 , it is not necessary that A  0
1
A
1
0
AB  
0
or
B  0 . For instance,
1
 1  1
and
B

 1 1 
1


0
 BA but A  0, B  0.

0
A p  A  A A , A p  Aq  A pq , ( A p ) q  A pq
p factors
Also, ( AB) p is not necessarily equal to A p B p .
 AB
t
V.
B . For instance,
 B t At .
Trace:
Definition of the trace of a matrix:
8
The sum of the diagonal elements of a r  r square matrix is called the trace of
the matrix, written
tr ( A) , i.e., for
 a11
a
A   21
 

 a r1
a12

a 22

ar 2



a1r 
a 2 r 
 ,

a rr 
r
tr( A)  a11  a22    arr   aii
i 1
.
Example 4:
1 5 6 
Let A  4 2 7 . Then, tr ( A)  1  2  3  6 .


8 9 3
Homework 1
1. Prove
tr ( AB)  tr ( BA ) , where A and B are
r  c and c r
matrices, respectively.
2.
(a) When does
 A  B  A  B   A2  B 2 ?
tr( AB)  tr( AB t )
(b) When
A t  A.
(c) When
X t XGX t X  X t X
Prove
, prove
X t XG t X t X  X t X
9
Section 2 Special Matrices
2.1 Symmetric Matrices:
Definition of symmetric matrix:
A r  r matrix Arr is defined as symmetric if
 a11
a
A   12
 

 a1r
a12
a22

a2 r
A  At . That is,
 a1r 
 a2 r 
, aij  a ji
.
  

 arr 
Example 1:
1 2 5 
A  2 3 6 is symmetric since
5 6 4
A  At .
Example 2:
Let
X1 , X 2 ,, X r
be random variables. Then,
X1
X2
…
Xr
X 1  Cov( X 1 , X 1 ) Cov( X 1 , X 2 )
X 2 Cov( X 2 , X 1 ) Cov( X 2 , X 2 )
V
 



X r Cov( X r , X 1 ) Cov( X r , X 2 )
 Cov( X 1 , X r ) 
 Cov( X 2 , X r )




 Cov( X r , X r ) 
Cov( X 1 , X 2 )
 Var ( X 1 )
Cov( X , X )
Var ( X 2 )
1
2





Cov( X 1 , X r ) Cov( X 2 , X r )
 Cov( X 1 , X r ) 
 Cov( X 2 , X r )





Var ( X r ) 

is called the covariance matrix, where Cov( X i , X j )  Cov( X j , X i ), i, j  1,2, , r ,
is the covariance of the random variables X i and X j and Var ( X i ) is the variance
of X i . V is a symmetric matrix. The correlation matrix for X 1 , X 2 ,, X r is
10
defined as
X1
…
X2
Xr
X 1  Corr ( X 1 , X 1 ) Corr ( X 1 , X 2 )
X 2 Corr ( X 2 , X 1 ) Corr ( X 2 , X 2 )
R
 



X r Corr ( X r , X 1 ) Corr ( X r , X 2 )
 Corr ( X 1 , X r ) 
 Corr ( X 2 , X r )




 Corr ( X r , X r ) 
1
Corr ( X 1 , X 2 )

Corr ( X , X )
1
1
2





Corr ( X 1 , X r ) Corr ( X 2 , X r )
 Corr ( X 1 , X r ) 
 Corr ( X 2 , X r )





1


Cov( X i , X j )
where Corr ( X i , X j ) 
Var ( X i )Var ( X j )
 Corr ( X j , X i ), i, j  1,2, , r , is the
correlation of X i and X j . R is also a symmetric matrix. For instance, let X 1 be
the random variable represent the sale amount of some product and X 2 be the
random variable represent the cost spent on advertisement. Suppose
Var( X 1 )  20, Var( X 2 )  80, Cov( X 1 , X 2 )  15.
Then,
20 15 
V 

15 80
and

1

R
 15
 20  80
 
1
20  80   
3


1
  8
15
3
8

1

Example 3:
Let Arc be a r  c matrix. Then, both AAt and At A are symmetric since
AA   A  A
t t
t t
t
 AAt
and A A
t
t
 
 At At
t
 At A .
AAt is a r  r symmetric matrix while At A is a c c symmetric matrix.
11
 row1 ( At ) 


row2 ( At )
t

AA  col1 ( A) col2 ( A)  colc ( A)




t 
 rowc ( A ) 
col1t ( A) 


t
col
(
A
)

 col1 ( A) col2 ( A)  colc ( A) 2





t
colc ( A) 
 col1 ( A)col1t ( A)  col2 ( A)col2t ( A)    colc ( A)colct ( A)
c
  coli ( A)colit ( A)
i 1
Also,
 row1 ( A) 
row ( A)
2
t
 rowt ( A) rowt ( A)  rowt ( A)
AA  
1
2
r
 



 rowr ( A) 
 row1 ( A)  row1t ( A) row1 ( A)  row2t ( A)  row1 ( A)  rowrt ( A) 


row2 ( A)  row1t ( A) row2 ( A)  row2t ( A)  row2 ( A)  rowrt ( A)










t
t
t
 rowr ( A)  row1 ( A) rowr ( A)  row2 ( A)  rowr ( A)  rowr ( A) 


Similarly,
 row1 ( A) 
row ( A)
2
t
t
t
t

A A  row1 ( A) row2 ( A)  rowr ( A) 





 rowr ( A) 
 row1t ( A)row1 ( A)  row2t ( A)row2 ( A)    rowrt ( A)rowr ( A)


r
  rowit ( A)rowi ( A)
i 1
and
12
col1t ( A) 
 t

col 2 ( A)
t

col1 ( A) col2 ( A)  colc ( A)
A A
 

 t

colc ( A) 
col1t ( A)  col1 ( A) col1t ( A)  col 2 ( A)  col1t ( A)  colc ( A) 
 t

t
t
col
(
A
)

col
(
A
)
col
(
A
)

col
(
A
)

col
(
A
)

col
(
A
)
2
1
2
2
2
c








 t

t
t
colc ( A)  col1 ( A) colc ( A)  col 2 ( A)  colc ( A)  colc ( A) 
For instance, let
1 2  1
A

 3 0 1
and
 1
At  
 2

 1
3
0
.
1

Then,
AAt  col1 ( A)
 col1 ( A)
col 2 ( A)
 row1 ( A t ) 


col 3 ( A)  row2 ( A t ) 
 row3 ( A t ) 


col 2 ( A)
col1t ( A) 


col 3 ( A)col 2t ( A) 
col 3t ( A) 


 col1 ( A)col1t ( A)  col 2 ( A)col 2t ( A)  col 3 ( A)col 3t ( A)
1
 2
  1 3   2
3
0 
1 3  4 0



3 9 0 0
  1
0  
 1 1
 1 
 1
2
 1
6

 1


1 

 2 10
In addition,
At A  row1t ( A)row1 ( A)  row2t ( A)row2 ( A)
1
 3
  2 1 2  1  03
 1
1
2  1  9 0
1
  2
4  2  0 0
 1  2 1  3 0
0 1
3 10 2
2 
0   2
4  2
1  2  2 2 
13
Note:
A and B are symmetric matrices. Then, AB is not necessarily equal to
BA   ( AB) t . That is, AB might not be a symmetric matrix.
Example 4:
1
A
2
2
3
3
B
7
and
7
6 .
Then,
17 19 
17 27
AB  

BA


19 32
27 32


Properties of AAt and At A :
(a)
At A  0

tr( At A)  0
A0

A0

PA  QA
(b)
PAAt  QAAt
[proof]
(a)
Let
col1t ( A)  col1 ( A) col1t ( A)  col2 ( A)
 t
col2 ( A)  col1 ( A) col2t ( A)  col2 ( A)
t

S  A A



 t
t
colc ( A)  col1 ( A) colc ( A)  col2 ( A)
 
 sij  0 .
Thus, for j  1,2,, c,
 col1t ( A)  colc ( A) 

 col2t ( A)  colc ( A)




 colct ( A)  colc ( A) 
14

s jj  col tj ( A)  col j ( A)  a1 j

a1 j  a 2 j    a rj  0

A0
 a1 j 
a 
2j
 a rj    a12j  a 22 j    a rj2  0
  
 
 a rj 

a2 j
tr( At A)  tr( S )  s11  s 22    s cc
 col1t ( A)  col1 ( A)  col 2t ( A)  col 2 ( A)    col ct ( A)col c ( A)
2
2
 a112  a 21
   a r21  a122  a 22
   a r22    a12c  a 22c    a rc2
0
 aij2  0, i  1,2, , r; j  1,2, , c.  aij  0

A0
(b)
Since
PAAt  QAAt , PAAt  QAAt  0,
PAA
t




 PA  QA A P  A Q 
 QAAt P t  Q t  PA  QA At P t  Q t
t
t
t
t
 PA  QA PA  QA 
0
t
By (a),
PA  QAt  0

PA  QA  0

PA  QA
Note:
A r  r matrix Brr is defined as skew-symmetric if
aij  a ji , aii  0 .
B   B t . That is,
15
Example 5:
 0
B   4
  5
4
0
6
5
6
0
Thus,
4
0
0
B t  4
5
 5
 6
0 
6
4 5
0
  B t   4 0 6  B .
  5  6 0
2.2 Idempotent Matrices:
Definition of idempotent matrices:
A square matrix K is said to be idempotent if
K
2
 K.
Properties of idempotent matrices:
1.
Kr  K
2.
IK
3.
If
K1
K1 K 2
for r being a positive integer.
is idempotent.
and
K2
are idempotent matrices and
K1 K 2  K 2 K1 . Then,
is idempotent.
[proof:]
1.
For r  1,
Suppose
K1  K .
Kr  K
By induction,
2.
is true, then K r 1  K r  K  K  K  K 2  K .
Kr  K
for r being any positive integer.
16
I  K I  K   I  K  K  K 2  I  K  K  K  I  K
3.
K1 K 2 K1 K 2   K1 K 2 K1 K 2  K1 K1 K 2 K 2 since
K1 K 2  K 2 K1 
 K12 K 22  K1 K 2
Example 1
Let
Arc
be a r  c matrix. Then,
 
1
K  A At A A
is an idempotent matrix since
 
1
 
 
1
1
 
1
KK  A At A At A At A A  AI At A At  A At A A  K .
Note:
A matrix A satisfying A 2  0 is called nilpotent, and that for which A 2  I
could be called unipotent.
Example 2:
5
1 2
A   2 4 10   A 2  0
 1  2  5
1 3 
1 0
2
B

B


0 1 
0  1


 A is nilpotent.
 B is unipotent.
Note:
K
is a idempotent matrix. Then,
K I
might not be idempotent.
17
2.3 Orthogonal Matrices:
Definition of orthogonality:
Two n  1 vectors u and v are said to be orthogonal if
u t v  vtu  0
A set of
n  1 vectors
x1 , x2 ,, xn  is said to be orthonormal if
xit xi  1, xit x j  0, i  j, i, j  1,2,, n.
Definition of orthogonal matrix:
A n n square matrix P is said to be orthogonal if
PP t  P t P  I nn .
Note:
 row1 ( P)row1t ( P) row1 ( P)row2t ( P)

row2 ( P)row1t ( P) row2 ( P)row2t ( P)
t

PP 




t
t
rown ( P)row1 ( P) rown ( P)row2 ( P)
 row1 ( P)rownt ( P) 

 row2 ( P)rownt ( P) 




 rown ( P)rownt ( P)
1 0  0
0 1  0 


   


0 0  1 
col1t ( P)col1 ( P) col1t ( P)col 2 ( P)
 t
col 2 ( P)col1 ( P) col 2t ( P)col 2 ( P)





 t
t
col n ( P)col1 ( P) col n ( P)col 2 ( P)
 col1t ( P)col n ( P) 

 col 2t ( P)col n ( P) 




 col nt ( P)col n ( P)
 Pt P

rowi ( P)rowit ( P)  1, rowi ( P)row tj ( P)  0
col it ( P)col i ( P)  1, col it ( P)col j ( P)  0
Thus,
row ( P), row ( P),, row ( P) and col1 (P), col2 (P),, coln (P)
t
1
t
2
t
n
18
are both orthonormal sets!!
Example 1:
(a) Helmert Matrices:
The Helmert matrix of order n has the first row
1/
n 1/ n  1/ n

,
and the other n-1 rows ( i  2,3, , n ) has the form,

1 / (i  1)i 1 / (i  1)i  1 / (i  1)i

 i  1
i  1i
(i-1) items
For example, as n  4 , then
 1/

1/
H4  
1 /

1 /
 1/

1/

 1/

1 /
4
1 2
23
3 4
4
2
6
12
1/
 1/
1/
1/
n-i items
4
1 2
23
3 4
1/ 4
 1/ 2
1/ 6
1 / 12

0  0

1/ 4
0
 2/ 23
1/ 3  4
1/ 4
0
 2/ 6
1 / 12
1/ 4 

0


0

 3 / 3  4 
1/ 4 

0


0

 3 / 12 
In statistics, we can use H to find a set of uncorrelated random variables.
Suppose Z1 , Z 2 , Z 3 , Z 4 are random variables with
Cov(Z i , Z j )  0, Cov(Z i , Z i )   2 , i  j, i, j  1,2,3,4.
Let
 1/ 4 1/ 4
 X1 

X 
1/ 2  1/ 2
2
X     H4Z  
 1/ 6 1/ 6
X3 

 
X
1 / 12 1 / 12
 4
 1 / 4 Z 1  Z 2  Z 3  Z 4  


1 / 2 Z 1  Z 2 



 1 / 6 Z  Z  Z  
1
2
3


1 / 12 Z 1  Z 2  Z 3  3Z 4 
1/ 4
0
 2/ 6
1 / 12
1 / 4   Z1 
 
0  Z 2 
0 Z 3 
 
 3 / 12   Z 4 
19
Then,
Cov( X i , X j )   2 rowi ( H 4 )rowtj ( H 4 )  0
since
row ( H
t
1
4
 is an orthonormal set
), row2t ( H 4 ), row3t ( H 4 ), row4t ( H 4 )
of vectors. That is, X 1 , X 2 , X 3 , X 4 are uncorrelated random variables. Also,
X  X  X   Z i  Z 
4
2
2
2
3
2
4
2
i 1
,
where
4
Z
Z
i 1
4
i
.
(b) Givens Matrices:
Let the orthogonal matrix be
 cos( ) sin(  ) 
G
.

 sin(  ) cos( )
G is referred to as a Givens matrix of order 2. For a Givens matrix of order 3,
 3
there are    3 different forms,
 2
1
2
3
1
2
3
1  cos( ) sin(  ) 0
1  cos( ) 0 sin(  ) 
G12  2  sin(  ) cos( ) 0, G13  2  0
1
0 
3  0
0
1
3  sin(  ) 0 cos( )
1
2
3
1 1
0
0 
G23  2 0 cos( ) sin(  ) 
3 0  sin(  ) cos( )
.
The general form of a Givens matrix Gij of order 3 is an identity matrix except
for 4 elements, cos( ), sin(  ), and  sin(  ) are in the i’th and j’th rows and
20
 4
columns. Similarly, For a Givens matrix of order 4, there are    6
 2
forms,
1
2
3 4
1
2 3
1  cos( ) sin(  ) 0 0
1  cos( ) 0 sin(  )


2  sin(  ) cos( ) 0 0
2 0
1
0
G12  
, G13  
3 0
0
1 0
3  sin(  ) 0 cos( )



4 0
0
0 1
4 0
0
0
1
1  cos( )
2 0
G14  
3 0

4  sin(  )
2
0
1
0
3
4
0 sin(  ) 
0
0 
,
1
0 

0 0 cos( )
1
2
1 1
0

2 0 cos( )
G24  
3 0
0

4 0  sin(  )
1
2
3
1 1
0
0

2 0 cos( ) sin(  )
G23  
3 0  sin(  ) cos( )

4 0
0
0
different
4
0
0
0

1
4
0
0
0

1
3
4
0
0 
0 sin(  ) 
,
1
0 

0 cos( )
1 2
3
4
1 1 0
0
0 

2 0 1
0
0 
G34  
.
3 0 0 cos( ) sin(  ) 


4 0 0  sin(  ) cos( )
 n
For the Givens matrix of order n, here are   different forms. The general
 2
 
form of Grs  g ij is an identity matrix except for 4 elements,
g rr  g ss  cos( ),  g rs  g sr  sin(  ), r  s .
2.4 Positive Definite Matrices:
Definition of positive definite matrix:
A symmetric n  n matrix A satisfying
x1tn Ann xn1  0
for all
is referred to as a positive definite (p.d.) matrix.
x  0,
21
Intuition:
If ax 2  0 for all real numbers x, x  0 , then the real number a is positive.
Similarly, as x is a n  1 vector, A is a n  n matrix and x t Ax  0 , then the
matrix A is “positive”.
Note:
A symmetric n  n matrix A satisfying
x1tn Ann xn1  0
for all
x  0,
is referred to as a positive semidefinite (p.d.) matrix.
Example 1:
Let
 x1 
x 
x   2
 
 
 xn 
and
1
1
l 
 .

1
Thus,
n
n
i 1
i 1
2
 xi  x    xi2  nx 2
 x1
 x1 
x 
 xn  2   nx1

 
 xn 
x2
x2
 x1 
1 / n
x 
1 / n


1 / n 1 / n  1 / n 2 
 xn 

  
 
 
1 / n
 xn 
t
 1 t 1
t
t  ll 
 x Ix  x  n ll  x  x Ix  x   x
 n n
 n
t
t

ll t 
 x t  I   x
n

Let
ll t
A I 
n
. Then, A is positive semidefinite since for x  0,
22
x Ax 
t
n
 x
i 1
i
 x  0.
2
Download