Textbook: “Matrix algebra useful for statistics”, Searle

1 Textbook: “Matrix algebra useful for statistics”, Searle. Webpage: Webpage: 1. course notes: http://mail.thu.edu.tw/~wenwei/cgi, then click on 統計教材 and then click on and then click on Math Algebra ( Word , PDF ) 2. Online grades: http://mail.thu.edu.tw/~wenwei Then, click on  Online Grade: 2006, Summer, Basic Statistics Objective: introduce basic concepts and skills in matrix algebra. In addition, some applications of matrix algebra in statistics are described. 2 Section 1. Introduction and Matrix Operations Definition of r  c matrix: An r  c matrix A is a rectangular array of rc real numbers arranged in r horizontal rows and c vertical columns:  a11 a A   21     ar1 a12 a22    ar 2   a1c  a2 c   .  arc  The i’th row of A is rowi ( A)  ai1 ai 2  aic , i  1,2,, r , , and the j’th column of A is  a1 j  a  2j col j ( A)   , j  1,2,, c.       arj  We often write A as   A  aij  Ar c . Matrix addition: Let A  Arc  a11 a12 a a  [aij ]   21 22      ar 1 ar 2  a1c   a2c    ,   arc  3 B  Bcs b11 b12 b b  [bij ]   21 22    bc1 bc 2  b1s   b2 s    ,   bcs  D  Drc  d11 d12 d d  [d ij ]   21 22      d r1 d r 2  d1c   d 2c    .   d rc  Then,  (a11  d11 ) ( a  d ) 21 A  D  [aij  d ij ]   21     ( a r1  d r1 )  pa11  pa pA  [ paij ]   21     par1 (a12  d12 )  (a1c  d1c )  (a22  d 22 )  (a2c  d 2c ) ,     (ar 2  d r 2 )  (arc  d rc )  pa12 pa22    par 2   pa1c  pa2 c  , p  R.    parc  and the transpose of A is denoted as At  Actr  a11 a  [a ji ]   12     a1c a21  ar1  a22  ar 2       a2c  arc  Example 1: Let  1 A  4 3 5 3 1 B  and 8 0  7 1 0 . 1 Then,  1 3 A B    4  8 37 5 1 1  0  4  0  1  4 4 6 1 , 1  4  1 2 2A    4  2 3 2 5 2 1 2   2  0  2   8 6 10 2 0  and 1  4 At   3 5  .  1 0  Matrix multiplication: We first define the dot product or inner product of n-vectors. Definition of dot product: The dot product or inner product of the n-vectors a  a1 a2  ac   b1  b  and b   2  ,      bc  are c a  b  a1b1  a2b2    ac bc   ai bi . i 1 Example 1:  4 Let a  1 2 3 and b  5 . Then, a  b  1 4  2  5  3  6  32 .   6 Definition of matrix multiplication:   E  Ers  eij  e11 e   21    er1  row1 ( A)  col1 ( B) row ( A)  col ( B) 2 1      rowr ( A)  col1 ( B) e12 e22    er 2   e1s  e2 s     ers  row1 ( A)  col 2 ( B)  row1 ( A)  col s ( B)  row2 ( A)  col 2 ( B)  row2 ( A)  col s ( B)      rowr ( A)  col 2 ( B)  rowr ( A)  col s ( B)  5  row1 ( A)  row ( A)  2 col ( B )  1       rowr ( A)   a11 a   21     ar1 a12 a22  ar 2     col2 ( B )  cols ( B ) a1c  b11 a2 c  b21     arc  bc1 b12 b22  bc 2  b1s   b2 s   Arc Bcs      bcs  That is, eij  rowi ( A)  col j ( B)  ai1b1 j  ai 2 b2 j    aicbcj , i  1,, r , j  1,, s. Example 2: 1 2  0 1 3  A22   , B   23  1 0  2 . 3  1   Then,  row ( A)  col1 ( B) row1 ( A)  col 2 ( B) row1 ( A)  col3 ( B)   2 1  1 E23   1    1 3 11  row ( A )  col ( B ) row ( A )  col ( B ) row ( A )  col ( B ) 2 3  2 1 2 2    since 0 0 row1 ( A)  col1 ( B)  1 2   2 , row2 ( A)  col1 ( B)  3  1   1  1  1 1 row1 ( A)  col2 ( B )  1 2   1 , row2 ( A)  col 2 ( B)  3  11  3 0  0   3  row1 ( A)  col3 ( B)  1 2   1 , row2 ( A)  col3 ( B)  3   2 Example 3 a31  3   1   11 .   2 1 1  4 5       2, b12  4 5  a 31b12  24 5   8 10  3 3 12 15 Another expression of matrix multiplication: 6 Arc Bcs  row1 ( B)  row ( B)  col1 ( A) col 2 ( A)  colc ( A) 2       rowc ( B)  c  col1 ( A)row1 ( B)  col 2 ( A)row2 ( B)    colc ( A)rowc ( B)   coli ( A)rowi ( B) i 1 where coli ( A)rowi ( B) are r  s matrices. Example 2 (continue):  row ( B)  A22 B23  col1 ( A) col 2 ( A) 1   col1 ( A)row1 ( B)  col 2 ( A)row2 ( B) row2 ( B) 1 2  0 1 3   2 0  4  2 1  1   0 1 3    1 0  2      3  1 0 3 9  1 0 2   1 3 11  Note:  row1 ( A)  row ( A) 2  and Heuristically, the matrices A and B,        rowr ( A)  col1 ( B) col 2 ( B)  col s ( B) , can be thought as r  1 and 1 s vectors. Thus, Arc Bcs  row1 ( A)  row ( A) 2 col ( B )  1       rowr ( A)  col2 ( B )  cols ( B ) can be thought as the multiplication of r  1 and 1 s vectors. Similarly, Arc Bcs  col1 ( A)  row1 ( B )  row ( B ) 2  col2 ( A)  colc ( A)       rowc ( B )  7 can be thought as the multiplication of 1 c and c  1 vectors. Note: I. AB BA . For instance, 3 2  1 and B  0 2   1   is not necessarily equal to 1 A 2  II. AC  BC  2 5  0 7  AB     4  2  BA . 4  4      A might be not equal to 1 3 2 A  , B   0 1 2  2  AC   1 III.  IV. 4  1  2 and C   1 2  3   4  BC but A  B  2 AB  0 , it is not necessary that A  0 1 A 1 0 AB   0 or B  0 . For instance, 1  1  1 and B   1 1  1   0  BA but A  0, B  0.  0 A p  A  A A , A p  Aq  A pq , ( A p ) q  A pq p factors Also, ( AB) p is not necessarily equal to A p B p .  AB t V. B . For instance,  B t At . Trace: Definition of the trace of a matrix: 8 The sum of the diagonal elements of a r  r square matrix is called the trace of the matrix, written tr ( A) , i.e., for  a11 a A   21     a r1 a12  a 22  ar 2    a1r  a 2 r   ,  a rr  r tr( A)  a11  a22    arr   aii i 1 . Example 4: 1 5 6  Let A  4 2 7 . Then, tr ( A)  1  2  3  6 .   8 9 3 Homework 1 1. Prove tr ( AB)  tr ( BA ) , where A and B are r  c and c r matrices, respectively. 2. (a) When does  A  B  A  B   A2  B 2 ? tr( AB)  tr( AB t ) (b) When A t  A. (c) When X t XGX t X  X t X Prove , prove X t XG t X t X  X t X 9 Section 2 Special Matrices 2.1 Symmetric Matrices: Definition of symmetric matrix: A r  r matrix Arr is defined as symmetric if  a11 a A   12     a1r a12 a22  a2 r A  At . That is,  a1r   a2 r  , aij  a ji .      arr  Example 1: 1 2 5  A  2 3 6 is symmetric since 5 6 4 A  At . Example 2: Let X1 , X 2 ,, X r be random variables. Then, X1 X2 … Xr X 1  Cov( X 1 , X 1 ) Cov( X 1 , X 2 ) X 2 Cov( X 2 , X 1 ) Cov( X 2 , X 2 ) V      X r Cov( X r , X 1 ) Cov( X r , X 2 )  Cov( X 1 , X r )   Cov( X 2 , X r )      Cov( X r , X r )  Cov( X 1 , X 2 )  Var ( X 1 ) Cov( X , X ) Var ( X 2 ) 1 2      Cov( X 1 , X r ) Cov( X 2 , X r )  Cov( X 1 , X r )   Cov( X 2 , X r )      Var ( X r )   is called the covariance matrix, where Cov( X i , X j )  Cov( X j , X i ), i, j  1,2, , r , is the covariance of the random variables X i and X j and Var ( X i ) is the variance of X i . V is a symmetric matrix. The correlation matrix for X 1 , X 2 ,, X r is 10 defined as X1 … X2 Xr X 1  Corr ( X 1 , X 1 ) Corr ( X 1 , X 2 ) X 2 Corr ( X 2 , X 1 ) Corr ( X 2 , X 2 ) R      X r Corr ( X r , X 1 ) Corr ( X r , X 2 )  Corr ( X 1 , X r )   Corr ( X 2 , X r )      Corr ( X r , X r )  1 Corr ( X 1 , X 2 )  Corr ( X , X ) 1 1 2      Corr ( X 1 , X r ) Corr ( X 2 , X r )  Corr ( X 1 , X r )   Corr ( X 2 , X r )      1   Cov( X i , X j ) where Corr ( X i , X j )  Var ( X i )Var ( X j )  Corr ( X j , X i ), i, j  1,2, , r , is the correlation of X i and X j . R is also a symmetric matrix. For instance, let X 1 be the random variable represent the sale amount of some product and X 2 be the random variable represent the cost spent on advertisement. Suppose Var( X 1 )  20, Var( X 2 )  80, Cov( X 1 , X 2 )  15. Then, 20 15  V   15 80 and  1  R  15  20  80   1 20  80    3   1   8 15 3 8  1  Example 3: Let Arc be a r  c matrix. Then, both AAt and At A are symmetric since AA   A  A t t t t t  AAt and A A t t    At At t  At A . AAt is a r  r symmetric matrix while At A is a c c symmetric matrix. 11  row1 ( At )    row2 ( At ) t  AA  col1 ( A) col2 ( A)  colc ( A)     t   rowc ( A )  col1t ( A)    t col ( A )   col1 ( A) col2 ( A)  colc ( A) 2      t colc ( A)   col1 ( A)col1t ( A)  col2 ( A)col2t ( A)    colc ( A)colct ( A) c   coli ( A)colit ( A) i 1 Also,  row1 ( A)  row ( A) 2 t  rowt ( A) rowt ( A)  rowt ( A) AA   1 2 r       rowr ( A)   row1 ( A)  row1t ( A) row1 ( A)  row2t ( A)  row1 ( A)  rowrt ( A)    row2 ( A)  row1t ( A) row2 ( A)  row2t ( A)  row2 ( A)  rowrt ( A)           t t t  rowr ( A)  row1 ( A) rowr ( A)  row2 ( A)  rowr ( A)  rowr ( A)    Similarly,  row1 ( A)  row ( A) 2 t t t t  A A  row1 ( A) row2 ( A)  rowr ( A)        rowr ( A)   row1t ( A)row1 ( A)  row2t ( A)row2 ( A)    rowrt ( A)rowr ( A)   r   rowit ( A)rowi ( A) i 1 and 12 col1t ( A)   t  col 2 ( A) t  col1 ( A) col2 ( A)  colc ( A) A A     t  colc ( A)  col1t ( A)  col1 ( A) col1t ( A)  col 2 ( A)  col1t ( A)  colc ( A)   t  t t col ( A )  col ( A ) col ( A )  col ( A )  col ( A )  col ( A ) 2 1 2 2 2 c          t  t t colc ( A)  col1 ( A) colc ( A)  col 2 ( A)  colc ( A)  colc ( A)  For instance, let 1 2  1 A   3 0 1 and  1 At    2   1 3 0 . 1  Then, AAt  col1 ( A)  col1 ( A) col 2 ( A)  row1 ( A t )    col 3 ( A)  row2 ( A t )   row3 ( A t )    col 2 ( A) col1t ( A)    col 3 ( A)col 2t ( A)  col 3t ( A)     col1 ( A)col1t ( A)  col 2 ( A)col 2t ( A)  col 3 ( A)col 3t ( A) 1  2   1 3   2 3 0  1 3  4 0    3 9 0 0   1 0    1 1  1   1 2  1 6   1   1    2 10 In addition, At A  row1t ( A)row1 ( A)  row2t ( A)row2 ( A) 1  3   2 1 2  1  03  1 1 2  1  9 0 1   2 4  2  0 0  1  2 1  3 0 0 1 3 10 2 2  0   2 4  2 1  2  2 2  13 Note: A and B are symmetric matrices. Then, AB is not necessarily equal to BA   ( AB) t . That is, AB might not be a symmetric matrix. Example 4: 1 A 2 2 3 3 B 7 and 7 6 . Then, 17 19  17 27 AB    BA   19 32 27 32   Properties of AAt and At A : (a) At A  0  tr( At A)  0 A0  A0  PA  QA (b) PAAt  QAAt [proof] (a) Let col1t ( A)  col1 ( A) col1t ( A)  col2 ( A)  t col2 ( A)  col1 ( A) col2t ( A)  col2 ( A) t  S  A A     t t colc ( A)  col1 ( A) colc ( A)  col2 ( A)    sij  0 . Thus, for j  1,2,, c,  col1t ( A)  colc ( A)    col2t ( A)  colc ( A)      colct ( A)  colc ( A)  14  s jj  col tj ( A)  col j ( A)  a1 j  a1 j  a 2 j    a rj  0  A0  a1 j  a  2j  a rj    a12j  a 22 j    a rj2  0       a rj   a2 j tr( At A)  tr( S )  s11  s 22    s cc  col1t ( A)  col1 ( A)  col 2t ( A)  col 2 ( A)    col ct ( A)col c ( A) 2 2  a112  a 21    a r21  a122  a 22    a r22    a12c  a 22c    a rc2 0  aij2  0, i  1,2, , r; j  1,2, , c.  aij  0  A0 (b) Since PAAt  QAAt , PAAt  QAAt  0, PAA t      PA  QA A P  A Q   QAAt P t  Q t  PA  QA At P t  Q t t t t t  PA  QA PA  QA  0 t By (a), PA  QAt  0  PA  QA  0  PA  QA Note: A r  r matrix Brr is defined as skew-symmetric if aij  a ji , aii  0 . B   B t . That is, 15 Example 5:  0 B   4   5 4 0 6 5 6 0 Thus, 4 0 0 B t  4 5  5  6 0  6 4 5 0   B t   4 0 6  B .   5  6 0 2.2 Idempotent Matrices: Definition of idempotent matrices: A square matrix K is said to be idempotent if K 2  K. Properties of idempotent matrices: 1. Kr  K 2. IK 3. If K1 K1 K 2 for r being a positive integer. is idempotent. and K2 are idempotent matrices and K1 K 2  K 2 K1 . Then, is idempotent. [proof:] 1. For r  1, Suppose K1  K . Kr  K By induction, 2. is true, then K r 1  K r  K  K  K  K 2  K . Kr  K for r being any positive integer. 16 I  K I  K   I  K  K  K 2  I  K  K  K  I  K 3. K1 K 2 K1 K 2   K1 K 2 K1 K 2  K1 K1 K 2 K 2 since K1 K 2  K 2 K1   K12 K 22  K1 K 2 Example 1 Let Arc be a r  c matrix. Then,   1 K  A At A A is an idempotent matrix since   1     1 1   1 KK  A At A At A At A A  AI At A At  A At A A  K . Note: A matrix A satisfying A 2  0 is called nilpotent, and that for which A 2  I could be called unipotent. Example 2: 5 1 2 A   2 4 10   A 2  0  1  2  5 1 3  1 0 2 B  B   0 1  0  1    A is nilpotent.  B is unipotent. Note: K is a idempotent matrix. Then, K I might not be idempotent. 17 2.3 Orthogonal Matrices: Definition of orthogonality: Two n  1 vectors u and v are said to be orthogonal if u t v  vtu  0 A set of n  1 vectors x1 , x2 ,, xn  is said to be orthonormal if xit xi  1, xit x j  0, i  j, i, j  1,2,, n. Definition of orthogonal matrix: A n n square matrix P is said to be orthogonal if PP t  P t P  I nn . Note:  row1 ( P)row1t ( P) row1 ( P)row2t ( P)  row2 ( P)row1t ( P) row2 ( P)row2t ( P) t  PP      t t rown ( P)row1 ( P) rown ( P)row2 ( P)  row1 ( P)rownt ( P)    row2 ( P)rownt ( P)       rown ( P)rownt ( P) 1 0  0 0 1  0          0 0  1  col1t ( P)col1 ( P) col1t ( P)col 2 ( P)  t col 2 ( P)col1 ( P) col 2t ( P)col 2 ( P)       t t col n ( P)col1 ( P) col n ( P)col 2 ( P)  col1t ( P)col n ( P)    col 2t ( P)col n ( P)       col nt ( P)col n ( P)  Pt P  rowi ( P)rowit ( P)  1, rowi ( P)row tj ( P)  0 col it ( P)col i ( P)  1, col it ( P)col j ( P)  0 Thus, row ( P), row ( P),, row ( P) and col1 (P), col2 (P),, coln (P) t 1 t 2 t n 18 are both orthonormal sets!! Example 1: (a) Helmert Matrices: The Helmert matrix of order n has the first row 1/ n 1/ n  1/ n  , and the other n-1 rows ( i  2,3, , n ) has the form,  1 / (i  1)i 1 / (i  1)i  1 / (i  1)i   i  1 i  1i (i-1) items For example, as n  4 , then  1/  1/ H4   1 /  1 /  1/  1/   1/  1 / 4 1 2 23 3 4 4 2 6 12 1/  1/ 1/ 1/ n-i items 4 1 2 23 3 4 1/ 4  1/ 2 1/ 6 1 / 12  0  0  1/ 4 0  2/ 23 1/ 3  4 1/ 4 0  2/ 6 1 / 12 1/ 4   0   0   3 / 3  4  1/ 4   0   0   3 / 12  In statistics, we can use H to find a set of uncorrelated random variables. Suppose Z1 , Z 2 , Z 3 , Z 4 are random variables with Cov(Z i , Z j )  0, Cov(Z i , Z i )   2 , i  j, i, j  1,2,3,4. Let  1/ 4 1/ 4  X1   X  1/ 2  1/ 2 2 X     H4Z    1/ 6 1/ 6 X3     X 1 / 12 1 / 12  4  1 / 4 Z 1  Z 2  Z 3  Z 4     1 / 2 Z 1  Z 2      1 / 6 Z  Z  Z   1 2 3   1 / 12 Z 1  Z 2  Z 3  3Z 4  1/ 4 0  2/ 6 1 / 12 1 / 4   Z1    0  Z 2  0 Z 3     3 / 12   Z 4  19 Then, Cov( X i , X j )   2 rowi ( H 4 )rowtj ( H 4 )  0 since row ( H t 1 4  is an orthonormal set ), row2t ( H 4 ), row3t ( H 4 ), row4t ( H 4 ) of vectors. That is, X 1 , X 2 , X 3 , X 4 are uncorrelated random variables. Also, X  X  X   Z i  Z  4 2 2 2 3 2 4 2 i 1 , where 4 Z Z i 1 4 i . (b) Givens Matrices: Let the orthogonal matrix be  cos( ) sin(  )  G .   sin(  ) cos( ) G is referred to as a Givens matrix of order 2. For a Givens matrix of order 3,  3 there are    3 different forms,  2 1 2 3 1 2 3 1  cos( ) sin(  ) 0 1  cos( ) 0 sin(  )  G12  2  sin(  ) cos( ) 0, G13  2  0 1 0  3  0 0 1 3  sin(  ) 0 cos( ) 1 2 3 1 1 0 0  G23  2 0 cos( ) sin(  )  3 0  sin(  ) cos( ) . The general form of a Givens matrix Gij of order 3 is an identity matrix except for 4 elements, cos( ), sin(  ), and  sin(  ) are in the i’th and j’th rows and 20  4 columns. Similarly, For a Givens matrix of order 4, there are    6  2 forms, 1 2 3 4 1 2 3 1  cos( ) sin(  ) 0 0 1  cos( ) 0 sin(  )   2  sin(  ) cos( ) 0 0 2 0 1 0 G12   , G13   3 0 0 1 0 3  sin(  ) 0 cos( )    4 0 0 0 1 4 0 0 0 1 1  cos( ) 2 0 G14   3 0  4  sin(  ) 2 0 1 0 3 4 0 sin(  )  0 0  , 1 0   0 0 cos( ) 1 2 1 1 0  2 0 cos( ) G24   3 0 0  4 0  sin(  ) 1 2 3 1 1 0 0  2 0 cos( ) sin(  ) G23   3 0  sin(  ) cos( )  4 0 0 0 different 4 0 0 0  1 4 0 0 0  1 3 4 0 0  0 sin(  )  , 1 0   0 cos( ) 1 2 3 4 1 1 0 0 0   2 0 1 0 0  G34   . 3 0 0 cos( ) sin(  )    4 0 0  sin(  ) cos( )  n For the Givens matrix of order n, here are   different forms. The general  2   form of Grs  g ij is an identity matrix except for 4 elements, g rr  g ss  cos( ),  g rs  g sr  sin(  ), r  s . 2.4 Positive Definite Matrices: Definition of positive definite matrix: A symmetric n  n matrix A satisfying x1tn Ann xn1  0 for all is referred to as a positive definite (p.d.) matrix. x  0, 21 Intuition: If ax 2  0 for all real numbers x, x  0 , then the real number a is positive. Similarly, as x is a n  1 vector, A is a n  n matrix and x t Ax  0 , then the matrix A is “positive”. Note: A symmetric n  n matrix A satisfying x1tn Ann xn1  0 for all x  0, is referred to as a positive semidefinite (p.d.) matrix. Example 1: Let  x1  x  x   2      xn  and 1 1 l   .  1 Thus, n n i 1 i 1 2  xi  x    xi2  nx 2  x1  x1  x   xn  2   nx1     xn  x2 x2  x1  1 / n x  1 / n   1 / n 1 / n  1 / n 2   xn          1 / n  xn  t  1 t 1 t t  ll   x Ix  x  n ll  x  x Ix  x   x  n n  n t t  ll t   x t  I   x n  Let ll t A I  n . Then, A is positive semidefinite since for x  0, 22 x Ax  t n  x i 1 i  x  0. 2

Textbook: “Matrix algebra useful for statistics”, Searle

Related documents

Products

Support

Textbook: “Matrix algebra useful for statistics”, Searle

Related documents

Add this document to collection(s)

Add this document to saved

Suggest us how to improve StudyLib