1. Working with Matrices and Vectors Defn 1.1. A column of real numbers is called a vector. Examples: Defn 1.2: A rectangular array of elements with m rows and k columns is called an mk matrix. 2 A 2 Y = 664 6 Y1 Y2 . 3 7 7 7 5 2 a = 64 Yn 1 1 1 3 1= 1 2 1 1 3 7 5 2 3 6 6 6 6 6 6 4 7 7 7 7 7 7 5 a11 a21 a12 a22 = . 6 6 6 4 . am1 am2 Since Y has n elements it is said to have order (or dimension) n. a1k a2k . 3 7 7 7 5 amk This matrix is said to be of order (or dimension) m k where m is the row order (dimension) k is the column order (dimension) 22 21 Examples: Defn 1.3 13 2 A= 04 5 " 100 I = 0 1 0 001 # 2 3 6 4 7 5 If A and B are both m k matrices, then C = = A+B 2 a11 a12 6 a 21 a22 6 4 2 13 B = 26 " Matrix addition # = 6 6 4 ... ... am1 am2 a11 + b11 a21 + b21 .. . a1k a2k ... amk 3 2 7 7 5 +6 4 ... a12 + b12 a22 + b22 .. . am1 + bm1 am2 + bm2 Notation: Cm k 23 6 b11 b12 b21 b22 ... bm1 bm2 a1k + b1k a2k + b2k .. . amk + bmk b1k b2k ... bmk 3 7 7 5 = fcij g where cij = aij + bij 24 3 7 7 5 Defn 1.4: Matrix subtraction If A and B are m k matrices, then C = A is dened by C = fcij g where cij = aij bij : Examples: 36 + 21 " 2 6 4 # 1 1 1 1 1 0 " 3 2 7 5 6 4 7 4 = 10 2 3 2 13 # " 1 1 2 0 = 1 1 3 2 7 5 6 4 B # 0 0 1 1 0 1 Defn 1.5: Scalar multiplication Let a be a scalar and B = fbij g be an m k matrix, then a B = B a = fa bij g Example: 2 20 41 32 = 40 28 64 " # " 3 7 5 26 25 Defn 1.6: Transpose The transpose of the m k matrix A = faij g is the k m matrix with elements fajig. The transpose of A is denoted by AT (or A0). Defn 1.7: If a matrix has the same number of rows and columns it is called a square matrix. 2 Ak k Example: 14 A= 30 26 2 3 6 4 7 5 AT = 41 30 26 " # # 27 = . 6 4 a11 ak 1 . a1k 3 7 5 akk is said to have order (or dimension) k. 28 Defn 1.8: A square matrix A = faij g is if A = AT , that is, if aij = aji for all (i; j). symmetric Examples: Defn 1.9: Inner product (crossproduct) of two vectors of order n 2 Y1 Y2 3 = [a ; a ; an] . Yn = a Y + a Y + + an Yn n = aj Yj j T Note that a Y = YT a aT Y 1 1 6 6 6 4 2 1 2 7 7 7 5 2 X A =1 = 41 12 4 3 = 32 50 1 2 " # 2 0 3 1 2 B 6 6 6 4 1 2 1 2 Defn 1.10: a vector) 3 7 7 7 5 Euclidean distance (or length of kYk = ( YT Y ) = 1=2 0 @ 1 1=2 2A n X j =1 Yj 29 Defn 1.11: Matrix multiplication The product of an n k matrix A and a k m matrix B is the n m matrix C = fcij g with elements cij = ai b j + ai b j + + aik bkj 1 1 30 Defn 1.12: Elementwise two matrices 2 . A # B = 6 4 ak1 2 2 Example: 3 0 2 A = 1 1 4 1 3 C = AB = 4 11 " 2 a = . 11 b11 6 4 ak1 bk1 11 B = 1 2 13 # " a11 2 3 6 4 7 5 # 31 Example 31 24 # 06 2 3 2 6 4 7 5 6 4 . a1m 3 7 5 akm of multiplication 2 # . 6 4 b11 bk1 .a m b m 1 1 . b1m bkm 3 7 5 akm bkm 1 5 3 4 = 2 2 3 2 7 5 6 4 3 5 6 16 0 12 3 7 5 32 3 7 5 Defn 1.13: matrices Ak m Kronecker product of two 2 Bns = 6 6 6 4 a11 B a12 B a21 B a22 B . ak1 B Examples: 2 6 4 2 4 0 2 3 1 ak 2 B 10 4 53 = 0 21 0 15 6 2 3 7 5 . " 6 6 6 6 6 6 6 6 4 # a Y = 6 4 a1 a2 a3 3 7 5 " Y1 Y2 # = . 3 7 7 7 5 akm B 6 20 12 2 8 4 0 10 6 0 4 2 9 5 3 3 2 1 2 2 a1m B a2m B 6 6 6 6 6 6 6 6 4 a1 Y1 a1 Y2 a2 Y1 a2 Y2 a3 Y1 a3 Y2 3 7 7 7 7 7 7 7 7 5 # This code is stored in the le # # matrix.ssc # #||||||||||||{ # Add and subtract matrices #||||||||||||{ 3 7 7 7 7 7 7 7 7 5 matrix(c(3, 6, 2, 1),2,2,byrow=T) [; 1] [; 2] [1; ] 3 6 [2; ] 2 1 b < matrix(c(7, -4, -3, 2),2,2,byrow=T) b [; 1] [; 2] [1; ] 7 4 [2; ] 3 2 > a< > a > > 33 > a > a +b b 34 #|||||||||||{ # Multiplication by a scalar #|||||||||||{ [; 1] [; 2] [1; ] 10 2 [2; ] 1 3 -matrix(c(2, -1, 3, 0, 4, -2), 2, 3, byrow=T) c [; 1] [; 2] [; 3] [1; ] 2 1 3 [2; ] 0 4 2 d< 2c d [; 1] [; 2] [; 3] [1; ] 4 2 6 [2; ] 0 8 4 >c< > [; 1] [; 2] [1; ] 4 10 [2; ] 5 1 > > 35 36 #|||||||||||# Matrix multiplication #|||||||||||> a <-matrix(c(3, 0, -2, 1, -1, 4), 2,3,byrow=T) > a [; 1] [; 2] [; 3] [1; ] 3 0 2 [2; ] 1 1 4 > b <-matrix(c(1,1,1,2,1,3), 3,2,byrow=T) > b [; 1] [; 2] [1; ] 1 1 [2; ] 1 2 [3; ] 1 3 >c< a %% b >c [; 1] [; 2] [1; ] 1 3 [2; ] 4 11 #||||||||||| # Transpose of a matrix #||||||||||| () [; 1] [1; ] 2 [2; ] 1 [3; ] 3 > ct < > ct t c [; 2] 0 4 2 37 #||||||||{ # Inner product #||||||||{ > x < c(1,7,-6,4) > y < c(2,-2,1,5) 38 crossprod(x,y) [,1] [1,] 2 > #||||||||{ # Length of a vector #||||||||{ > ynorm<-sqrt(crossprod(y,y)) > ynorm [,1] [1,] 5.830952 > x [1] 1 7 -6 4 > y [1] 2 -2 1 5 > t(x)% %y [,1] [1,] 2 > x% %y [,1] [1,] 2 #|||||||||||||# Number of elements in a vector #|||||||||||||> length(y) [1] 4 39 40 #|||||||||||| # Elementwise multiplication #|||||||||||| > a < matrix(c(3, 6, 2, 1),2,2,byrow=T) > a [; 1] [; 2] [1; ] 3 6 [2; ] 2 1 > b < matrix(c(7, -4, -3, 2),2,2,byrow=T) > b [; 1] [; 2] [1; ] 7 4 [2; ] 3 2 > a*b [; 1] [; 2] [1; ] 21 24 [2; ] 6 2 41 #|||||||||||||| # What happens when the dimensions # of the matrices or vectors are # not appropriate for the operation #|||||||||||||| a -matrix(c(1, 1, 1, 2), 2, 2, byrow=T) b -matrix(c(3, 0, -2, 1, -1, 4), 2, 3, byrow=T) >a [; 1] [; 2] [1; ] 1 1 [2; ] 1 2 >b [; 1] [; 2] [; 3] [1; ] 3 0 2 [2; ] 1 1 4 > a+b Error in a + b: Dimension attributes do not match > b+a Error in b + a: Dimension attributes do not match > < > < 43 # |||||||||{ # Kronecker Product) # |||||||||{ > a<-matrix(c(2,4,0,-2,3,-1), ncol=2,byrow=T) > a [; 1] [; 2] [1; ] 2 4 [2; ] 0 2 [3; ] 3 1 > b<-matrix(c(5,3,2,1),2,2,byrow=T) > b [; 1] [; 2] [1; ] 5 3 [2; ] 2 1 > kronecker(a,b) [; 1] [; 2] [; 3] [; 4] [1; ] 10 6 20 12 [2; ] 4 2 8 4 [3; ] 0 0 10 6 [4; ] 0 0 4 2 [5; ] 15 9 5 3 [6; ] 6 3 2 1 42 a% [;%b 1] [; 2] [; 3] [1; ] 4 1 2 [2; ] 5 2 6 > b% %a Error in "%*%.default"(b, a): Number of columns of x should be the same as number of rows of y > ab Error in a * b: Dimension attributes do not match > 44 Defn 1.14: The determinant of an n n matrix A is n jAj = aij ( 1)i j jMij j for any row i j or n jAj = aij ( 1)i j jMij j for any column j i where Mij is the \minor" for aij obtained by deleting the i-th row and j-th column from A. X + =1 X + =1 Example: a a A = a a jAj = a ( 1) " 11 21 jAj 1+1 = = # 12 22 11 Example: a11 ( 1) 1+1 1) 1+2 ja21j a22 a23 a32 a33 + a ( 1) + a ( 1) 1+3 13 1+2 12 ja22j + a12( a11 a12 a13 a21 a22 a23 a31 a32 a33 a21 a23 a31 a33 a21 a22 a31 a32 7 2 = (7)(5) (2)(4) = 27 45 46 45 Example: then 123 4 5 6 = (1)( 1) 58 69 789 2 + (3)( 1) 47 58 4 12 3 45 6 = 3 7 8 10 +(2)( 1) 47 69 3 = (1)( 3) (2)( 6) + (3)( 3) =0 47 48 Defn 1.15: A set of n-dimensional vectors Y Y Yk are linearly independent if there is no set of scalars a a ak such that k 0= aj Yj j and at least one aj is non-zero. 1 1 Properties of determinants: (i) j j = jAj (ii) jAj = product of the eigenvalues of A (iii) jABj = jAj jBj when A and B are square matrices of the same order. (iv) PX 0Q = jP j jQj when P and Q are square matrices of the same order and 0 is a matrix of zeros. (v) jABj = jBAj when the matrix product is dened (vi) jcAj = ckjAj when c is a scalar and A is a k k matrix AT 2 2 X =1 Example: 1 Y = 0 1 1 2 3 6 4 7 5 Y2 2 = 6 4 1 2 1 3 Y3 7 5 1 = 1 1 2 3 6 4 7 5 are linearly independent. 49 Example: Y1 1 = 2 1 2 3 6 4 7 5 Y2 1 = 1 1 2 3 6 4 7 5 Y3 1 = 0 1 2 3 6 4 7 5 are not linearly independent because (1) Y + (1) Y + ( 2) Y = 0 1 3 2 Any two of these vectors are linearly independent, and it is said that this set contains two linearly independent vectors. 50 Defn 1.16: The row rank of a matrix is the number of linearly independent rows, where each row is considered as a vector. Example: 11 1 A= 2 5 1 01 1 The row rank of A is 2 because 3 6 4 7 5 1 2 ( 2) 1 +(1) 5 +( 3) 1 1 and there are no scalars a and a 2 3 2 3 2 6 4 7 5 6 4 7 5 6 4 1 2 0 0 1 = 0 1 0 such that 3 2 3 7 5 6 4 7 5 1 0 0 a 1 + a 1 = 0 1 1 0 except for a = a = 0. 1 51 2 2 3 6 4 7 5 1 2 2 2 3 2 3 6 4 7 5 6 4 7 5 52 Defn 1.17: The column rank of a matrix is the number of linearly independent columns, with each column considered as a vector. Example: 11 1 A= 2 5 1 01 1 Result 1.1: The row rank and the column rank of a matrix are equal. has column rank 2 because 1 1 ( 2) 2 + (1) 5 + (1) 0 1 Defn 1.19: A square matrix Akk is nonsingular if its rank is equal to the number of rows (or columns). This is equivalent to the condition Akk bk = 0k only when b = 0 2 3 6 4 7 5 2 3 2 3 2 6 4 7 5 6 4 7 5 6 4 1 0 1 = 0 1 0 and there are no scalars a and a 1 1 a 2 +a 1 = 0 1 except a = a = 0. 1 1 2 3 6 4 7 5 1 2 2 3 2 3 7 5 6 4 7 5 such that 0 0 0 2 3 2 3 6 4 7 5 6 4 7 5 2 Defn 1.18 The rank of a matrix is either the row rank or the column rank of the matrix. 1 1 A matrix that fails to be nonsingular is called singular. 53 Result 1.2: If B and C are non-singular matrices and products with A are dened, then rank(BA) = rank(AC ) = rank(A): Result 1.3: ( )= = = rank AT A ( ) rank(A) rank(AT ): 54 Defn 1.20: The identity matrix, denoted by I , is a k k matrix of the form 1 0 0 0 0 0 1 0 0 0 0. 0. 1. . . 0. 0. I = . 0 0 0 1 0 0 0 0 0 1 2 3 6 6 6 6 6 6 6 6 4 7 7 7 7 7 7 7 7 5 rank AAT Defn 1.21: The inverse of a square, nonsingular matrix A is the matrix, denoted by A , such that AA =A A=I Example 2 4 = 6=8 4=8 16 1=8 2=8 1 1 " 55 # 1 1 " # 56 Result 1.5: For a k k matrix A, the following are equivalent: Result 1.4 (i) The inverse of A = A 1 = jA1 j " a11 a12 a21 a22 " a22 a21 # (i) A is nonsingular (ii) jAj 6= 0 (iii) A exists is a12 a11 1 # (ii) In general, the (i; j) element of A is 1 ( 1) (i) (AT ) = (A )T (ii) (A B) = B A (iii) jA j = 1=jAj (iv) A is unique and nonsingular (v) (A ) = A (vi) If A is symmetric, than A is symmetric 1 j j i+j A ji jAj Result 1.6: For k k nonsingular matrices A and B 1 1 where Aji is the matrix obtained by deleting the j-th row and i-th column of A. 1 1 1 1 1 1 1 58 57 Result 1.8: If B is a k k non-singular matrix and B + ccT is non-singular, then T (B + ccT ) = B B1 + cccT BB c 1 Result 1.7: Inverse of a Diagonal Matrix 2 6 6 6 6 6 4 a 11 3 22 a 2 6 6 6 6 6 4 ... 11 1=a akk 7 7 7 7 7 5 1 22 1 1 1 Result 1.9 Let In be an n n identity matrix and let Jn = 11T be an n n matrix where each element is one, then (a In + b Jn) = 1a (In a +b nbJn) = 1 3 1=a 1 ... 1=akk 7 7 7 7 7 5 59 60 Defn 1.22: The trace of a k k matrix A = faij g is the sum of the diagonal elements: ( )= tr A k X j =1 ajj Result 1.10 Let A and B denote k k matrices and let c be a scalar. Then, (i) tr(c A) = c tr(A) (ii) tr(A + B) = tr(A) + tr(B) (iii) tr(AB) = tr(BA) (iv) tr(B A B) = tr(A) k k (v) tr(A AT ) = aij # This script is also stored in # # matrix.ssc # ||||||||||||{ # Create an nxn identity matrix # ||||||||||||{ > 1 X X 2 diag(rep(1,4)) [; 1] [; 2] [; 3] [1; ] 1 0 0 [2; ] 0 1 0 [3; ] 0 0 1 [4; ] 0 0 0 [; 4] 0 0 0 1 i=1 j =1 61 #||||||||||# Inverse of a matrix #||||||||||- #||||||||| # Trace of a matrix #||||||||| > 62 w<-matrix(c(1,2,3,4,5,6,7,8,10), 3,3,byrow=T) > w [; 1] [; 2] [; 3] [1; ] 1 2 3 [2; ] 4 5 6 [3; ] 7 8 10 > winv<-solve(w) > winv ; ; ; > w<-matrix(c(1,2,3,4,5,6,7,8,10), 3,3,byrow=T) w [; 1] [; 2] [; 3] [1; ] 1 2 3 [2; ] 4 5 6 [3; ] 7 8 10 > tr<-sum(diag(w)) > tr [1] 16 > [1; ] [2; ] [3; ] > 63 [ 1] 0:6666667 0:6666667 1:0000000 w%*%winv [ 2] [ 3] 1:333333 1 3:666667 2 2:000000 1 [; 1] [; 2] [1; ] 1:000000e + 00 4:440892e 15 [2; ] 8:881784e 16 1:000000e + 00 [3; ] 0:000000e + 00 0:000000e + 00 [; 3] 2:664535e 15 8:881784e 16 1:000000e + 00 64 # ||||||||||| # Determinant of a matrix # ||||||||||| # Another example # Build your own function > x1 <- matrix(c(1,2,3,4,5,6,7,8,9), ncol=3,byrow=T) > x1 [; 1] [; 2] [; 3] [1; ] 1 2 3 [2; ] 4 5 6 [3; ] 7 8 9 > determ(x1) [1] 3.154999e-15 > absdet(x1) [1] 1.631688e-15 determ<-function(M) Re(prod( eigen(M, only.values=T)$values)) > determ(w) [1] -3 > # Another function (V&R, page 101) > absdet <- function(M) abs(prod(diag(qr(M)$qr))) > absdet(w) [1] 3 66 65 #||||||||||||||{ # Rank of a matrix: use the "qr" # function (V&R on p.101) #||||||||||||||{ # Another example > A <- matrix(c(1,1, 1, + 2,5,-1, + 0,1, 1),3,3,byrow=T) A <- matrix(c(1, 1, 1, 2, 5, -1, 0, 1, -1),3,3,byrow=T) > A [; 1] [; 2] [1; ] 1 1 [2; ] 2 5 [3; ] 0 1 > qr(A)$rank [1] 2 > A [; 1] [; 2] [1; ] 1 1 [2; ] 2 5 [3; ] 0 1 > qr(A)$rank [1] 3 > [; 3] 1 1 1 67 [; 3] 1 1 1 68 # Another example > X <- matrix(c(1,1,0,0, + 1,1,0,0, + 1,0,1,0, + 1,0,1,0, + 1,0,0,1, + 1,0,0,1),ncol=4,byrow=T) X [; 1] [; 2] [1; ] 1 1 [2; ] 1 1 [3; ] 1 0 [4; ] 1 0 [5; ] 1 0 [6; ] 1 0 > qr(X)$rank [1] 3 > [; 3] 0 0 1 1 0 0 [; 4] 0 0 0 0 1 1 # Note that the sum of squares # and crossproducts matrix has # the same rank as X XtX <- t(X)%*%X XtX [; 1] [; 2] [; 3] [; 4] [1; ] 6 2 2 2 [2; ] 2 2 0 0 [3; ] 2 0 2 0 [4; ] 2 0 0 2 > qr(XtX)$rank [1] 3 # This is a square symmetric matrix # but the inverse does not exist > > solve(XtX) Problem in solve.qr(a): apparently singular matrix Use traceback() to see the call stack > 69 70 # Note that the function "rank" in Splus # is related to sorting. It computes the # ranks of the elements of a vector. # (V&R on page 45) # ||||||||||||# Create an nxn identity matrix # ||||||||||||- rank(c(1.2, 5.1, 3.5, 9.8)) [1] 1 3 2 4 > > > 71 I4<-diag(rep(1,4)) I4 [; 1] [; 2] [; 3] [1; ] 1 0 0 [2; ] 0 1 0 [3; ] 0 0 1 [4; ] 0 0 0 [; 4] 0 0 0 1 72 #|||||||||||||||# Compute row sums or column sums #|||||||||||||||> sum(w) [1] 46 #||||||||| # Trace of a matrix #||||||||| apply(w,1,sum) [1] 6 15 25 > w<-matrix(c(1,2,3,4,5,6,7,8,10), 3,3,byrow=T) > w [; 1] [; 2] [; 3] [1; ] 1 2 3 [2; ] 4 5 6 [3; ] 7 8 10 > tr<-sum(diag(w)) > tr [1] 16 > apply(w,2,sum) [1] 12 15 19 > apply(w,1,prod) [1] 6 120 560 > 73 Defn 1.23: A square matrix A is said to be orthogonal if A AT = AT A = I (then A = AT ) 1 Examples: p p = 11==p22 11==p22 " A 2 A = 6 6 6 6 6 6 4 1 2 1 2 1 2 1 2 1 2 1 2 1 2 1 2 1 2 1 2 1 2 1 2 # 1 2 1 2 1 2 1 2 apply(w,1,mean) [1] 2.000000 5.000000 8.333333 > apply(w,1,var) [1] 1.000000 1.000000 2.333333 > 74 Defn 1.24: A square matrix P is idempodent if P P = P Example 2 P = 6 6 6 4 5 6 2 6 1 6 2 6 2 6 2 6 1 6 2 6 5 6 3 7 7 7 5 Example (linear regression) Y = X + The least squares estimator is b = (X T X ) X T Y The estimated means are ^ = X (X T X ) X T Y Y and the residuals are e = (I X (X T X ) X T )Y Both X (X T X ) X T and I X (X T X ) are idempodent matrices. 1 3 7 7 7 7 7 7 5 1 In each case the columns of A are coeÆcients for orthogonal contrasts. 75 1 1 1X T 76 Defn 1.25: Let A be a k k matrix and let Y be a vector of order k, then k k YT A Y = Yi Yj aij i j is called a quadratic form. X X =1 =1 Defn 1.26: A k k matrix A is said to be positive denite if YT A Y > 0 for any Y = (Y ; : : : ; Yk)T 6= 0. 1 Defn 1.27: A k k matrix A is said to be nonnegative denite (or positive semi-denite) if YT A Y 0 for any Y = (Y ; : : : ; Yk)T . Eigenvalues and Eigenvectors Defn 1.28: For a k k matrix A, the scalars k satisfying the polynomial equation jA I j = 0 are called the eigenvalues (or characteristic roots) of A. 1 2 Defn 1.29: Corresponding to any eigenvalue i is an eigenvector (or characteristic vector) ui 6= 0 satisfying A ui = i i . 1 77 78 Comment: Eigenvectors are not unique (i) If ui is an eigenvector for i, then c ui is also an eigenvector for any scalar c 6= 0. Example: (ii) We will adopt the following conventions (for real symmetric matrices) uTi ui = 1 for all i = 1; : : : ; k T ui uj = 0 for all i 6= j Eigenvalues are solutions to (iii) Even with (ii), eigenvectors are not unique If ui is an eigenvector satisfying (ii), then ui is also an eigenvector satisfying (ii). If i = j then there are an innite number of choices for ui and uj . 79 1:96 0:72 A= 0:72 1:54 " 0 = = 1:96 0:72 0:72 1:54 1:96 0:72 # 0 0 0:72 1:54 )(1:54 ) (0:72)2 3:5 + 2:5 = a 2 + b + c = (1:96 = 2 " 1 " -3.5 " 2.5 80 Find the eigenvectors: " Solutions to a quadratic equation: = b q b2 2a 4ac ) ) 3:5 p12:25 1 10 2 = 2:5 and = 1 ) ) 1:96 0:72 0:72 1:54 # " A u11 u12 i = i i # = 2:5 " u11 u12 # 1:96 u + 0:72 u = 2:5 u 0:72 u + 1:54 u = 2:5 u u = 0:75 u 11 12 11 11 12 12 12 11 2 then " u1 = 0:75c c # 81 Find an eigenvector for = 1 To satisfy our convention we must have 1= Consequently, c uT1 u1 2 " = c + 0:5625 c 2 82 2 ) = 0:8 or c = 0:8 ) then = 00::86 or " u1 # = 00::86 " u1 1:96 0:72 0:72 1:54 # # = (1) u22 21 22 21 21 22 22 = " u21 u22 # 4 3 u21 u2 83 u21 u22 1:96 u + 0:72 u = u 0:72 u + 1:54 u = u Then # " = " c # 4 3 c 84 To satisfy our convention, we must have Result 1.11 For a k k symmetric matrix A with elements that are real numbers 1 = uT u = c + 169c Consequently, c = 0:6 or c = 0:6 and = 00::86 " u2 # or (i) every eigenvalue of A is a real number (ii) rank(A) = number of non-zero eigenvalues (iii) if A is non-negative denite, then i 0 for all i = 1; 2; : : : ; k (iv) if A is positive denite then i > 0 for all i = 1; 2; : : : ; k k k (v) trace(A) = aii = i i i (vi) jAj = ki i (vii) if A is idempodent (A A = A), then the eigenvalues are either zero or one. 2 2 2 2 = 00::68 " u2 # X =1 =1 =1 In either case, uT u = 0. 2 1 X 86 85 Result 1.12: Spectral decomposition. The spectral decomposition of a k k symmetric matrix A with eigenvalues : : : k and eigenvectors u ; u ; : : : ; uk (with uTi ui = 1 and uTi uj = 0) is A = u uT + u uT + + k uk uT k = U D UT where 1 1 Result 1.13: If A is a k k symmetric nonsingular matrix with spectral decomposition k T A= i ui uT i = U DU i then X =1 2 2 1 1 2 D 2 1 = 6 6 6 4 1 2 2 3 2 and ... 1 1 k X i=1 i 1 ui uTi =UD (ii) the square root matrix k A = = i has the properties: (a) A = A = = A (b) A = A A = = I (c) A = is symmetric 1 2 7 7 7 5 k 2 1 2 1 2 1 2 1 1 UT X q =1 = [u j u j j uk ] is an orthogonal matrix. U (i) A = i ui uT i 1 2 1 2 87 88 (iii) The inverse square root matrix k A = = p1i ui uTi i = UD = UT has the properties: (a) A = A = = A (b) A = A A = = I (c) A = is symmetric Result 1.14: Singular value decomposition Any p q matrix A of rank r can be expressed as 0 MT A=L 0 0 where In parts (ii) and (iii), A should be positive denite to ensure that k > 0 (i) Lpp and Mqq are orthogonal matrices (ii) rr is a diagonal matrix with = containing the positive (non-zero) eigenvalues of AT A and A AT 1 2 X =1 1 2 1 2 1 2 1 2 " 1 1 2 1 2 1 2 # 2 Note that AT A and A AT are non-negative definite and suitable L and M matrices can always be found but they are not unique. 90 89 # |||||||||||# Eigenvalues & Eigenvectors # |||||||||||- # ||||||||||||{ # Singular Value Decomposition # ||||||||||||{ A <- matrix(c(1.96,.72,.72,1.54), 2,2,byrow=T) > A [; 1] [; 2] [1; ] 1:96 0:72 [2; ] 0:72 1:54 > EA <- eigen(A) > EA $values: [1] 2.5 1.0 $vectors:[; 1] [; 2] [1; ] 0:8 0:6 [2; ] 0:6 0:8 > A<-matrix(c(2,0,1,1,0,2,1,1,1,1,1,1), ncol=4,byrow=T) > A [; 1] [; 2] [; 3] [; 4] [1; ] 2 0 1 1 [2; ] 0 2 1 1 [3; ] 1 1 1 1 > svdA <- svd(A) > svdA $d: [1] 3.464102 2.000000 0.000000 > 91 92 $v: [; 1] [; 2] [1; ] 0:5 7:071068e 001 [2; ] 0:5 7:071068e 001 [3; ] 0:5 1:226910e 016 [4; ] 0:5 9:065285e 017 $u: [; 1] [1; ] 0:5773503 7:071068e [2; ] 0:5773503 7:071068e [3; ] 0:5773503 7:597547e > svdA$u %*% t(svdA$u) [; 3] 0:5 0:5 0:5 0:5 > [ 1] [ 2] [ 3] [1 ] 1 000000 + 000 5 116079 017 2 775558 017 [2 ] 5 116079 017 1 000000 + 000 3 405750 017 [3 ] 2 775558 017 3 405750 017 1 000000 + 000 ; [; 2] [; 3] 001 0:4082483 001 0:4082483 017 0:8164966 > ; ; ; : e : e : e ; : e : e : e ; : e : e : e ; ; ; : e : e : e ; : e : e : e ; : e : e : e svdA$u%*%diag(svdA$d)%*%t(svdA$v) [1; ] [2; ] [3; ] [ 1] [ 2] [ 3] [1 ] 1 000000 + 000 9 310586 018 3 089976 018 [2 ] 9 310586 018 1 000000 + 000 3 244475 017 [3 ] 3 089976 018 3 244475 017 1 000000 + 000 ; t(svdA$v) %*% svdA$v [; 1] 2:000000e + 000 9:074772e 017 1:000000e + 000 [; 2] [; 3] [; 4] 1:557456e 016 1 1 2:000000e + 000 1 1 2:000000e + 000 1 1 94 93 # An example where the singular values # are the eigenvalues A <- matrix(c(1.96,.72,.72,1.54), 2,2,byrow=T) > A [; 1] [; 2] [1; ] 1:96 0:72 [2; ] 0:72 1:54 > svdA <- svd(A) > svdA $d: [1] 2.5 1.0 $v: [; 1] [; 2] [1; ] 0:8 0:6 [2; ] 0:6 0:8 $u: [; 1] [; 2] [1; ] 0:8 0:6 [2; ] 0:6 0:8 > diag(svdA$d) % % diag(svdA$d) [; 1] [; 2] [; 3] [1; ] 12 0 0 [2; ] 0 4 0 [3; ] 0 0 0 > eigen(A %*% t(A))$values [1] 1.200000e+001 4.000000e+000 4.440892e-016 > eigen.(t(A) %*% A)$values [1] 1.200000e+001 4.000000e+000 -2.167786e-016 -3.238078e-015 > 95 96 #||||||||||||||{ # Trace and determinant of a matrix #||||||||||||||{ eigenA <- eigen(A) > eigenA $values: > > A <- matrix(c(1,1, 1, 2,5,-1, 0,1, 1),3,3,byrow=T) > A [; 1] [; 2] [; 3] [1; ] 1 1 1 [2; ] 2 5 1 [3; ] 0 1 1 > traceA <- sum(diag(A)) > traceA [1] 7 [1] 5.336912+0.0000000i 0.831544-0.6578603i 0.831544+0.6578603i $vectors: [1; ] [2; ] [3; ] [1; ] [2; ] [3; ] [; 1] 0:2852888 + 0i 1:0054394 + 0i 0:2318330 + 0i [; 2] 1:7077352 + 0:3055786i 0:7100770 0:2551929i 0:6234268 0:9197348i [; 3] 1:7077352 0:3055786i 0:7100770 + 0:2551929i 0:6234268 + 0:9197348i 97 98 # An example where the eigenvalues # are real numbers. A <- matrix(c(1,1, 1, 2,5,-1, 0, 1, -1),3,3,byrow=T) > A [; 1] [; 2] [; 3] [1; ] 1 1 1 [2; ] 2 5 1 [3; ] 0 1 1 > eigenA <- eigen(A) > eigenA > + + traceA <- sum(eigenA$values) > traceA [1] 7+0i > Re(traceA) [1] 7 > detA <- Re(prod(eigenA$values)) > detA [1] 6 > $values: [1] 5.372281e+00 -3.722813e-01 -1.405092e-16 99 100 #||||||||||||||||{ # Eigenvalues of a square symmetric matrix #||||||||||||||||{ $vectors: [; 1] [; 2] [; 3] [1; ] 0:2608539 3:269275 0:8164966 [2; ] 0:9858217 1:730136 0:4082483 [3; ] 0:1547047 2:756228 0:4082483 > traceA <- sum(eigenA$values) > traceA [1] 5 > detA <- Re(prod(eigenA$values)) > detA [1] 2.810183e-16 101 SVDA <- svd(A) SVDA $d: [1] 12.245772 4.433349 2.320879 $v: [; 1] [; 2] [; 3] [1; ] 0:2347350 0:7321107 0:6394634 [2; ] 0:5764345 0:4248579 0:6980108 [3; ] 0:7827022 0:5324563 0:3222848 $u: [; 1] [; 2] [; 3] [1; ] 0:2347350 0:7321107 0:6394634 [2; ] 0:5764345 0:4248579 0:6980108 [3; ] 0:7827022 0:5324563 0:3222848 > > 103 A<-matrix(c(4,2,-1,2,6,-4,-1,-4,9), 3,3,byrow=T) > A [; 1] [; 2] [; 3] [1; ] 4 2 1 [2; ] 2 6 4 [3; ] 1 4 9 > EA <- eigen(A) > EA $values: [1] 12.245772 4.433349 2.320879 $vectors: [; 1] [; 2] [; 3] [1; ] 0:2347350 0:7321107 0:6394634 [2; ] 0:5764345 0:4248579 0:6980108 [3; ] 0:7827022 0:5324563 0:3222848 > 102 #||||||||||||||{ # An example of a square symmetric # matrix that is not positive denite #||||||||||||||{ W<-matrix(c(4,2,-1,2,6,-4,-1,-4,-9), 3,3,byrow=T) > W [; 1] [; 2] [; 3] [1; ] 4 2 1 [2; ] 2 6 4 [3; ] 1 4 9 > EW <- eigen(W) > EW $values: [1] 8.151345 2.865783 -10.017128 $vectors: [; 1] [; 2] [; 3] [1; ] 0:4665008 0:88381658 0:0352886 [2; ] 0:8550024 0:46079428 0:2379907 [3; ] 0:2266009 0:08085101 0:9706262 > 104 > t(EW$vectors)%; %EW$vectors; [1; ] [2; ] [3; ] [ 1] 1:000000e + 00 1:353084e 16 1:665335e 16 [ 2] 1:353084e 16 1:000000e + 00 6:938894e 17 SVDW <- svd(W) > SVDW $d: [1] 10.017128 8.151345 2.865783 $v: [; 1] [; 2] [1; ] 0:0352886 0:4665008 [2; ] 0:2379907 0:8550024 [3; ] 0:9706262 0:2266009 $u: [; 1] [; 2] [1; ] 0:0352886 0:4665008 [2; ] 0:2379907 0:8550024 [3; ] 0:9706262 0:2266009 [; 3] 1:665335e 16 6:938894e 17 1:000000e + 00 > [; 3] 0:88381658 0:46079428 0:08085101 [; 3] 0:88381658 0:46079428 0:08085101 #||||||||||||{ # Inverse of a matrix #||||||||||||{ > A<-matrix(c(1.96,.72,.72,1.54), 2,2,byrow=T) > Ainv <- solve(A) > Ainv [; 1] [; 2] [1; ] 0:616 0:288 [2; ] 0:288 0:784 > A% %Ainv [; 1] [; 2] [1; ] 1:000000e + 000 1:638772e 016 [2; ] 7:548758e 017 1:000000e + 000 105 # Use the spectral decomposition # to compute the inverse of a matrix > Aev<-eigen(A)$vectors > Aeval<-eigen(A)$values > Ainv2<-Aev% %diag(1/Aeval)% %t(Aev) > Ainv2 [; 1] [; 2] [1; ] 0:616 0:288 [2; ] 0:288 0:784 107 106 #|||||||||||||| # Solutions to linear equations #|||||||||||||| x<-c(1,1) x [1] 1 1 > b<-solve(A,x) > b [1] 0.328 0.496 > > 108 2 2. Vector Spaces Euclidean space: " x1 x2 x3 3 A vector of order 3, x = represents a point in 3-dimensional Euclidean space (denoted by R ). 6 4 7 5 3 # A vector x = xx of order 2 represents a point in a plane 1 2 Note that any x as Note that any point in the plane can be represented as x = x 10 + x 01 x " # 1 2 " # " 1 # 2 6 4 2 - % basis vectors The entire plane is denoted by R : x = 2 6 4 x1 x2 x3 3 7 5 =x 2 2 6 14 x1 x2 x3 3 7 5 R3 can be expressed 1 0 0 0 +x 1 +x 0 0 0 1 % % % basis vectors for R 3 2 3 2 3 7 5 24 6 7 5 34 6 7 5 3 110 109 2 x1 x2 3 A vector of order n, x = . represents xn a point in n-dimensional Euclidean space (denoted by Rn). Rn is a special case of a more general concept of a vector space. 6 6 6 4 7 7 7 5 Defn 2.2: If every vector in some vector space S can be expressed as a linear combination a x + a x + + ak xk of a set of k vectors x ; x ; ; xk, this set of vectors is said to span the vector space S. 1 1 2 2 1 2 Defn 2.3: If a set of vectors x ; x ; ; xk span S and are linearly independent, then the set is called a basis for S. 1 2 Defn 2.1: A set of vectors, denoted by S, is a vector space if for every pair of vectors xi and xj in S we have (i) xi + xj is a vector in S (ii) a xi is in S for any real scalar. 111 112 Comments: Example: (i) The number of vectors in a basis for a vector space S is called the dimension of S (dim(S)). (ii) 0 belongs to every vector space in Rn. 2 x1 = 4 1 1 1 3 5 2 x2 = 1 1 0 4 3 5 2 x3 = 3 1 0 1 4 5 2 x4 = 0 1 1 4 3 5 span R , but are not a basis for R . 3 3 (iii) A vector space can have many bases. x1 1 = 1 1 2 3 6 4 7 5 = x2 2 6 4 1 1 0 3 x3 7 5 = 2 6 4 1 0 1 3 7 5 are a basis for R . 3 113 then Note that 1x 3 1 1x 3 1 f rac 114 13x 1 1 + 13 x + 31x = 0 0 2 x + 1x = 01 3 3 0 0 + 13 x 32x = 0 1 2 2 2 3 3 3 2 3 6 4 7 5 2 3 6 4 7 5 2 3 6 4 7 5 2 6 4 a b c 3 7 5 1 0 0 = a 0 +b 1 +c 0 0 0 1 = a + 3b + c x + a 23b + c x + a + b3 2c x 2 3 2 3 2 3 6 4 7 5 6 4 7 5 6 4 7 5 1 2 3 115 116 Example 1 1 3 x = 2 x = 0 x = 2 1 1 1 do not span R . Any two of these vectors provides a basis for a 2-dimensional subspace of R . Note that x = x + 2 x , which implies that x = x 2 x and x = 0:5(x x ). Then, for any z = a x + b x we have z = a(x 2x ) + bx = (b 2a)x + a x and b x ) z = a x + (x 2 = (a 2b )x + 2b x 1 2 3 6 4 7 5 2 2 3 6 4 7 5 3 2 3 6 4 7 5 This 2-dimensional subspace of R is the vector space consisting of all vectors of the form 3 3 3 3 1 1 3 z 1 1 = a 2 +b 0 1 1 a+b = 2a = ax + bx 1 2 2 2 2 1 3 1 2 3 2 3 6 4 7 5 6 4 7 5 2 3 6 4 7 5 b a 2 3 2 2 2 3 3 1 1 1 3 117 Random vectors: 2 2 Defn 2.4: A random vector Y = . is a Yn vector whose elements are random variables. 6 4 Mean vectors: E (Y ) = where i = E(Yi) = 2 6 4 8 R > > > > > > > > > > > > > < Y1 6 4 (. ) = . = n E (Yn) E Y1 3 2 7 5 6 4 1 y f (y )dy 1 i X > > all possible > > y values > > > > > > > > > : y pi(y) Covariance matrix: 2 6 6 6 6 6 6 4 23 n1 n2 n3 2 1 7 5 = V ar(Y) = . 2 if Yi is a continuous random variable with density function fi(y ) if Yi is a discrete random variable with probability function pi(y ): 119 12 13 2 21 2 with variances V ar (Yi ) = i = E (Yi 3 1 118 1 (y 1 = 8 R > > > > > > > > > < X > > > > > > > > > : all y (y . 2n ... . 2 n 3 7 7 7 7 7 7 5 ) i 2 i)2fi(y)dy i)2 pi(y) . 1n if yi is a continuous random variable if yi is a discrete random variable 120 and covariances: ij = Cov (Yi; Yj ) = E (Yi h i )(Yj j ) 2 i where 1 1 ij = (y i)(v j )fij (y; v)dy dv 1 1 if Yi and Yj are continuous random variables with joint density function fij (y; v) Z Z and (y i)(v j )Pij (y; v) all all if Yi and Yj are discrete random variables with joint probability function Pij (y; v) = P r (Yi = y; Vj = v ) ij = X X y v Result 2.1: Y Let Y = . be a random vector with 121 6 4 3 1 7 5 Yn and let = E (Y ) and = V ar(Y), P a n Apn = . ap apn be a matrix of non-random elements, and let c d c= . and d = . cn dp be vectors of non-random elements, then 2 6 4 a11 3 1 7 5 1 2 6 4 3 1 7 5 2 6 4 3 1 7 5 (i) E(AY + d) = A + d (ii) V ar(AY + d) = AAT (iii) E(cT Y) = cT (iv) V ar(cT Y) = cT c 122