Preliminary Linear Algebra 1 c Copyright 2012 Dan Nettleton (Iowa State University) Statistics 611 1 / 100 Notation ∀ for all ∃ there exists 3 such that ∴ therefore ∵ because end of proof (QED) c Copyright 2012 Dan Nettleton (Iowa State University) Statistics 611 2 / 100 Notation A =⇒ B A implies B A ⇐⇒ B A if and only if (iff) B a∈B a is an element of the set B A⊂B A is a proper subset of B A⊆B A is a subset of B Rn Euclidean n-space c Copyright 2012 Dan Nettleton (Iowa State University) Statistics 611 3 / 100 Matrix Notation a11 a12 · · · a1n a21 a22 · · · a2n A = [aij ] = . is a matrix with m rows and n .. .. . m×n . . m×n . . . . am1 am2 · · · amn columns. The entry in the ith row and jth column of A is aij . c Copyright 2012 Dan Nettleton (Iowa State University) Statistics 611 4 / 100 Square Matrices and Vectors Matrixm×n A is said to be square if m = n. A matrix with one column is called a vector. A matrix with one row is called a row vector. c Copyright 2012 Dan Nettleton (Iowa State University) Statistics 611 5 / 100 Notation In these STAT611 notes, bold uppercase letters are typically used to denote matrices, and bold lowercase letters are typically used to denote vectors. c Copyright 2012 Dan Nettleton (Iowa State University) Statistics 611 6 / 100 Examples 0 or 0n is a vector of zeros. 1 or 1n is a vector of ones. I or In or In×n is the identity matrix. For example, 1 0 0 I3 = 0 1 0 = diag(1, 1, 1). 0 0 1 c Copyright 2012 Dan Nettleton (Iowa State University) Statistics 611 7 / 100 Special Types of Square Matrices A square matrixn×n A is upper triangular if aij = 0, ∀ i > j. A square matrixn×n A is lower triangular if aij = 0, ∀ i < j. A square matrixn×n A is diagonal if aij = 0, ∀ i 6= j. Write one example for each of these types of matrices. c Copyright 2012 Dan Nettleton (Iowa State University) Statistics 611 8 / 100 Matrix Transpose Ifm×n A = [aij ], the transpose of A, denoted A0 , is the matrixn×m B = [bij ], where bij = aji , ∀ i = 1, . . . , m; j = 1, . . . , n. That is, B = A0 is the matrix whose columns are the rows of A and whose rows are the columns of A. c Copyright 2012 Dan Nettleton (Iowa State University) Statistics 611 10 / 100 A Symmetric Matrix A square matrixn×n A is symmetric if A = A0 . c Copyright 2012 Dan Nettleton (Iowa State University) Statistics 611 11 / 100 Examples Find the transpose of " 4 −2 3 # . 7 Provide an example of a symmetric matrix. c Copyright 2012 Dan Nettleton (Iowa State University) Statistics 611 12 / 100 Matrix Multiplication Suppose a01 A = [aij ] = ... = [a(1) , . . . , a(n) ] m×n a0m and 0 b1 .. B = [bij ] = . = [b(1) , . . . , b(k) ]. n×k b0n Then A B = C = [cij = m×n n×k n X m×k ail blj ] = [cij = a0i b(j) ] l=1 a01 B n X = [Ab(1) , . . . , Ab(k) ] = ... = a(l) b0l . l=1 a0m B c Copyright 2012 Dan Nettleton (Iowa State University) Statistics 611 14 / 100 Matrix Multiplication Suppose A= " # 1 2 3 4 and B= Work out AB using AB = c Copyright 2012 Dan Nettleton (Iowa State University) " # 5 6 . 7 8 (l) 0 l=1 a bl . Pn Statistics 611 15 / 100 Transpose of a Matrix Product (AB)0 = B0 A0 The transpose of a product is the product of the transposes in reverse order. c Copyright 2012 Dan Nettleton (Iowa State University) Statistics 611 17 / 100 Scalar Multiplication of a Matrix If c ∈ R, then c times the matrix A is the matrix whose i, jth element is c times the i, jth element of A; i.e., ca11 ca12 · · · ca1n ca21 ca22 · · · ca2n cm×n A = c[aij ] = [caij ] = . . . . . .. .. .. .. m×n m×n cam1 cam2 · · · camn c Copyright 2012 Dan Nettleton (Iowa State University) Statistics 611 18 / 100 Linear Combination If x1 , . . . , xn ∈ Rm and c1 , . . . , cn ∈ R, then n X ci xi = c1 x1 + · · · + cn xn i=1 is a linear combination (LC) of x1 , . . . , xn . The coefficients of the LC are c1 , . . . , cn . c Copyright 2012 Dan Nettleton (Iowa State University) Statistics 611 19 / 100 Linear Independence and Linear Dependence A set of vectors x1 , . . . , xn is linearly independent (LI) if n X ci xi = 0 ⇐⇒ c1 = · · · = cn = 0. i=1 A set of vectors x1 , . . . , xn is linearly dependent (LD) if ∃ c1 , . . . , cn not all 0 3 n X ci xi = 0. i=1 c Copyright 2012 Dan Nettleton (Iowa State University) Statistics 611 20 / 100 Prove or disprove: any set of vectors that contains the vector 0 is LD. Prove or disprove: The following set of vectors is LI. 7 9 −5 , 4 , −6 . 3 1 7 c Copyright 2012 Dan Nettleton (Iowa State University) 1 Statistics 611 21 / 100 Fact V1: The nonzero vectors x1 , . . . , xn are LD ⇐⇒ xj is a LC of x1 , . . . , xj−1 for some j ∈ {2, . . . , n}. c Copyright 2012 Dan Nettleton (Iowa State University) Statistics 611 24 / 100 Orthogonality The two vectors x, y are orthogonal to each other if their inner product is zero, i.e., 0 0 xy=yx= n X xi yi = 0. i=1 The length of a vector, also known as its Euclidean norm, is kxk := √ v u n uX 0 xx=t x2 . i i=1 c Copyright 2012 Dan Nettleton (Iowa State University) Statistics 611 26 / 100 The vectors x1 , . . . , xn are mutually orthogonal if x0i xj = 0, ∀ i 6= j. The vectors x1 , . . . , xn are mutually orthonormal if x0i xj = 0 ∀ i 6= j, and kxi k = 1 c Copyright 2012 Dan Nettleton (Iowa State University) ∀ i = 1, . . . , n. Statistics 611 27 / 100 Write down a set of mutually orthogonal but not mutually orthonormal vectors. Write down a set of mutually orthonormal vectors. c Copyright 2012 Dan Nettleton (Iowa State University) Statistics 611 28 / 100 Orthogonal Matrix A square matrix with mutually orthonormal columns is called an orthogonal matrix. c Copyright 2012 Dan Nettleton (Iowa State University) Statistics 611 30 / 100 Show that if Q is orthogonal, then Q0 Q = I. Show that if Q is orthogonal and x is any vector of appropriate dimension, then kQxk = kxk. c Copyright 2012 Dan Nettleton (Iowa State University) Statistics 611 31 / 100 An orthogonal matrix Q is sometimes called a rotation matrix because if a vector x is premultiplied by Q, the result (Qx) is the vector x rotated n×1 to a new position in n×n Rn . c Copyright 2012 Dan Nettleton (Iowa State University) Statistics 611 34 / 100 Vector Space A vector space S is a set of vectors that is closed under addition (i.e., if x1 ∈ S, x2 ∈ S, then x1 + x2 ∈ S) and closed under scalar multiplication (i.e., if c ∈ R, x ∈ S, then cx ∈ S). In other words, c1 x1 + c2 x2 ∈ S c Copyright 2012 Dan Nettleton (Iowa State University) ∀ c1 , c2 ∈ R; x1 , x2 ∈ S. Statistics 611 35 / 100 Is {x ∈ Rn : kxk = 1} a vector space? Is {x ∈ Rn : 10 x = 0} a vector space? Is {n×m Ax : x ∈ Rm } a vector space? c Copyright 2012 Dan Nettleton (Iowa State University) Statistics 611 36 / 100 Generators of a Vector Space A vector space S is said to be generated by a set of vectors x1 , . . . , xn if x ∈ S =⇒ x = n X ci xi for some c1 , . . . , cn ∈ R. i=1 c Copyright 2012 Dan Nettleton (Iowa State University) Statistics 611 40 / 100 Span of a Set of Vectors The span of vectors x1 , . . . , xn is the set of all LC of x1 , . . . , xn , i.e., span{x1 , . . . , xn } = ( n X ) ci xi : c1 , . . . , cn ∈ R . i=1 span{x1 , . . . , xn } is the vector space generated by x1 , . . . , xn . c Copyright 2012 Dan Nettleton (Iowa State University) Statistics 611 41 / 100 Find a set of vectors that generates the space {x ∈ R3 : 10 x = 0}; i.e., find a set of vectors whose span is S = {x ∈ R3 : 10 x = 0}. c Copyright 2012 Dan Nettleton (Iowa State University) Statistics 611 42 / 100 Basis of a Vector Space If a vector space S is generated by LI vectors x1 , . . . , xn , then x1 , . . . , xn form a basis for S. c Copyright 2012 Dan Nettleton (Iowa State University) Statistics 611 44 / 100 Fact V2: Suppose a1 , . . . , an form a basis for a vector space S. If b1 , . . . , bk are LI vectors in S, then k ≤ n. c Copyright 2012 Dan Nettleton (Iowa State University) Statistics 611 45 / 100 Proof of Fact V2: Because a1 , . . . , an form a basis for S and b1 ∈ S, b1 = Pn i=1 ci ai for some c1 , . . . , cn ∈ R. Thus, a1 , . . . , an , b1 are LD by Fact V1. Again, using V1, we have aj a LC of b1 , a1 , . . . , aj−1 for some j ∈ {1, 2, . . . , n}. c Copyright 2012 Dan Nettleton (Iowa State University) Statistics 611 46 / 100 Thus, b1 , a1 , . . . , aj−1 , aj+1 , . . . , an generate S. It follows that b1 , a1 , . . . , aj−1 , aj+1 , . . . , an , b2 is a LD set of vectors by V1. Again by V1, one of the vectors b1 , b2 , a1 , . . . , aj−1 , aj+1 , . . . , an is a LC of the preceding vectors. It is not b2 ∵ b1 , . . . , bk are LI. c Copyright 2012 Dan Nettleton (Iowa State University) Statistics 611 47 / 100 Thus b1 , b2 and n − 2 of a1 , . . . , an generate S. If k > n, we can continue adding b vectors and deleting a vectors to get b1 , . . . , bn generates S. However, then V1 would imply b1 , . . . , bn+1 are LD. This contradicts LI of b1 , . . . , bk ∴ k ≤ n. c Copyright 2012 Dan Nettleton (Iowa State University) Statistics 611 48 / 100 Fact V3: If {a1 , . . . , an } and {b1 , . . . , bk } each provide a basis for a vector space S, then n = k. Proof: By V2, we have k ≤ n and n ≤ k. ∴ k = n. c Copyright 2012 Dan Nettleton (Iowa State University) Statistics 611 49 / 100 Dimension of a Vector Space A basis for a vector space is not unique, but the number of vectors in the basis, known as dimension of the vector space, is unique. c Copyright 2012 Dan Nettleton (Iowa State University) Statistics 611 50 / 100 Find dim(S), the dimension of vector space S, for S = {x ∈ R3 : 10 x = 0}. c Copyright 2012 Dan Nettleton (Iowa State University) Statistics 611 51 / 100 Consider the set {0}. Is this a vector space? If so, what is its n×1 dimension? c Copyright 2012 Dan Nettleton (Iowa State University) Statistics 611 53 / 100 Fact V4: Suppose a1 , . . . , an are LI vectors in a vector space S with dimension n. Then a1 , . . . , an form a basis for S. c Copyright 2012 Dan Nettleton (Iowa State University) Statistics 611 55 / 100 Fact V5: If a1 , . . . , ak are LI vectors in an n-dimensional vector space S, then there exists a basis for S that contains a1 , . . . , ak . c Copyright 2012 Dan Nettleton (Iowa State University) Statistics 611 57 / 100 Fact V6: If a1 , . . . , ak are LI and orthonormal vectors in Rn , then there exist ak+1 , . . . , an such that a1 , . . . , an are LI and orthornormal. Proof: HW problem. c Copyright 2012 Dan Nettleton (Iowa State University) Statistics 611 59 / 100 Rank of a Matrix It can be shown that the (maximum) number of LI rows of a matrixm×n A is the same as the (maximum) number of LI columns ofm×n A. This number of LI rows or columns is known as the rank ofm×n A and is denoted rank(m×n A) or r(m×n A). c Copyright 2012 Dan Nettleton (Iowa State University) Statistics 611 60 / 100 If r(m×n A) = m,m×n A is said to have full row rank. If r(m×n A) = n,m×n A is said to have full column rank. c Copyright 2012 Dan Nettleton (Iowa State University) Statistics 611 61 / 100 Inverse of a Matrix If r(n×n A) = n, there exists a matrixn×n B such thatn×n A n×n B =n×n I. Such a matrix B is called the inverse of A and is denoted A−1 . c Copyright 2012 Dan Nettleton (Iowa State University) Statistics 611 62 / 100 Prove that r(n×n A) = n ⇐⇒ ∃n×n B 3n×n A n×n B =n×n I. Prove thatn×n A n×n B =n×n I =⇒n×n B n×n A =n×n I Thus AA−1 = A−1 A = I. c Copyright 2012 Dan Nettleton (Iowa State University) Statistics 611 63 / 100 Singular / Nonsingular Matrix If r(n×n A) = n,n×n A is said to be nonsingular. If r(n×n A) < n,n×n A is said to be singular. c Copyright 2012 Dan Nettleton (Iowa State University) Statistics 611 68 / 100 Inverse of a Nonsingular 2 × 2 Matrix " a b #−1 = c d " a b c d " 1 ad−bc d −b −c a # if ad − bc 6= 0. # is singular if ad − bc = 0. c Copyright 2012 Dan Nettleton (Iowa State University) Statistics 611 69 / 100 Column Space of a Matrix A, denoted by C(A), is the vector space The column space of a matrixm×n generated by the columns ofm×n A; i.e., C(A) = {Ax : x ∈ Rn }. c Copyright 2012 Dan Nettleton (Iowa State University) Statistics 611 70 / 100 dim(C(A)) = r(m×n A) because the largest possible subset of LI columns of A is a basis for C(A). m×n c Copyright 2012 Dan Nettleton (Iowa State University) Statistics 611 71 / 100 Let 1 0 3 A= 2 1 4 . 4 2 8 Find r(A). Give a basis for C(A). Characterize C(A). c Copyright 2012 Dan Nettleton (Iowa State University) Statistics 611 72 / 100 Result A.1: rank(AB) ≤ min{rank(A), rank(B)} c Copyright 2012 Dan Nettleton (Iowa State University) Statistics 611 76 / 100 Proof of Result A.1: Let b1 , . . . , bn denote the columns of B so that B = [b1 , . . . , bn ]. Then AB = [Ab1 , . . . , Abn ]. This implies that the columns of AB are in C(A). c Copyright 2012 Dan Nettleton (Iowa State University) Statistics 611 77 / 100 dim(C(A)) is rank(A). There does not exist a set of LI vectors in C(A) with more than rank(A) vectors by Fact V2. It follows that rank(AB) ≤ rank(A). c Copyright 2012 Dan Nettleton (Iowa State University) Statistics 611 78 / 100 It remains to show that rank(AB) ≤ rank(B). Note that rank(AB) is the same as rank((AB)0 ) = rank(B0 A0 ). Our previous argument shows that rank(B0 A0 ) ≤ rank(B0 ) = rank(B). c Copyright 2012 Dan Nettleton (Iowa State University) Statistics 611 79 / 100 Provide an example where rank(AB) < min{rank(A), rank(B)}. c Copyright 2012 Dan Nettleton (Iowa State University) Statistics 611 80 / 100 Result A.2: (a) If A = BC, then C(A) ⊆ C(B), (b) If C(A) ⊆ C(B), then there exist C such that A = BC. c Copyright 2012 Dan Nettleton (Iowa State University) Statistics 611 82 / 100 Null Space of a Matrix The null space of a matrix A, denoted N (A) is defined as N (A) = {y : Ay = 0}. Note that N (A) is the set of vectors orthogonal to every row of A. c Copyright 2012 Dan Nettleton (Iowa State University) Statistics 611 85 / 100 A vector in N (A) can also be seen as a vector of coefficients corresponding to a LC of the columns of A that is 0. c Copyright 2012 Dan Nettleton (Iowa State University) Statistics 611 86 / 100 Note that if A has dimension m × n, then the vectors in C(A) have dimension m and the vectors in N (A) have dimension n. c Copyright 2012 Dan Nettleton (Iowa State University) Statistics 611 87 / 100 Is the null space of a matrixm×n A a vector space? c Copyright 2012 Dan Nettleton (Iowa State University) Statistics 611 88 / 100 Find the null space of 1 2 2 4 A= . −1 −2 3 6 c Copyright 2012 Dan Nettleton (Iowa State University) Statistics 611 90 / 100 Result A.3: rank(m×n A) = n ⇐⇒ N (A) = {0}. c Copyright 2012 Dan Nettleton (Iowa State University) Statistics 611 92 / 100 Theorem A.1: If the matrix A is m × n with rank r, then dim(N (A)) = n − r, or more elegantly, dim(N (m×n A)) + dim(C(m×n A)) = n. c Copyright 2012 Dan Nettleton (Iowa State University) Statistics 611 94 / 100 Proof of Theorem A.1: Let k = dim(N (A)). Results A.3 covers the case where k = 0. Suppose now that k > 0. Let u1 , . . . , uk form a basis for N (A). Then Aui = 0 ∀ i = 1, . . . , k. m×n By Fact V5, there exist uk+1 , . . . , un such that u1 , . . . , un form a basis for Rn . c Copyright 2012 Dan Nettleton (Iowa State University) Statistics 611 95 / 100 We will now argue that the n − k vectors Auk+1 , . . . , Aun form a basis for C(A). If so, then dim(N (A)) + dim(C(A)) = k + n − k = n, i.e., dim(N (A)) = n − dim(C(A)) = n − r. c Copyright 2012 Dan Nettleton (Iowa State University) Statistics 611 96 / 100 First note that Aui ∈ C(A) ∀ i = k + 1, . . . , n. Now note that ck+1 Auk+1 + · · · + cn Aun = 0 =⇒ A(ck+1 uk+1 + · · · + cn un ) = 0 =⇒ ck+1 uk+1 + · · · + cn un ∈ N (A) =⇒ ∃ c1 , . . . , ck ∈ R 3 c1 u1 + · · · + ck uk = n X cj u j j=k+1 =⇒ c1 u1 + · · · + ck uk − ck+1 uk+1 − · · · − cn un = 0 =⇒ c1 = · · · = cn = 0 by LI of u1 , . . . , un . c Copyright 2012 Dan Nettleton (Iowa State University) Statistics 611 97 / 100 Therefore, Auk+1 , . . . , Aun are LI. Now let U = [u1 , . . . , un ]. Because u1 , . . . , un are LI and a basis for Rn , ∃ U−1 3 UU−1 = I. Let x ∈ Rn be arbitrary and define z = U−1 x. c Copyright 2012 Dan Nettleton (Iowa State University) Statistics 611 98 / 100 Then Ax = AUU−1 x = AUz = [Au1 , . . . , Auk , Auk+1 , . . . , Aun ]z = [0, . . . , 0, Auk+1 , . . . , Aun ]z = zk+1 Auk+1 + · · · + zn Aun . Therefore, any vector in C(A) can be written as a LC of Auk+1 , . . . , Aun . c Copyright 2012 Dan Nettleton (Iowa State University) Statistics 611 99 / 100 It follows that Auk+1 , . . . , Aun is a basis for C(A). ∴ n − k = r and k + r = n. c Copyright 2012 Dan Nettleton (Iowa State University) Statistics 611 100 / 100