Fundamental content after midterm- A brief review MAT2040 Linear Algebra () Fundamental content after midterm- A brief review 1 / 27 Brief Review- Determinants 1. Minors and Cofactor. (Minors and Cofactor) Let A = (aij )n×n be a square matrix and let Mij denotes the (n − 1) × (n − 1) matrix obtained by deleting the row and column containing aij . The det(Mij ) is called the minor of aij , and the cofactor Aij of aij is given by Aij = (−1)i+j det(Mij ) () Fundamental content after midterm- A brief review 2 / 27 Brief Review-Cofactor expansion formula for the determinant, expansion can be along any row or any column det(A) of A is a scalar defined recursively as a11 , det(A) = a11 A11 + a12 A12 + · · · + a1n A1n , if n = 1, if n > 1, (1) where A1j = (−1)1+j M1j , j = 1, · · · , n () Fundamental content after midterm- A brief review 3 / 27 3. Determinant of an triangular matrix equals to the product of diagonal elements. det(AT ) = det(A). 4. Determinant of a square matrix with zero row or zero column equals to zero. 5. Determinant of a square matrix with two identical rows or columns equals to zero. () Fundamental content after midterm- A brief review 4 / 27 Brief Review- Determinants (Determinants of Three Elementary Matrices) Ri →Rj (1) In −−−−→ ERi Rj , det(ERi Rj ) = − det(In ) = −1, det(ERi Rj A) = − det(A) = det(ERi Rj ) det(A). Ri →αRi (α6=0) (2) In −−−−−−−−−→ EαRi , det(EαRi ) = α det(In ) = α(α 6= 0), det(EαRi A) = α det(A) = det(EαRi ) det(A)(α 6= 0). Rj →βRi +Rj (3) In −−−−−−−→ EβRi +Rj , det(EβRi +Rj ) = det(In ) = 1, det(EβRi +Rj A) = det(A) = det(EβRi +Rj ) det(A). Method to compute the determinant Using row operations (column operations) to reduce the corresponding matrix into the upper triangular form or lower triangular form. () Fundamental content after midterm- A brief review 5 / 27 Brief Review- Determinants Theorem A is singular if and only if det(A) = 0. Theorem If A and B are square matrices with the same sizes, then det(AB) = det(A) det(B). Adjoint matrix Definition for adjoint matrix adj(A), A adj(A) = det(A)I . 1 adj(A) If det(A) 6= 0, A−1 = det(A) Cramer’s Rule Let A = [a1 , · · · , an ] (a1 , · · · , an are column vectors of A) be a nonsingular n × n matrix and let b ∈ Rn and Ai = [a1 , · · · , ai−1 , b, ai+1 , · · · , an ], then the unique solution of Ax = b is given by xi = () det(Ai ) , det(A) for i = 1, 2, · · · , n Fundamental content after midterm- A brief review 6 / 27 Brief Review- Linear transformation 1. Definition for linear transformation: Let V , W be two vector spaces, and the mapping L from V to W is said to be a linear transformation if the following condition is satisfied: L(α1 v1 + α2 v2 ) = α1 L(v1 ) + α2 L(v2 ), ∀α1 , α2 ∈ R, ∀v1 , v2 ∈ V .(∗) V is called the domain of the linear transformation, and W is called the codomain of the linear transformation. 2. Definition for Kernel of L, denoted by ker(L) is defined as ker(L) = {v ∈ V |L(v) = 0W }. 3. Definition of range. Let S be a subspace of V , the image of S, denoted by L(S), is defined by L(S) = {w ∈ W |∃ v ∈ S, s.t. L(v) = w} The image of the entire vector space V , i.e., L(V ) is called the range of L. () Fundamental content after midterm- A brief review 7 / 27 Brief Review- Linear transformation (Matrix Representation for linear transformation between Eulerian vector spaces w.r.t. standard bases) If L is a linear transformation from Rn to Rm , there is a m × n matrix A such that L(x) = Ax for each x ∈ Rn . In fact, the jth column vector of A = [a1 , · · · , an ] is given by aj = L(ej ), j = 1, 2, · · · , n where En = {e1 , · · · , en } is the standard basis of Rn . () Fundamental content after midterm- A brief review 8 / 27 Brief Review- Linear transformation (Matrix Representation for General Vector Spaces) If V = {v1 , v2 , · · · , vn } is a basis for vector space V and W = {w1 , w2 , · · · , wm } is a basis for vector space W , and L is a linear transformation mapping from vector space V to vector space W , then there is a m × n matrix A such that [L(u)]W = A[u]V , ∀u ∈ V And in fact, the jth column of A is given by aj = [L(vj )]W and A = [a1 , · · · , an ]. () Fundamental content after midterm- A brief review 9 / 27 Brief Review-Orthogonality Let x = [x1 , · · · , xn ]T , y = [y1 , · · · , yn ]T ∈ Rn , then < x, y >= xT y = x1 y1 + x2 y2 + · · · + xn yn q √ k x k= xT x = x12 + x22 + · · · + xn2 xT y =k x kk y k cos θ, 0 ≤ θ ≤ π. Two vectors x, y ∈ Rn are said to be orthogonal if xT y = 0. Denote x ⊥ y. () Fundamental content after midterm- A brief review 10 / 27 Brief Review-Orthogonality (Orthogonal Subspaces in Rn ) Two subspaces X and Y of Rn are said to be orthogonal if xT y = 0, ∀ x ∈ X , y ∈ Y . Denoted by X ⊥ Y . Y ⊥ = {x ∈ Rn |xT y = 0, ∀ y ∈ Y } () Fundamental content after midterm- A brief review 11 / 27 Brief Review-Orthogonality A ∈ Rm×n , (1)Null(A)=Col(AT )⊥ =Row (A)⊥ (2)Null(AT )=Col(A)⊥ =Row (AT )⊥ () Fundamental content after midterm- A brief review 12 / 27 Brief Review-Orthogonality If S is a subspace of Rn , then Theorem dim S + dim S ⊥ = n. Furthermore, if {u1 , · · · , ur } is a basis for S and {ur +1 , · · · , un } is a basis for S ⊥ , then {u1 , · · · , ur , ur +1 , · · · , un } is a basis for Rn . Rn = S ⊕ S ⊥ (Definition for Least square solution) Given linear system Ax = b(A ∈ Rm×n , b ∈ Rm ), a vector x̂(x ∈ Rn ) that satisfies the minimum residual condition k r (x̂) k= min k r (x) k x is called the least square solution for Ax = b, where r (x) = b − Ax. () Fundamental content after midterm- A brief review 13 / 27 Brief Review-Orthogonality (Normal equations for the linear system) Given the linear system Ax = b (A ∈ Rm×n , b ∈ Rm ), let the projection of b onto the subspace Col(A) is p, then there exists a vector x̂ ∈ Rn , s.t. p = Ax̂ ∈ Col(A), b − Ax̂ ∈ Col(A)⊥ = Null(AT ) and k b − Ax k≥k b − Ax̂ k for any x ∈ Rn . b − Ax̂ ∈ Col(A)⊥ = Null(AT ) gives the condition AT Ax̂ = AT b which is called the normal equation, and it is a n × n linear system. Figure: Projection of b ∈ V onto column space Col(A). () Fundamental content after midterm- A brief review 14 / 27 Brief Review-Orthogonality (Unique Solution Condition for the Normal Equations) m × n matrix of rank n, the normal equations If A is a AT Ax̂ = AT b have a unique solution x̂ = (AT A)−1 AT b and x̂ is the unique least square solution for the linear system Ax̂ = b. The projection vector is given by p = Ax̂ = A(AT A)−1 AT b where P = A(AT A)−1 AT is called the projection matrix () Fundamental content after midterm- A brief review 15 / 27 Brief Review-Orthogonality 1. (Orthogonal Set in Rn ) Let {u1 , u2 , · · · , um } be nonzero vectors from Rn . If uT i uj = ui , uj = 0 when i 6= j, then {u1 , u2 , · · · , um } is said to be an orthogonal set. 2. Theorem (Orthogonal vectors are linearly independent) 3. (Orthonormal Set in Rn ) Let {u1 , u2 , · · · , um } be nonzero vectors from Rn . {u1 , u2 , · · · , um } is said to be the orthonormal set if 1, if i = j, < ui , uj >= uT i uj = 0, if i 6= j. 4. (Orthonormal basis for Rn ) B = {u1 , · · · , un } is an orthonormal basis for Rn if B = {u1 , · · · , un } is an orthonormal set in Rn . () Fundamental content after midterm- A brief review 16 / 27 Brief Review-Orthogonality 1. (Orthogonal Matrix) Let Q ∈ Rn×n , Q is said to be the orthogonal matrix if the column vectors of Q form an orthonormal set in Rn (also form an orthonormal basis for Rn ). 2. (Equivalent Condition for Orthogonal Matrix) Let Q ∈ Rn×n , Q is an orthogonal matrix if and only if Q T Q = In . Q is an orthogonal matrix if and only if Q −1 = Q T . 3. Property for Orthogonal Matrix) Let Q ∈ Rn×n is an orthogonal matrix, then (a) k Qx k=k x k, ∀ x ∈ Rn (b) Qx, Qy = x, y , ∀ x, y ∈ Rn () Fundamental content after midterm- A brief review 17 / 27 Brief Review-Gram-Schmidt process in Rn Question: Given an linearly independent set {u1 , · · · , um } in Rn , how can we find an orthonormal set {v1 , · · · , vm } such that Span(u1 , · · · , um ) = Span(v1 , · · · , vm )? (Gram-Schmidt Process for m=3) Step 1: normalize u1 to get v1 , i.e., v1 = u1 ku1 k Step 2: project u2 onto Span(v1 ) to get p1 = u2 , v1 v1 , then 1 r1 = u2 − p1 ⊥ Span(u1 ). Set v2 = krr11 k = kuu22 −p −p1 k , then {v1 , v2 } are orthonormal set and Span(v1 , v2 )=Span(u1 , u2 ). u2 v2= (u2- p1)/|| u2- p1|| u 2- p 1 u1 p1 v1= u1/|| u1|| v1= u1/|| u1|| () v2= (u2- p1)/|| u2- p1|| Fundamental content after midterm- A brief review 18 / 27 Brief Review-Gram-Schmidt process in Rn v3=( u3- p2)/|| u3- p2|| u3 u 3- p 2 u2 v2=( u2- p1)/|| u2- p1|| u 2- p 1 p2 u1 p1 v1= u1/|| u1|| p1=< u2, v1> v1 v1= u1/|| u1|| v2=( u2- p1)/|| u2- p1|| p2=< u3, v1> v1+< u3, v2> v2 v3=( u3- p2)/|| u3- p2|| Step 3: project u3 onto Span(v1 , v2 ) to get p2 = u3 , v1 v1 + u3 , v2 v2 , −p2 then r2 = u3 − p2 ⊥ Span(v1 , v2 ), set v3 = krr22 k = kuu33 −p , then 2k {v1 , v2 , v3 } are orthonormal set and Span(v1 , v2 , v3 )=Span(u1 , u2 , u3 ). () Fundamental content after midterm- A brief review 19 / 27 Brief Review-Eigenvalue and Eigenvectors 1. Let A be a square matrix with size n × n (A ∈ Rn×n or A ∈ Cn×n ), if there exists a scalar λ (λ ∈ R or λ ∈ C) and nonzero vector x such that Ax = λx, then λ is called the eigenvalue (or characteristic value) and x is called the eigenvector (or characteristic vector) w.r.t λ. 2. (Characteristic Polynomial) Let A is a n × n matrix (A ∈ Rn×n or A ∈ Cn×n ) and λ is a variable, then pA (λ) = det(A − λI ) is the characteristic polynomial with degree n. The roots of the characteristic polynomial are the eigenvalues of A, the number of eigenvalues (counting with multiplicity) are n. 3. (Product and Sum of Eigenvalues) Let A = (aij )n×n be a square matrix (A ∈ Rn×n or A ∈ Cn×n ), λi (i = 1, 2, · · · , n) are the eigenvalues, n n n Q P P then (1) det(A) = λi , (2) aii = λi , where n P i=1 i=1 i=1 aii = a11 + a22 + · · · + ann = Trace(A) is called the trace of A. i=1 () Fundamental content after midterm- A brief review 20 / 27 Brief Review-Eigenvalue and Eigenvectors 1. (Diagonalizable) A n × n matrix A is said to be diagonalizable if there exists a nonsingular matrix X and a diagonal matrix D such that X −1 AX = D. () Fundamental content after midterm- A brief review 21 / 27 Brief Review-Eigenvalue and Eigenvectors 2. Theorem (Sufficient and Necessary Condition for Diagonalization) A n × n matrix A is diagonalizable if and only if A has n linearly independent eigenvectors. 3. Corollary Let A be a matrix with size n × n, if A has n distinct eigenvalues, then A is diagonalizable. However, if A has eigenvalues with multiplicity ≥ 2, then A may or may not be diagonalizable () Fundamental content after midterm- A brief review 22 / 27 Brief Review-Diagonalization for real symmetric matrix 1. Theorem Let A ∈ Rn×n be the real symmetric matrix with eigenvalues λ1 , λ2 , · · · , λn , then (1) λi ∈ R, ∀ i = 1, · · · , n. (The eigenvalues of real symmetric matrices are real numbers.) (2) If λi 6= λj , xi is the eigenvectors w.r.t λi , xj is the is the eigenvectors w.r.t λj , then xi , xj are orthogonal.(For real symmetric matrices, the eigenvectors belonging to different eigenvalues are orthogonal.) 2. Theorem (Spectral Theorem (eigen decomposition theorem) for Real Symmetric Matrix) If A is a real symmetric matrix, then there exists an orthogonal matrix Q that diagonalizes A, i.e., Q T AQ = Q −1 AQ = Λ (Λ is a diagonal matrix) () Fundamental content after midterm- A brief review 23 / 27 Brief Review-Quadratic form Let x ∈ Rn , A ∈ Rn×n be symmetric, then (1) The quadratic form f (x) = xT Ax is called positive definite if f (x) > 0 for any x 6= 0. And correspondingly, A is called positive definite matrix. (2) The quadratic form f (x) = xT Ax is called positive semidefinite if f (x) ≥ 0 for any x 6= 0. And correspondingly, A is called positive semidefinite matrix. (3) The quadratic form f (x) = xT Ax is called indefinite if f (x) takes different signs. () Fundamental content after midterm- A brief review 24 / 27 Brief Review-Quadratic form The negative definite and negative semidefinite can defined as follows: (4) The quadratic form f (x) = xT Ax is called negative definite if f (x) < 0 for any x 6= 0. And correspondingly, A is called negative definite matrix. (5) The quadratic form f (x) = xT Ax is called negative semidefinite if f (x) ≥ 0 for any x 6= 0. And correspondingly, A is called negative semidefinite matrix. () Fundamental content after midterm- A brief review 25 / 27 Brief Review-Quadratic form Let A ∈ Rn×n be symmetric, then A is positive definite if only if all eigenvalues are positive. Proof. Since A is symmetric, by spectral theorem for real symmetric matrix, there exists an orthogonal matrix Q such that Q −1 AQ = Q T AQ = D, where D is the diagonal matrix. Let x̂ = Q T x then x = Qx̂ and xT Ax = (Qx̂)T AQx̂ = x̂T Q T AQx̂ = x̂T Dx̂. Since Q is invertible and x̂ = Q T x, thus xT Ax > 0, ∀x 6= 0 ⇔ x̂T Dx̂ > 0, ∀x̂ 6= 0 Thus, A is positive definite ⇔ the entries in diagonal elements of D are all positive⇔ all eigenvalues of A are positive. () Fundamental content after midterm- A brief review 26 / 27 Brief Review-Quadratic form Remark. 1. Let A ∈ Rn×n be symmetric, then A is negative definite if only if all eigenvalues are negative. 2. Let A ∈ Rn×n be symmetric, then A is indefinite if only if eigenvalues have different signs. () Fundamental content after midterm- A brief review 27 / 27