A Brief Outline of Math 355 Lecture 1. The geometry of linear equations; elimination with matrices. • A system of m linear equations with n unknowns can be thought of geometrically as m hyperplanes intersecting in Rn . Some basic questions in this course are: i. Are there any points in Rn where all the hyperplanes intersect? ii. How many such points are there? Geometrically, we saw that in R2 , two lines can intersect either in a single point, everywhere (i.e. they are the same line), or nowhere (i.e. they are parallel). • We also emphasized the column picture, where we look for a solution as a linear combination of the columns of our matrix. The columns are viewed as vectors in Rn . • In order to (attempt to) solve a system of linear equations, we convert the equations to a matrix, then use row operations to reduce a matrix A to an upper triangular matrix U , using which it is easy to backsubstitute and find any solutions. Allowable row operations are: i. Add a multiple of one row to another ii. Exchange rows iii. Multiply a row by a nonzero number Lecture 2. Multiplication and inverse matrices; • Matrix multiplication (i.e. AB = C) can be thought of in four ways: 1. One entry at a time: The entry ci,j is the inner product of the ith row of A with the jth column of B. 2. A row at a time: The ith row of C is a linear combination of the rows of B, with the coefficients of the linear combination being the ith row of A. 3. A column at a time: The ith column of C is a linear combination of the columns of A, with the coefficients of the linear combination being the ith column of B. 4. A whole matrix at a time: Multiply a column of A with a row of B to get an m×n matrix. Add up all such matrices to get AB. • Given a square, invertible matrix, take the augmented matrix [ A | I ], and row reduce A to reduced row echelon form (which will be the identity matrix, I), to get [ I | A−1 ]. 1 Lecture 3. Factorization into A = LU ; transposes, permutations, spaces. • Using the ”One row at a time” idea of multiplication, we could translate row operations into matrix algebra. For example, if we were row reducing a 3x3 matrix and wanted to subtract 2 of row 1 from row 2, the picture might look something like: 1 −2 0 0 0 1 1 0 2 0 1 0 2 5 0 3 1 6 = E2,1 A = 0 1 0 0 1 0 0 0 1 We prefer to reduce a matrix to the form A = LU , since the inverse of the elimination matrices are particularly easy to find (the inverse of a matrix that subtracts 2 of row 2 is one that −1 −1 −1 adds 2 of row 2), and the product E2,1 E3,1 E3,2 is just the coefficients from elimination. • If A = (ai,j ), then the (i, j)th entry of AT , the transpose of A, is (aj,i ). • The matrix that permutes the rows of another matrix can be found by performing the permutation on the identity matrix. There are n! n × n permutation matrices, the inverse of a permutation matrix is its transpose (i.e. P P T = P P −1 = I), and the product (or transpose) of a permutation matrix is another permutation matrix. • A vector space is a collection V of objects (which are called vectors), which can be added or multiplied by a (real) number (and the result will still be in V ). A subspace of a vector space V is a subset of V which is still a vector space. For example, the column space of an m × n matrix is a subspace of Rm . Lecture 4. Rn ; column space and null space; solving Ax = 0: pivot variables, special solutions. • Our prime example of a vector space is Rn , that is, we take our vectors to be n-tuples of real numbers, and perform addition and scalar multiplication componentwise. • The column space of a matrix A is the set of all linear combinations of the columns of a matrix. Equivalently, it is the set of all b so that Ax = b has a solution. We denote the column space by C(A). • The null space of a matrix A is the set of all x so that Ax = 0. We denote the null space by N (A). • To find the null space of a matrix A: 2 i. Use Gauss-Jordan elimination to convert A to reduced row echelon form, R. You will have r pivot variables and n − r free variables. ii. Set the first free variable equal to 1 and the rest equal to 0, then solve for the pivot variables. This is the first special solution. iii. Repeat the previous step with each of the other free variables to find n − r linearly independent special solutions. iv. These special solutions form a basis for N (A) (so any linear combination of the special solutions is in the null space). Lecture 5. Solving Ax = b: row reduced form R; independence, span, basis and dimension. • Algorithm for complete solution to Ax = b: i. Use row operations to change A to R. ii. Set free variables to zero, solve for pivot variables to find xparticular . iii. Find the nullspace of A, N (A). iv. Complete solution to Ax = b is x = xp + xn , where xn is any vector in N (A). So a complete solution to such a problem would consist of finding some xp (remember there are infinitely many), and N (A) (plus writing x = xp + xn ). • Given an m × n matrix A with rank r, there are three special cases: i. r = n < m: Then N (A) = {0}, and Ax = b has either 0 or 1 solution. ii. r = m < n: Then dim(C(A)) = r = m, so C(A) = Rm , and so Ax = b always has a solution. Also, dim(N (A)) = n − r > 0, so there are in fact ∞ many solutions. iii. r = m = n: Then N (A) = {0} and Ax = b always has a solution. Thus there is always a unique solution to Ax = b. • A set of vectors v1 , v2 , . . . , vn are linearly independent if a1 v1 + a2 v2 + · · · + an vn = 0 means that a1 = a2 = · · · = an = 0. The algorithm for checking whether vectors are independent is to create a matrix A with the vectors as columns. If N (A) = 0, then the vectors are independent. Otherwise, they are dependent. 3 • The span of a set of vectors is all the linear combinations of those vectors. We say that v1 , v2 , . . . , vn span a vector space V if V = span{v1 , v2 , . . . , vn }. For example, the span of the columns of a matrix is the column space. • A set of vectors is a basis for a vector space V if i. The vectors are linearly independent, and ii. The vectors span V . • The dimension of a vector space V is the number of vectors in any basis for V (recall, we showed that any basis for V has the same number of vectors). We also showed that if dim(V ) = n, then any n linearly independent vectors in V will be a basis for V . Lecture 6. The four fundamental subspaces; matrix spaces, polynomial spaces. • We took an m×n matrix A, and looked at the column space(C(A)), the null space (N (A)), the row space (C(AT )), and the left null space (N (AT )). The natural questions to ask when looking at subspaces are: 1. What is a basis? 2. What is the dimension? We answer these questions here: Suppose we have a matrix A. Then if we take the augmented matrix [ A | I ] and use row operations to reduce A to reduced row echelon form, R, then we call the matrix on the right E, for elimination matrix (note the matrix E’s relationship to the elimination matrices from chapter 1). That is, we use row reduction to go from: [A|I ]→[R|E ] Now we can easily read off the rank, r of the matrix A, by counting the pivot variables in R, as well as calculate: Dimension of C(A): Is just the rank, r. Basis for C(A): Is the r columns of A that correspond to the pivot columns of R. Dimension of C(AT ): Is also the rank, r. Basis for C(AT ): These are the first r ows of R (since row operations do not change the row space, and the first r rows are the pivot rows). Dimension of N (A): This is the number of free variables, which is n − r. 4 Basis for N (A): We find n − r solutions to the system of equations Rx = 0, by setting one free variable equal to 1 at a time, while leaving the rest equal to zero, and solving. Dimension of N (AT ): Since this is just the null space of AT , which has r pivots and m−r free variables, this must have dimension m − r. Basis for N (AT ): Take the bottom m − r rows of E. • The space Mm×n of m × n matrices can also be considered a vector space, even though matrices are not traditionally thought of as “vectors”. We also inspected the subspaces of upper triangular, symmetric and diagonal matrices. Be able to find bases for these spaces. As an example, a basis for M3×3 is 0 1 0 0 0 0 1 0 0 0 0 0 , 0 0 0 ,..., 0 0 0 . 0 0 0 0 0 1 0 0 0 • The space Pn of polynomials of degree ≤ n is also a vector space of dimension n + 1, with basis {1, x, x2 , . . . , xn }. Lecture 7. Graphs, networks, incidence matrices. • A graph consists of nodes and edges. If these were more serious notes, there’d be an example drawn. • An incidence matrix for an oriented graph with m edges and n nodes will be an m × n matrix, with the entries −1 if edge i leaves node j 1 if edge i enters node j ai,j = 0 otherwise. • Each of the four fundamental subspaces has a physical interpretation, starting with interpreting the vector x as the potential at each node: The column space The vector e = Ax represents the possible potential differences. The null space This is the stationary solution- when there is no potential difference. The left null space The set of y so that Ay = 0 are those currents which satisfy Kirchoff ’s circuit law, which says that the net flow of current at any node must be 0. The row space The corresponding pivot rows will create a maximum tree in the graph (i.e. a subgraph that has no loops, but contains every node). 5 • An incidence matrix has another interesting interpretation: the dimension of N (AT ) is the number of loops in the graph, while the rank is the number of nodes, minus 1. Hence, dim(N (AT )) # loops = m−r = # edges − # of nodes − 1. This is Euler’s formula. Lecture 8. Orthogonal vectors and subspaces; projections onto subspaces. • Two vectors x and y are orthogonal if xT y = 0. • Two subspaces S and T are orthogonal if sT t = 0 for every vector s ∈ S and t ∈ T . • Two subspaces S and T of Rn are orthogonal complements if i. S and T are orthogonal. ii. dimS + dimT = n. • The row space and null space of an m×n matrix are orthogonal complements in Rn . • The column space and the left null space of an m × n matrix are orthogonal complements in Rm . • To project a vector b onto the subspace generated by a, we use the projection matrix P , given by P = aaT . aT a Then the projection of b is just P b. • We define that a projection matrix is any matrix so that i. P T = P , and ii. P 2 = P Lecture 9. Projection matrices and least squares; orthogonal matrices and Gram-Schmidt • We use projections to solve least squares problems. That is, in the event that there is no x so that Ax = b, we find an x̂ so 2 that Ax̂ = b and ||x − x̂|| is as small as possible. • We solve least squares using the projection matrix P = A(AT A)−1 AT , in the sense that P b = Ax̂. 6 • In practice, to solve the least squares problem, you solve the equation AT Ax̂ = AT b. This will have a solution if and only if AT A is invertible, which is true whenever A has independent columns. • A set of vectors q1 , q2 , . . . , qn is orthonormal if ( 1 if i = j T qi qj = 0 if i 6= j • Any matrix (rectangular or square) Q with orthonormal columns has the property QT Q = I. If Q is also square, then QT = Q−1 . • If we have a least squares problem with an orthogonal matrix (i.e. one with orthonormal columns), then the projection equation simplifies to x̂ = QT b, so in particular, the ith coordinate of x̂ is x̂i = qTi b. • The Gram-Schmidt process takes a set of vectors a, b, c, . . . z (ok, I don’t mean precisely 26 vectors, but I don’t want to involve subscripts either so bear with me), and converts them into orthogonal vectors A, B, C, . . . , Z, and then into orthonormal vectors q1 , q2 , q3 , . . . , qn , so that all of the different sets of vectors have the same span. Here is the algorithm: 1. We define the orthogonal vectors recursively: A = a, AT b , AT A AT c BT c C = c− T − T , A A B B .. . AT z BT z YT z Z = z − T − T − ··· − T . A A B B Y Y B = b− 7 2. We normalize the vectors: q1 = q2 = q3 = A ||A|| B ||B|| C ||C|| (1) .. . qn = Z ||Z|| Lecture 10. Properties of determinants; determinant formulas and cofactors. We deduced that three properties of the determinant completely determine the determinant. We used these three properties to prove that seven more properties hold, and then used this to deduce formulas for the determinant. • The determinant is a function that eats square (real valued) matrices and gives a (real) number. The three defining properties of the determinant are: 1. detI = 1. 2. Transposing two rows of a matrix changes the sign of the determinant. 3. The determinant is linear in each row. This means: a. Multiplying a row by a number multiplies the determinant by the same number. For example, if ← r1 → ← r1 → ← r2 → ← r2 → .. .. .. .. .. .. . . . . . 0 . A= ← ri → , and A = ← tri → , . . .. .. .. .. .. .. . . . . ← rn → ← rn → then detA0 = t · detA. b. Adding a vector to a row of a matrix is “additive” (this doesn’t seem like the right word to use, but I don’t think a correct and simple word exists...). For 8 example, if ← ← .. . Ar = ← . .. ← and r1 r2 .. . → → .. . ri .. . → .. . rn → A= ← ← .. . r1 r2 .. . , As = ← si . .. .. . ← rn ← ← .. . r1 r2 .. . ← ri + si .. .. . . ← rn → → .. . → .. . → → .. . → .. . , → , → then detA = detAr + detAs . • We then used the previous three properties to deduce seven more properties that the determinant must satisfy: 4. If two rows of A are equal, then detA = 0. 5. Subtracting k × (row i) from row j does not change the determinant. 6. If A has a row of zeros, then detA = 0. 7. The determinant of an upper triangular matrix is the product of the pivots: d1 u1,2 . . . u1,n 0 d2 . . . u2,n detU = . .. .. = d1 d2 . . . dn . .. .. . . . 0 0 . . . dn 8. detA = 0 if and only if A is singular. 9. det(AB) = (detA)(detB). 10. detA = detAT . • We then used the above 10 properties to determine 3 formulas for the determinant of a matrix: Long formula with n! terms: By expanding a matrix using property 3b, and eliminating those with rows of 0 using property 6, we got X detA = ±a1,α a2,β a3,γ . . . an,ω , n! permutations of 1,...n 9 where {α, β, γ, . . . , ω} is some permutation of {1, 2, 3, . . . , n}, and the sign is determined by whether this is an odd or even permutation. Cofactor expansion: The cofactor of ai,j , denoted ci,j , is (n − 1) × (n − 1) matrix i+j ci,j = (−1) det . with row i, col j removed. Then we concluded that detA = a1,1 c1,1 + aa,2 c1,2 + · · · + a1,n c1,n , and referred to this as the cofactor expansion along row 1. A similar formula holds expanding along any row or column. Row reduction: Using properties 5 and 7, we concluded that row reducing a matrix to A = LU , then detA = detU = the product of the pivots. This is the most computationally efficient method in general, though cofactors are also very useful in computing by hand. Lecture 11. Applications of the determinant: Cramer’s rule, inverse matrices, and volume; eigenvalues and eigenvectors. • We defined C to be the cofactor matrix of A: that is, ci,j is the cofactor associated with ai,j . This allowed us to write A−1 = 1 C. detA (note this equation only holds if A is invertible) • Cramer’s Rule gives us an explicit way of solving for each coordinate of Ax = b. In particular, x1 = x2 = detB1 detA detB2 detA .. . xn = detBn , detA where Bi is the matrix A with the ith column replaced by the vector b. • We also saw that the volume of an n-dimensional parallelepiped with edges a1 , a2 , . . . , an is the absolute value of the determinant of the matrix A with columns a1 , a2 , . . . , an . 10 • An eigenvalue of a matrix A is a number λ so that there exists a vector x (called the eigenvector ) with Ax = λx. • To find the eigenvalues of A, we solve the characteristic equation det[A − λI] = 0, which will be an nth degree polynomial (and so will have n not-necessarily-distinct, not-necessarily real roots). • To find the eigenvectors, we take the eigenvalues λ1 , . . . , λn , and let xi be a vector in the nullspace of A−λi I. (this is a little imprecise, since if two eigenvalues are the same, the nullspace of A − λI may contain more than one linearly independent vector) • If we have n independent eigenvectors, and put them as columns of the matrix S, then S −1 AS = Λ and A = SΛS −1 , where Λ is a diagonal matrix with the eigenvalues along the diagonal: λ1 0 . . . 0 0 λ2 . . . 0 Λ= . .. . . .. . .. . . . 0 0 ... λn You can remember this equation since Axi = xi λi corresponds to multiplying A on the right by the column xi , giving AS = SΛ. • Note that if a matrix can be diagonalized, then Ak = SΛk S −1 , where Λk is easily computed k λ1 0 Λk = . .. 0 as 0 λk2 .. . ... ... .. . 0 0 .. . 0 ... λkn . • A matrix is diagonalizable if and only if it has n independent eigenvectors. If each of the eigenvalues are different, then the matrix is sure to be diagonalizable. However, if a matrix has repeated eigenvalues, then it may or may not be diagonalizable. 11 • Solved the equation: uk+1 = Auk , given the initial vector u0 , by noting that uk = Ak u0 . • To actually compute uk : i. Find eigenvalues λ1 , . . . , λn and eigenvectors x1 , . . . , xn of A, ii. Write u0 = c1 x1 + c2 x2 + · · · + cn xn = Sc, where S is the eigenvector matrix, and c = [c1 , . . . , cn ]T is the solution vector to Sc = u0 . iii. Then uk = Λk Sc. Lecture 12. Diagonalization and powers of A; differential equations and eAt . • Solved linear equations of the form du1 dt du2 dt = a1,1 u1 + a1,2 u2 + · · · + a1,n un = a2,1 u1 + a2,2 u2 + · · · + a2,n un .. . dun dt = an,1 u1 + an,2 u2 + · · · + an,n un , which we wrote in the decidedly more compact form du = Au. dt We typically are also given an initial condition u(0). • To solve: i. Find eigenvalues λ1 , . . . , λn and eigenvectors x1 , . . . , xn of A, ii. Solution is u(t) = c1 eλ1 t x1 + c2 eλ2 t x2 + · · · + cn eλn t xn , where c = [c1 , . . . , cn ]T is found by noting that u(0) = Sc. • This can also be written u(t) = SeΛt S −1 u(0). • We noted that the exponential of a matrix is defined by: eAt = I + At + (At)3 (At)2 + + · · · = SeΛt S −1 , 2! 3! (2) with the second equality holding only if A is diagonalizable. 12 • We also saw that eΛt = eλ1 t 0 .. . 0 0 eλ2 t .. . ... ... .. . 0 0 .. . 0 ... eλn t • You can change a single 2nd order equation into a system of 1st order equations by rewriting y 00 + by 0 + ky = 0 as u= y0 y , so u0 = y 00 y0 = −b −k 1 0 y0 y . This can also be used to reduce nth order differential equations to a system of n first order equations. Lecture 13. Markov matrices, Fourier series. • A Markov matrix is one where i. All entries ≥ 0. ii. The entries in each column add to 1. • If A is a Markov matrix, then λ = 1 is an eigenvalue, and |λi | ≤ 1 for all other eigenvalues. Hence the steady state will be some multiple of the eigenvector x1 corresponding to λ1 = 1. • Given an orthonormal basis q1 , q2 , . . . , qn , we can write any v as v = x1 q1 + x2 q2 + · · · + cn qn . Since the qi ’s are orthonormal, multiplying the equation on the left by qTi leaves us with qTi v = xi . • The Fourier series for a function f (x) is the expansion f (x) = a0 + a1 cos x + b1 sin x + a2 cos 2x + b2 sin 2x + · · · . • We define the inner product for these functions as Z 2π T f g= f (x)g(x)dx, 0 13 and observe that 1, cos x, sin x, cos 2x, sin 2x, . . . is an orthogonal basis (though each one has norm π, so it is easy to make it orthonormal). Hence to find b2 (for example), we use the above and observe Z 1 2π f (x) sin 2x dx b2 = π 0 Lecture 14. Symmetric Matrices • If you have a symmetric matrix (that is, A = AT , or when a complex matrix, A = ĀT ), then 1. The eigenvalues of A are real, and 2. The eigenvectors of A can be chosen to be orthogonal. • Then a symmetric matrix A can be factored as A = QΛQT (compare to the usual case A = SΛS −1 ). This is called the spectral theorem. • Multiplying out the factorization above, we get A = λ1 q1 qT1 + λ2 q2 qT2 + · · · + λn qn qTn , where each qi qTi = qi qTi qTi qi is an orthogonal projection matrix. So every symmetric matrix is a linear combination of orthogonal projection matrices. • For a symmetric matrix, the number of positive pivots is the same as the number of positive eigenvalues. • A positive definite matrix is a symmetric matrix where all eigenvalues are positive (which is the same as all the pivots being positive). Lecture 15. Complex matrices and the Fast Fourier Transform. • A complex number z can be written in three ways: i. z = x + iy. It can be viewed on the complex plane as the point (x, y), making the obvious identification with R2 . ii. z = r(cos θ + i sin θ). In this case, r is called the modulus (fancy word for “length”) of z, and θ is called the “argument”. It can be viewed on the complex plane as the endpoint of the vector leaving from the origin with lengh r and angle θ. 14 iii. z = reiθ . See above for the terminology. This form has the same geometric interpretation as ii, but is more widely πi used. For example, 2i = 2e 2 . • The complex conjugate of a complex number z is found by switching the sign on the imaginary part of z, or graphically by reflecting z over the real axis, and is denoted by z̄: If z = x + iy = reiθ , then z̄ = x − iy = re−iθ . A number z is real if and only if z = z̄. • The length of a complex number z is p p 1 (z z̄) 2 = (x + iy)(x − iy) = x2 + y 2 = r. • Given a complex vector z ∈ Cn , z1 z2 z= . .. , zn we noticed that the length was given by z̄T z, and so defined the Hermitian as the transpose of the conjugate: For vectors, zH := z̄T . For complex matrices, AH := ĀT . • We use the Hermitian to translate words we used for real matrices and vectors into words for complex matrices and vectors: Def. for R-valued Def. for C-valued Length of x xT x xH x T Inner product x y xH y T A symmetric A=A ( A = AH ( 0 if i 6= j 0 if i 6= j q1 , . . . , qn qTi qj = qH i qj = 1 otherwise. 1 otherwise. orthonormal Notice that the only difference is that the transpose is always exchanged for a Hermitian, and that when dealing with real vectors/matrices, each definition is the same. • The nth Fourier matrix, Fn is defined as 1 1 1 ... 1 2 n−1 1 ω ω . . . ω ω2 ω4 ... ω 2(n−1) Fn = 1 .. .. .. .. .. . . . . . 1 ω n−1 ω 2(n−1) . . . ω (n−1)(n−1) 15 , where ω is the nth root of unity, that is, ω is a solution of xn − 1 = 0. More specifically, ω=e 2πi n . • The columns of Fn are orthogonal (so FnH Fn = I), and can be multiplied very quickly. Lecture 16. Positive definite matrices and minima; Similar matrices and Jordan form. • We looked at four equivalent definitions for an n × n matrix A being positive definite: i. λ1 > 0, λ2 > 0, . . . , λn > 0. ii. Each of the n leading subdeterminants are strictly positive. The mth leading subdeterminant is the determinant of the m × m matrix in the top left corner of A. iii. Each of the pivots of A are strictly positive. (Careful: this does not mean that the diagonal of A is positive. It means that if A = LU , then the elements on the diagonal of U are positive!) iv. xT Ax > 0 for all x. • We define positive semidefinite by replacing all the incidences of the words “strictly positive” above by “positive or zero”. The terms negative definite and negative semidefinite are defined the same, just replacing “positive” by “negative” in the definition. • The function produced by xT Ax is called a quadratic form. When A is 2 × 2, this corresponds to a conic section. • If A is positive definite, then xT Ax is a paraboloid. More generally, let f : Rn → R (think: f (x1 , x2 , . . . , xn ) = y) then ∂f ∂f if ∇f (a) = 0 (where ∇f = ( ∂x , ∂f , . . . , ∂x )) we have: 1 ∂x2 n ∂2f 2 ∂2f f . . . ∂x∂1 ∂x ∂x1 ∂x2 ∂x21 n ∂2f 2 2 ∂ f ∂ f . . . 2 ∂x2 ∂xn ∂x2 ∂x1 ∂x2 f (a) is a minimum if f 00 (a) = .. .. .. .. . . . . ∂2f ∂2f ∂2f ... ∂xn ∂x1 ∂xn ∂x2 ∂x2 n is positive definite (where each second derivative is evaluated at a). Compare this to calculus where a is a minimum if f 0 (a) = 0 and f 00 (a) > 0. • Positive definite matrices act like positive numbers: If A, B are positive definite matrices, then so are A−1 and A + B. Also, AT A is positive definite for any m × n matrix A with rank n. (since xT AT Ax = (Ax)T (Ax) = ||Ax||2 > 0) 16 • Two n × n matrices are similar if there is an invertible matrix M so that B = M −1 AM . • An example to remember is that every diagonalizable matrix is similar to a diagonal matrix (A = SΛS −1 ). • Similar matrices have the same eigenvalues, and represent the same linear transformations with different coordinates. • We found a ’good’ representative for each family of matrices (that is to say, a family of matrices is the set of all matrices you can get by conjugating the matrix by an invertible matrix. i.e. the set of matrices similar to eachother), which we called the Jordan canonical form. • A Jordan block is the matrix λ 1 0 λ Jλ = 0 0 .. .. . . 0 0 0 1 λ .. . ... ... ... .. . 0 0 0 .. . 0 ... λ . • Every matrix A is similar to a Jordan canonical matrix, which looks like Jλ1 0 ... 0 0 Jλ2 . . . 0 J = . . .. . . .. .. .. . 0 0 . . . J λn Lecture 17. Singular Value Decomposition; Linear transformations and their matrices, coordinates. • The singular value decomposition works for all matrices, and decomposes the m × n matrix A into A = U ΣV T , where U is an m × m orthogonal matrix, Σ is an m × n “diagonal” matrix with all entries ≥ 0, and V is an n × n orthogonal matrix. • The columns of U are the eigenvectors of AT A (which, recall, is positive indefinite), and the diagonal entries of Σ are the square roots of the associated eigenvalues. • The columns of V are the eigenvectors of AAT , and again the diagonal entries of Σ are square roots of the eigenvalues. • We can also look at the columns v1 , . . . , vn of V and the columns u1 , . . . , um of U in the following way: Let r be the rank of A. 17 – v1 , . . . , vr are an orthonormal basis for C(A), – u1 , . . . , ur are an orthonormal basis for C(AT ), – vr+1 , . . . , vn are an orthonormal basis for N (AT ), – ur+1 , . . . , um are an orthonormal basis for N (A). • A linear transformation is a function T : Rn → Rm so that i. T (u + v) = T (u) + T (v), and ii. T (cv) = cT (v). • Given coordinates, that is, a basis for Rn and Rm , every linear transformation T is uniquely associated with a matrix A. • Translating a linear transformation into a matrix: 1. You will be given a linear transformation T : Rn → Rm , as well as a basis v1 , . . . , vn of Rn and a basis u1 , . . . , um of Rm (in practice, you may decide the bases). 2. Evaluate the basis elements of Rn , and write them in the coordinates of Rm : T (v1 ) = a1,1 w1 + a2,1 w2 + · · · + am,1 wm T (v2 ) = .. . a1,2 w1 + a2,2 w2 + · · · + am,2 wm T (vn ) = a1,n w1 + a2,n w2 + · · · + am,n wm . 3. Now A = (ai,j ) will be the matrix representation of the linear transformation in the given basis. • Two matrices represent the same linear transformation in different coordinates precisely when they are similar, which is one good reason to use eigenvectors as coordinates (so that the linear transformation matrix is diagonal). Lecture 18. Change of basis. • A natural question is : if a linear transformation T : Rn → Rm has matrix A with respect to the basis v1 , . . . , vn , and matrix B with respect to the basis u1 , . . . , um then what is the relationship between A and B? • Denote Rn by V or U , depending on the basis used. Then we want to find a matrix M so that M V ↓ U A − → B − → V ↓ M U commutes. That is to say, B = M −1 AM . • But M may be interpreted as a linear transformation, with M vi = vi = a1,i u1 + a2,i u2 + · · · + an,i un , and we use the above to find the matrix M . 18