Matrix Multiplication Stephen Boyd EE103 Stanford University October 8, 2015 Outline Matrix multiplication Composition of linear functions Matrix powers QR factorization Matrix multiplication 2 Matrix multiplication I can multiply m × p matrix A and p × n matrix B to get C = AB: Cij = p X Aik Bkj = Ai1 B1j + · · · + Aip Bpj k=1 for i = 1, . . . , m, j = 1, . . . , n I to get Cij : move along ith row of A, jth column of B I example: −1.5 1 Matrix multiplication 3 2 −1 0 −1 0 1 −1 3.5 −2 = −1 0 −4.5 1 3 Special cases of matrix multiplication I scalar-vector product (with scalar on right!) xα I inner product aT b I matrix-vector multiplication Ax I outer product abT = Matrix multiplication a1 b1 a2 b1 .. . a1 b2 a2 b2 .. . ··· ··· a1 bn a2 bn .. . am b1 a m b2 ··· am bn 4 Properties I (AB)C = A(BC), so both can be written ABC I A(B + C) = AB + AC I (AB)T = B T AT I AI = A; IA = A I AB = BA does not hold in general Matrix multiplication 5 Block matrices I block matrices can be multiplied using the same formula, e.g., A B E F AE + BG AF + BH = C D G H CE + DG CF + DH (provided the products all make sense) Matrix multiplication 6 Column interpretation b1 I write B = I then we have AB = A I ··· b2 b1 b2 ··· bn bn (bi is ith column of B) = Ab1 Ab2 ··· Abn so AB is ‘batch’ multiply of A times columns of B Matrix multiplication 7 Inner product interpretation I with aTi the rows of A, bj the columns of B, we have T a1 b1 aT1 b2 · · · aT1 bn aT2 b1 aT2 b2 · · · aT2 bn AB = . .. .. .. . . T T T am b1 am b2 · · · am bn I so matrix product is all inner products of rows of A and columns of B, arranged in a matrix Matrix multiplication 8 Gram matrix I the Gram matrix of an m × n matrix A is T a1 a1 aT1 a2 · · · aT2 a1 aT2 a2 · · · T G=A A= . .. .. .. . . aTn a1 aTn a2 · · · aT1 an aT2 an .. . aTn an I Gram matrix gives all inner products of columns of A I example: G = AT A = I means columns of A are orthonormal Matrix multiplication 9 Complexity I to compute Cij = (AB)ij is inner product of p-vectors so total required flops is (mn)(2p) = 2mnp flops I multiplying two 1000 × 1000 matrices requires 2 billion flops I . . . and can be done in well under a second on current computers I Matrix multiplication 10 Outline Matrix multiplication Composition of linear functions Matrix powers QR factorization Composition of linear functions 11 Composition of linear functions I A is an m × p matrix, B is p × n I define f : Rp → Rm as f (u) = Au, g : Rn → Rp as g(v) = Bv I I f and g are linear functions composition is h : Rn → Rm , h(x) = f (g(x)) I we have h(x) = f (g(x)) = A(Bx) = (AB)x so – composition of linear functions is linear – associated matrix is product of matrices of the functions Composition of linear functions 12 Outline Matrix multiplication Composition of linear functions Matrix powers QR factorization Matrix powers 13 Matrix powers I for A square, A2 means AA, and same for higher powers I with convention A0 = I we have Ak Al = Ak+l I negative powers later; fractional powers in other courses Matrix powers 14 Directed graph I n × n matrix A represents directed graph: 1 there is a edge from vertex j to vertex i Aij = 0 otherwise I example: 2 3 A= 5 1 Matrix powers 0 1 0 1 0 1 0 0 0 0 0 1 1 0 0 0 0 1 0 1 1 0 1 0 0 4 15 Paths in directed graph Pn I (A2 )ij = I for the example, k=1 Aik Akj = number of paths of length 2 from j to i A2 = 1 0 1 0 1 0 1 0 1 0 1 1 1 0 0 1 1 2 0 0 0 2 1 1 0 e.g., there are two paths from 4 to 3 (via 3 and 5) I more generally, (A` )ij = number of paths of length ` from j Matrix powers 16 Outline Matrix multiplication Composition of linear functions Matrix powers QR factorization QR factorization 17 Gram-Schmidt in matrix notation I run Gram-Schmidt of columns a1 , . . . , ak of n × k matrix A I if columns are independent, get orthonormal q1 , . . . , qk I define n × k matrix Q with columns q1 , . . . , qk I QT Q = I I from Gram-Schmidt algorithm ai = T (q1T ai )q1 + · · · + (qi−1 ai )qi−1 + kq̃i kqi = R1i q1 + · · · + Rii qi with Rij = qiT aj for i < j, Rii = kq̃i k I defining Rij = 0 for i > j we have A = QR I R is upper triangular, with positive diagonal entries QR factorization 18 QR factorization I A = QR is called QR factorization of A I factors satisfy QT Q = I, R upper triangular with positive diagonal entries I can be computed using Gram-Schmidt algorithm (or some variations) I has a huge number of uses, which we’ll see soon QR factorization 19