Permutation Matrices and Row Swaps Learning Goals: students gain a basic understanding of permutation matrices, and see how to deal with row swaps in solving systems. Sometimes our elimination requires that we swap rows. We have already seen that the elementary matrix that accomplishes this is I with two rows swapped. Let’s look at the general case of swapping rows around at will. A permutation matrix is a matrix P that, when multiplied to give PA, reorders the rows of A. Let row j be swapped into row k. Then the kth row of P must be a row of all zeroes except for a 1 in the jth position. This is because the kth row of PA is the rows of A weighted by the elements in the rows of P. Since we want just row j we put a 1 in the jth position of row k, and the rest zeroes. This is true of any row of P. Each row has a single one in it. So does each column, for each row of A goes somewhere in PA. so Theorem: a permutation matrix is any matrix with a single one in each row and in each column. It is simply I with its rows (or columns) changed around. We can see that since there are n! orderings of the rows, there are n! permutation matrices of size n × n. Every permutation matrix is invertible, and its inverse is again a permutation matrix. That is because we can simply put the rows back into their original order. The curious thing is that P-1 = PT for any permutation matrix. This is easy to check because when we multiply P with PT the only times the one in the row matches with the one in a column is when the row and column number match, so PPT = I. Another way to see this is to think of creating P by swapping one pair of rows at a time. Let Pij be the matrix that swaps rows i and j. It has exactly two rows of I swapped, and in fact has those same two columns swapped. In fact, P = PT = P-1 for a single row swap. But now any general P can be arrived at by swapping rows a pair at a time. So P = P1P2Pr. Then P-1 = (P1P2Pr)-1 = Pr-1Pr-1-1P2-1P1-1 = PrTP1T = (P1Pr)T. Permutations and factoring How can we handle row swaps in our LU factorization? Let’s say we are rolling merrily along in our elimination, when all of a sudden we have a zero in the pivot position, and have to swap rows i and j. So we start with A. We use a bunch of L’s to get to LLLLA. Now we need to multiply by Pij in order to proceed. So we might end up with something like LLPLLLPLPLLLA. If we could get all the L’s together, we could turn them into one big L as before. But how can we get the P’s out from among the L’s? It would have been nice had we swapped around all the rows before we started elimination. Then we wouldn’t have had this problem! If we had done that, we would now be in the position LLLLLLLLPPPPPA = U. We could then combine all the P’s into one big permutation matrix, and we already know that the L’s go on the other side. We would then have the factorization PA = LU. So lets have this as our guide in moving the P’s out from amongst the L’s. Let’s try to move all the P’s rightward to be next to the A. First, let’s look at a single PL pair in the string. The L has its sole off-diagonal entry in row i and column j, where j < i. The P swaps two rows, k and l. The key observation is that both k and l are larger than j. This is because L eliminates in column j. The next swap happens in some column past j, so must swap rows past j. Either could equal i, though. Now a very common trick in mathematics is to use “conjugation” where you conjugate a by b by multiplying aba-1. Let’s try that here. We would like PLP-1. But we can’t just put in a P-1. We have to cancel it out as well. So the string LLLPLLL will turn into LLLPLP-1PLL. Also, we know that P-1 = P for a swap. So with this in there, what do we make of PLP-1? This is the same as PLP. The first P swaps rows k and l. The second swaps columns k and l. This is why it is important the k and l are both larger than j—the column swap doesn’t move the column where the off-diagonal entry it located. The first P might change where it is located within the column, but the column that it is in doesn’t change. Note that when the first P swaps the rows of L, the ones on the diagonal get swapped. So now row k has a one in column l and vice versa. Then the second P swaps them back into position. Here are two examples: !1 0 0 0 $ !1 0 0 0 $ !1 0 0 0 $ !1 0 0 0 $ #0 1 0 0 & #2 1 0 0 & #0 1 0 0 & #2 1 0 0 & # &# &# &=# & #0 0 0 1 & #0 0 1 0 & #0 0 0 1 & #0 0 1 0 & # &# &# & # & "0 0 1 0 % "0 0 0 1 % "0 0 1 0 % "0 0 0 1 % !1 0 0 0 $ !1 0 0 0 $ !1 0 0 0 $ !1 0 0 0 $ #0 1 0 0 & #0 1 0 0 & #0 1 0 0 & #0 1 0 0 & # &# &# &=# & #0 0 0 1 & #2 0 1 0 & #0 0 0 1 & #0 0 1 0 & # &# &# & # & "0 0 1 0 % "0 0 0 1 % "0 0 1 0 % "2 0 0 1 % In the first example, since the rows being swapped do not involve the row of the off-diagonal entry, nothing at all happens. In the second, since the off-diagonal entry’s row is involved, it changes rows. This is the general case. So now we can say exactly how to move the P’s. If the L has its off-diagonal entry in a row not affected by P, then PL = PLPP = LP so the P can just move to the right past L. If the off-diagonal entry is in a row affected by P then the entry moves to the other row affected by P. This happens to the other L that has its off-diagonal entry in the other row affected by P. So in the end analysis all we end up doing is swapping the two entries in L! So we can move all the P’s rightward next to A, and finish the factorization as usual. Theorem: if A is a non-singular square matrix, it can be factored as PA = LU where P is a permutation, L is lower triangular with ones on the diagonal, and U is upper triangular. We may, of course, pull the pivots out of U to get PA = LDU. This factorization is not, in general, unique, because we can consider different swaps—we can swap a missing pivot for any lower row in which there is a leading non-zero number. " 2 !3 4 3 % $ 6 !9 12 8 ' ' Example: find PA = LU for A = $ $ 4 !5 6 7 ' $ ' # 2 1 !1 8 & There is no need for a row swap to get things going, so simply eliminate in column 1: 3% !1 0 0 0 $ !1 0 0 0 $ " 2 !3 4 #0 1 0 0 & #3 1 0 0& $ 0 0 0 !1' & & ' P so far = # L so far = # U so far = $ #0 0 1 0 & #2 $ 0 1 !2 1 ' 1 0& # & # & $ ' 1% "0 0 0 1 % "1 # 0 4 !5 5 & We need to make a row swap. Let’s swap rows 2 and 3. Remember to do the same in L and U! 3% !1 0 0 0 $ !1 0 0 0 $ " 2 !3 4 #0 0 1 0 & #2 1 0 0 & $ 0 1 !2 1 ' & & ' P so far = # L so far = # U so far = $ #0 1 0 0 & #3 $ 0 0 0 !1' 1 0& # & # & $ ' 1% "0 0 0 1 % "1 # 0 4 !5 5 & Now eliminate in column 3. The multiplier goes into L as usual. 3% !1 0 0 0 $ !1 0 0 0 $ " 2 !3 4 #0 0 1 0 & #2 1 0 0 & $ 0 1 !2 1 ' & & ' P so far = # L so far = # U so far = $ #0 1 0 0 & #3 0 1 0& $ 0 0 0 !1' # & # & $ ' 1% 3 1& "0 0 0 1 % "1 4 #0 0 Now we must do another row swap, of rows 3 and 4. Remember to make the same swaps in the of-diagonal elements of L! 3% !1 0 0 0 $ !1 0 0 0 $ " 2 !3 4 #0 0 1 0 & #2 1 0 0 & $ 0 1 !2 1 ' & & ' P so far = # L so far = # U so far = $ #0 0 0 1 & #1 4 1 0 & $0 0 3 1' # & # & $ ' 1% "0 1 0 0 % "3 0 # 0 0 0 !1& We’re now done, since U is upper triangular. The missing element of L is zero, since we don’t need to do another elimination step. Check that PA = LU. Any matrix can be row reduced in this way, swapping elements of L and U if necessary. This doesn’t add anything to the operation count, because the swap is really just a matter of record-keeping—we don’t do any work. We just need to know how to solve a system in PA = LU form. There is a shorthand notation for keeping track of P. Instead of moving things around in the entire I matrix, most of which is zero, it is easier to start with the vector (1, 2, 3, …, n). This vector tells us that, currently, the rows come in their natural order. As we proceed, we swap elements of this vector (think of it as a column, and we swap the same rows in it as we do in A, U, and L as we eliminate). In the elimination above, the first row swap produces (1, 3, 2, 4) and the second would then give (1, 3, 4, 2). This tells us in which column to place each 1 in the matrix P. Solving PA = LU factored systems So if we’ve factored PA = LU, how do we solve a system Ax = b? It’s easy! If Ax = b, then PAx = Pb. In fact, this is “if and only if” because P is invertible. So let b’ = Pb. Then since PA = LU, we can proceed just as before using b’ instead of b. That is, we solve Lc = b’ and then Ux = c by forward and back substitution respectively. Getting b’ doesn’t even require that we multiply out Pb. For if we have kept track of Pin the vector form, we just have to reorder b in the order given by this vector. For instance, if we have b = (4, 8, -7, 2) with the above permutation vector (1, 3, 4, 2), we get b’ = (4, -7, 2, 8). Reading: 2.7 Problems: 2.7: 1, 2, 3, 4 (proof should allow for arbitrary size matrices!), 6, 7, 11, 13, 14, 16, 17, 18, 19, 22, 36, 37, 40, and… "1 0 1 3 % $ 2 !2 3 8 ' ' into PA = LU, and use the factorization to solve Ax = b for Factor A = $ $ 2 !2 3 11 ' $ ' # 1 !6 9 10 & !4$ !0 $ #7& #1 & b1 = # & and b2 = # & #10 & #1 & # & # & "13 % "8 %