Chapter 9 The Transitive Closure, All Pairs Shortest Paths This chapter studies two related problems that answers the following questions. 1. Can I get there from here? (is there a path from u to v?) 2. What is the shortest path from u to v? Kleene, Warshall, and Floyd all studied these problems. Kleene – the kleene closure, regular languages Warshall – transitive closure Floyd – all pairs of shortest paths There are many applications -- networks, air routes, computer connections, flowcharts, electrical connections, passwords, etc. 9.2.1 Definition of transitive closure and adjacency matrix. Definitions: * S a finite set of elements. * binary relation on S is a subset of S X S, call it A. si is related to sj with the notation siAsj * Can be represented by an adjacency matrix, which is an important true if si As j relation in itself. aij false otherwise * Equivalence relation and partial orders are additional examples of interesting relations. * The relation can be viewed as a directed graph as we looked at in the previous chapter. G = (S,A). S is the vertices, A as the ordered pairs of edges. * zero matrix – all entries 0. * Notation – ۸ and, ۷ or, ¬ not, + binary or, ∑ multi-way or (not exclusive or, which is not used in this chapter) * identity matrix – all entries 0 except the diagonal which is 1. * A relation is transitive iff for all x, y, z in S xAy, yAz implies xAz. * Transitive - from any value sjAsi, siAsk implies sjAsk (gets to another value). * Reflexive Transitive closure – Let S be a set and let A be a binary relation on S. Let G = (S,A). RTC is a binary relation R defined by: siRsj iff there is a path from si to sj in G * Reflexive since there is a path from each vertex to itself of length zero. * We are to study methods of finding the transitive closure. * Assume the number of elements of S is n. * Assume the number of elements of A is m. * A relation and its transitive closure. 01001 00010 A = 01000 00100 00010 11111 01110 R = 01110 01110 01111 9.2.2 Finding the Reachability Matrix by Depth-first search * Let R ultimately be the reachability matrix. * A solution - find y from x assign a 1 to xRy. each search fills one row, processing one row at a time * Modification - add the intermediate row nodes to R as they are seen. * As seen in chapter 4, worst case running time is O(nm) using an adjacency list structure. 9.2.3 Transitive Closure by Shortcuts Input: A and n, A is an n X n boolean matrix Output: R the boolean matrix for the transitive closure of A void simpleTransitiveClosure(boolean[][] A, int n, boolean[][] R) int i, j, k; Copy A into R; // set diagonal to 1 for(i = 1; i<=n; i++ r[i][i] = 1; while(any entry of R changed during one complete pass) for(i=1; i<=n; i++) for(j=1; j<=n; j++) for(k=1; k<=n; k++) r[i][j] = r[i][j] ۷ (r[i][k] ۸ r[k][j]); Consider A = 0 0 0 0 0 1 0 1 0 0 0 0 0 1 0 0 1 0 0 1 1 0 0 0 0 Clearly, the running time is O(n3). 9.3.1 Warshall's Algorithm * consider the transitive relation as though it were a digraph. * consider algorithm 8.1, it computes the transitive closure in O(n3). * messing with the subscripts can make the algorithm run faster by not reconsidering some of the triples. Consider example. input: A and n, where A is an n X n adjacency matrix output: R, the n X n transitive closure matrix of A void transtiveClosure(boolean[][] A, int n, boolean [][] R) int i, j, k; Copy A into R; Set all main diagonal entries, r[i][i] to true; for (k=1; k<=n; k++) for (i=1; i<=n; i++) for (j=1; j<=n; j++) r[i][j] = r[i][j] | (r[i][k] & r[k][j]); 9.3.2 Warshall's Algorithm for Bit Matrices * Uses logical OR. * does n2 logical or's but if matrix doesn't fit in a word the number of or's is n/c where c is the word size. So the algorithm still does n3/c yielding an algorithm in O(n3) input: A and n, where A is an n X n adjacency matrix output: R, the n X n transitive closure matrix of A void transtivieClosure(boolean[][] A, int n, boolean [][] R) int i, k; Copy A into R; Set all main diagonal entries, r[i][i] to true; for (k=1; k<=n; k++) for (i=1; i<=n; i++) if(r[i][j] == 1) bitwiseOR(R[i], R[k], n); 9.5 Computing transitive Closure by Matrix Operations * Let A be the relation on S. * A2 is a matrix of paths of length 2. n aij2 k1(aik akj ) * In a graph of n vertices, if there is a path from vertex v to vertex w, then there is a path from v to w of length at most n. * c(i)(j) = UNION 1 to k of (a(i)(k) ^ b(k)(j)) * Requires work to compute the matrices. Properties that ease our life a graph of n vertices requires a path Absorption of +: A + A = A Commutative of +: A + B = B + A Associative of + and *: A + (B + C) = A * (B * C) = * Distributive of + over *: A + (B * C) * Identity: IA = AI = A * * * * of length at most n (A + B) + C (A * B) * C = (A + B) * (A + C) For s >= n-1 R = A + A2 + A3 + A4 + ... + As = A(I + A + A2 + ... + As-1) I + A + A2 + ... + As = (I + A)s -- verify this! proof - induct on s s = 0 left side is I right side is (I + A)0 = I ok for s > 0 if I + I + A + A2 + ... + A(s-1) = (I + A)(s-1) then (I + A)s = (I + A)(s-1) (I+A)=(I+A)(s-1)I + (I+A)(s-1)A * Thm 9.7 Let A be an nXn Boolean matrix representing a binary relation. Then R, the transitive closure of A is (I+A)s for s >= n-1. How much work does this require? I + A requires n operations to insert 1's on the diagonal. s >= n let it be 2lg(n-1) + 1. i.e. s-1 is 2lg(n-1) Then (I+A)(s-1) is computed in lg(n-1) matrix multiplications. So R is computed in lg(n-1) + 1 matrix multiplications. Each multiplication requires O(n3) operations so R can be computed in O(n3 * lgn) 9.6.1 Kronrod's Algorithm It is used to multiply boolean matrices. C = A x B For example suppose: The A matrix row determines which rows of B are to be unioned to provide a row of C. And several rows of B will be unioned over and over. Kronrod does all possible combinations of unions in groups, along with the entries of A determining which unions to use, thus yielding C. Divide B into groups containing t rows. Compute all possible unions w/in each group. Groups B1 ... Bt Bt+1 ... B2T ... Suppose t=4 and A and B are 12x12. All combinations of B1, B2, B3, B4 are computed. All unions can be found by doing 11 unions. B1 B2, B1 B3, B1 B4, B2 B3, B2 B4, B3 B4 B1 B2 B3, B1 B2 B4, B2 B3 B4, B1 B3 B4 B1 B2 B3 B4 - 11 unions Similar for B5...B8, B9...B12 Suppose A is: column 1 2 3 4 5 6 7 8 9 0 1 2 ----------------------A1 : 1 0 1 1 0 1 0 1 0 0 0 1 A2 : 1 0 1 1 1 0 0 1 0 1 0 1 A3 : 1 0 1 1 1 0 0 1 1 0 1 1 A4 : ... A5 : ... A6 : 0 1 0 1 A7 : 1 0 1 1 1 0 0 1 1 1 1 0 A8 : ... A9 : ... A10: ... A11: ... A12: ... For the first row of C, 2 more unions are required. (B1 B3 B4) (B6 B8) (B12) We can do fewer than n2 unions if t is chosen correctly. n Let there be g (ceiling) t For each group there are 2t sets of rows to be combined, including the empty set, each of the single terms, B1, B2, etc. A total of 2t - 1 - t unions are done for each group. No unions are needed for B1, B2, B3, B4. n n There are groups so (2 1 t ) t t combinations of unions. t n unions are done for the C is computed by unioning at most 1 t additional n unions per row and for all rows. n 1 additional t unions. Grand total unions for all combinations and for C is the sum: n n ( )(2 t 1 t ) (n)( 1) t t Considering the high order terms (n2 t ) n 2 t t if t = 1 => O(n2) if t = n => O(2n) no improvement. worse (n2 t ) n 2 We want to minimize t t (n2 t ) if is dominant unless t < lg n in which case t n2 is dominant. t (n2 t ) If is dominant we want t as large as possible but not t larger than lg n. Let t be lg n, makes ( 2n 2 ) ( n 2 t ) n 2 ( 2n 2 ) = which is t t lg n lg n thus, less than n2 an improvement. IMPLEMENTATION For each group we store 2lgn sets use array “UNIONS” to hold the sets. index into “UNIONS” is t bit binary numbers -- b1b2...bt The bits indicate which rows of B are included in index value union index value union 0 0000 empty set, no B's 8 1000 B1 1 0001 B4 9 1001 B1 2 0010 B3 10 1010 B1 3 0011 B3 B4 11 1011 B1 4 0100 B2 12 1100 B1 5 0101 B2 B4 13 1101 B1 6 0110 B2 B3 14 1110 B1 7 0111 B2 B3 B4 15 1111 B1 UNIONS[i] B4 B3 B3 B2 B2 B2 B2 B4 B4 B3 B3 B4 2t entries of UNIONS are required for the first t unions Break a row of A into t entries each. A(i,j) A(i,j) i=j=1, ie the is the jth segment of t entries in the ith row. is the correct index into UNIONS, for example, if A(i,j) = 1101 = 13 => UNIONS(13) which is B1 B2 B4 correct row unions of B for this part of A. A(i,j) will be the correct index of UNIONS plus a multiple of 2^t namely (j-1)2^t