Lecture 7 Network Flows We consider a network with directed edges. Every edge has a capacity. If there is an edge from i to j, there is an edge from j to i (possibly of 0 capacity) There is a source s, and destination d The objective is to find the maximum flow that can be sent from s to t. Flow Maximization Flow in edge (u, v) is xuv Maximize the total output flow from the source d such that Flow in every edge is upper bounded by the edge capacity 0 xuv Cuv Input flow = Output Flow at every node other than the source and the destination (flow conservation) v: (v, u) Exvu = v: (u, v) Exuv v: (v, s) Exvs - v: (s, v) Exsv = -d v: (v, t) Exvt - v: (t, v) Extv = d Source 1 1.5 0.5 2.5 Destination Applications of Network Flow Routing applications - MIRA Find a maximum matching in an undirected bipartite graph Matching is a set of edges such that no two edges in the set share the same vertex. Bipartite graph is a graph such that the set of vertices can be partitioned into two sets A, B and every edge has a vertex in A and another in B. Network flow algorithms can be used to find a maximum matching in a bipartite graph. As long as the link capacities are integers, there exists at least one maximum flow which allocates integer flows to the links, And there are network flow algorithms which attain this integer flow allocation. Consider the bipartite graph with unit capacity associated to all the edges. All edges are directed from vertices in A and B. Add a source and a destination node. The source has unit capacity edge to every vertex in set A The destination has unit capacity edge from every vertex in set B Any integral flow is a matching and vice versa (integral flow gives 0 or 1 flow values to the edges, corresponding matching is the edges with flow value 1) s So max flow gives a max matching t Scheduling on Uniform Parallel Machines There are a set of J jobs. M parallel machines Every job has a start day and a deadline day. Job j needs pj days to finish if it is served continuously. A machine can work on only one job at a time. A job can be served by only one machine at a time Preemption allowed: Service of a job can be interrupted, and jobs can switch machines. Is there a schedule for completing every job within its deadline? Network Flow Formulation Sort the arrival and the deadlines. Let the arrival day for job 1 be 1, job 2 3, job 3 4, Deadline for job 1 is 2, job 2 is 5, and job 3 is 7. The sorted list is 1, 2, 3, 4, 5, 7 Form the non-overlapping intervals between arrival and deadline epochs. The intervals are (1, 2), (3, 4), (4, 5), (5, 7). There is a vertex corresponding to each job, and a vertex corresponding to each interval. A job can be processed in an interval iff the interval is between its arrival and deadline. There is an edge between a job and an interval vertices iff the job can be processed in the interval. Capacity of the edge is the length of the interval. There is a source with edge to every job. Capacity of such an edge to job j is pj (the number of machinedays required by the job j) There is a destination with edge from every interval. Capacity of such an edge is M.length of the interval Let the arrival day for job 1 be 1, job 2 3, job 3 4, Deadline for job 1 is 2, job 2 is 5, and job 3 is 7. 1 (1, 2) 1 s p1 2 p2 3 p3 1 1 1 2 (3, 4) 3 (4, 5) 3 3 (5, 7) 6 t There exists a schedule iff max flow is p1 + p2 + p3 Flows and Cuts • A cut is a partition of the vertex set into two nonintersecting sets S and T such that the source belongs to S and the destination belong to T. • Capacity of a cut is capacity of the edges crossing the cut – C[S, T] = (u, v) E, u S, v TCuv • Minimum cut is a cut with minimum capacity among all cuts. • The value of any flow is less than or equal to that of any cut. – Entire flow from source to destination must cross the Hence, max flow min cut capacity. Need to show that max flow = min. cut We will use the concept of residual graphs to prove this result. Combinatorial Implications of Max Flow Min Cut Theorem The maximum number of edge-disjoint paths from the source to destination equals the minimum number of edges whose removal from the network disconnects the source from the destination. Maximum cardinality of any matching equals the minimum cardinality of any vertex cover in a bipartite graph. Residual Network Given a flow, the residual capacity of any edge is the maximum additional flow that can be sent from node i to j using edges (u, v) and (v, u). This value is Cuv - xuv + xvu (ruv) The residual network is the network with the edges of positive residual capacity. 1 (3, 4) 3 (2, 2) 2 (2, 3) (5, 5) 2 (0, 1) 4 Original Network 1 1 3 1 2 1 3 2 5 4 Residual Network An augmenting path is a path in the residual flow graph from the source to the destination. Proof of Maxflow MincutTheorem Suppose there is no augmenting path from the source to the destination . Let S be the set of nodes which can be reached from the source in the residual network. Let T be the remainder of the nodes. Clearly s S and t T and S, T constitute a cut in the graph. There is no edge between vertices in S and T in the residual graph. Clearly, any edge (u, v) between S and T have 0 residual capacity, ruv = 0 This means that Cuv – xuv + xvu = 0 Note that xuv Cuv and xvu 0 It follows that xuv = Cuv and xvu = 0 for any edge crossing the cut. Can see intuitively as well as argue that the net flow between the source and the destination is the flow from S to T – the flow between T to S, i.e., (u, v) E, u S, v Txuv - (u, v) E, u S, v Txvu The first term is C[S, T] = (u, v) E, u S, v TCuv (the capacity of the cut). The second term is 0. Hence, the net flow from the source to the destination is equal to the capacity of the cut. Since value of any flow is upper bounded by the capacity of any cut and hence the S, T cut, this flow is the max flow. Since the capacity of any cut is lower bounded by the value of any flow, and hence this flow, the S, T cut is the minimum cut. It follows that the max flow = min cut. All we need to show that there exists a flow for which the residual network has no augmenting path. We will present an algorithm which computes such a flow in finite time. From the previous argument such a flow is the max. flow. Generic Augmenting Path Algorithm Note that the capacity of every edge (u, v) in the residual network is the maximum additional flow from u to v, using both the edges between u and v. Consider a path P from s to t in the residual graph. Clearly the flow from s to t can be augmented by the minimum capacity in the augmenting path. Initially, flow is 0 Construct the residual network while the residual network has a directed path from s to t (augmenting path) Identify an augmenting path P from s to t Let C be the capacity of the augmenting path P Send C units of flow along P Update the residual network Route 1 unit along 1-2-3-4 Route 4 units along 1-3-4 source 1 2 2 source 4 1 3 3 5 1 4 3 2 2 4 3 41 1 4 destination destination Original Network/Residual Network with 0 flow source source 1 4 1 1 2 2 1 Residual Network 1 4 3 1 Route 1 unit 5 along 1-2-4 4 destination 2 2 2 1 3 1 5 4 No augmenting path destination Complexity Analysis When the algorithm terminates, the residual network has no augmenting path. By the previous argument, the output of this algorithm is a max flow We will argue that the algorithm terminates in O(VEC) complexity, if every link has an integer capacity, and C is the maximum capacity of the link An augmenting path can be found in O(V + E) A residual network can be computed in O(E) Every time there is an augmentation, flow increases by 1 unit. Since every edge capacity is upper bounded by C, and there are O(V) edges crossing the cut {s}, {V – s}, max flow is O(VC) So there are O(VC) augmentations. Each augmentation has an augmentation complexity + residual network computation complexity. O(V + E) + O(E) Overall complexity is O(VEC) The proof can be extended to rational capacities. Verify that the complexity for rational capacities is O(VEC/c), where c is the least positive capacity in the network. However, the algorithm may not terminate for irrational capacities, and the successive flow values may converge to a value strictly less than max-flow. However, there are other algorithms which converges for irrational capacities to the maxflow value, and the maxflow mincut theorem holds irrespectively. Slow Speed of Convergence Convergence takes long time for large capacity values. source 1 C 2 C 1 First route 1 unit through 1-2-3-4 3 C Then route 1 unit through 1-3-2-4 Again, 1-2-3-4… C 4 destination Repeat the cycle. Network flow will be generated in O(C) time Capacity Scaling Algorithm The above problem goes away if we route traffic along the maximum capacity augmenting path. We need just 2 iterations! It can be shown that routing traffic along the maximum capacity augmenting path reduces the number of iterations to O(Elog C) as opposed to O(VC) However we need O(ElogE) complexity to locate the maximum capacity augmenting path as opposed to O(E) for locating any augmenting path. But, by choosing augmenting paths with ``sufficiently large’’ capacity, though not the largest capacity the algorithm runs in overall O(E2log C) time. G(, x) is a subgraph of the residual network for flow x, Such that only edges with residual capacity not less than are included. 1. Start with 0 flow (x = 0) and = 2log C 2. Compute a residual capacity graph G(, x) 3. If the source is connected to the destination, • choose any augmenting path and identify its capacity. • Augment the existing flow with the capacity of this path. • Back to Step 2. 4. Else, • Set = /2 • If ( 1) go to step 2, else stop. Complexity Analysis Consider the number of augmentations done with a particular value of . First note that one is trying the value , because with the current flow value, there is no path from the source to the destination with capacity 2 Let S be the set of nodes reachable from source in G(2, x), where x is the flow when we start with . Clearly, the edges crossing the cut (S, V-S) have capacity less than or equal to 2. The capacity of the cut is at most 2E . This means that the max flow in the residual network is upper-bounded by 2E . Note that the difference in max flow and current flow is upper bounded by the max flow in the residual network. This means that current flow can not be increased by more than 2E . Any augmentation in the -phase is increases the flow by at least . So we can perform only 2E such augmentations. For each value of , there are 2E augmentations. Each augmentation has complexity O(E) as before. We try log C values of . So overall complexity is O(E2 log C) However, the complexities of both the previous algorithms depend on the maximum capacity value, and are hence called pseudo-polynomial. Polynomial Complexity Algorithms • Complexity is a polynomial in the number of edges and vertices • Shortest Augmenting Path algorithm • Generic preflow-push algorithms – FIFO pre-flow push algorithms – Highest capacity pre-flow push algorithms Shortest Augmenting Path Algorithm Shortest Augmenting Path Algorithm forwards the flow along the shortest path from the source to the destination in the residual network. Note that if we repeatedly use the shortest path to forward the flow, the length of the shortest path increases It can be shown that during E augmentations, the length of the shortest path is guaranteed to increase. The length of the shortest path can not increase beyond V. Hence, there are at most VE augmentations. Distance Labels There is a function d from the node space to the set of nonnegative integers. d(i) is a nonnegative integer for every node I d(t) = 0 d(i) d(j) + 1, for every edge (i, j) in the residual network An edge (u, v) in the residual network is an admissible edge if d(u) = d(v) + 1 An admissible path is one which consists of only admissible edges Properties of Distance Labels Distance label of a node is less than or equal to the length of the shortest path from the node to destination ( here the weight of a path is the number of hops in the path). Observe that an admissible path from the source to the destination is indeed the shortest path. Hence, if the distance label of the shortest path is V, then there is no augmenting path in the residual network. As we have proved earlier, the flow is the maximum flow. Algorithm Description Start with distance labels equal to the shortest path lengths to the destination in the original network. Always route flow along an admissible path in the residual network. Additional flow routed each time equals the capacity of the chosen admissible path. While constructing an admissible path, change the distance functions as necessary. Note that the residual network changes as flows are updated. However, the efficiency of the algorithm is that all distance labels need not be changed every time the residual network is updated because of flow augmentation. Admissible paths are constructed as follows: Source node selects an admissible outgoing edge. End node of this edge selects an admissible outgoing edge. Suppose no outgoing edge is admissible for an edge. Consider a node u. It may well happen that d(u) < d(v) + 1 for all of its outgoing edges (u, v). In that case, the label of u is increased to minv: (u, v) residual network{d(v) + 1}. Now the edge (w, u) in the path constructed so far may no longer be admissible as d(u) increases. In that case, (w, u) is removed from the path, and the process searches for another admissible edge from w. If none are found, then the label of w is increased further, and so on. Once an admissible path is computed (path under construction reaches the destination), existing flow is augmented by the capacity of the path. Residual network is modified. One looks for a new admissible path. The algorithm terminates only when the distance label of the source reaches V. Example 3 s 2 3 s 2 1 2 1 1 1 1 1 1 2 1 2 1 0 t 1 0 t (s, 1) is admissible But there is no admissible edge from node 1. So node 1 is re-labeled to 4 Consequently, s is relabeled to 5. Algorithm terminates Proof for Correctness Supposing the distance labels satisfy the required distance properties at termination, then termination occurs only when there is no augmenting path from the source to the destination. Why? This means at termination the output is the max flow as argued earlier. We need to argue that the distance properties hold at termination. In fact, we will argue that re-labeling does not violate the distance properties. Relabeling increases the distance function for a node. Possible violation takes place if after relabeling node i, d(i) d(j) + 1, for some edge (i, j) in the residual network. But we know that after relabelling d(i) =mini: (i, j) residual network{d(j) + 1}. Also, when we route additional flow residual network changes. But this change is restricted to the augmenting path only. Some edges disappear along this path. This does not cause any violation However, some edges appear as well! But if an edge (j, i) appears, then flow has been routed along (i, j) . This means (i, j) is admissible. This means d(i) = d(j) + 1 It follows that d(j) d(i) + 1 Hence, distance labels are always valid! Complexity Analysis A combinatorial argument shows that the complexity is O(V2E)