Minimum Spanning Trees and Kruskal’s Algorithm CLRS 23 Minimum Spanning Trees (MST) • A common problem in communications networks and circuit design: connecting a set of nodes by a network of minimum total length • Represent the problem as an undirected graph where edge weights denote wire lengths 8 7 9 4 4 11 7 6 10 8 1 14 2 Formally, • Given a connected undirected graph G=(V,E), a spanning tree is an acyclic subset of edges T that connects all the vertices together. • Assuming each edge (u,v) of G has a numeric weight (or cost) w(u,v), the cost of a spanning tree T is the sum of the weights of the edges in T. w(T ) w(u, v) ( u ,v )T • A minimum spanning tree (MST) is a spanning tree of minimum weight. Minimum Spanning Trees (MST) • Never pays to have cycles • Resulting connection graph is connected, undirected and acyclic, hence a free tree • Not unique 8 4 7 8 9 2 4 11 7 6 1 14 2 Total weight = 37 9 2 4 11 10 8 4 7 7 6 10 8 1 14 2 Total weight = 37 Generic MST Algorithm • Two main algorithms for computing MSTs – Kruskal’s – Prim’s • Both greedy algorithms • The greedy strategy captured by a generic aproach: – Grow the MST one edge at a time – which edge to select? Generic MST Algorithms • Maintain a set of edges : A • Loop invariant: A is a subset of some MST • Safe edge (u,v): A U {(u,v)} is also a subset of an MST Generic-MST(G, w) A = empty set while A does not form a spanning tree do find an edge (u,v) that is safe for A A = A U {(u,v)} return A How to recognize safe edges? Definitions first: • Cut: Let S be a subset of vertices V. A cut (S, V-S) is a partition of vertices into two disjoint sets S and V-S. 8 4 S 7 9 2 4 11 7 6 10 8 V-S 1 14 2 • We say that an edge crosses the cut (S, V-S) if one of its endpoints is in S and the other is in (V-S) 8 4 S 7 9 2 4 11 7 6 10 8 V-S 1 14 2 • We say that a cut respects a set A of edges if no edge in A crosses the cut 8 4 S 7 9 2 4 11 7 6 10 8 V-S 1 14 2 • An edge is a light edge crossing a cut if its weight is the minimum of any edge crossing the cut. 7 8 4 S 9 2 4 11 7 6 10 8 V-S 1 14 2 How to recognize safe edges? Theorem: Let G=(V,E) be a connected undirected graph with a real valued weight function w defined on E. Let A be a subset of E that is included in some MST for G. Let (S, V-S) be any cut of G that respects A and let (u,v) be a light edge crossing (S, V-S). Then, edge (u,v) is safe for A. Proof • Suppose that no MST contains (u,v). • Let T be an MST of G. x v 4 u 6 8 y T • Add (u,v) to T, creating a cycle. • There must be at least one more edge on this cycle that crosses the cut: (x,y) x v 4 u 6 8 y • The edge (x,y) is not in A, since the cut respects A • Remove (x,y) and restore the spanning tree T’ • w(T’) = w(T) + w(u,v) – w(x,y) < w(T) – Conradicts the assumption that T is an MST. x v 6 4 u y T’ Kruskal’s Algorithm • Add edges to A in increasing order of weights. – If the edge being considered introduces a cycle, skip it. – Otherwise add to A. • Notice that the edges in A make a forest, and they keep getting merged together. Correctness • Say (u, v) is the edge is going to be added next, and it does not introduce a cycle in A. • Let A’ denote the tree of A that contains vertex u. Consider the cut (A’, V-A’). • Every edge crossing the cut is not in A, so this cut respects A and (u,v) is the light edge crossing it. • Thus, (u,v) is safe.. How to efficiently detect if adding (u,v) creates a cycle in A • Can be done by a data structure called Disjoint Set UnionFind • Supports three operations: – CreateSet(u): create a set containing a single item u – FindSet(u): return the set that contains item u – Union(u,v): merge the set containing u with the set containing v • Suffices to know: takes O(n log n + m) time to carry out any sequence of m union-find operations on a set of size n. Kruskal(G=(V,E), w) { A = empty set for each (u in V) CreateSet(u) //create a set for each vertex Sort E in increasing order by weight for each ((u,v) from the sorted list E) { if (FindSet(u) != FindSet(v)) { // u and v are in different trees Add(u,v) to A Union(u,v) } Θ(E log E) for sorting the edges } O(V log V + E) for a sequence of E return A union find operations } Total : O(E log E) since (E >= V-1) Prim’s Algorithm • Edges in A always form a tree (partial MST) • Start from a vertex r and grow until the tree spans all vertices • Let S denote the set of vertices which are on this partial MST – A is the set of edges connecting the vertices in S • At each stage the light edge crossing (S, V-S) is added to A S 8 4 r V-S 7 9 2 4 11 7 6 8 1 14 10 2 Prim’s Algorithm Prim(G=(V,E),w,r) { for each (u in V) { key[u] = infinity pred[u] = NIL } key[r] = 0; PQ = {V} //add all vertices to Priority Queue PQ based on keys while (PQ is not empty) do { u = PQ.Extract-Min() for each (v in Adj[u]) if (v is in PQ and w(u,v) < key[v]) then { pred[v] = u key[v]=w(u,v) PQ.DecreaseKey(v, key[v]) //reorganizes the PQ } } }