CSCI 333: Greedy algorithms

advertisement
CSCI 333: Kruskal’s Algorithm
Overview
Prim’s algorithm builds upon a single partial MST, at each step adding
an edge connecting the vertex nearest to but not already in the
current partial minimum spanning tree.
Kruskal’s algorithm, on the other hand, maintains a set of partial
minimum spanning trees, and repeatedly adds the shortest edge in the
graph whose vertices are in different partial minimum spanning trees.
High-level Prim’s algorithm:
Let: F be a subset of edges, Y be a subset of vertices, and G be the
graph such that G=<E,V>
F=0
Y = {v1}
while (the instance is not solved) {
select a vertex in V-Y that is nearest to Y
add the vertex to Y
add the edge to F
if (Y == V)
done
}
Kruskal’s high-level algorithm:
F=0
while (the instance is not solved) {
select the next locally optimal edge
if (the edge connects two vertices in disjoint subsets) {
merge the subsets;
add the edge to F;
}
}
if (all the subsets are merged)
done
Kruskal’s algorithm for the Minimum Spanning Tree problem starts by
creating disjoint subsets of V, one for each vertex and containing only
that vertex. It then inspects the edges according to non-decreasing
weight. If an edge connects two vertices in disjoint subsets, the edge
is added and the subsets are merged into one set. This process is
repeated until all the subsets are merged into one set.
Kurskal’s Algorithm
In Kruskal’s algorithm, we have to handle multiple sets: the nodes in
each connected component. We need to carry out 2 operations:
find(x), where x is a node, and merge(A,B) to merge two disjoint sets.
We will use a disjoint set abstract data type, which consists of data
types index and set pointer, and routines initial, find, merge, and
equal. If we declare
index i ;
set pointer p, q ;
Then
 initial(n) initializes n disjoint subsets, each of which contains
exactly one of the indices between 1 and n.
 p = find (i) makes p point to the set containing index i.
 merge (p, q) merges the two sets, to which p and q point, into
the set.
 equal (p, q) returns true if p and q both point to the same set.
Data structures for Disjoint Sets are covered in Appendix C (page
590). For an on-line discussion click here.
void kruskal (int n, int m, set_of_edges E, set_of_edges&
F)
{
index I, j;
set_pointer p, q;
edge e;
sort the m edges in E by weight in non-decreasing order
F = 0
initial(n)
while (number of edges in F is less than n-1) {
e = edge with least weight not yet considered;
I, j = indices of vertices connected by e
p = find(i)
q = find (j)
if ( ! equal(p,q)) {
merge (p, q)
add e to F
}
}
}
To understand Kruskal’s algorithm better consider an example.
First, we sort the edges by increasing order of weight [(v1, v2), (v3,
v5), (v1, v3), (v2, v3), (v4, v5), (v2, v4)]. Then disjoint sets are
created. After that we select edge (v1, v2) because it is the least
weight, then edge (v3, v5), then edge (v1, v3), edge (v2, v3) is not
added because it will create a cycle. Then edge (v3, v4) is selected.
Time Complexity
Let n be the number of vertices, and m the number of edges.
The edges can be sorted in worse-case time  (m lg m) using
mergesort as described in chapter 2.
The disjoint sets can be initialized in time  (n).
Each pass through the while loop, we do two find operations and
potentially one merge operation. In the worse case, there are m
passes through the while loop. According to Appendix C (page 597),
the worse case number of comparisons done in m passes through a
loop containing a constant number of calls to equal, find and merge is
 (m lg m).
Therefore the time complexity is  (m lg m) where m is the
number of edges.
In the case of a fully connected graph, where every vertex is
connected to every other vertex,
m = (n)(n-1)/2 =  (n2)
Therefore, in the worse case, Kruskal’s algorithm is order:
 (n2 ln n2) =  (n2 ln n)
Comparing Prim’s Algorithm with Kruskal’s Algorithm
A given graph G can be fully connected or a connected graph G can
have as few as |V|-1 edges, hence the relationship between the
number of edges and the number of vertices in a connected graph is
|V|-1 <= |E| <= (|V|-1)|V|/2.
If a graph has large number of edges, i.e. |E| ~= |V|2, then the
running time of Kruskal’s Algorithm becomes  (|V|2log |V|) therefore
Prim’s algorithm is better.
If a graph has small number of edges, i.e. |E| ~= |V|, then the
running time of Kruskal’s Algorithm becomes  (|V| log |V|) therefore
Kruskal’s algorithm is better.
Proof
Again, a proof is necessary is establish that an optimal solution is
produced by a greedy algorithm. The proof by induction, similar to the
proof for Prim’s algorithm, is on page 154 of your text.
Work problems 6 and 9 on page 182 of your text.
Download