ppt

advertisement
UBI529
3. Distributed Graph
Algorithms
Distributed Algorithms Models
Interprocess Communication method: accessing shared memory, pointto-point or broadcast messages, or remote procedure calls.
• Timing model: synchronous or asynchronous models.
• Failure models : reliable or faulty behavior; Byzantine failures (failed
processor can behave arbitrarily).
2
We assume
A distributed network—Modeled as a graph. Nodes are processors and
edges are communication links.
• Nodes can communicate directly (only) with their neighbors through
the edges.
• Nodes have unique processor identities.
• Synchronous model: Time is measured in rounds (time steps).
• One message (typically of size O(log n)) can be sent through an edge
in a time step. A node can send messages simultaneously through all its
edges at once in a round.
• No failure of nodes or edges. No malicious nodes.
3
2.1 Vertex and Tree Coloring
• Vertex Coloring
• Sequential Vertex Coloring Algorithms
• Distributed Synchronous Vertex Coloring Algorithm
• Distributed Tree Coloring Algorithms
Preliminaries
Vertex Coloring Problem: Given undirected Graph G = (V,E). Assign a
color cu to each vertex u Є V such that if e = (v,w) Є E, then cu ≠ cw Aim
is to use the minimum number of colors.
Definition 2.1.1 : Given an undirected Graph, chromatic number Χ(G)
is the minimum number of colors to color it. A vertex k-coloring uses
exactly k colors. If X(G) = k, G is k-colorable but not (k-1) colorable.
Calculating X(G) is NP-hard. 3-coloring decision is NP-complete.
Applications :
Assignment of radio frequencies : Colors represent frequencies,
transmitters are the vertices. If two stations are neighbors when they
interfere.
University course scheduling : Vertices are courses, students edges
Fast register allocation for computer programming : Vertices are
variables, they are neigbors if they can be active at the same time.
5
Sequential Algorithm for Vertex Coloring
Algorithm 2.1.1 : Sequential Vertex Coloring
Input : G with v1,v2, ..., vn
Output : Vertex Coloring f : VG -> {1,2,3,..}
1. For i =1 to n do
2.
f(vi) := smallest color number that does not conflict by any of
the other colored neighbors of vi
3. Return Vertex Coloring f
6
Vertex Coloring Algorithms
Definition 2.1.2 : The number of neighbors of a vertex v is called the
degree of v δ(v). The maximum degree vertex in a Graph G is called the
the Graph degree Δ(G) = Δ.
Theorem 2.1.1 : The algorithm is correct and terminates in O(n)
steps. The algorithm uses Δ +1 colors.
Proof: Correctness and termination are straight-forward. Since each
node has at most Δ neighbors, there is always at least one color free in
the range {1, …, Δ+1}.
Remarks:
• For many graphs coloring can be done with much less than Δ +1 colors.
• This algorithm is not distributed; only one processor is active at a
time. But: Use idea of Algorithm 1.4 to define “local” coloring
subroutine 1.7
7
Heuristic Vertex Coloring Algorithm : Largest Degree First
Idea : (Two observations) A vertex of a large degree is more difficult
to color than a smaller degree vertex. Also, a vertex with more
colored neighbors will be more difficult to color later
Algorithm 2.1.1 : Largest Degree First Algorithm
Input : G with v1,v2, ..., vn
Output : Vertex Coloring f : VG -> {1,2,3,..}
1. While there are uncolored vertices of G
2. Among the uncolored max. degree vertices
Choose vertex v with the max. Colored degree
3. Assign smallest possible k to v : f(v) := k
4. Return Vertex Coloring f
The coloring in the diagram is v3,v1,v2,v4,v8,v6,v7,v5
Colored degree : # of different colors used to color neighbors of v
8
Coloring Trees : A Distributed Algorithm
Lemma 2.1.1: X(Tree) <= 2.
Proof: If the distance of a node to the root is odd (even), color it 1 (0).
An odd node has only even neighbors and vice versa.
If we assume that each node knows its parent (root has no parent) and
children in a tree, this constructive proof gives a very simple algorithm.
Algorithm 2.1.3 [Slow tree coloring]:
1. Root sends color 0 to children. (Root is colored 0)
2. When receiving a message x from parent, a node u picks color cu = 1x, and sends cu to its children
9
Distributed Tree Coloring
Remarks:
• With the proof of Lemma 2.1.1, the algorithm 2.13 is correct.
• The time complexity of the algorithm is the height of the tree.
• When the root is chosen randomly, this can be up to the diameter of
the tree.
10
2.2 Distributed Tree based Communication
Algorithms
• Broadcast
• Convergecast
• BFS Tree Construction
Broadcast
Broadcasting means sending a message from a source node to all other
nodes of the network.
Two basic broadcasting approaches are flooding and
spanning tree-based broadcast.
Flooding:
A source node s wants to send a message to all
nodes in the network. s simply forwards the message over all its edges.
Any vertex v != s, upon receiving the message for
the first time (over an edge e) forwards it on every
other edge.
Upon receiving the message again it does nothing.
12
Broadcast
Definition 2.2.1 [Broadcast]: A broadcast operation is initiated by a
single processor, the source. The source wants to send a message to all
other nodes in the system.
Definition 2.2.2 [Distance, Radius, Diameter]:
The distance between two nodes u, v in an undirected graph is the
number of hops of a minimum path between u and v.
•
The radius of a node u in a graph is the maximum distance between u
and any other node. The radius of a graph is the minimum radius of any
node in the graph.
•
The diameter of a graph is the maximum distance between two
arbitrary nodes.
•
13
Broadcast
Theorem 2.2.1 [Lower Bound]: The message complexity of a broadcast
is at least n-1. The radius of the graph is a lower bound for the time
complexity.
Proof: Every node must receive the message.
Remarks:
• You can use a pre-computed spanning tree to do the broadcast with
tight message complexity.
• If the spanning tree is a breadth-first spanning tree (for a given
source), then also the time complexity is tight.
Definition 2.2.3 : A graph (system/network) is clean if the nodes do
not know the topology of the graph.
Theorem 2.2.2 [Clean Lower Bound]: For a clean network, the number
of edges is a lower bound for the broadcast message complexity.
Proof: If you do not try every edge, you might miss a whole part of the
graph behind it.
14
Flooding
Algorithm 2.2.1 [Flooding]: The source sends the message to all
neighbors. Each node receiving the message the first time forwards to
all (other) neighbors.
Remarks:
• If node v receives the message first from node u, then node v calls
node u “parent”. This parent relation defines a spanning tree T. If the
flooding algorithm is executed in a synchronous system, then T is a
breadth-first spanning tree (with respect to the root).
• More interestingly, also in asynchronous systems the flooding
algorithm terminates after r time units, where r is the radius of the
source. (But note that the constructed spanning tree needs not be
breadth-first.)
15
Flooding Analysis
Theorem : The message complexity of flooding is (|E|) and the time
complexity is (D), where D is the diameter of G.
Proof. The message complexity follows from the fact that each edge
delivers the message at least once and at most twice (one in each
direction). To show the time complexity, we use induction on t to show
that after t time units, the message has already reached every
vertex at a distance of t or less from the source
16
Broadcast Over a Rooted Spanning Tree
Suppose processors already have information about a rooted spanning
tree of the communication topology



tree: connected graph with no cycles
spanning tree: contains all processors
rooted: there is a unique root node
Implemented via parent and children local variables at each processor

indicate which incident channels lead to parent and children in the
rooted spanning tree
17
Broadcast Over a Rooted Spanning Tree: A Simple Algorithm
1. root initially sends msg to its children
2. when a node receives msg from its parent


sends msg to its children
terminates (sets a local boolean to true)
Synchronous model:


time is depth of the spanning tree, which is at most n - 1
number of messages is n - 1, since one message is sent over each
spanning tree edge
Asynchronous model:

same time and messages
18
Tree Broadcast
Assume that a spanning tree has been constructed.
Theorem . For every n-vertex graph G with a spanning tree T rooted at
r0, the message complexity of broadcast is n−1 and time complexity is
depth(T).
A broadcast algorithm can be used to construct a spanning tree in G.
The message complexity of broadcast is asymptotically equivalent to
the message complexity of spanning tree construction.
Using a breadth-first spanning tree, we get the
optimal message and time complexities for broadcast.
19
Convergecast
Again, suppose a rooted spanning tree has already been computed by
the processors

parent and children variables at each processor
Do the opposite of broadcast:


leaves send messages to their parents
non-leaves wait to get message from each child, then send combined
info to parent
20
Convergecast
solid arrows:
parent-child relationships
a
b,d
dotted lines:
non-tree edges
c,f,h
b
c
d
f,h
e,g
d
e
f
g
h
g
h
21
Finding a Spanning Tree Given a Root
a distinguished processor is known, to serve as the root
root sends M to all its neighbors
when non-root first gets M



set the sender as its parent
send "parent" msg to sender
send M to all other neighbors
when get M otherwise

send "reject" msg to sender
use "parent" and "reject" msgs to set children variables and know when
to terminate
22
Execution of Spanning Tree Alg.
a
b
d
a
Both models:
O(m) messages
O(diam) time
c
e
f
g
h
Synchronous: always gives
breadth-first search (BFS) tree
b
d
c
e
f
g
h
Asynchronous: not
necessarily BFS tree
23
2.3 Distributed Minimum Spanning Tree
Algorithms
Minimum Spanning Tree
Minimum spanning tree. Given a connected graph G = (V, E) with realvalued edge weights ce, an MST is a subset of the edges T  E such
that T is a spanning tree whose sum of edge weights is minimized.
24
4
23
6
16
4
18
5
9
5
11
8
14
10
9
6
7
8
11
7
21
G = (V, E)
T, eT ce = 50
Cayley's Theorem. There are nn-2 spanning trees of Kn.
can't solve by brute force
25
Applications
MST is fundamental problem with diverse applications.



Network design
– telephone, electrical, hydraulic, TV cable, computer, road
Approximation algorithms for NP-hard problems
– traveling salesperson problem, Steiner tree
Indirect applications
max bottleneck paths
– LDPC codes for error correction
– image registration with Renyi entropy
– learning salient features for real-time face verification
– reducing data storage in sequencing amino acids in a protein
– model locality of particle interactions in turbulent fluid flows
– autoconfig protocol for Ethernet bridging to avoid cycles in a
network
Cluster analysis.
–

26
Greedy Algorithms
Kruskal's algorithm. Start with T = . Consider edges in ascending
order of cost. Insert edge e in T unless doing so would create a cycle.
Reverse-Delete algorithm. Start with T = E. Consider edges in
descending order of cost. Delete edge e from T unless doing so would
disconnect T.
Prim's algorithm. Start with some root node s and greedily grow a tree
T from s outward. At each step, add the cheapest edge e to T that has
exactly one endpoint in T.
Remark. All three algorithms produce an MST.
27
Greedy Algorithms
Simplifying assumption. All edge costs ce are distinct.
Cut property. Let S be any subset of nodes, and let e be the min cost
edge with exactly one endpoint in S. Then the MST contains e.
Cycle property. Let C be any cycle, and let f be the max cost edge
belonging to C. Then the MST does not contain f.
f
S
C
e
e is in the MST
f is not in the MST
28
Cycles and Cuts
Cycle. Set of edges the form a-b, b-c, c-d, …, y-z, z-a.
1
2
3
6
4
Cycle C = 1-2, 2-3, 3-4, 4-5, 5-6, 6-1
5
8
7
Cutset. A cut is a subset of nodes S. The corresponding cutset D is
the subset of edges with exactly one endpoint in S.
1
2
3
6
Cut S
= { 4, 5, 8 }
Cutset D = 5-6, 5-7, 3-4, 3-5, 7-8
4
5
7
8
29
Cycle-Cut Intersection
Claim. A cycle and a cutset intersect in an even number of edges.
2
1
3
6
Cycle C = 1-2, 2-3, 3-4, 4-5, 5-6, 6-1
Cutset D = 3-4, 3-5, 5-6, 5-7, 7-8
Intersection = 3-4, 5-6
4
5
8
7
Pf. (by picture)
C
S
V-S
30
Greedy Algorithms
Simplifying assumption. All edge costs ce are distinct.
Cut property. Let S be any subset of nodes, and let e be the min cost
edge with exactly one endpoint in S. Then the MST T* contains e.
Pf. (exchange argument)
Suppose e does not belong to T*, and let's see what happens.
Adding e to T* creates a cycle C in T*.
Edge e is both in the cycle C and in the cutset D corresponding to S
 there exists another edge, say f, that is in both C and D.
T' = T*  { e } - { f } is also a spanning tree.
Since ce < cf, cost(T') < cost(T*).
This is a contradiction. ▪






f
S
e
T*
31
Greedy Algorithms
Simplifying assumption. All edge costs ce are distinct.
Cycle property. Let C be any cycle in G, and let f be the max cost edge
belonging to C. Then the MST T* does not contain f.
Pf. (exchange argument)
Suppose f belongs to T*, and let's see what happens.
Deleting f from T* creates a cut S in T*.
Edge f is both in the cycle C and in the cutset D corresponding to S
 there exists another edge, say e, that is in both C and D.
T' = T*  { e } - { f } is also a spanning tree.
Since ce < cf, cost(T') < cost(T*).
This is a contradiction. ▪






f
S
e
T*
32
Prim's Algorithm: Proof of Correctness
Prim's algorithm. [Jarník 1930, Dijkstra 1957, Prim 1959]
Initialize S = any node.
Apply cut property to S.
Add min cost edge in cutset corresponding to S to T, and add one
new explored node u to S.



S
33
Implementation: Prim's Algorithm
Implementation. Use a priority queue ala Dijkstra.
Maintain set of explored nodes S.
For each unexplored node v, maintain attachment cost a[v] = cost of
cheapest edge v to a node in S.
O(n2) with an array; O(m log n) with a binary heap.



Prim(G, c) {
foreach (v
Initialize
foreach (v
Initialize
 V) a[v]  
an empty priority queue Q
 V) insert v onto Q
set of explored nodes S  
while (Q is not empty) {
u  delete min element from Q
S  S  {u }
foreach (edge e = (u, v) incident to u)
if ((v  S) and (ce < a[v]))
decrease priority a[v] to ce
}
34
Kruskal's Algorithm: Proof of Correctness
Kruskal's algorithm. [Kruskal, 1956]
Consider edges in ascending order of weight.
Case 1: If adding e to T creates a cycle, discard e according to
cycle property.
Case 2: Otherwise, insert e = (u, v) into T according to cut
property where S = set of nodes in u's connected component.



v
e
Case 1
S
e
u
Case 2
35
Implementation: Kruskal's Algorithm
Implementation. Use the union-find data structure.
Build set T of edges in the MST.
Maintain set for each connected component.
O(m log n) for sorting and O(m  (m, n)) for union-find.



m  n2  log m is O(log n)
essentially a constant
Kruskal(G, c) {
Sort edges weights so that c1  c2  ...  cm.
T  
foreach (u  V) make a set containing singleton u
are u and v in different connected components?
for i = 1 to m
(u,v) = ei
if (u and v are in different sets) {
T  T  {ei}
merge the sets containing u and v
}
merge two components
return T
}
36
37
Distributed Spanning tree construction
For a graph G=(V,E), a spanning tree is a maximally connected subgraph T=(V,E’),
E’ E,such that if one more edge is added, then the subgraph is no more a tree.
Used for broadcasting in a network.
Chang-Robert’s algorithm
1
2
{The root is known}
Uses signals and acks, similar to
the
termination
detection
algorithm. Uses the same rule for
sending acknowledgment.
0
5
root
3
4
Question: What if the root is not designated?
38
Chang Roberts Spanning Tree Alg
program
define
initially
probe-echo
N : integer (no. of neighbors)
C, D : integer;
parent :=i; C=0; D=0;
{for the initiator}
send probes to each neighbor;
D:=no. of neighbors;
do D!=0  echo -> D:=D-1 od {D=0 signals end}
{ for a non-initator process i>0}
do
parent  parent=i  C=0 -> C:=1;
parent := sender;
if i is not a leaf -> send probes to non –
parent neighbors;
D:= no. of non-parent neighbors
fi;
 echo -> D:=D-1;
 probe  sender != parent -> send echo to sender;
 C=1  D=0 -> send echo to parent;
C:=0;
od
39
Graph traversal
Consider web-crawlers, exploration of social networks,
graph layouts for visualization or drawing etc.
Many applications of exploring an unknown graph by a visitor
(a token or mobile agent or a robot). The goal of traversal
is to visit every node at least once, and return to the starting point.
- How efficiently can this be done?
- What is the guarantee that all nodes will be visited?
- What is the guarantee that the algorithm will terminate?
40
Graph traversal and Spanning Tree Formation
Tarry’s algorithm is one of the oldest (1895)
3
5
1
2
4
5
6
Rule 1. Send the token towards
each neighbor exactly once.
0
Rule 2. If rule 1 is not applicable,
then send the token to the parent.
root
A possible route is: 0 1 2 5 3 1 4 6 2 6 4 1 3 5 2 1 0
Nodes and their parent pointers generate a spanning tree
that may not be DFS
41
Distributed MST
Def MST Fragment : In a weighted graph G = (V,E,w), a tree T in G is
called an MST fragment of G, i there exists an MST of G such that T is
a subgraph of that MST.
Def MWOE : An edge e is an outgoing edge of a MST fragment T, iff
exactly one of its endpoints belongs to T. The minimum weight outgoing
edge is denoted MWOE(T).
Lemma : Consider a MST fragment T of a graph G = (V, E, w). Let
e = MWOE(T). Then T U e is a MST fragment as well.
Proof : Let TM be an MST containing T. If TM contains T we are done.
Otherwise, let e’ be an edge that connects T to the rest of TM.
Clearly, e’ is an outgoing edge of T and w(e’)>=w(e). Adding e to TM,
creates a graph C with a cycle through e and e’. Discarding e’ from C
yields a new T’ M with w(T’ M) >= w(TM).
42
Minimum Spanning Tree
Given a weighted graph G = (V, E), generate a spanning tree T = (V, E’) such that
the sum of the weights of all the edges is minimum.
Applications
The traveling salesman problem
asks for the shortest route to visit
a collection of cities and return to
the starting
point.
On Euclidean plane, approximate solutions to the traveling salesman
problem,
Lease phone lines to connect the different offices with a minimum cost,
Visualizing multidimensional data (how entities are related to each other)
We are interested in distributed algorithms only
43
Example
44
Sequential algorithms for MST
Review (1) Prim’s algorithm and (2) Kruskal’s algorithm.
Theorem. If the weight of every edge is distinct, then the MST is unique.
8
0
1
1
2
e
5
4
3
5
7 T2
T1
4
6
2
6
3
9
45
Gallagher-Humblet-Spira (GHS) Algorithm
GHS is a distributed version of Prim’s
algorithm.
3
Bottom-up approach. MST is recursively
7
constructed by fragments joined by an edge of
least cost.
5
Fragment
Fragment
46
Challenges
8
0
1
1
2
e
5
4
3
5
7 T2
T1
4
6
2
6
3
9
Challenge 1. How will the nodes in a given fragment identify the edge
to be used to connect with a different fragment?
A root node in each fragment is the coordinator
47
Challenges
8
0
1
1
2
e
5
4
3
5
7 T2
T1
4
6
2
6
3
9
Challenge 2. How will a node in T1 determine if a given edge
connects to a node of a different tree T2 or the same tree T1? Why
will node 0 choose the edge e with weight 8, and not the edge with
weight 4?
Nodes in a fragment acquire the same name before augmentation.
48
Two main steps
Each fragment has a level. Initially each node is a fragment at level 0.
(MERGE) Two fragments at the same level L combine to form a fragment of
level L+1
(ABSORB) A fragment at level L is absorbed by another fragment at level L’ (L
< L’)
49
Least weight outgoing edge
To test if an edge is outgoing, each node
sends a test message through a candidate edge.
The receiving node may send accept or reject.
Root broadcasts initiate in its own
fragment, collects the report from other nodes
about eligible edges using a convergecast, and
determines the least weight outgoing edge.
test
8
0
2
e
1
5
accept
1
4
3
T1
reject
4
2
6
5
7 T2
6
3
9
50
Accept of reject?
Case 1. If name (i) = name (j) then send reject
Case 2. If name (i)≠name (j)level (i)  level (j)
then send accept
Case 3. If name (i) ≠ name (j)  level (i) > level (j)
then wait until level (j) = level (i).
Let i send test to j
Levels can only increase.
reject
test
test
Question: Can fragments wait for ever and lead
to a deadlock?
51
Delayed response
test
A
join
initiate
Level 5
B
Level 3
B is about to change its level to 5. So B does not
send an accept reponse to A in response to test
52
The major steps
Repeat
Test edges as outgoing or not
Determine lwoe - it becomes a tree edge
Send join (or respond to join)
Update level & name & identify new coordinator
until done
53
Classification of edges
Basic (initially all branches are basic)
Branch (all tree edges)
Rejected (not a tree edge)
Branch and rejected are stable attributes
54
Wrapping it up
Example of merge
Merge
The edge through which the join
message is sent, changes its status to
branch, and becomes a tree edge.
Each root broadcasts an
(join, L, T)
T
T’
(join, L’, T’)
level = L’
level=L
(initiate, L+1, name) message
(a)
L= L’
to the nodes in its own fragment.
T
(join, L’, T;)
T’
level = L’
level=L
(b) L > L’
55
Wrapping it up
Absorb
T’ receives an initiate message.
This indicates that the fragment
at level L has been absorbed
(join, L, T)
T
by the
other fragment at level L’.
T’
(join, L’, T’)
level = L’
level=L
(a)
They collectively search for the
L= L’
lwoe.
The edge through which the
join message was sent, changes
its status to branch.
T
initiate
(join, L’, T;)
level=L
T’
level = L’
Example of absorb (b) L > L’
56
Example
1
0
8
2
5
1
3
7
4
5
4
6
2
6
9
3
57
Example
merge
1
0
8
2
1
3
7
4
5
4
merge 2
merge
5
6
6
9
3
58
Example
1
0
8
2
5
1
merge
3
7
4
5
4
6
2
6
9
absorb
3
59
Example
1
0
absorb
8
2
5
1
3
7
4
5
4
6
2
6
9
3
60
Message complexity
At least two messages (test + reject) must pass through each
rejected edge. The upper bound is 2|E| messages.
At each of the log N levels, a node can receive at most (1) one
initiate message and (2) one accept message (3) one join
message (4) one test message not leading to a rejection, and
(5) one changeroot message.
So, the total number of messages has an upper bound of
2|E| + 5N logN
61
MST Algorithms: Theory
Deterministic comparison based algorithms.
O(m log n)
[Jarník, Prim, Dijkstra, Kruskal, Boruvka]
O(m log log n).
[Cheriton-Tarjan 1976, Yao 1975]
O(m (m, n)).
[Fredman-Tarjan 1987]
O(m log (m, n)).
[Gabow-Galil-Spencer-Tarjan 1986]
O(m  (m, n)).
[Chazelle 2000]





Holy grail. O(m).
Notable.
O(m) randomized.
O(m) verification.
[Karger-Klein-Tarjan 1995]
[Dixon-Rauch-Tarjan 1992]
Euclidean.
2-d: O(n log n).
k-d: O(k n2).
compute MST of edges in Delaunay
dense Prim




62
Distributed MST Algorithms
Gallager, Humblet, & Spira ’83:
O(n log n) running time
message: O(|E| + n log n) (optimal)
Chin & Ting ’85:
O(n log log n) time
Gafni ’85:
O(n log*n)
Awerbuch ’87:
O(n),
existentially optimal
Garay, Kutten, & Peleg ’98:
O(D + n0.61),
Diameter D
Kutten & Peleg ’98:
O D  n log* n

Elkin ’04:



~
O  n ,
μ is called MST radius
– Cannot detect termination unless μ is given as input.
 
~
Peleg & Rabinovich (’99) showed a lower bound of 
n for running time.
63
Distributed Graph Algs : Other areas of interest
Distributed Cycle/Knot Detection
Distributed Center Finding
Distributed Connected Dominating Set Construction in MANETs, WSNs
Distributed Clustering based on Graph Partitioning
64
References
Introduction to Graph Theory, Douglas West, Prentice Hall, 2000
(basics)
Graph Theory and Its Applications, Gross and Yellen, CRC Press, 1998
(basics)
Distributed Algorithm Course notes, J.Welch, TAMU (flooding and tree
algorithms)
CS590A Fall 2007 G. Pandurangan 1, Purdue University
Distributed Computing Principles Course Notes, Roger Wattenhofer,
ETH (Coloring algorithms)
Introduction to Algorithm Design, Kleinman, Tardos, Prentice-Hall,
2005 (MST dependent)
22C:166 Distributed Systems and Algorithms Course, Sukumar Ghosh,
University of Iowa (routing part heavily dependent)
65
Download