Random Walks on Distributed Networks Masafumi Yamashita (Kyushu Univ., Japan) Table of Contents What is a random walk? Applications Random walks on distributed networks Markov chain Monte Carlo Searching a distributed network for information Self-stabilizing mutual exclusion Random walks using local information Random walks on dynamic graphs Open problems What is a random walk? (1) Given graph G=(V,E) and trans prob matrix P = (Pij), random walk starting at u: x = x0, x1, … , where Pr(x0=u) = 1, Pr(xk =j | xk-1=i) = Pij 2/3 1/3 1/3 1/3 1/3 1/2 1/2 1/2 1/3 1/3 1 1/2 simple random walk P0 1/3 1/2 1/2 1/2 1 0 0 1/3 1/3 1/6 1/2 1 1 0 1/6 symmetric: ST=S (doubly stochastic) uniform distribution quick traversal What is a random walk? (2) Motivation: Random walks with different properties emerge from different trans prob matrix P. Goal: Design good matrix P meeting requirements. Simple --- simple random walk P0 Uniform distribution --- symmetric matrix S Quick traversal --- global info on G necessary Goodness measures Hitting time --- ave # of steps to reach a given vertex Cover time --- ave # of steps to visit all vertices Target stationary distrib --- prob to visit each vertex Mixing time --- ave # of steps to conv to stationary distrib Known about simple random walks (1) Hitting time = (n3) For any G, H(G) = O(n3). There exists G such that H(G) = (n3). H(G) =(1-o(1))(4/27) n3 (Brightwell+Winkler 1990) Cover time = (n3) C(G) =(1+o(1))(4/27) n3 (Feige 1995) Lollipop Ln K2n/3 Xn/3 Known about simple random walks (2) Stationary distribution = (i ) (vector satisfying P = ) Prob that token stays at i converges to i i = d(i) / 2m (m = |E|) --- not uniform Uniform distribution requested? P is symmetric => is uniform (Why?) i Pij = j Pji = j Pij (detailed balance) i Pij i j Pji j Any distribution requested? Metropolis-Hastings algorithm (MCMC) Table of Contents What is a random walk? Applications Random walks on distributed networks Markov chain Monte Carlo Searching a distributed network for information Self-stabilizing mutual exclusion Random walks using local information Random walks on dynamic graphs Open problems Appl 1: Markov Chain Monte Carlo Given set S and prob distrib , pick elements from S following . G = (S,E) S => 1. 2. Design G = (S,E) and P such that P = Random walk x: x0, x1, … , xi, … => Pick xi (i must be larger than the mixing time) Challenge: Design P with small mixing time. Appl 2: Searching network for a file Naïve algorithm (for node j, search initiated by i): 1. 2. 3. Upon receiving mssg search(i,f), if j has f then send back mssg found(j,f) to i otherwise select a neighbor k and forward search(i,f) to k How to determine neighbor k? Depth first search (deterministic selection) Fast: H(G) = (n+m) May not work correctly on dynamic network Simple random walk (select uniformly at random) Slow: H(G) = (n3) Challenge: Design P based on local info good on dynamic network. Appl 3: Self-stabilizing mutual exclusion(1) Token circulation in a uni-dir ring Token ring algorithm A: Upon receiving mssg token, (critical section) forward it to the next node; token (privilege) How robust is A? => A cannot recognize the situations below. Appl 3: Self-stabilizing mutual exclusion(2) Self-stabilizing systems (Dijkstra 1974): Tolerate a finite number of transient failures. Equivalently, work correctly from any init conf. Idea for self-stabilizing algorithm A (for u): Set of states : = {0,1,…,n-2} If s’ (s+1) mod n, s’ := (s+1) mod (n-1). u has a token iff s’ (s+1) mod (n-1). u s’ s Ass: u can read s 3 1 2 3 2 0 3 1 2 0 => 4 1 2 3 1 2 0 4 3 Appl 3: Self-stabilizing mutual exclusion(3) 0 2 Correctness of A? 1. 2. 3. Every conf contains a token. # of tokens never increase. # of tokens may not decrease. 1 1 0 2 Theorem 1. For anonymous uni-dir ring, there is a deterministic self-stabilizing mutual exclusion algorithm only if n is prime. The problem is unsolvable deterministically. Appl 3: Self-stabilizing mutual exclusion(4) Israeli-Jalfon algo B for general graph (1990) – idea Each node has a register s. u has token iff s0 max {si of neighbors} s3 = 1 s4 = 2 u s0 = 2 s1 = 1 s5 = 3 u’ If u has token, select a neighbor u’ at random and transfer token to u’. s2 = 0 -> 3 Tokens randomly walk in network. Appl 3: Self-stabilizing mutual exclusion(5) Correctness of B 1. 2. 3. 4. Every conf contains a token. # of tokens never increase. Since two simple random walks eventually meets, # of tokens eventually decreases to 1. Since simple random walk eventually visits every nodes, all nodes enjoy the privilege. Theorem 2. The problem is solvable probabilistically. Performance (under static networks) 1. 2. 3. Waiting time: Hitting or cover time = (n3) Convergence time: Hitting time = (n3) Fairness: stationary distrib is not uniform. Challenge: Design P with good hitting/cover times. Table of Contents What is a random walk? Applications Random walks on distributed networks Markov chain Monte Carlo Searching a distributed network for information Self-stabilizing mutual exclusion Random walks using local information Random walks on dynamic graphs Open problems Random walks on distributed networks Design issues in sequential and distributed appl appl. dynamic hitting networks time cover local time info stationary mixing distrib time MCMC ☓ △ ☓ △ ◎ ◎ distr comp ◎ ◎ ◎ ◎ △ △ Random walks using local info on static graphs (1) Impact of using degree info of neighbors Guarantee uniform distrib? -> symmetric trans prob matrix S Sij = 1 / max{d(i), d(j)} Guarantee any distribution ? -> Metropolis walks (Metropolis et al. 1953, Hastings 1970) Mij is 1/d(i) if d(i) d(j) (i / j ) else j / (i d(j) ) Theorem 3. Stationary distrib of M is . (Why?) i Mij = i / d(i) = j Mji if d(i) d(j) (i / j ). The other case is symmetrical. By detailed balance cond. Random walks using local info(2) Impact of using global info Cover time of simple random walks on trees = O(n2) (Aleliunas et al. 1979) Given G, extract spanning tree T, and have token circulate T. 0 1/2 1 G 1/2 0 1/3 0 0 0 0 1/2 1/3 1/3 1 1/2 T 1 Theorem 4. For any G, H(G,PT) =C(G,PT) = O(n2). (Epple, c.f. Ikeda et al. 2009) Theorem 5. For any P, H(Xn,P) = C(Xn,P) = (n2). (Ikeda et al. 2009) Corollary 1. PT is best possible. Recall H(G,P0) = C(G,P0) = (n3) for simple random walk P0. Random walks using local info (3) What is the essential local info? Given G, let Dij = d(j) -1/2/ kN(i) d(k) -1/2. (D is a Gibbs distribution.) Theorem 6. For any G, H(G,D) = O(n2), C(G,D) = O(n2 log n). (Ikeda et al. 2009) (Why?) If j N(i), H(i,j) max { d(i), d(j) } n. For all i,j, there is path X connecting i and j such that sum of the degs of nodes in X is less than 3n. Thus H(i,j) 6n2. Cover time is by Matthew’s theorem. Corollary 2. With respect to hitting time, D is best possible. Open: close gap on cover time. Random walks using local info(4) Impact of stationary distrib f = max{ i } / min{ i } Theorem 7. For any f, there are G (G’) and (’) such that H(G,P) = ( f n2) (C(G’,P) = ( f n2 log n)) for any P. (Nonaka et al. 2010) Performance of Metropolis walks M Theorem 8. For any G and , H(G,M) = O( f n2) and C(G,M) = O( f n2 log n)). (Nonaka et al. 2010) Corollary 3. For any G and , M is best possible. Random walks using local Info (5) Are simple random walks really bad? Yes: H(G,PT) = C(G,PT) = O(n2) for all G, and H(Ln, P0) = C(Ln, P0) = (n3). But, not so bad as long as on trees. Theorem 9. For all trees T, H(T,P0) / H(T,P*) = (n1/2) and C(T,P0) / C(T,P*) = O((n log n)1/2), where P* is the best P (global info available) for T. (Nonaka et al. 2011) Simple random walks on dynamic graphs(1) Gt = (Vt, Et) be the graph at time t 0. In general, both Vt and Et may change. Assume Vt = V for all t 0. Consider simple random walks on {Gt}. 1. 2. Choosing before checking Et (CBC): if the node chosen is not adjacent in Gt, token takes the self-loop. Choosing after checking Et (CAC) Let G = (V,E) be connected. At any time t, each edge in E is up (i.e., in Et) with prob p. Theorem 10. (CBC) When G is Kn, H({Gt}, P0) = n/p and C({Gt}, P0) = n Hn /p, where Hn is harmonic number. (Why?) Identify it as simple random walk on Kn with self-loop with prob (1 – p). Random walks on dynamic graphs(2) Theorem 11. (CAC) When G is Kn, H({Gt}, P0) = n / (1-qn-1) and C({Gt}, P0) = n Hn / (1-qn-1) , where q =1 – p. (Why?) Roughly, same as Theorem 10, but success prob is 1 – (1-p)n-1 . Theorem 12. (CBC) For general G , H({Gt}, P0) = H(G,P0) / p and C({Gt}, P0) = C(G,P0) / p. The order of performance of simple random walk on {Gt} is exactly the same as on G, if p is constant. Theorem 13. (CAC) For any general G , F(G,P0) / (1 - q) F ({Gt}, P0) F(G,P0) / (1 - q) , where, F {H, C}, q = 1 - p, is max deg, is min deg. (Koba et al. 2010) Random walks on dynamic graphs(3) Edge Markovian graph : {Et} is Markov chain following prob trans matrix Q. Markovian graph is Bernoulli if all rows of Q are identical. Assume self-loop at every node. Theorem 14. (CBC) For any connected Bernoulli graph {Gt}, H({Gt}, P0) = O(n3) and C({Gt}, P0) = O(n3 log n). (Avin et al. 2008) Simple random walks are still OK. Theorem 15. (CBC) There is Markovian (but not Bernoulli) graph {Gt} such that H({Gt}, P0) = (2n). (Avin et al. 2008) Simple random walks may not perform well. Theorem 16. (CBC) For any connected dynamic graph {Gt} of max deg , C({Gt}, P0) = O(2 n3 ln2 n), where P0 considers G as K+1 (i.e., max lazy chain). (Avin et al. 2008) Observe why it does not contradict to Theorem 15. Table of Contents What is a random walk? Applications Random walks on distributed networks Markov chain Monte Carlo Searching a distributed network for information Self-stabilizing mutual exclusion Random walks using local information Random walks on dynamic graphs Open problems Open problems We are interested in and working on 1. Understanding the impact of local info – close gap of bounds on cover time between (n2) and O(n2 log n). 2. Analyzing the performances of random walks using local info on dynamic graphs. 3. Analyzing the cover ratio of random walks on dynamic graphs with variable vertex set. 4. Analyzing the performances of multiple random walks on dynamic graphs.