Random Graphs Liang Li April 9, 2014 Outline Objectives Internet Topology Melting points [2] History Definition With high probability (whp) Near clique Scale free Models Erdős-Rényi Model Random Graphs Edgar Gilbert Model Results with classical random graphs Giant component Probability methods Ramsey Number Bound Hamiltonian paths Watts and Strogatz Small world Model "Kavin Bacon game" and "Erdős number" Small World Model generating clustering coefficient Liang Li | The University of Tennessee — Department of EECS 3/35 Random Graphs Barabási and Albert Preferential attachment Model generating properties Applications open problems Some open problems Homework problems The average Clustering coefficient Prove or Disprove Liang Li | The University of Tennessee — Department of EECS 4/35 Random Graphs Objectives Internet Topology When you send or receive data over the internet you computer doesn’t really give how the data travels. The media (wire, optic fibre, ox cart) and route (via hong kong or Champaign-Urbana) are irrelevant so long as we don’t mind waiting. Of course, we do mind so in general routers try to route packets over the fastest link and shortest distance. A program called traceroute finds out where data is flowing by sending out suicidal packets of information that self-destruct after they have seen a set number of computers. Of course some computers don’t care if the packet dies, some respond with nonsense, some respond too quickly or too slowly. Liang Li | The University of Tennessee — Department of EECS 5/35 Random Graphs Figure 1: The internet topology in 2001 taken from https://www. fractalus.com/steve/stuff/ipmap/ Liang Li | The University of Tennessee — Department of EECS 6/35 Random Graphs https://www.fractalus.com/steve/stuff/ipmap/layout2.gif https://www.fractalus.com/steve/stuff/ipmap/net-anim.gif Liang Li | The University of Tennessee — Department of EECS 7/35 Random Graphs Melting points [2] Think of a solid as a three-dimensional grid of molecules, with neighboring molecules joined by bonds. 1. Adding energy excites molecules and breaks bonds. 2. Bonds break at random as the temperature (energy level) raises. 3. Break off bonds make the molecules form others, like a liquid or gas. Liang Li | The University of Tennessee — Department of EECS 8/35 Random Graphs Figure 2: Melting points Liang Li | The University of Tennessee — Department of EECS 9/35 Random Graphs History Small world model. Ramsey number Survey articles Watt. Erdős Albert Random graphs BA model Hamilton path Erdős Barabási. Watt Szele 1943 1947 1959-1961 1998 1999 2002 2003 Newman Remco 2006 2014 The theory of random graphs was founded by Erdős and Rényi (1959, 1960, 1961a,b) after Erdős (1947, 1959, 1961) had discovered that probabilistic methods [6, 7] were often useful in tackling extremal problems in graph theory [3]. The small world model [4] of Watts and strogatz(1998) and the preferential attachment model [5] of Barabási and Albert (1999) [1] have led to an explosion of research [8]. Liang Li | The University of Tennessee — Department of EECS 10/35 Random Graphs Definition With high probability (whp) We say that a graph has a certain property Q, if limn→∞ Pr(Graph has Q) = 1. Near clique An undirected graph is a near clique if adding an additional edge would make it a clique. Scale free The degree distribution is almost independent of the size of the graph, and the proportion of vertices with degree k is close to proportional to P (k) ∼ k −τ , typically 2 < τ < 3 for real network [11]. Or Nk ∼ cn k −τ [12]. Liang Li | The University of Tennessee — Department of EECS 11/35 Random Graphs Models Erdős-Rényi Model G(n, M ) consists of all graphs with vertex set V = {1, 2, ..., n} having M edges, in which the graphs have the same probability. N Thus with the notations N = n2 , 0 ≤ M ≤ N , G(n, M ) has M elements and N −1 every element occurs with probability M . M The random variable G denotes a graph generated in this way. Liang Li | The University of Tennessee — Department of EECS 12/35 Random Graphs Edgar Gilbert Model G(n, P (edge) = p) consists of all graphs with vertex set V = {1, 2, ..., n} in which the edges are chosen independently and with probability p. In other worlds, if G0 is a graph with vertex set V and it has m edges, then P ({G0 }) = P (G = G0 ) = pm (1 − p)N −m . The random variable Gp denotes a graph generated in this way. For M ' pN , the these two models are almost interchangeable [8]. Liang Li | The University of Tennessee — Department of EECS 13/35 Random Graphs Results with classical random graphs Giant component Erdős and Rényi discovered that there was a sharp threshold for the appearance of many properties [1]. Let c > 0 be a constant and set p = c/n. • if c < 1 , most of the connected components of the graph are small, which the largest having only O(log n) vertices, where the O symbol means that there is a constant C < ∞ so that the Probability (the largest component is ≤ C log n) tends to 1 as n → ∞. • if c > 1 there is a constant θ(c) > 0, so that the largest component has ∼ θ(c)n vertices and the second largest component is O(log n). Here Xn ∼ bn means that Xn /bn converges to 1 in probability as n → ∞. Liang Li | The University of Tennessee — Department of EECS 14/35 Random Graphs Probability methods Ramsey Number Bound The Ramsey number [13] R(m, n) gives the solution to the party problem, which asks the minimum number of guests R(m, n) that must be invited so that at least m will know each other or at least n will not know each other. In the language of graph theory, the Ramsey number is the minimum number of vertices v = R(m, n) such that all undirected simple graphs of order v contains a clique of order m or an independent set of order n. Liang Li | The University of Tennessee — Department of EECS 15/35 Random Graphs S P Using the observation that P ( i Ai ) ≤ i P (Ai ). Theorem (Erdős (1947)) 1−(m) n 2 If m 2 < 1, then R(m, m) > n. Liang Li | The University of Tennessee — Department of EECS 16/35 Random Graphs Proof. [2] Define a probability model on graphs with vertex set n by letting each edge appear independently with probability 0.5. If the probability of the event Q=" no m-clique or independent m-set" is positive, then the desired graph exists. m Each possible p-clique occurs with probability 2−( 2 ) , since obtaining the complete graph requires obtaining all its edges, and they occur independently. Hence the −(m) n probability of having at least one m-clique is bounded by m 2 2 . The same bound holds for independent m-sets. Hence the probability of "not Q" is bounded 1−(m) n 2 , and the given inequality guarantees that P (Q) > 0. by m 2 Liang Li | The University of Tennessee — Department of EECS 17/35 Random Graphs Hamiltonian paths A random variable is a function assigning a real number to each element of a probability space. We use X = k to denote the event consisting of all elements where variable X has the value k. P The expection E(X) of a random variable X is the weighted average k kP (X = k). The pigeonhole property of the expectation is the statement that there exists an element of the probability space for which the value of X is as large as (or as small as) E(X). Liang Li | The University of Tennessee — Department of EECS 18/35 Random Graphs Theorem (Szele (1943)) Some n vertex tournament has at least n!/2n−1 Hamiltonian paths. Proof. [2] Generate tournament on n randomly by choosing i → j or j → i with equal probability for each pair {i, j}. Let X be the number of Hamiltonian parts; X is the sums of n! indicator variables for the possible Hamiltonian paths. Each Hamiltonian path occurs with probability 1/2n−1 , so E(X) = n!/2n−1 . In some tournament, X is at least as large as the expectation. This simple bound using expectation gives almost the right answer for the maximum number of Hamiltonian paths in an n-vertex tournament; Alon[14] proved that it is at least n!/(2 + o(1))n . Liang Li | The University of Tennessee — Department of EECS 19/35 Random Graphs Watts and Strogatz Small world Model "Kavin Bacon game" and "Erdős number" 0 1 1 1673 2 130,851 3 4 349,031 84,615 5 6,718 6 7 788 107 8 11 Table 1: Bacon number Kevin Bacon number is 2.94; Erdős number is 4.7 with 337,454 authors and 496,489 edges. Facebook released two papers in Nov.2011 that 721 million users with 69 billion friendship links, average distance is 4.74. Liang Li | The University of Tennessee — Department of EECS 20/35 Random Graphs Small World Model The Gp graphs have small diameters, but have very few triangles. (while in social networks if A and B are friends and A and C are friends, it is fairly that B and C are also friends.) To construct a network with small diameter and a positive density of K3 , Watts and Strogatz started a ring lattice with n vertices and k edges per vertex, where the construction interpolates between regularity (p = 0) and disorder (p = 1). Liang Li | The University of Tennessee — Department of EECS 21/35 Random Graphs generating Figure 3: Generating small world graphs [15] • Disallow self-edges. • Disallow multiple edges. Liang Li | The University of Tennessee — Department of EECS 22/35 Random Graphs clustering coefficient Denote L(p) be the average distance between two randomly chosen vertices and define clustering coefficient C(p) to be the fraction of connections that exist between the k2 pairs of neighbors of a site. edges between neighbors of v| Local clustering coefficient of node V: | |actual possible edges between neighbors of v| The clustering coefficient for the whole graph is the average of the local values. Figure 4: C(v) = 4 6 Liang Li | The University of Tennessee — Department of EECS 23/35 Random Graphs A graph is considered small world, if: • its average clustering coefficient is significantly higher than the one of a random graph constructed on the same vertex set, and • it has approximately the same mean shortest path length as its corresponding random graph. The regular graph has L(0) ∼ n/2k and C(0) ≈ 3/4 if k is large, which the disorder one has L(1) ∼ (log n)(log k) and C(1) ∼ k/n. Here L(p) decreases quickly near 0, which C(p) changes slowly so there is a broad interval of p over which L(p) is almost as small as L(1), yet C(p) is far from 0 [1]. • Small-world networks tend to contain cliques, and near-cliques. • Most paris of nodes will be connected by at least one short path. Liang Li | The University of Tennessee — Department of EECS 24/35 Random Graphs Barabási and Albert Preferential attachment Model BA model an algorithm for generating random scale-free networks using a preferential attachment mechanism. It incorporates two important general concepts: Growth means the number of nodes in the network increases over time. Preferential attachment means that the more connected a node is, the more likely it is to receive new links. Liang Li | The University of Tennessee — Department of EECS 25/35 Random Graphs generating The network begins with an initial connected network of m0 nodes. New nodes are added to the network one at a time with the probability that is proportional to the number of links that the existing nodes already have: pi = Pki j kj where ki is the degree of node i and the sum is made over all pre-existing nodes j. Liang Li | The University of Tennessee — Department of EECS 26/35 Random Graphs properties • BA model is scale free. Its power law of the form p(k) ∼ k −3 • The average path length increases approximately with the size of the network l ∼ lnlnlnNN • The clustering coefficient with network size C ∼ N −0.75 For example, on the web, very well known sites such as Google or Wikipedia, rather than to pages that hardly anyone knows will be more likely to be linked. If someone selects a new page to link to by randomly choosing an existing link, the probability of selecting a particular page would be proportional to its degree. Liang Li | The University of Tennessee — Department of EECS 27/35 Random Graphs Applications World Wide Web... Internet... Movie actor collaboration network... Cellular networks.... Ecological networks... Phone call network ... Citation network... Networks in linguistics... Power and neural networks... ..... Liang Li | The University of Tennessee — Department of EECS 28/35 Random Graphs open problems Random Structures: a model of real world networks, such as Internet, social network or biological networks it leaves a lot to be desired. Figure 5: The internet topology in 2001 taken from https://www. fractalus.com/steve/stuff/ipmap/ Liang Li | The University of Tennessee — Department of EECS 29/35 Random Graphs Some open problems Is that true that who Gm has δ(Gm )/2 Hamilton cycles?[19] It is known to be true as long as δ(Gm )/2 = o(average degree). What is the expected time to for a random walk to get within distance d for every vertex? More problems[20]: Ramsey theory... Graph coloring problems.. ... Liang Li | The University of Tennessee — Department of EECS 30/35 Random Graphs Homework problems The average Clustering coefficient Figure 6: Liang Li | The University of Tennessee — Department of EECS 31/35 Random Graphs Prove or Disprove When p is constant, then almost every Gp is has diameter 2 (and Gp is connected). Liang Li | The University of Tennessee — Department of EECS 32/35 Random Graphs References [1] Rick Durrett, Random Graph Dynamics (Cambridge Series in Statistical and Probabilistic Mathematics), Cambridge University Press, New York, NY, 2006 [2] D. B. West. Introduction to Graph Theory (2nd Edition). Edited by Prentice Hall. Prentice Hall, 2001. [3] http://en.wikipedia.org/wiki/Random_graph [4] D. J.Watts, S. H. Strogatz(1998). "Collective dynamics of ’small-world’ networks". Nature 393 (6684): 440Ð442. [5] R. Albert, A.-L. Barabási (2002). "Statistical mechanics of complex networks". Reviews of Modern Physics 74: 47Ð97. [6] Erdös, P.; Rényi, A. (1959). "On Random Graphs. I". Publicationes Mathematicae 6: 290Ð297 [7] Erdös, P.; Rényi, A. (1960). "The Evolution of Random Graphs". Magyar Tud. Akad. Mat. Kutató Int. Közl. 5: 17Ð61. [8] Bollobas, B. and Riordan, O.M.(2003) "Mathematical results on scale-free random graphs" in "Handbook of Graphs and Networks" (S. Bornholdt and H.G. Schuster (eds)), Wiley VCH, Weinheim, 1st ed. [9] http://en.wikipedia.org/wiki/Small-world_network#Properties_of_small-world_networks [10] http://en.wikipedia.org/wiki/Barab%C3%A1si%E2%80%93Albert_model Liang Li | The University of Tennessee — Department of EECS 33/35 Random Graphs [11] http://en.wikipedia.org/wiki/Scale-free_network [12] R. V. D. Hofstad. (2014) "Random Graphs and Complex NetworksÓ. Department of Mathematics and Computer Sciene Eindhoven University of Technology. [13] http://mathworld.wolfram.com/RamseyNumber.html [14] Alon, Noga. (1990)"The maximum number of Hamiltonian paths in tournaments." Combinatorica VOL.10. NO 4. 319324. [15] http://cs.brynmawr.edu/Courses/cs380/spring2013/section02/slides/06_SmallWorldNetworks.pdf [16] http://en.wikipedia.org/wiki/Barab%C3%A1si%E2%80%93Albert_model#Clustering_coefficient [17] Albert, RŐka, and Albert-LĞszlŮ BarabĞsi. (2002) "Statistical mechanics of complex networks." Reviews of modern physics VOL. 74.NO. 1:47-93. [18] Watts, Duncan J.; Strogatz, Steven H. (June 1998). "Collective dynamics of ’small-world’ networks". Nature 393 (6684): 440Ð442. [19] http://www.math.cmu.edu/~af1p/Talks/RandomGraphs/rgtalk.pdf [20] Chung, F. R. K. "Open problems of Paul Erdos in graph theory." Journal of Graph Theory 25.1 (1997): 3-36. Liang Li | The University of Tennessee — Department of EECS 34/35 Questions?