Random-Graph Theory The Erdos-Renyi model In mathematical terms a network is represented by a graph. A graph is a pair of sets G={P,E}, where P is a set of N nodes (or vertices or points) P1 ,P2 ,...,PN and E is a set of edges (or links or lines) that connect two elements of P. One of the objects of Random-Graph Theory is: A particularly rich source of ideas has been the study of random graphs, graphs in which the edges are distributed randomly. Networks with a complex topology and unknown organizing principles often appear random; thus random-graph theory is regularly used in the study of complex networks. The Erdo˝s-Re´nyi model Classic first article: In their classic first article on random graphs, Erdo˝s and Re´nyi define a random graph as N labeled nodes connected by n edges, which are chosen n randomly from the N(N-1)/2 possible edges . In total there are C[ N ( N 1) / 2] graphs with N nodes and n edges, forming a probability space in which every realization is equiprobable. 2 The binomial model: Here we start with N nodes, every pair of nodes being connected with probability p. Consequently the total number of edges is a random variable with the expectation value E(n)=p[N(N-1)/2] . If G0 is a graph with nodes P1 ,P2 ,...,PN and n edges, the probability of obtaining it by this graph construction process is P(G0)=pn(1-p)N(N-1)/2 - n. We can notice the emergence of trees (a tree of order 3, drawn with long-dashed lines) and cycles (a cycle of order 3, drawn with short-dashed lines) in the graph, and a connected cluster that unites half of the nodes at p=0.15=1.5/N. 3 Random-graph theory studies the properties of the probability space associated with graphs with N nodes as N→∞. In this respect Erdo˝s and Re´nyi used the definition that almost every graph has a property Q if the probability of having Q approaches 1 as N→∞. lim N 0, if PN , p Q 1, if p( N ) 0 pc ( N ) p( N ) pc ( N ) 4 Subgraphs The first property of random graphs to be studied by Erdo˝s and Re´nyi was the appearance of subgraphs. The simplest examples of subgraphs are cycles, trees, and complete subgraphs. Most generally we can ask whether there is a critical probability that marks the appearance of arbitrary subgraphs consisting of k nodes and l edges. Thus the expected number of such subgraphs is: l k! l k p E( X ) C p N a a k N lim P ( X r ) e N p r r! (*) Thus, indeed, the critical probability at which almost every graph contains a subgraph with k nodes and l edges is pc(N)=cN-k/l. A few important special cases directly follow from Eq. (*): (a) The critical probability of having a tree of order k is pc(N)=cN-k/(k-1); (b) The critical probability of having a cycle of order k is pc(N)=cN-1; (c) The critical probability of having a complete subgraph of order k is pc(N)=cN-2/(k-1). 5 The threshold probabilities at which different subgraphs appear in a random graph. For pN3/2→0 the graph consists of isolated nodes and edges. For p~N-3/2 trees of order 3 appear, while for p~N-4/3 trees of order 4 appear. At p~N-1 trees of all orders are present, and at the same time cycles of all orders appear. The probability p~N-2/3 marks the appearance of complete subgraphs of order 4 and p~N-1/2 corresponds to complete subgraphs of order 5. As z approaches 0, the graph contains complete subgraphs of increasing order. 6 Graph evolution It is instructive to look at the results discussed above from a different point of view. Consider a random graph with N nodes and assume that the connection probability p(N) scales as Nz, where z is a tunable parameter that can take any value between -∞ and 0. If 0< <k> <1, almost surely all clusters are either trees or clusters containing exactly one cycle. Although cycles are present, almost all nodes belong to trees. The mean number of clusters is of order N-n, where n is the number of edges, i.e., in this range when a new edge is added the number of clusters decreases by 1. The largest cluster is a tree, and its size is proportional to lnN. When <k> passes the threshold <k>c=1, the structure of the graph changes abruptly. While for <k> < 1 the greatest cluster is a tree, for <k>c=1 it has approximately N2/3 nodes and has a rather complex structure. Moreover for <k> >1 the greatest (giant) cluster has [1-f(<k>)]N nodes, where f(x) is a function that decreases exponentially from f(1)=1 to 0 for x→∞. Thus a finite fraction S=1-f(<k>) of the nodes belongs to the largest cluster. 7 Degree distribution In a random graph with connection probability p the degree ki of a node i follows a binomial distribution with parameters N-1 and p: P(ki=k)=CN-1k pk(1-p)N-1-k. To find the degree distribution of the graph, we need to study the number of nodes with degree k , Xk. Our main goal is to determine the probability that Xk takes on a given value, P(Xk=r). The expectation value of the number of nodes with degree k is: E(Xk)=NP(ki=k)=λk , where λk =N CN-1k pk(1-p)N-1-k. The distribution of the Xk values, P(Xk=r), approaches a Poisson distribution, P(Xk=r)= e k rk r! 8 The degree distribution that results from the numerical simulation of a random graph. We generated a single random graph with N=10 000 nodes and connection probability p =0.0015, and calculated the number of nodes with degree k,Xk . The plot compares Xk /N with the expectation value of the Poisson distribution E(Xk)/N=P(ki=k), and we can see that the deviation is small. 9 Connectedness and diameter d ln(N ) ln( k ) • If <k>=pN<1, a typical graph is composed of isolated trees and its diameter equals the diameter of a tree. • If <k> > 1, a giant cluster appears. The diameter of the graph equals the diameter of the giant cluster if <k> > 3.5, and is proportional to ln(N)/ln(<k>). • If <k> > ln(N), almost every graph is totally connected.The diameters of the graphs having the same N and <k> are concentrated on a few values around ln(N)/ln(<k>). lrand ~ ln(N ) ln(k ) 10 Clustering coefficient Crand p k N 11 PERCOLATION THEORY Illustration of bond percolation in 2D. The nodes are placed on a 25X25 square lattice, and two nodes are connected by an edge with probability p. For p=0.315 (left), which is below the percolation threshold pc=0.5, the connected nodes form isolated clusters. For p=0.525 (right), which is above the percolation threshold, the largest cluster percolates. 12 (1) The percolation probability P, denoting the probability that a given node belongs to the infinite cluster: Pp (| C | s) P=Pp(|C|=∞)=1 - s where Pp(|C|=s) denotes the probability that the cluster at the origin has size s. Obviously 0 , if p pc P 0 , if p pc (2) The average cluster size <s>, defined as <s>=Ep(|C|)= sPp (| C | s ) s 1 giving the expectation value of cluster sizes. Because <s> is infinite when P>0, in this case it is useful to work with the average size of the finite clusters by taking away from the system the infinite (|C| =∞) cluster s f E p (| C |, | C | ) sPp (| C | s) s 13 (3)The cluster size distribution ns , defined as the probability of a node’s having a fixed position in a cluster of size s (for example, being its left-hand end, if this position is uniquely defined), 1 ns Pp (| C | s). s 14 Example of a Cayley tree with coordination number z=3. •Percolation threshold: pc=1/(z-1) •Percolation probability: 0 , if p pc 1 / 2 P 2 ( 2 p 1 ) / p , if p pc 1 / 2 •Mean cluster size: 3 s ( pc p ) 1 4 Cluster size distribution: 1 s 1 s 1 Pp (| C | s) C2 s p (1 p) s 1 s 15 Parallels between random-graph theory and percolation (1) For p<pc=1/N • The probability of a giant cluster in a graph, and of an infinite cluster in percolation, is equal to 0. • The clusters of a random graph are trees, while the clusters in percolation have a fractal structure and a perimeter proportional with their volume. • The largest cluster in a random graph is a tree with ln(N) nodes, while in general for percolation Pp(|C| = s)~e-s/ξ, suggesting that the size of the largest cluster scales as ln(N). (2) For p=pc=1/N • A unique giant cluster or an infinite cluster appears. • The size of the giant cluster is N2/3; while for infinitedimensional percolation Pp(|C|=s)~s-3/2, thus the size of the largest cluster scales as N2/3. (3) For p>pc=1/N • The size of the giant cluster is (f(pcN)-f(pN))N, where f is an exponentially decreasing function with f(1)=1. The size of the infinite cluster is PN∞(p -pc)N. • The giant cluster has a complex structure containing cycles, while the infinite cluster is no longer fractal, but compact. 16