Random Graph Theory

advertisement
Random-Graph
Theory
The Erdos-Renyi
model
In mathematical terms a network is represented by a graph. A graph is a
pair of sets G={P,E}, where P is a set of N nodes (or vertices or points) P1 ,P2
,...,PN and E is a set of edges (or links or lines) that connect two elements
of P.
One of the objects of Random-Graph Theory is:
A particularly rich source of ideas has been the study of random graphs,
graphs in which the edges are distributed randomly. Networks with a
complex topology and unknown organizing principles often appear
random; thus random-graph theory is regularly used in the study of
complex networks.
The Erdo˝s-Re´nyi model
Classic first article:
In their classic first article on random graphs, Erdo˝s and Re´nyi define a
random graph as N labeled nodes connected by n edges, which are chosen
n
randomly from the N(N-1)/2 possible edges . In total there are C[ N ( N 1) / 2]
graphs with N nodes and n edges, forming a probability space in which
every realization is equiprobable.
2
The binomial model:
Here we start with N nodes, every pair of nodes being connected with probability p.
Consequently the total number of edges is a random variable with the expectation value
E(n)=p[N(N-1)/2] . If G0 is a graph with nodes P1 ,P2 ,...,PN and n edges, the probability of
obtaining it by this graph construction process is P(G0)=pn(1-p)N(N-1)/2 - n.
We can notice the emergence
of trees (a tree of order 3,
drawn with long-dashed lines)
and cycles (a cycle of order 3,
drawn with short-dashed lines)
in the graph, and a connected
cluster that unites half of the
nodes at p=0.15=1.5/N.
3
Random-graph theory studies the properties of the probability space associated with
graphs with N nodes as N→∞.
In this respect Erdo˝s and Re´nyi used the definition that almost every graph has a
property Q if the probability of having Q approaches 1 as N→∞.
lim
N 

0, if

PN , p Q   
1, if

p( N )
0
pc ( N )
p( N )

pc ( N )
4
Subgraphs
The first property of random graphs to be studied by Erdo˝s and Re´nyi was
the appearance of subgraphs.
The simplest examples of subgraphs are cycles, trees, and complete
subgraphs.
Most generally we can ask whether there is a critical probability that marks
the appearance of arbitrary subgraphs consisting of k nodes and l edges.
Thus the expected number of such subgraphs is:
l
k! l
k p
E( X )  C
p N
a
a
k
N
lim P ( X  r )  e
N 
p

r
r!
(*)
Thus, indeed, the critical probability at which almost every graph contains a
subgraph with k nodes and l edges is pc(N)=cN-k/l.
A few important special cases directly follow from Eq. (*):
(a) The critical probability of having a tree of order k is pc(N)=cN-k/(k-1);
(b) The critical probability of having a cycle of order k is pc(N)=cN-1;
(c) The critical probability of having a complete subgraph of order k is
pc(N)=cN-2/(k-1).
5
The threshold probabilities at which different subgraphs appear in a random graph. For
pN3/2→0 the graph consists of isolated nodes and edges. For p~N-3/2 trees of
order 3 appear, while for p~N-4/3 trees of order 4 appear. At p~N-1 trees of all orders are
present, and at the same time cycles of all orders appear. The probability p~N-2/3 marks
the appearance of complete subgraphs of order 4 and p~N-1/2 corresponds to complete
subgraphs of order 5. As z approaches 0, the graph contains complete subgraphs of
increasing order.
6
Graph evolution
It is instructive to look at the results discussed above from a different point of view.
Consider a random graph with N nodes and assume that the connection probability
p(N) scales as Nz, where z is a tunable parameter that can take any value between -∞
and 0.
If 0< <k> <1, almost surely all clusters are either trees or clusters containing exactly
one cycle. Although cycles are present, almost all nodes belong to trees. The mean
number of clusters is of order N-n, where n is the number of edges, i.e., in this range
when a new edge is added the number of clusters decreases by 1. The largest
cluster is a tree, and its size is proportional to lnN. When <k> passes the threshold
<k>c=1, the structure of the graph changes abruptly. While for <k> < 1 the
greatest cluster is a tree, for <k>c=1 it has approximately N2/3 nodes and has a rather
complex structure.
Moreover for <k> >1 the greatest (giant) cluster has [1-f(<k>)]N nodes, where f(x) is a
function that decreases exponentially from f(1)=1 to 0 for x→∞. Thus a finite fraction
S=1-f(<k>) of the nodes belongs to the largest cluster.
7
Degree distribution
In a random graph with connection probability p the degree ki of a node i follows a
binomial distribution with parameters N-1 and p:
P(ki=k)=CN-1k pk(1-p)N-1-k.
To find the degree distribution of the graph, we need to study the number of
nodes with degree k , Xk. Our main goal is to determine the probability that Xk
takes on a given value, P(Xk=r).
The expectation value of the number of nodes with degree k is:
E(Xk)=NP(ki=k)=λk , where
λk =N CN-1k pk(1-p)N-1-k.
The distribution of the Xk values, P(Xk=r), approaches a Poisson distribution,
P(Xk=r)= e
 k
rk
r!
8
The degree distribution that results from the numerical simulation of a random graph.
We generated a single random graph with N=10 000 nodes and connection probability
p =0.0015, and calculated the number of nodes with degree k,Xk . The plot compares
Xk /N with the expectation value of the Poisson distribution E(Xk)/N=P(ki=k), and we
can see that the deviation is small.
9
Connectedness and diameter
d
ln(N )
ln( k  )
• If <k>=pN<1, a typical graph is composed of isolated trees and its diameter equals the
diameter of a tree.
• If <k> > 1, a giant cluster appears. The diameter of the graph equals the diameter of
the giant cluster if <k> > 3.5, and is proportional to ln(N)/ln(<k>).
• If <k> > ln(N), almost every graph is totally connected.The diameters of the graphs
having the same N and <k> are concentrated on a few values around ln(N)/ln(<k>).
lrand ~
ln(N )
ln(k  )
10
Clustering coefficient
Crand  p 
k 
N
11
PERCOLATION THEORY
Illustration of bond percolation in 2D. The nodes are placed on a 25X25 square
lattice, and two nodes are connected by an edge with probability p. For p=0.315
(left), which is below the percolation threshold pc=0.5, the connected nodes
form isolated clusters. For p=0.525 (right), which is above the percolation
threshold, the largest cluster percolates.
12
(1) The percolation probability P, denoting the probability that a given node belongs to
the infinite cluster:
Pp (| C | s)
P=Pp(|C|=∞)=1 -

s 
where Pp(|C|=s) denotes the probability that the cluster at the origin has size s.
Obviously
0 , if p  pc
P
 0 , if p  pc
(2) The average cluster size <s>, defined as

<s>=Ep(|C|)=  sPp (| C | s )
s 1
giving the expectation value of cluster sizes. Because
<s> is infinite when P>0, in this case it is useful to work with the average size of the
finite clusters by taking away from the system the infinite (|C| =∞) cluster
 s f  E p (| C |, | C | )   sPp (| C | s)
s 
13
(3)The cluster size distribution ns , defined as the probability of a node’s having a fixed
position in a cluster of size s (for example, being its left-hand end, if this
position is uniquely defined),
1
ns  Pp (| C | s).
s
14
Example of a Cayley tree
with coordination number
z=3.
•Percolation threshold:
pc=1/(z-1)
•Percolation probability:
0 , if p  pc  1 / 2
P
2
(
2
p

1
)
/
p
, if p  pc  1 / 2

•Mean cluster size:
3
 s  ( pc  p ) 1
4
Cluster size distribution:
1 s 1 s 1
Pp (| C | s)  C2 s p (1  p) s 1
s
15
Parallels between random-graph theory and percolation
(1) For p<pc=1/N
• The probability of a giant cluster in a graph, and of an infinite cluster in percolation, is
equal to 0.
• The clusters of a random graph are trees, while the clusters in percolation have a
fractal structure and a perimeter proportional with their volume.
• The largest cluster in a random graph is a tree with ln(N) nodes, while in general for
percolation Pp(|C| = s)~e-s/ξ, suggesting that the size of the largest cluster scales as
ln(N).
(2) For p=pc=1/N
• A unique giant cluster or an infinite cluster appears.
• The size of the giant cluster is N2/3; while for infinitedimensional
percolation Pp(|C|=s)~s-3/2, thus the size of the largest cluster scales as N2/3.
(3) For p>pc=1/N
• The size of the giant cluster is (f(pcN)-f(pN))N, where f is an exponentially decreasing
function with f(1)=1. The size of the infinite cluster is PN∞(p -pc)N.
• The giant cluster has a complex structure containing cycles, while the infinite cluster
is no longer fractal, but compact.
16
Download