PPTX

advertisement
Network Science
Class 5: BA model
(Sept 15, 2014)
Albert-László Barabási
With
Roberta Sinatra
www.BarabasiLab.com
Section 1
Introduction
Section 1
Hubs represent the most striking difference between a random and a
scale-free network. Their emergence in many real systems raises
several fundamental questions:
•Why does the random network model of Erdős and Rényi fail to
reproduce the hubs and the power laws observed in many real
networks?
• Why do so different systems as the WWW or the cell converge to a
similar scale-free architecture?
Section 2
Growth and preferential attachment
BA MODEL: Growth
ER model:
the number of nodes, N, is fixed (static models)
networks expand through the addition
of new nodes
Barabási & Albert, Science 286, 509 (1999)
BA MODEL: Preferential attachment
ER model: links are added randomly to the network
New nodes prefer to connect to the more connected nodes
Barabási & Albert, Science 286, 509 (1999)
Network Science: Evolving Network Models February 14, 2011
Section 2: Growth and Preferential Sttachment
The random network model differs from real networks in two important
characteristics:
Growth: While the random network model assumes that the number of
nodes is fixed (time invariant), real networks are the result of a growth
process that continuously increases.
Preferential Attachment: While nodes in random networks randomly choose
their interaction partner, in real networks new nodes prefer to link to the more
connected nodes.
Barabási & Albert, Science 286, 509 (1999)
Network Science: Evolving Network Models February 14, 2011
Section 3
The Barabási-Albert model
Origin of SF networks: Growth and preferential attachment
(1) Networks continuously expand by the
addition of new nodes
WWW : addition of new documents
(2) New nodes prefer to link to highly
connected nodes.
GROWTH:
add a new node with m links
PREFERENTIAL ATTACHMENT:
the probability that a node connects to a node
with k links is proportional to k.
WWW : linking to well known sites
 ( ki ) 
ki
 jk j
P(k) ~k-3
Barabási & Albert, Science 286, 509 (1999)
Network Science: Evolving Network Models February 14, 2011
Section 4
Section 4
Linearized Chord Diagram
Section 4
Degree dynamics
All nodes follow the same growth law
¶ki
ki
µ P(ki ) = A
¶t
å kj
j
Use:
å
¶ki
ki
=
¶t
2t
j
k j = 2mt
During a unit time (time step): Δk=m  A=m
¶ki
¶t
=
ki
2t
æ t öb
ki (t) = mç ÷
è ti ø
b=
k
ò
m
¶ki
=
ki
t
ò
ti
¶t
2t
1
2
β: dynamical exponent
A.-L.Barabási, R. Albert and H. Jeong, Physica A 272, 173 (1999)
Network Science: Evolving Network Models February 14, 2011
All nodes follow the same growth law
k(t)~t ½
Degree (k)
SF model:
time
(first mover advantage)
Section 5.3
Section 5
Degree distribution
Degree distribution
æ t öb
ki (t) = mç ÷
è ti ø
b=
1
2
A node i can come with equal probability any time between ti=m0 and t, hence:
1
P(t i ) =
m0 + t
1
P(t i < t ) =
m0 + t
t
ò dt
i
0
¶P(ki (t) < k)
2m2 t 1
-g
\P(k) =
=
~
k
¶k
mo + t k3
=
t
m0 + t
γ=3
A.-L.Barabási, R. Albert and H. Jeong, Physica A 272, 173 (1999)
Network Science: Evolving Network Models February 14, 2011
Degree distribution
æ t æb
ki (t) = mæ æ
æt i æ
b=
1
2
2m2 t 1
-g
P(k) =
~
k
mo + t k 3
γ=3
(i) The degree exponent is independent of m.
(ii) As the power-law describes systems of rather different ages and sizes, it is
expected that a correct model should provide a time-independent degree
distribution. Indeed, asymptotically the degree distribution of the BA model is
independent of time (and of the system size N)
 the network reaches a stationary scale-free state.
(iii) The coefficient of the power-law distribution is proportional to m2.
A.-L.Barabási, R. Albert and H. Jeong, Physica A 272, 173 (1999)
Network Science: Evolving Network Models February 14, 2011
The mean field theory offers the correct scaling, BUT it provides the
wrong coefficient of the degree distribution.
So assymptotically it is correct (k ∞), but not correct in details
(particularly for small k).
To fix it, we need to calculate P(k) exactly, which we will do next using a
rate equation based approach.
Network Science: Evolving Network Models February 14, 2011
MFT - Degree Distribution: Rate Equation
< N(k,t) >= tP(K,t)
Number of nodes with degree k at time t.
Since at each timestep we add one node, we have N=t (total number of nodes =number of timesteps)
P(k) =
k
å
j
kj
=
k
2mt
2m: each node adds m links, but each link contributed to the degree of 2 nodes
Total number of
k-nodes
Number of links added to degree k nodes after the arrival of a new node:
Nr. of degree k-1 nodes that acquire
a new link, becoming degree k
Nr. of degree k nodes that acquire a
new link, becoming degree k+1
k -1
P(k -1,t)
2
Preferential
attachment
k
P(k, t)
2
(N +1)P(k,t +1) = NP(k,t) +
# k-nodes at time t+1
k
k
´ NP(k,t) ´ m = P(k,t)
2mt
2
# k-nodes
at time t
k -1
k
P(k -1,t) - P(k,t)
2
2
Gain of knodes via
k-1 k
Loss of knodes via
k k+1
New node adds
m new links to
other nodes
MFT - Degree Distribution: Rate Equation
k -1
k
(N +1)P(k,t +1) = NP(k,t) +
P(k -1,t) - P(k,t)
2
2
# k-nodes at time t+1
# k-nodes
at time t
Gain of knodes via
k-1 k
Loss of knodes via
k k+1
We do not have k=0,1,...,m-1 nodes in the network (each node arrives with degree m)
 We need a separate equation for degree m modes
(N +1)P(m,t +1) = NP(m,t) + 1
# m-nodes at time t+1
# mnodes at
time t
-
Add one
m-degeree
node
m
P(m,t)
2
Loss of an
m-node via
m m+1
Network Science: Evolving Network Models February 14, 2011
MFT - Degree Distribution: Rate Equation
k -1
k
P(k -1,t) - P(k,t)
2
2
m
(N +1)P(m,t +1) = NP(m,t) + 1 - P(m,t)
2
(N +1)P(k,t +1) = NP(k,t) +
k>m
We assume that there is a stationary state in the N=t∞ limit, when P(k,∞)=P(k)
(N +1)P(k,t +1) - NP(k,t) ®NP(k,¥ ) + P(k,¥ ) - NP(k,¥ ) = P(k,¥ ) = P(k)
(N +1)P(m,t +1) - NP(m,t) ®P(m)
k -1
k
P(k -1) - P(k)
2
2
m
P(m) = 1 - P(m)
2
P(k) =
k -1
P(k -1)
k+2
2
P(m) =
2+m
P(k) =
k>m
Network Science: Evolving Network Models February 14, 2011
MFT - Degree Distribution: Rate Equation
k -1
P(k) =
P(k -1)
k+2

k
P(k +1) =
P(k)
k+2
2
m+ 2
m
2m
P(m+1) =
P(m) =
m+ 3
(m+ 2)(m+ 3)
P(m) =
P(m+ 2) =
m+1
2m(m+1)
P(m+1) =
m+ 4
(m+ 2)(m+ 3)(m+ 4)
P(m+ 3) =
m+ 2
2m(m+1)
P(m+ 2) =
m+ 5
(m+ 3)(m+ 4)(m+ 5)
...
2m(m+1)
P(k) =
k(k +1)(k + 2)
Krapivsky, Redner, Leyvraz, PRL 2000
Dorogovtsev, Mendes, Samukhin, PRL 2000
Bollobas et al, Random Struc. Alg. 2001
m+3  k
P(k) ~ k-3
for large k
Network Science: Evolving Network Models February 14, 2011
MFT - Degree Distribution: A Pretty Caveat
Start from eq.
P(k) =
k -1
k
P(k -1) - P(k)
2
2
2P(k) = (k -1)P(k -1) - kP(k) = -P(k -1) - k[P(k) - P(k -1)]
P(k) - P(k -1)
¶P(k)
2P(k) = -P(k -1) - k
= -P(k -1) - k
k - (k -1)
¶k
P(k) = -
Its solution is:
1 ¶[kP(k)]
2
¶k
P(k) ~ k-3
Dorogovtsev and Mendes, 2003
Network Science: Evolving Network Models February 14, 2011
Degree distribution
æ t æb
ki (t) = mæ æ
æt i æ
1
b=
2
γ=3
2m(m+1)
P(k) =
k(k +1)(k + 2)
P(k) ~ k-3
for large k
(i) The degree exponent is independent of m.
(ii) As the power-law describes systems of rather different ages and sizes, it is
expected that a correct model should provide a time-independent degree
distribution. Indeed, asymptotically the degree distribution of the BA model is
independent of time (and of the system size N)
 the network reaches a stationary scale-free state.
(iii) The coefficient of the power-law distribution is proportional to m2.
Network Science: Evolving Network Models February 14, 2011
NUMERICAL SIMULATION OF THE BA MODEL
P(k) =
2m(m+1)
k(k +1)(k + 2)
Section 6
absence of growth and preferential
attachment
MODEL A
growth
preferential attachment
Π(ki) : uniform
¶ki
m
= AP(ki ) =
¶t
m0 + t -1
ki (t) = mln(
m0 + t -1
)+m
m + t i -1
e
k
P(k) = exp(- ) ~ e- k
m
m
MODEL B
growth
preferential attachment
ki
1
N ki
1
 A (ki ) 


t
N
N  1 2t N
N
2( N  1)
2
2 ( N 1)
ki (t ) 
t  Ct
~
t
N ( N  2)
N
pk : power law (initially) 
 Gaussian  Fully Connected
Do we need both growth and
preferential attachment?
YEP.
Network Science: Evolving Network Models February 14, 2011
Section 7
Measuring preferential attachment
Section 7
Measuring preferential attachment
ki
k i
  ( ki ) ~
t
t
Plot the change in the degree Δk during
a fixed time Δt for nodes with degree k.
To reduce noise, plot the integral of Π(k) over k:
(k ) 
 ( K )
K k
No pref. attach:
κ~k
Linear pref. attach:
κ~k2
(Jeong, Neda, A.-L. B, Europhys Letter 2003; cond-mat/0104131)
Section 7
Measuring preferential attachment
citation
network
Internet
Plots shows the integral of
Π(k) over k:
(k ) 
 ( K )
K k
No pref. attach:
κ~k
neurosci
actor
collab
collab.
Linear pref. attach:
κ~k2
 (k )  A  k  ,   1
Network Science: Evolving Network Models February 14, 2011
Section 8
Nonlinear preferenatial attachment
Section 8
Nonlinear preferential attachment
α=0: Reduces to Model A discussed in Section 5.4. The degree distribution follows the
simple exponential function.
α=1: Barabási-Albert model, a scale-free network with degree exponent 3.
0<α<1: Sublinear preferential attachment. New nodes favor the more connected
nodes over the less connected nodes. Yet, for the bias is not sufficient to generate a
scale-free degree distribution. Instead, in this regime the degrees follow the stretched
exponential distribution:
Section 8
Nonlinear preferential attachment
α=0: Reduces to Model A discussed in Section 5.4. The degree distribution follows the
simple exponential function.
α=1: Barabási-Albert model, a scale-free network with degree exponent 3.
α>1: Superlinear preferential attachment. The tendency to link to highly connected
nodes is enhanced, accelerating the “rich-gets-richer” process. The consequence of this
is most obvious for , when the model predicts a winner-takes-all phenomenon: almost
all nodes connect to a single or a few super-hubs.
Section 8
Nonlinear preferential attachment
The growth of the hubs. The nature of preferential attachment affects the degree of the
largest node. While in a scale-free network the biggest hub grows as (green curve), for
sublinear preferential attachment this dependence becomes logarithmic (red curve). For
superlinear preferential attachment the biggest hub grows linearly with time, always grabbing
a finite fraction of all links (blue curve)). The symbols are provided by a numerical simulation;
the dotted lines represent the analytical predictions.
Section 9
The origins of preferential
attachment
hen ce i t i s i n her en t ly local an d r an dom . Un li ke t he Bar abási -Alber t
m odel, i t 9
lacks a bui lt -i n
Section
EXISTING
NETWORK
(k) f un
ct i on . Yet
n ext w e show
t hat i t gen er Link
selection
model
at es pr ef er en t i al at t achm en t .
p
Link selection model -- perhaps the simplest example of a
local
ofn ode
generating
. t he
Weor
st arrandom
t by w r i t i nmechanism
g t he pr obabi li tcapable
y qk t hat
at t he en d of a r an preferential
attachment.
dom ly chosen
li n k has degr ee k as
Growth: at each time stepqk weCkp
add
a new node to the (5.26)
k
network.
Equat i on (5.26) capt ur es t w o ef f ect s:
TARGET
u
CHOOSE TARGET
Link selection: we select a link at random and connect the
• node
The hi gher t he degr ee of a n ode, t he hi gher t he chan ce t hat i t i s lonew
to one of nodes at the two ends of the selected
Figure 5.14
cat ed at t he en d of t he chosen li n k.
link.
Copying Model
• The m or e degr ee-k n odes ar e i n t he n et w or k (i .e., t he hi gher i s p k),
To show
that this simple mechanism generates linear
t he m or e li kely t hat a degr ee k n ode i s at t he en d of t he li n k.
preferential attachment, we write the probability that the
node
at the
of a randomly
link
In (5.26)
C canend
be calculat
ed usi n g t hechosen
n or m ali zat
i on has
con didegree
t i on q =k1,as
k
obt ai n i n g C=1/ k . Hen ce t he pr obabi li t y t o f i n d a degr ee-k n ode at t he en d
of a r an dom ly chosen li n k i s
qk =
kpk
,
k
(5.27)
1-p
u
u
CHOOSE ONE OF THE
OUTGOING LINKS OF TAR
The m ai n st eps of t he copyi ng m odel.
node connect s wi t h pr obabi li t y p t o a r an
chosen t ar get node u, or wi t h pr obabi li t y
one of t he nodes t he t ar get u poi nt s t o. I
wor ds, wi t h pr obabi lt y 1-p t he new node c
li nk of i t s t ar get u.
Section 9
Copying model
(a) Random Connection: with probability p the new node
links to u.
(b) Copying: with probability we randomly choose an
outgoing link of node u and connect the new node to the
selected link's target. Hence the new node “copies” one of
the links of an earlier node
(a) the probability of selecting a node is 1/N.
(b) is equivalent with selecting a node linked to a randomly
selected link. The probability of selecting a degree-k node
through the copying process of step (b) is k/2L for undirected
networks.
The likelihood that the new node will connect to a degree-k
node follows preferential attachment
Social networks: Copy your friend’s friends.
Citation Networks: Copy references from papers we read.
Protein interaction networks: gene duplication,
Section 9
Optimization model
Section 9
Optimization model
Section 9
Optimization model
Section 9
Optimization model
Section 9
Section 10
Diameter and clustering coefficient
Section 10
Diameter
Bollobas, Riordan, 2002
Section 10
Clustering coefficient
Reminder: for a random graph we have:
Crand =
<k>
~ N -1
N
What is the functional form of C(N)?
m (ln N) 2
C=
8
N
Konstantin Klemm, Victor M. Eguiluz,
Growing scale-free networks with small-world behavior,
Phys. Rev. E 65, 057102 (2002), cond-mat/0107607
CLUSTERING COEFFICIENT OF THE BA MODEL
C=
Nr(◃ )
k(k -1)
2
1
2
C=
2
6
Denote the probability to have a link between node i and j with P(i,j)
The probability that three nodes i,j,l form a triangle is P(i,j)P(i,l)P(j,l)
The expected number of triangles in which a node l with degree kl participates is thus:
Nrl (◃ ) =
N
N
i =1
j =1
ò di ò dj P(i, j )P(i,l)P( j,l)
We need to calculate P(i,j).
Network Science: Evolving Network Models February 14, 2011
CLUSTERING COEFFICIENT OF THE BA MODEL
Calculate P(i,j).
Node j arrives at time tj=j and the probability that it
will link to node i with degree ki already in the
network is determined by preferential attachment:
æ t æ1/ 2
æ j æ1/ 2
ki (t) = mæ æ = mæ æ
æi æ
æt i æ
N
N
m (ln N) 2
C=
8
N
N
N
ò di ò dj (ij )
i =1
ki ( j )
j
åk
=m
ki ( j )
2mj
l
l =1
1
m
P(i, j ) = (ij ) 2
2
Where we used that the arrival time of node
j is tj=j and the arrival time of node is ti=i
m3
Nrl (◃ ) = ò di ò dj P(i, j )P(i,l)P( j,l) =
8
i =1
j =1
m3
(ln N) 2
C = 8l
kl (kl -1) /2
P(i, j ) = mP(ki ( j )) = m
-
1
2
j =1
(il)
-
1
2
( jl)
-
1
2
m3
=
8l
æN æ1/ 2 Which is the degree of node l
kl (t) = mæ æ
æ l æ at current time, at time t=N
There is a factor of two difference... Where does it come from?
N
di
ò i
i =1
N
dj m3
ò j = 8l (ln N)2
j =1
Let us approximate:
kl (kl -1) » kl 2 = m2
N
l
Network Science: Evolving Network Models February 14, 2011
Section 10
Clustering coefficient
Reminder: for a random graph we have:
Crand =
<k>
~ N -1
N
What is the functional form of C(N)?
m (ln N) 2
C=
8
N
Konstantin Klemm, Victor M. Eguiluz,
Growing scale-free networks with small-world behavior,
Phys. Rev. E 65, 057102 (2002), cond-mat/0107607
Section 11: Summary
The network grows, but the degree distribution is stationary.
Section 11: Summary
The network grows, but the degree distribution is stationary.
Section 11: Summary
Download