Degree distribution is scale free

advertisement
LECTURE 2
1. Complex Network Models
2. Properties of Protein-Protein Interaction
Networks
Complex Network Models:
Average Path length L, Clustering coefficient C, Degree
Distribution P(k) help understand the structure of the
network.
Some well-known types of Network Models are as
follows:
•Regular Coupled Networks
•Random Graphs
•Small world Model
•Scale-free Model
•Hierarchical Networks
Regular networks
Regular networks
Diamond Crystal
Both diamond and
graphite are carbon
Graphite Crystal
Regular network (A ring lattice)
Average path length L is
high
Clustering coefficient C is
high
Degree distribution is delta
type.
P(k)
1
1
2
3 4 5
Random Graph
Erdos and Renyi introduced the concept of random
graph around 40 years ago.
Random Graph
N=10
Emax = N(N-1)/2
=45
p=0.1
p=0
p=0.15
p=0.25
Random Graph
Average path length L is
Low
Clustering coefficient C is
low
Degree distribution is
exponential type.
p=0.25
P(k )  e


k
k!
Random Graph
Usually to compare a real network with a
random network we first generate a random
network of the same size i.e. with the same
number of nodes and edges.
Other than Erdos Reyini random graphs there
are other type of random graphs
A Random graph can be constructed such that it
matches the degree distribution or some other
topological properties of a given graph
Geometric random graphs
Small world model (Watts and Strogatz)
Oftentimes,soon after meeting a stranger, one is surprised to
find that they have a common friend in between; so they both
cheer:
“What a small world!”
What a small world!!
Small world model (Watts and Strogatz)
Randomly rewire each edge
Begin with a nearest-neighbor of the network with some
coupled network
probability p
Small world model (Watts and Strogatz)
Average path length L is
Low
Clustering coefficient C is
high
Degree distribution is
exponential type.
P(k)
Scale-free model (Barabási and Albert)
Start with a small number of nodes; at every time step,
a new node is introduced and is connected to alreadyexisting nodes following Preferential Attachment
(probability is high that a new node be connected to
high degree nodes)
Average path length L is
Low
Clustering coefficient C is
not clearly known.
Degree distribution is
power-law type.
1
0.1
γ=2
0.01
γ=3
0.001
P(k) ~
k-γ
0.0001
1
10
100
1000
Scale-free networks exhibit robustness
Robustness – The ability of complex systems to maintain their
function even when the structure of the system changes significantly
Tolerant to random removal of nodes (mutations)
Vulnerable to targeted attack of hubs (mutations) – Drug
targets
Scale-free model (Barabási and Albert)
The term “scale-free” refers to any functional form
f(x) that remains unchanged to within a
multiplicative factor under a rescaling of the
independent variable x i.e. f(ax) = bf(x).
This means power-law forms (P(k) ~ k-γ), since
these are the only solutions to f(ax) = bf(x), and
hence “power-law” is referred to as “scale-free”.
Hierarchical Graphs
NETWORK BIOLOGY: UNDERSTANDING THE CELL’S FUNCTIONAL ORGANIZATION
Albert-László Barabási & Zoltán N. Oltvai
NATURE REVIEWS | GENETICS VOLUME 5 | FEBRUARY 2004 | 101
The starting point of this construction
is a small cluster of four densely
linked nodes (see the four central
nodes in figure).Next, three replicas of
this module are generated and the
three external nodes of the replicated
clusters connected to the central node
of the old cluster, which produces a
large 16-node module. Three replicas
of this 16-node module are then
generated and the 12 peripheral nodes
connected to the central node of the
old module, which produces a new
module of 64 nodes.
Hierarchical Graphs
The hierarchical network model seamlessly integrates a scale-free topology with
an inherent modular structure by generating a network that has a power-law
degree distribution with degree exponent γ = 1 +ln4/ln3 = 2.26 and a large,
system-size independent average clustering coefficient <C> ~ 0.6. The most
important signature of hierarchical modularity is the scaling of the clustering
coefficient, which follows C(k) ~ k –1 a straight line of slope –1 on a log–log plot
NETWORK BIOLOGY: UNDERSTANDING THE CELL’S FUNCTIONAL ORGANIZATION
Albert-László Barabási & Zoltán N. Oltvai
NATURE REVIEWS | GENETICS VOLUME 5 | FEBRUARY 2004 | 101
NETWORK BIOLOGY: UNDERSTANDING THE CELL’S FUNCTIONAL ORGANIZATION
Albert-László Barabási & Zoltán N. Oltvai
NATURE REVIEWS | GENETICS VOLUME 5 | FEBRUARY 2004 | 101
Comparison of
random, scalefree and
hierarchical
networks
protein-protein interaction
Typical protein-protein interaction
A protein binds with another or several other proteins in
order to perform different biological functions---they are
called protein complexes.
protein-protein interaction
This complex
transport oxygen
from lungs to cells all
over the body through
blood circulation
PROTEINPROTEIN
INTERACTIONS
by Catherine Royer
Biophysics Textbook
Online
protein-protein interaction
PROTEINPROTEIN
INTERACTIONS
by Catherine Royer
Biophysics Textbook
Online
Network of interactions and complexes
•Usually protein-protein interaction data are produced by
Laboratory experiments (Yeast two-hybrid, pull-down
assay etc.)
detected complex data
A
A
B D
C
E F
A
Bait protein
B
Interacted protein
C D
E
F
Spoke approach
B
F
C
E
D
Matrix approach
•The results of the experiments are converted to binary
interactions.
•The binary interactions can be represented as a
network/graph where a node represents a protein and an edge
represents an interaction.
Network of interactions
AtpB
AtpG
AtpA
AtpB
AtpG
AtpE
00101
00011
10001
01001
11110
AtpA
AtpE
AtpH
AtpH
AtpH
AtpH
List of
interactions
Corresponding
network
Adjacency
matrix
The yeast protein interaction network evolves rapidly and contain
few redundant duplicate genes by A. Wagner.
Mol. Biology and Evolution. 2001
985 proteins and 899
interactions
S. Cerevisiae
giant component consists
of 466 proteins
The yeast protein interaction network evolves rapidly and contain
few redundant duplicate genes by A. Wagner.
Mol. Biol. Evol. 2001
Average degree ~ 2
Clustering coefficient = 0.022
Degree distribution is scale free
An E. coli interaction network from DIP
(http://dip.mbi.ucla.edu/).
Components of this
graph has been
determined by applying
Depth First Search
Algorithm
There are total 62
components
Giant component
93 proteins
300 proteins and 287
interactions
E. coli
An E. coli interaction network from DIP
(http://dip.mbi.ucla.edu/).
2.5
Log(No. of Node)
2
1.5
1
0.5
0
0
0.5
1
1.5
2
Log(Degree)
Average degree ~ 1.913
Clustering co-efficient
= 0.29
Degree distribution ~ scale free
Lethality and Centrality in protein networks by
H. Jeong, S. P. Mason, A.-L. Barabasi, Z. N. Oltvai
Nature, May 2001
Almost all proteins
are connected
1870 proteins and 2240
interactions
S. Cerevisiae
Degree distribution is scale free
PPI network based on MIPS database consisting of 4546 proteins
12319 interactions
Average
degree 5.42
Clustering coefficient =
0.18
Giant
component
consists of
4385 proteins
PPI network
based on MIPS
database
consisting of
4546 proteins
12319
interactions
3.5
3
Degree distribution ~ scale free
2.5
2
1.5
1
0.5
0
0
0.5
1
1.5
2
2.5
3
# of
protein
s
# of
Interac.
Average
degree
Clusterin
g Coeffi.
Giant
Degree
Compo. Distribu
.
985
899
~2
0.022
Exist
47.3%
Power
law
300
287
1.913
0.29
Exist
31%
Almost
Power
law
1870
2240
______
______
Exist
~100%
Power
law
4546
12319
5.42
0.18
Exist
~96%
Not
exactly
Power
law
A complete PPI network tends to be a connected graph
And tends to have Power law distribution
We learnt
1.Properties of some complex network
models
2.Properties of Protein-Protein Interaction
Networks
Download