Robustness, clustering & evolutionary conservation Stefan Wuchty Center of Network Research

advertisement
Robustness, clustering &
evolutionary conservation
Stefan Wuchty
Center of Network Research
Department of Physics
University of Notre Dame
Complex systems
Made of many non-identical elements
connected by diverse interactions.
NETWORK
GENOME
proteingene
interactions
PROTEOME
proteinprotein
interactions
METABOLISM
Citrate Cycle
Biochemical
reactions
PROTEOME
proteinprotein
interactions
Yeast protein network
Nodes: proteins
Links: physical interactions (binding)
P. Uetz, et al. Nature, 2000; Ito et al., PNAS, 2001; …
Topology of the protein network
P (k ) ~ (k  k0 )  exp( 
k  k0
)
k
H. Jeong, S.P. Mason, A.-L. Barabasi & Z.N. Oltvai, Nature, 2001
Robustness
Complex systems maintain their basic functions
even under errors and failures
(cell  mutations; Internet  router breakdowns)
1
S
fc
0
1
Fraction of removed nodes, f
node failure
Robustness of scale-free networks
Failures
Topological
error tolerance
1
R. Albert et.al.
Nature, 2000
S
0
Attacks
  3 : fc=1
(R. Cohen et. al., PRL, 2000)
fc
f
1
Yeast protein network
- lethality and topological position -
Highly connected proteins are more essential (lethal)...
H. Jeong et al., Nature, 2001
Modules in biological systems
Metabolic networks
E. Ravasz et al., Science, 2002
Protein networks
Can we identify the modules?
J (i, j )
OT (i, j ) 
J(i,j): # of nodes both i and j link to; +1 if there is a direct (i,j) link
min( ki , k j )
Metabolism: E. Ravasz et al., Science, 2002
Protein interactions: Rives and Galitski, PNAS, 2003
Spirin and Mirny, PNAS, 2003
Open questions
Does the application of standart clustering
algorithms reflect real modules well?
Since e.g. one protein can be part of more
than one protein complex overlapping
clustering algorithms should give better
results.
Motifs
Small subnetworks that appear in real world networks
significantly more often than in random graphs.
(Milo et al., Science, 2002; Conant and Wagner, Nature Gen., 2003,
Shen-Orr et al., Nature Gen., 2002, Milo et al, Science, 2004)
From the particular to the universal
A.-L- Barabasi & Z. Oltvai, Science, 2002
Topology and Evolution
Topology and Evolution
S. Wuchty, Z. Oltvai & A.-L. Barabasi, Nature Genetics, 2003
Topology and evolution
- General distribution of orthologs:
E = N(o)/N(p)
- degree-dependent distribution
of orthologs
ek = Nk(o)/Nk
Orthologous Excess Retention:
ERk = ek/E
S. Wuchty, Genome Res., 2004
Clustering in protein interaction networks
high clustering = high quality of interaction
Goldberg and Roth, PNAS, 2003
Cvw   log
min(| N ( v )|,| N ( w )|)

i | N ( v ) N ( w )|
 | N (v) |  N  | N (v) | 



 i  | N ( w)  i | 
 N 


 | N ( w) | 
Does that also hold for evolutionary
conservation?
Protein-protein interaction data
are highly flawed:
90% false positives, 50% false negatives
Von Mering et al., Nature, 2002
How stable are these results?
Something else?
Eisen et al., PNAS, 1998
Open question
??
Wuchty et al., submitted, 2004
Plasmodium falciparum
•
•
•
•
Eukaryotic organism
Malaria parasite
Genome size 23 MB, 14 chromosomes
5300 genes (estimated, Hall et al., Nature 2002,
Gardner et al., Nature, 2002)
• No protein interaction data available
• Co-expression data available (Bozdech et al.,
PloS, 2003, LeRoch et al., Science, 2003)
• 868 orthologs with Yeast (InParanoid, Remm et
al. J. Mol. Biol., 2001)
Plasmodium falciparum
Plasmodium falciparum
Q   (eii  ai 2 )
i
Inferred protein interaction network
in P. falciparum
• 667 nodes, 3,564 weighted interactions
• Clustering
- Iteratively pruning edges starting with the least weighted
link
- Quality of clusters is assessed by their modularity
Q   (eii  ai )
2
i
until a maximum is reached.
All edges shown with Cvw > 1.
Colorcode red: Cvw > 4, yellow: Cvw > 3,
green: Cvw > 2, blue: Cvw > 1
What does that mean?
Validation of results?
Co-expression patterns
Bozdech et al. PLoS, 2003
replication
exo/protesome
DNA processing
translation
RNA processing
ribososome
Wuchty, Barabasi, Ferdig and Adams, in preperation
What‘s next?
• Uncovering evolutionary cores of interactions
in other organisms.
• Application of a Maximum Set Cover
Algorithm to predict protein interactions
(Huang, Kaanan, Wuchty, Izaguirre and
Cheng, submitted) to unfold the interactome
using the evolutionary cores and
experimentally derived interactions.
http://www.nd.edu/~swuchty
swuchty@nd.edu
THX!
Download