LECTURE 2 1. Complex Network Models 2. Properties of Protein-Protein Interaction Networks Complex Network Models: Average Path length L, Clustering coefficient C, Degree Distribution P(k) help understand the structure of the network. Some well-known types of Network Models are as follows: •Regular Coupled Networks •Random Graphs •Small world Model •Scale-free Model •Hierarchical Networks Regular networks Regular networks Diamond Crystal Both diamond and graphite are carbon Graphite Crystal Regular network (A ring lattice) Average path length L is high Clustering coefficient C is high Degree distribution is delta type. P(k) 1 1 2 3 4 5 Random Graph Erdos and Renyi introduced the concept of random graph around 40 years ago. Random Graph N=10 Emax = N(N-1)/2 =45 p=0.1 p=0 p=0.15 p=0.25 Random Graph Average path length L is Low Clustering coefficient C is low Degree distribution is exponential type. p=0.25 P(k ) e k k! Random Graph Usually to compare a real network with a random network we first generate a random network of the same size i.e. with the same number of nodes and edges. Other than Erdos Reyini random graphs there are other type of random graphs A Random graph can be constructed such that it matches the degree distribution or some other topological properties of a given graph Geometric random graphs Small world model (Watts and Strogatz) Oftentimes,soon after meeting a stranger, one is surprised to find that they have a common friend in between; so they both cheer: “What a small world!” What a small world!! Small world model (Watts and Strogatz) Randomly rewire each edge Begin with a nearest-neighbor of the network with some coupled network probability p Small world model (Watts and Strogatz) Average path length L is Low Clustering coefficient C is high Degree distribution is exponential type. P(k) Scale-free model (Barabási and Albert) Start with a small number of nodes; at every time step, a new node is introduced and is connected to alreadyexisting nodes following Preferential Attachment (probability is high that a new node be connected to high degree nodes) Average path length L is Low Clustering coefficient C is not clearly known. Degree distribution is power-law type. 1 0.1 γ=2 0.01 γ=3 0.001 P(k) ~ k-γ 0.0001 1 10 100 1000 Scale-free networks exhibit robustness Robustness – The ability of complex systems to maintain their function even when the structure of the system changes significantly Tolerant to random removal of nodes (mutations) Vulnerable to targeted attack of hubs (mutations) – Drug targets Scale-free model (Barabási and Albert) The term “scale-free” refers to any functional form f(x) that remains unchanged to within a multiplicative factor under a rescaling of the independent variable x i.e. f(ax) = bf(x). This means power-law forms (P(k) ~ k-γ), since these are the only solutions to f(ax) = bf(x), and hence “power-law” is referred to as “scale-free”. Hierarchical Graphs NETWORK BIOLOGY: UNDERSTANDING THE CELL’S FUNCTIONAL ORGANIZATION Albert-László Barabási & Zoltán N. Oltvai NATURE REVIEWS | GENETICS VOLUME 5 | FEBRUARY 2004 | 101 The starting point of this construction is a small cluster of four densely linked nodes (see the four central nodes in figure).Next, three replicas of this module are generated and the three external nodes of the replicated clusters connected to the central node of the old cluster, which produces a large 16-node module. Three replicas of this 16-node module are then generated and the 12 peripheral nodes connected to the central node of the old module, which produces a new module of 64 nodes. Hierarchical Graphs The hierarchical network model seamlessly integrates a scale-free topology with an inherent modular structure by generating a network that has a power-law degree distribution with degree exponent γ = 1 +ln4/ln3 = 2.26 and a large, system-size independent average clustering coefficient <C> ~ 0.6. The most important signature of hierarchical modularity is the scaling of the clustering coefficient, which follows C(k) ~ k –1 a straight line of slope –1 on a log–log plot NETWORK BIOLOGY: UNDERSTANDING THE CELL’S FUNCTIONAL ORGANIZATION Albert-László Barabási & Zoltán N. Oltvai NATURE REVIEWS | GENETICS VOLUME 5 | FEBRUARY 2004 | 101 NETWORK BIOLOGY: UNDERSTANDING THE CELL’S FUNCTIONAL ORGANIZATION Albert-László Barabási & Zoltán N. Oltvai NATURE REVIEWS | GENETICS VOLUME 5 | FEBRUARY 2004 | 101 Comparison of random, scalefree and hierarchical networks protein-protein interaction Typical protein-protein interaction A protein binds with another or several other proteins in order to perform different biological functions---they are called protein complexes. protein-protein interaction This complex transport oxygen from lungs to cells all over the body through blood circulation PROTEINPROTEIN INTERACTIONS by Catherine Royer Biophysics Textbook Online protein-protein interaction PROTEINPROTEIN INTERACTIONS by Catherine Royer Biophysics Textbook Online Network of interactions and complexes •Usually protein-protein interaction data are produced by Laboratory experiments (Yeast two-hybrid, pull-down assay etc.) detected complex data A A B D C E F A Bait protein B Interacted protein C D E F Spoke approach B F C E D Matrix approach •The results of the experiments are converted to binary interactions. •The binary interactions can be represented as a network/graph where a node represents a protein and an edge represents an interaction. Network of interactions AtpB AtpG AtpA AtpB AtpG AtpE 00101 00011 10001 01001 11110 AtpA AtpE AtpH AtpH AtpH AtpH List of interactions Corresponding network Adjacency matrix The yeast protein interaction network evolves rapidly and contain few redundant duplicate genes by A. Wagner. Mol. Biology and Evolution. 2001 985 proteins and 899 interactions S. Cerevisiae giant component consists of 466 proteins The yeast protein interaction network evolves rapidly and contain few redundant duplicate genes by A. Wagner. Mol. Biol. Evol. 2001 Average degree ~ 2 Clustering coefficient = 0.022 Degree distribution is scale free An E. coli interaction network from DIP (http://dip.mbi.ucla.edu/). Components of this graph has been determined by applying Depth First Search Algorithm There are total 62 components Giant component 93 proteins 300 proteins and 287 interactions E. coli An E. coli interaction network from DIP (http://dip.mbi.ucla.edu/). 2.5 Log(No. of Node) 2 1.5 1 0.5 0 0 0.5 1 1.5 2 Log(Degree) Average degree ~ 1.913 Clustering co-efficient = 0.29 Degree distribution ~ scale free Lethality and Centrality in protein networks by H. Jeong, S. P. Mason, A.-L. Barabasi, Z. N. Oltvai Nature, May 2001 Almost all proteins are connected 1870 proteins and 2240 interactions S. Cerevisiae Degree distribution is scale free PPI network based on MIPS database consisting of 4546 proteins 12319 interactions Average degree 5.42 Clustering coefficient = 0.18 Giant component consists of 4385 proteins PPI network based on MIPS database consisting of 4546 proteins 12319 interactions 3.5 3 Degree distribution ~ scale free 2.5 2 1.5 1 0.5 0 0 0.5 1 1.5 2 2.5 3 # of protein s # of Interac. Average degree Clusterin g Coeffi. Giant Degree Compo. Distribu . 985 899 ~2 0.022 Exist 47.3% Power law 300 287 1.913 0.29 Exist 31% Almost Power law 1870 2240 ______ ______ Exist ~100% Power law 4546 12319 5.42 0.18 Exist ~96% Not exactly Power law A complete PPI network tends to be a connected graph And tends to have Power law distribution We learnt 1.Properties of some complex network models 2.Properties of Protein-Protein Interaction Networks