Complex Networks Junio 2006 L. Lacasa, B. Luque y J.C. Nuño Departamentos de Matemática Aplicada Aeronáuticos y Montes Universidad Politécnica de Madrid Society Nodes: individuals Links: social relationship (family/work/friends/ etc.) Social networks: Many individuals with diverse social interactions between them. Social networks • Contacts and Influences Poll & Kochen (1958) – How great is the chance that two people chosen at random from the population will have a friend in common? – How far are people aware of the available lines of contact? • The Small-World Problem – Milgram (1967) – How many intermediaries are needed to move a letter from person A to person B through a chain of acquaintances? – Letter-sending experiment: starting in Nebraska/Kansas, with a target person in Boston. Social networks: Milgram’s experiment 160 letters: From Wichita (Kansas) and Omaha (Nebraska) If you do not know the target person on a personal basis, do not try to contact him directly. Instead, mail this folder to a personal acquaintance who is more likely than you to know the target person. to Sharon (Mass) Milgram, Psych Today 2, 60 (1967) “Six degrees of separation” ¡El mundo es un pañuelo! C’est petit le monde !! What a Small-World ! The Small World concept in simple terms describes the fact despite their often large size, in most networks there is a relatively short path between any two nodes. El número de Erdös Pál Erdös (1913-1996) Fue autor o coautor de 1.475 artículos matemáticos y colaboró en ellos con un total de 493 coautores distintos. Sólo un matemático en la historia escribió más páginas de matemáticas originales que Erdös. En siglo XVII, el suizo Leonhard Euler, padre de trece niños, escribió ochenta volúmenes de resultados matemáticos. Números de Erdös de científicos famosos http://www.oakland.edu/enp/ Walter Alvarez Rudolf Carnap Jule G. Charney Noam Chomsky Freeman J. Dyson George Gamow Stephen Hawking Pascual Jordan Theodore von Kármán John Maynard Smith Oskar Morgenstern J. Robert Oppenheimer Roger Penrose Jean Piaget Karl Popper Claude E. Shannon Arnold Sommerfeld Edward Teller George Uhlenbeck John A. Wheeler Número 1- 504 colaboradores Número 2- 6593 colaboradores geology philosophy meteorology linguistics quantum physics nuclear physics and cosmology relativity and cosmology quantum physics aeronautical engineering biology economics nuclear physics relativity and cosmology psychology philosophy electrical engineering atomic physics nuclear physics atomic physics nuclear physics 7 4 4 4 2 5 4 4 4 4 4 4 3 3 4 3 5 4 2 3 Números de Erdös de premios Nobel de física Max von Laue 1914 Albert Einstein 1921 Niels Bohr Louis de Broglie Werner Heisenberg Paul A. Dirac Erwin Schrödinger Enrico Fermi Ernest O. Lawrence Otto Stern Isidor I. Rabi Wolfgang Pauli Frits Zernike Max Born Willis E. Lamb John Bardeen Walter H. Brattain William B. Shockley Chen Ning Yang Tsung-dao Lee Emilio Segrè 1922 1929 1932 1933 1933 1938 1939 1943 1944 1945 1953 1954 1955 1956 1956 1956 1957 1957 1959 4 2 5 5 4 4 8 3 6 3 4 3 6 3 3 5 6 6 4 5 4 Owen Chamberlain Robert Hofstadter Eugene Wigner Richard P. Feynman Julian S. Schwinger Hans A. Bethe Luis W. Alvarez Murray Gell-Mann John Bardeen Leon N. Cooper John R. Schrieffer Aage Bohr Ben Mottelson Leo J. Rainwater Steven Weinberg Sheldon Lee Glashow Abdus Salam S. Chandrasekhar Norman F. Ramsey 1959 1961 1963 1965 1965 1967 1968 1969 1972 1972 1972 1975 1975 1975 1979 1979 1979 1983 1989 5 5 4 4 4 4 6 3 5 6 5 5 5 7 4 2 3 4 3 Erdös number Erdös number 0 --- 1 person Erdös number 1 --- 504 people Erdös number 2 --- 6593 people Erdös number 3 --- 33605 people Erdös number 4 --- 83642 people Erdös number 5 --- 87760 people Erdös number 6 --- 40014 people Erdös number 7 --- 11591 people Erdös number 8 --- 3146 people Erdös number 9 --- 819 people Erdös number 10 --- 244 people Erdös number 11 --- 68 people Erdös number 12 --- 23 people Erdös number 13 --- 5 people • Graph: a pair of sets G = {P,E} where P is a set of nodes, and E is a set of edges that connect 2 elements of P. • Degree of a node: the number of edges incident on the node i Degree of node i = 5 Type of Edges • Directed • edges have a direction, only go one way (citations, one way streets) • Undirected • no direction (committee membership, twoway streets) • Weighted • Not all edges are equal. (Friendships) • Degree • Number of edges connected to a node. • In-degree • Number of incoming edges. • Out-degree • Number of outgoing edges. Network parameters Diameter Maximum distance between any pair of nodes. Characteristic path length Connectivity Number of neighbours of a given node: k := degree. P(k) := Probability of having k neighbours. Clustering Are neighbours of a node also neighbours among them? Characteristic path length GLOBAL property • L( i , j ) is the number of edges in the shortest path between vertices i and j (geodesic path). i L(i , j ) 2 j • The characteristic path length L of a graph is the average of the L( i , j ) for every possible pair (i,j) Networks with small values of L are said to have the “Small World property” Austin Powers: The spy who shagged me Bacon’s Game Let’s make it legal Robert Wagner Wild Things What Price Glory Internet Movie Database Barry Norton A Few Good Man http://www.cs.virginia.edu/oracle/ Monsieur Verdoux Why Kevin Bacon? Measure the average distance between Kevin Bacon and all other actors. Kevin Bacon Is Kevin Bacon the most connected actor? NO! No. of movies : 46 No. of actors : 1811 Average separation: 2.79 Rod Steiger Donald Pleasence Martin Sheen Christopher Lee Robert Mitchum Charlton Heston Eddie Albert Robert Vaughn Donald Sutherland John Gielgud Anthony Quinn James Earl Jones Average distance 2.537527 2.542376 2.551210 2.552497 2.557181 2.566284 2.567036 2.570193 2.577880 2.578980 2.579750 2.584440 # of movies 112 180 136 201 136 104 112 126 107 122 146 112 # of links 2562 2874 3501 2993 2905 2552 3333 2761 2865 2942 2978 3787 Kevin Bacon Kevin Bacon 2.786981 2.786981 46 46 1811 1811 Rank Name 1 2 3 4 5 6 7 8 9 10 11 12 … 876 876 … #1 Rod Steiger #876 Kevin Bacon Donald #2 Pleasence #3 Martin Sheen Tree Network Random Network: The typical distance between any two nodes in a random graph scales as the logarithm of the number of nodes. Then the Small World concept is not an indication of a particular organizing principle. Random graphs – Erdos & Renyi (1960) • Start with N nodes and for each pair of nodes, with probability p, add a link between them. • For large N, there is a giant connected component if the average connectivity (number of links per node) is larger than 1. • The average path length L in the giant component scales as L lnN. Minimal number of links one needs to follow to go from one node to another, on average. Erdös-Renyi model (1960) Many properties in these graphs appear quite suddenly, at a threshold value of p = PER(N) -If PER ~ c / N with c < 1, then almost all vertices belong to isolated trees. -Cycles of all orders appear at PER ~ 1/ N P(k ) e pN k ( pN ) k k e k k! k! Poisson distribution Random Graphs Model Given N nodes connect each pair with probability p: – P(k) ~ Poisson distribution – <k> = pN. – Most nodes degree ~ <k>. – <L> = log(N) / log(<k>). – Small World property Asymptotic behavior Lattice L( N ) N Random graph 1/ d L( N ) log N • For many years typical explanation for SmallWorld property was random graphs – Low diameter: expected distance between two nodes is log<k>N, where <k> is the average outdegree and N the number of nodes. – When pairs or vertices are selected uniformly at random they are connected by a short path with high probability. • But there are some inaccuracies – If A and B have a common friend C it is more likely that they themselves will be friends! (clustering). – Many real world networks exhibit this clustering property. Random networks are NOT clustered. Clustering coefficient Local propierty: C(v) = C(v) = 4/6 # of links between neighbors n(n-1)/2 C is the average over all C(v) Clustering: My friends will know each other with high probability! (typical example: social networks) Asymptotic behavior Lattice L( N ) N 1/ d C ( N ) const. Random graph L( N ) log N C(N ) N 1 Power grid NW USA-Canada N = 4914 kmax = 19 kaver = 2.67 L = 18.7 C = 0.08 D = 46 Caenorhabditis elegans Neural system N = 282 kmax = 14 kaverage = 9 L = 2.65 C = 0.28 D = 3 Duncan J. Watts & Steven H. Strogatz, Nature 393, 440-442 (1998) Real life networks are clustered, large C, but have small average distance L. WWW Actors Power Grid C. Elegans L 3.1 3.65 18.7 2.65 Lrand 3.35 2.99 12.4 2.25 C 0.11 0.79 0.080 0.28 Crand 0.00023 0.00027 0.005 0.05 N 153127 225226 4914 282 Structured network Random network • high clustering Small-world network • small clustering • high clustering • large diameter • small diameter • regular • small diameter • almost regular N = 1000 k =10 D = 100 L = 49.51 C = 0.67 N =1000 k = 8-13 D = 14 d = 11.1 C = 0.63 N =1000 k = 5-18 D = 5 L = 4.46 C = 0.01 Duncan J. Watts & Steven H. Strogatz, Nature 393, 440-442 (1998) Watts-Strogatz Model regular SW random C L p C(p) : clustering coeff. L(p) : average path length