It's a Small World After All Math Models Spring 2000 Kim Dressel Angie Heimkes Eric Larson Kyle Pinion Jason Rebhahn The Small-World Phenomenon Two individuals from a different part of the state, or different part of the country, or even a different part of the world meet. After conversing for some time, the two discover that they share a common acquaintance. Upon this discovery, one individual states “It’s a small word.” This is the basic idea of the Small-World Phenomenon. The Small World phenomenon is “the idea that even on a planet with billions of people, everyone is connected in a tight network.” This phenomenon is probably more commonly known as the Six Degrees of Separation. Stanley Milgram, a social psychologist at Harvard University, first studied the idea that people are connected through indirect networks. In 1967, Milgram performed his first real-world study of the Small World Phenomenon by giving letters to various people in Kansas and Nebraska with the goal of eventually getting the letter to a certain individual in Boston. The participants were instructed to pass the letter on to whomever they knew on a first name basis who would be the most likely to know the target recipient in Boston. Milgram found that it took an average of about five people, or six total transition, to pass the letter on from the sources in Kansas and Nebraska to the target in Boston. The findings would also hold when going from a white sender to a black recipient, as it only requires a couple of people to “bridge” the divide between the races. With the average of six total transition, the idea of six degrees of separation came about. Milgram’s finding sparked interest in the phenomenon by other scientist. For a long time, the Small World Phenomenon was mostly studied by sociologists and psychologists. Different network models were proposed as frameworks to study the problem analytically. However, no model with a strong mathematical backing was developed to explain the phenomenon. Then in 1998, Steven Strogatz, a mathematician at Cornell University, and Duncan Watts, a postdoctoral fellow in the social sciences at Columbia University, further studied the phenomenon and developed what came to be known as the Watts-Strogatz Model. The Watts-Strogatz model of the Small-World Network is based upon regular and random networks. In a regular network, a point is directly linked to its four nearest neighbors. So to get from one point in the network to the other side, there must be several intermediary links. In a random network, each point has a connection to a distant point, minimizing the number of links needed to reach across the network. Subsequently, a Small-World Network has it four local connections, plus a long distance connection. So a message can be passed to local neighbors, or passed on to long distance connections, simulating the real world. The Watts-Strogatz Model was the most refined model at this time and it provided compelling evidence that the Small-World Phenomenon is pervasive in a wide range of networks arising in nature and technology. Some examples of networks that the SmallWorld Model have been applied to include: the grid of power stations in western United States, neural networks of elegan worms, and the Six Degrees of Kevin Bacon which states that every actor and actress, past and present, can be connected to Kevin Bacon through other actors and actresses that they have performed in a movie with. Other areas of interest that the Small-World Model has spread to and can hopefully be applied to include economics, physics, biochemistry, and neurophysiology. Just as a message may be passed from person to person in the world, a disease may be passed from person to person. So a better understanding of this using the Small-World Model may help prevent the spread of disease throughout the world. Applying the Small-World Model to neurophysiology may also help solve certain brain disorders such at epileptic seizures. The Small-World Model also applies very well to the World Wide Web. Just as there are connections between people in the world, there are connections, or links, between web sites on the World Wide Web. The estimated size of WWW is 800 million documents, and is continuously and rapidly growing. The Northern Light search engine covers the largest amount of the WWW with only 38% total coverage. Since the Small-World Network applies so well to the WWW, it would be great if search engines could make use of this to create more efficient searches over a larger amount of the web. Jon Kleinberg, a professor at Cornell University, took great interest in Milgram’s research and the Watts-Strogatz Model. He determined that although the Watts-Strogatz Model was well constructed, it was insufficient to explain the algorithmic concepts of Milgram’s Small-World Phenomenon. So Kleinberg developed his own model to represent the Small-World Network. DEFINITIONS AND TERMS A Lattice drawing is an n x n grid in which the nodes represent individuals in a social network. This can be defined mathematically as {(i, j ) : i ∈ {1,2,..., n}, j ∈ {1,2,..., n}} . Lattice Distance is defined to be D ((i, j ), (k , l )) =| k − i | + | j − l | , that is, the distance between to points is the horizontal and vertical distances. Movement in the diagonal direction is not allowed in this model. Figure 1 is an example of a lattice drawing with n = 6. Figure 1 Short-range contacts, p, are defined for p > 0 the node u has a directed edge on every other ode within lattice distance p. Long-range contacts, q, are defined for q ≥ 0 and r ≥ 0 a directed edge is made using independent random trials. As seen in figure 2, p = 1 and q = 2. The value of r is not directly seen in this digram and will be defined shortly. Also, u would be the current message holder and v and w would be the two long-range contacts. Figure 2 The decentralized algorithm A is the method by which a message is transmitted from one message holder to the next. First, the long-range contact(s) are determined. Second, the message is transferred the closest available node to the target node. That is, if one of the long-range contacts is closer than the current message holder, the message is transferred to that long-range contact. However, if no long-range contacts are closer, then the message is transferred to one of the short-range contacts. The inverse rth-power distribution is defined as the ith directed edge from u has endpoint v with probability proportional to [ D(u , v)] − r . To obtain a probability distribution, this is divided by an appropriate normalizing constant. Finally the inverse rth-power distribution is [ D(u , v)]− r . Where the value of r is used as a probablility function. That is, if we [ D(u , v)] − r u ≠v let r = 0, it would be similar to saying that the probability that you know somebody one hundred miles away is equally likely as knowing someone that is ten thousand miles away. As the value of r increases, the likelyhood that you know someone one hundred miles away is greater than the likelyhood that you know someone ten thousand miles away. For our model we have select a value of r = 2. Performance in this system is measured by the average number of steps it takes to get from the souce to the target. Let this be X and also be mathematically defined as the expectation of X where E ( X ) = ∞ Pr( X ≥ i ) . i =1 For j > 0, phase j is defined as {x : 2 j < D ( x, t ) ≤ 2 j +1} . These are essentially bands where a particular message holder can be located. The further the message is away from the user, the wider the band is for each particular phase. Similarly, the Ball j is defined as for j > 0 {x : D ( x, t ) ≤ 2 j } . These are all the phases back to the target. Thus the Ball j only has an outer limit. With both of these terms, once a message enters a particular phase or ball, it cannot return to an phase or ball that is further removed from the target in lattice distance. Now that we have defined X , X j , and phase j, we can go on to explain the proof of the theorem. The theorem behind the model states that there is a decentralized algorithm A, and a constant c, independent of n, so that when r = 2 and p = q = 1, the expected delivery time of A is at most c(log n) 2 . First of all, we will define the upper and lower bounds for the probability that u, the message holder, chooses v, as its long-range contact. Keep in mind, long-range contacts are determined at random and only one is allowed. To find bounds on the probability we need to know that the probability that u chooses v as its long-range contact is d (u , v) −2 . This equation is used to find upper and lower bounds. To find the upper d (u , v) − 2 bound we have: d (u , v) − 2 ≤ 2n−2 (4 j )( j − 2 ) . We set the upper limit as 2n-2 because we j =1 are dealing with a finite lattice structure with the long-range contact at a distance of at most (n - 1) + (n - 1)= 2n - 2 nodes away from the message holder. We get 4j from the number of long-range contacts for a specific phase. This is the number of nodes on the outer edge of the "diamond" structure. And j −2 measures the distance from the center of the diamond structure to the outer edge of the diamond. Moving on, we can rewrite this equation as: = 4 2n−2 j −1 and this can be approximated by j =1 2 n −2 4 1 dx ≤ 4 + ln(2n − 2). This equation can be rewritten as 4 ln (6n) which is x approximately ln (n). We can omit the constants because we know the rules of multiples. Next, to find the lower bound, we simply put ln (n) back into our original equation to get: 1 ≤ ln(n) d (u , v) −2 d (u , v) −2 1 . This can be written as ≤ which is the ln(n) d (u , v) − 2 d (u , v) − 2 probability that u chooses v as its long-range contact. Mathematical Background In the next segment, I will be giving you a mathematical background of some of the terminology that we will be using in some of our proofs. I will be defining a geometric series, probability, discrete random variable, and logarithms. First of all, a series is called geometric if each term in the series is obtained from the proceeding one a + ar + ar 2 + ar 3 + ... = ∞ ar n −1 = a /(1 − r ) n =1 by multiplying it by a common ratio. In general, a geometric series is of the form Probability is used to mean the chance that a particular event (or set of events) will occur expressed on a linear scale 0(impossibility) to 1 (certainty). A discrete random variable assumes each of its values with a certain probability. The assumed probability of the outcomes must be between 0 and 1 with the sum of 1. An example would be the number of heads that occur in a given number of coin tosses. Lastly, log n denotes the logarithm base 2, while ln n denotes the natural logarithm, base e. Number of Nodes in Bj As explained before, Bj is the set of all nodes from the target to the initial message holder that fall within the lattice distance of 2j. Its set will be defined as being the following… B j = {x : D( x, t ) ≤ 2 j +1} The number of nodes in Bj can be approximated as… # (B j ) ≈ 1 + 2j 4i i =1 1 represents the target itself and is shown as being the red dot in the picture. The i represents the boundary of the diamond. The 4i denote the number of nodes on the boundary of the diamonds for each phase. For example, in the first phase you would have 4*1=4 nodes and on the second phase you would have 4*2=8 nodes. This is shown in the picture above because the higher phases are shown as being substantially thicker. The above statement can then be rewritten as… # ( B j ) ≈ 1 + 2 j (2 j + 1) This statement can intern be simplified to the following expression # ( B j ) ≈ 1 + 2 j −1 (2 j + 1) # ( B j ) ≈ 1 + 2 2 j −1 + 2 j −1 > 2 2 j −1 Therefore the number of nodes contained in Bj is at least the value of # ( B j ) > 2 2 j −1 Probability that a Node will be in Bj The next proof will give the probability that a node will fall into Bj. This probability can be represented by the following expression… P (enter B j ) ≥ (2 2 j −1 ) /(4 ln(6n)2 2 j + 4 ) The value of 22j-1 represents the number of nodes in Bj. That number was obtained from the last proof. The 4ln(6n) is the probability that v is chosen as a long range contact. That was also proven earlier in the paper. The 22j+4 value was obtained by plugging the number of nodes in Bj back into the distance formula d(u,v)-2. That statement can be rewritten and simplified as…. P (enter B j ) ≥ 1 /(2 7 ln(6n)) P (enter B j ) ≥ 1 /(128 ln(6n)) Therefore we can conclude that the probability that a node will enter Bj is at least the value of (1/(128ln(6n)). Proof of Expectation Xj represents the number of steps spent in phase j. The Expectation can be expressed by the following expression… EX = E ( X j) = log n j=0 EX j We will have to break this equation apart piece by piece in order to obtain our desired result. The first part that we will be looking at will be the expectation of Xj. It can be written as…. EX j = ∞ i =1 Pr[X j ≥ i ] Pr[Xj>=i] denotes the probability that the message spends at least I steps in phase j. This statement can again be broken down by use of the law of total probability as…. Pr[ X j ≥ i ] = Pr[ X j ≥ i | X j > 0] Pr[ X j > 0] + Pr[ X j ≥ i | X j = 0] Pr[ X j = 0] This will be 0 because we are already assuming that j>0. Pr [Xj>0] will also be less than or equal to 1. Or expression can be simplified to the following…. Pr[ X j ≥ i | X j > 0] = (1 − (1 / 128 ln(n))) i −1 (1/(128ln(n))) is the probabiltiy that message will stay in Bj for at least i steps. It is raised the the (i-1) value because we are already assuming that the first step is contained in j. The value of the probability that we were analyzing before can be rewritten as the following…. Pr[ X j ≥ i ] = ∞ (c(1 − (1 / ln(n)))) i −1 i =1 This value is a geometric series. We know how to find a value for a series like this by using the definition that was given before of (a/(1-r)). Once we substitute all of the given values back into our original equation we get….. EX j ≤ c(ln(n)) 2 Well is the World Really That Small? There are some questions one could ask, such as "What if there were different values for p and q?"; "What if there was different deffinitions for a long and short range contact?"; "What if different formulas were used to determine the decentralized algorithms and probability?"; and "What if you could move diagonally on the n x n grid and not just up & down and left & right?". There are many different models someone could derive looking at this with different perspectives. References 1) L. Adamic, "The Small World Web" manuscript available at http://www.parx.xerox.com/istl/groups/iea/www/smallworld.html 2) Sandra Blakeslee. "Mathematics Prove That it is a Small World" 3) Dr. Steve Deckleman. His Mathematical Knowledge 4) Jon Kleinberg. "The Small World Phenomenon: An Algorithmic Perspective". 5) Stanley Milgram. "The Small World Problem" Psychology Today 1, 61 (1967) 6) Beth Salnier. "Small World" 7) R. Albert, H. Jeong, A. Bababarsi. "Diameter of the World-Wide-Web" Nature 401, 130 (1999)