Random Graphs Drew Masters March 9th, 2016 What is a random graph? ● A graph in which properties such as the number of graph vertices and graph edges between them are determined in some random way ● Can use some graph generating model (constructive) or by randomly sampling from a collection of graphs (sampling) ● Consists of n vertices and m edges History ● First papers published in 1957 ● Paul Erdős and Alfré d Ré nyi ● Edgar Gilbert ● First papers that formally defined a random graph ● Papers independent of each other Paul Erdős ● Hungarian ● Mathematician ● Known for his social practice of mathematics ○ ● engaged more than 500 collaborators Erdős number Alfré d Ré nyi ● Hungarian ● Mathematician ● Founded Mathematics Research Institute of the Hungarian Academy of Sciences Edgar Gilbert ● American ● Mathematician ● Researcher at Bell Laboratories ● Known for his contributions to coding theory Random Graph Models ● Obtained by starting with a set of n isolated vertices and add successive edges between the vertices at random ● Random graph models ○ Edgar Gilbert: G(n,p) ○ Erdős-Ré nyi model: G(n,M) Edgar Gilbert Model ● ● Denoted G(n,p) ○ n = number of isolated vertices ○ p = probability every possible edge occurs independent to each other ○ m = number of edges n 2 Probability of obtaining a particular graph = ○ N= ● A graph is constructed by connecting nodes randomly. ● As p increases from 0 to 1, the model is more likely to produce graphs with more edges Erdős-Ré nyi Model ● Denoted G(n,M) ○ ● Assigns equal probability to all graphs with exactly M edges ○ ● M = number of edges Every element occurs with probability: A graph is chosen uniformly at random from the collection of all graphs which have n nodes and M edges. Comparison of the main models ● G(n,p) model is the more commonly used model ○ ● G(n,M) model is not as easy to deal with mathematically Equivalence of G(n,M) and G(n,p) is done by setting M = * p. ○ As n approaches infinity G(n,p) should behave similarly to G(n,M) ○ Law of large numbers says G(n,p) will contain approximating the same number of edges as G (n,M) Characteristics of G(n,p) ● Expected number of edges = ● Expected degree: z (also called c or k in some papers) ○ ● *p z = (n-1) * p Clustering coefficient: cc(v) ○ Average probability two neighbours of a given vertex are also neighbours of one another ○ Expected cc(v) of = p Evolution of G(n,p) ● np << 1 ○ ○ ● np = c where c < 1 ○ ○ ● cycles start appearing almost all vertices connected in trees np = 1 ○ ● G(n,p) most likely does not have any connected components larger than O(log(n)) in size Components consist of trees G(n,p) most likely has a largest component of the order n2/3 np > 1 ○ ○ G(n,p) most likely has a unique giant component contain a fraction of the vertices No other component will contain more than O(log(n)) vertices Dual phase evolution ● ● ● Process that promotes the emergence of large-scale order in complex systems Arises because of the property that the connectivity avalanche occurs in graphs as the number of edges increases Features necessary for Dual phase evolution to occur ○ ○ ○ ○ underlying network phase shifts selection and variation system memory Probabilities of the whole graph ● Probability the graph is connected ○ ○ ○ ○ ● N = number of vertices p = probability that an edge exists q = probability that an edge does not exist=1-p PN = probability N vertices are connected Probability two vertices are connected in the graph Table from Gilbert paper Diameter of random graphs ● Diameter: longest shortest path between two vertices ● Expected diameter of a random graph = ○ c is the average degree of a vertex Rado Graph ● Unique countably infinite graph R such that for every finite graph G and every vertex v of G, every embedding of G-v as an induced subgraph of R can be extended to an embedding of G into R. ● Contains all finite and countably infinite graphs as induced subgraphs ● For any finite disjoint sets of vertices U and V, there exists a vertex x connected to everything in U, and to nothing in V Random Regular Graph ● ● Graph selected from Gn,r ○ Gn,r denotes the probability space of all r-regular graphs on n vertices ○ 3 ≤ r < n and nr is even Possible to prove that certain properties of random r-regular graph almost surely hold ● Non-trivial to implement Threshold Functions ● Function, m*(n), for a monotone increasing property P in random graph G such that as n approaches infinity ● Every non-trivial monotone graph property has a threshold ○ Proved by Bollobá s and Thomason in 1987 Other Models ● Barabá si-Albert ○ ● Model for generating random scale-free networks using a preferential attachment mechanism Watts-Strogatz ○ Model produces graphs with small-world properties including short average path lengths and high clustering ● Stochastic Block ○ Model produces graphs containing communities Barabá si-Albert Model ● Scale-free networks are widely observed in natural and human-made systems ○ ● network whose degree distribution follows a power law Incorporates two concepts ○ growth: number of nodes in network increases over time ○ preferential attachment: the more connected a node is, the more likely it is to receive new edges ● As nodes are added to network, it is connected to m vertices where m is less than the beginning number of vertices in graph with probability that is proportional to the number of links that existing nodes already have. Watts-Strogatz Model ● Created because Erdős-Ré nyi graphs do not have two properties that are in many real-world networks ○ ○ ● Do not generate local clustering and triadic closures Do not account for formation of hubs Algorithm ○ ○ ○ Given N nodes, average degree z, and a special parameter ■ 0≤ ≤1 ■ N >> K >> ln(N) >> 1 ■ edges Construct regular ring lattice Rewire the edges ■ choose end point with probability ■ avoid self-loops and link duplication Stochastic Block Model ● Communities ○ ● Subsets that are connected with one another with particular edge densities Parameters ○ ○ ○ n=number of vertices Partition of vertex set into disjoint subsets C1,...,Cr Symmetric r x r matrix P of inter-community edge probabilities Network Probability Matrix ● Probability structure of a network based on the historical presence or absence of edges in a network ● Example: individuals in a social network ● Simulated by varying the probabilities that certain nodes will communicate Percolation theory ● Describes the behavior of connected clusters in a random graph ● Consider some liquid is poured on top of some porous material. Will the liquid be able to make its way from hole to hole and reach the bottom? ● ○ Modelled as 3D network of n x n x n vertices ○ Edges is open (allows liquid through) with probability p or closed with probability q ○ Problem is called bond percolation Question: For a given p, what is the probability that a path exists between top and bottom? ● Erdős and Ré nyi process in a unweighted link percolation on complete graph Other Applications ● Answer questions about properties of typical graphs ○ Property A almost always implies property B ● Study small world phenomena ● Used to model various types of networks ○ ○ ○ ○ ○ ○ social networks internet power grid networks of collaborators neural networks food web Open questions ● Chromatic polynomial of a random graph ○ number of proper coloring of random graphs given a number of q colors ● Community structure ● Link prediction Homework ● What is the expected diameter of a random graph with 7 billion vertices and an average degree of 1000? ● Consider G(1000000, 0.00000001). What kind of components should you expect? What about G(1000000, 0.001)? ● Convert G(1000000, 0.00000001) (G(n,p)) to G(n,M). References [1] https://en.wikipedia.org/wiki/Random_graph [2] https://en.wikipedia.org/wiki/Erd%C5%91s%E2%80%93R%C3%A9nyi_model [3] https://en.wikipedia.org/wiki/Watts_and_Strogatz_model [4] https://en.wikipedia.org/wiki/Network_probability_matrix [5] https://en.wikipedia.org/wiki/Percolation_theory [6] https://en.wikipedia.org/wiki/Community_structure [7] https://en.wikipedia.org/wiki/Chromatic_polynomial References [8] https://en.wikipedia.org/wiki/Rado_graph [9] https://en.wikipedia.org/wiki/Random_regular_graph [10] https://en.wikipedia.org/wiki/Paul_Erd%C5%91s [11] https://en.wikipedia.org/wiki/Edgar_Gilbert [12] https://en.wikipedia.org/wiki/Alfr%C3%A9d_R%C3%A9nyi [13] https://en.wikipedia.org/wiki/Barab%C3%A1si%E2%80%93Albert_model [14] https://en.wikipedia.org/wiki/Stochastic_block_model References [15] https://en.wikipedia.org/wiki/Dual-phase_evolution [16] “The Link Prediction Problem for Social Networks” by David Liben-Nowell and Jon Kleinberg [17] “Random Graphs as Models of Networks” by M. E. J. Newman [18] “Random Graphs” by Edgar Gilbert [19] “On random graphs I.” by Paul Erdős and Alfré d Ré nyi [20] “Lecture 4” put together by Sriram Pemmaraju [21] “The Rado graph and the Urysohn space” by Peter J. Cameron