Power Laws By Cameron Megaw 3/11/2013 What is a Power Law? A power law is a distribution of the form: π π₯ = πΆπ₯ −πΌ similarly ln π π₯ = −πΌln π₯ + π Example: The size of cities in the US (population 1000 or more) • Highly right skewed • The largest city has 8 million people • Most cities have much fewer people Measuring Power Laws Sampling Errors • 1 million random numbers from a power law distribution • Exponent πΌ = 2.5 • Data is binned in intervals of size .1 • Linear scales produce a smooth curve • Log-log scales have noisy data in the tail • Result of sampling errors • Corresponding bins have few samples (if any) • Fractional fluctuations in the bin counts are large Measuring Power Laws Sampling errors Solution 1: Throw out the data in the tail of the curve • Statistically significant information lost • Some distributions only follow a power law distribution in their tail • Not recommended Measuring Power Laws Sampling errors Solution 2: Very the width of the bins • Normalize the data • Results in a count per unit interval of x • Very bin size by a fixed multiplier (for example 2) • Bins become: 1 to 1.1, 1.1 to 1.3, 1.3 to 1.7 and so on • Called logarithmic binning Measuring Power Laws Sampling errors Solution 3: Calculate the probability distribution function (aka Zipf’s Law or a Pareto distribution) ∞ πΌ πΆπ₯ ′ ππ₯′ = π π₯ = π₯ πΆ π₯ 1−πΌ πΌ−1 • No need to bin the data • Information on individual values are preserved • Eliminates the noise in the tail Measuring Power Laws Unknown exponent 1. Method of least squares: • Most common method • Plots the line of best fit in log-log scales • Introduces systematic biases in the value of the exponent • Estimated πΌ = 2.26 ± .02 (actual 2.5) 2. Use maximum likelihood formula • A non-biased estimator • Calculate an error estimate • standard bootstrap resampling • jackknife resampling • Estimated πΌ = 2.500 ± .002 πΌ =1+π π₯π π ln π=1 π₯ πππ −1 Mathematics of Power Laws Calculating C ∞ 1= ∞ π π₯ ππ₯ = π₯min π₯min πΆ π₯ −πΌ ππ₯ πΆ = π₯ 1−πΌ 1−πΌ πΌ−1 πΆ = 1 − πΌ π₯min ∞ π₯min Mathematics of Power Laws Moments π₯π ∞ = π₯ππ π₯min πΌ−1 π π₯ ππ₯ = π₯min πΌ−1−π • All moments π₯ π exists for π < πΌ − 1 and diverge otherwise: • Mean: π₯1 = • Variance: π₯2 = πΌ−1 π₯ πΌ−2 min πΌ−1 π₯ πΌ−2 min • Intensity of Solar flares have an exponent 1.4 is the average intensity infinite? • All data sets have finite upper bound • Larger sampling space gives a non-negligible chance of increasing the upper bound Mathematics of Power Laws Largest Value For a sample of size n we can estimate the largest value in the sample: π₯max = ππ₯min π© πΌ−2 π, πΌ−1 ~ π 1 πΌ−1 as π → ∞ Where B is beta-function This estimate enables the calculation of moments for data sets whose moments would otherwise diverge. π₯π = π₯max π₯min π₯ π π π₯ ππ₯ Mathematics of Power Laws Scale Free Distribution • A function is said to be scale free if: π ππ₯ = π π π π₯ • The unit of measure does not affect the shape of the distribution • If 2kB files are 1 4 as common as 1kB files then 2mB files are 1 4 as common as 1mb files • Scale free distribution is unique to Power Law distributions • Scale free implies power law and vice versa Mechanisms for Generating Power Laws Some examples : • Combinations of exponents • • • • Inverses of quantities Random Walks The Yule process Critical phenomena The Topology of the Internet Some Key Questions What does the internet look like? Are there any topological properties that stay constant in time? How can I generate Internet-like graphs for simulation? Internet Instances • Three Inter-domain topologies • November 1997, April and December 1998 • One Router topology from 1995 Metrics Outdegree of a Node and it’s Rank Power Law 1: The out degree ππ£ of a node v is proportional to the rank of the node, ππ£ , to the power of a constant R. ππ£ = πΆππ£π 1 By setting ππ = 1 it can be shown that πΆ = ππ Outdegree of a Node and it’s Rank Inter domain topologies • Correlation coefficient above .974 • Exponents -.81, -.82, -.74 Router • Correlation coefficient .948 • Exponent -.48 Outdegree and it’s Rank Power Law Analysis The exponent is relatively fixed for the three inter-domain topologies • Topological property is fixed in time • Can be used to generate models or test authenticity Significant difference in exponent value for the router topology • Can characterize different families of graphs The rank exponent π can be used to estimate the number of edges π 1 πΈ = 2 π +1 (1 − ππ +1 ) Frequency of the Outdegree Power Law 2: The frequency, ππ , of an outdegree, d, is proportional to the outdegree to the power π: ππ = πΆπ π Frequency of the Outdegree Inter domain topologies • Correlation coefficient above .968 • Exponents -2.15, -2.16, and -2.2 Router • Correlation coefficient .966 • Exponent -2.48 Frequency of the Outdegree Power Law Analysis The exponent is relatively fixed for the three inter-domain topologies • Topological property is fixed in time • Could be used to generate models or test authenticity Similar exponent value for the router topology • Could suggest a fundamental property of the network Eigenvalues and their Ordering Power Law 3: The eigenvalues, ππ , of a graph are proportional to the order, π, to the power of a constant π: ππ = πΆπ π Eigenvalues and their Ordering Inter domain topologies • Correlation coefficient .99 • Exponents -.47, -.50, and -.48 Router • Correlation coefficient .99 • Exponent -.1777 Eigenvalues and their Ordering Power Law analysis Eigenvalues are closely related to many topological properties • Graph diameter • Number of edges • Number of spanning trees… The exponent is relatively fixed for the three inter-domain topologies • Topological property seems fixed in time • Can be used to generate models Significant difference in the exponent value for the router topology • Can characterize different families of graphs Hop Plot Exponent Approximation 1: The total number of pairs of nodes, π β , within β hops can be approximated by: πβπ» , β βͺ πΏ π β = π2, β≥πΏ Where π = π + 2πΈ Hop Plot Exponent Inter domain topologies • First 4 hops • Correlation coefficient above .96 • Exponents -4.6, -4.7, -4.86 Router • First 12 hops • Correlation coefficient .98 • Exponent -2.8 Hop Plot Exponent Power Law analysis • The exponent is relatively fixed for the three inter-domain topologies • Topological property seems fixed in time • Can be used to generate models • Significant difference in the exponent value for the router topology • Can characterize different families of graphs The Effective Diameter How many hops to reach a “sufficiently large” part of the network? • Too small a broadcast will not reach the target • Too large a broadcast can clog the network • A good guess is the intersection of the hop-plot at β The effective diameter πΏππ = For the interdomain instances • 80% of nodes were within πΏππ • 90% were within ⌈πΏππ ⌉ π2 π+2πΈ 1 π» Average Neighborhood Size Average outdegree: ππ′ β = π π − 1 Hop-plot exponent: ππ β = π β π β−1 −1= π π» β π −1 Conclusions Power Law and Internet topology • Can assess realism of synthetic graphs • Provide important parameters for graph generators • Help with network protocols • Help answer “what if” questions • What would the diameter be if the number of nodes doubles? • What would be the average neighborhood size be? Questions?