The Power-Law of Top500 - People.cs.uchicago.edu

advertisement
The Power-Law of Top500
Matei Ripeanu
(matei@cs.uchicago.edu)
GFLOPS (log scale)
Most natural and technical phenomena
are characterized by highly unbalanced
10000
2001
2000
distributions: there are few powerful,
1999
1998
1000
1997
1996
devastating earthquakes and countless
1995
unnoticeable ones; there are few
100
machines with a peak FLOPS rate larger
than 1 TFLOPS while millions of
10
machines work at around 1 MFLOPS.
Many of these events (city sizes,
1
incomes, word frequency [1, 2]) fit
1
10
100
1000
power-law distributions: the number of
Rank (log scale)
events of a certain size is proportional
Figure 1: Peak processing rate (GFLOPS) for
to the size of the event to a negative
world’s fastest supercomputers in Top500 list from
constant power.
1995 to 2001. Each series of points represents one
year on this log-log plot.
There are many dimensions of variation
for entities participating in the Internet:
from the obvious ones like CPU speed, available disk space and network bandwidth, to
more elaborate ones such as inter-failure time, node trustworthiness, or reliability. We
conjecture that these follow similar, highly heterogeneous distributions. Preliminary
results support our intuition: Internet’s autonomous system size [3], node bandwidth for
nodes in Gnutella network [4, 5] or CPU power for machines in Top500 list [6], all
follow power-law distributions (or at least highly variable distributions that can be well
approximated as power-laws).
We use an example to depict the
-0.66
characteristics of this distribution: peak
-0.68
MFLOPS rate of the world’s most
-0.70
powerful supercomputers follows a
-0.72
power-law distribution for all years for
-0.74
which data is available (see Figure 1).
-0.76
If we make the assumption that the
-0.78
same distribution extends to machines
-0.80
beyond the Top500, even more
-0.82
interesting is perhaps the heavy-tail
-0.84
1995 1996 1997 1998 1999 2000 2001
property. As a result, aggregating
computers in the tail results into a
Figure 4: Evolution of power-law distribution
powerful machine comparable with the
coefficient k over time. As k gets closer to 0 the
distribution is more heavy-tailed. Note that k
top ones. By analyzing data for the last
decreases on average 2% per year.
seven years, one can notice a
fascinating trend: the heavy-tail
property becomes more accentuated
(Figure 2 shows that the power constant of these distributions gets closer to zero). If this
trend persists, the interest will continue to shift from building large machines to largescale integrations of less powerful systems.
References
[1]
[2]
[3]
[4]
[5]
[6]
M. Schroeder, Fractals, Chaos, Power Laws : Minutes from an Infinite Paradise:
W.H. Freeman and Company, 1991.
N. Shiode and M. Batty, "Power Law Distributions in Real and Virtual Worlds,"
presented at INET 2000, Yokohama, Japan, 2000.
H. Tangmunarunkit, S. Doyle, R. Govindan, S. Jamin, S. Shenker, and W.
Willinger, "Does AS Size Determine Degree in AS Topology?," ACM Computer
Communication Review, 2001.
S. Saroiu, P. K. Gummadi, and S. D. Gribble, "A Measurement Study of Peer-toPeer File Sharing Systems," presented at Multimedia Computing and Networking
Conference (MMCN), San Jose, CA, USA, 2002.
M. Ripeanu, I. Foster, and A. Iamnitchi, "Mapping the Gnutella Network:
Properties of Large-Scale Peer-to-Peer Systems and Implications for System
Design," Internet Computing Journal, vol. 6, 2002.
"TOP500 Supercomputer Sites - http://www.top500.org/," 2002.
Download