A Statistical Derivation of the Significant-Digit Law

advertisement
THE SIGNIFICANT-DIGIT
THE
SIGNIFICANT-DIGITLAW
LAW
A perhaps
perhaps surprising
ofthe
the general
generallaw
law
A
surprisingcorollary
corollaryof
(4)
(4) is
is that
that(CL
(cf.Hill,
Hill, 1995b)
1995b)
the
digits
the significant
are dependent
dependent
significant
digitsare
and
mightexpect.
expect.
and not
notindependent
independentas
as one
one might
From
prob(unconditional)probFrom(2)
(2) it
it follows
followsthat
that the
the (unconditional)
ability
by
the second
seconddigit
is 22 is
is ~ 0.109,
thatthe
but by
0.109, but
digitis
abilitythat
(4)
probability that
the(conditional)
thatthe
thesecond
seconddigit
digit
(4) the
(conditional)probability
_ 0.115.
This dedeis
given that
is 2,
thatthe
thefirst
firstdigit
is 1,
is ~
0.115.This
digitis
2,given
1,is
pendence among
digits
decreasesrapidly
rapidly
pendence
amongsignificant
significant
digitsdecreases
as
between the
the digits
and it
it
as the
the distance
distancebetween
digitsincreases,
increases,and
follows
fromthe
the general
law (4)
(4) that
thatthe
the disdisfollowseasily
easilyfrom
generallaw
tribution
digit
ofthe
the nth
nthsignificant
the
digitapproaches
approachesthe
tributionof
significant
uniform
on {O,
.. . ,9}
uniformdistribution
distributionon
1, ...
, 9} exponentially
exponentially
{0, 1,
-? 00.
fast
will concentrate
This article
article will
concentrateon
on decifast as
as nn ~
oo. This
decimal
and
digits;
mal (base-l0)
and significant
significant
digits;
(base-10)representations
representations
the
analog
forother
bases bb >> 11
thecorresponding
of(3) for
otherbases
corresponding
analog of(3)
< tlb)
= 10gb
for
is
b) :::
t/b)=
is simply
simplyProb
Prob(mantissa
(mantissa(base
(base b)
logbtt for
all
all tt EE [1,
[1, b).]
b).]
EMPIRICAL
EVIDENCE
EMPIRICALEVIDENCE
Of
Of course,
of numerical
do
numerical data
data do
course, many
many tables
tables of
not
not follow
followthis
this logarithmic
of
distribution-listsof
logarithmicdistribution-lists
in aa given
telephone
benumbersin
regiontypically
typicallybetelephonenumbers
given region
gin with
the same
with the
same few
fewdigits-and
even "neutral"
"neutral"
gin
digits-and even
data
of integers
data such
such as
as square-root
tables of
are
integersare
square-roottables
not
diverse
diversecolcolnot good
good fits.
fits.However,
However,aa surprisingly
surprisingly
lection
of empirical
lection of
data does
does seem
seem to
to obey
obey the
the
empiricaldata
significant-digit
law.
law.
significant-digit
Newcomb (1881)
Newcomb
noticed"how
"how much
much faster
fasterthe
the
(1881) noticed
first
pages [of
tables] wear
wear out
out than
than
firstpages
[of logarithmic
logarithmictables]
the last
the
last ones,"
and after
afterseveral
several short
shortheuristics,
heuristics,
ones,"and
law.
concluded
law. Some
Some
concludedthe
the equiprobable-mantissae
equiprobable-mantissae
57
the physicist
physicist Frank
57 years
Frank Benford
Benfordredisredislater the
years later
covered the
it with
withover
over20,000
covered
the law
law and
and supported
20,000
supportedit
entries from
tables
such didientries
from20
20 different
different
tables including
includingsuch
of
verse data
of 335
335 rivers,
heats of
verse
data as
as areas
areas of
rivers,specific
specificheats
1389
basechemicalcompounds,
AmericanLeague
1389 chemical
League basecompounds,American
ball statistics
Reader's
fromReader's
statisticsand
and numbers
numbersgleaned
ball
gleaned from
AlDigest articles
pages of
of,newspapers.
articlesand
and front
frontpages
newspapers.AlDigest
though
page 363)
Diaconis and
and Freedman
Freedman (1979,
(1979, page
363)
thoughDiaconis
offer/convincing
evidence
evidencethat
thatBenford
Benfordmanipulated
offerconvincing
manipulated
round-off
errors.
better fit
errorsto
to obtain
obtain aa better
fitto
to the
the logaround-off
logarithmic
the unmanipulated
data are
are aa
rithmiclaw,
law, even
even the
unmanipulateddata
remarkably
good
Newcomb's article
been
fit.Newcomb's
articlehaving
havingbeen
remarkably
goodfit.
overlooked,
the
became known
Benford's
also became
knownas
as Benford's
thelaw
law also
overlooked,
law.
law.
popularization of
Since
of the
an
Since Benford's
Benford'spopularization
the law,
law, an
abundance
of additional
evidence has
has
additional empirical
abundance of
empirical evidence
appeared.
physics, for
In physics,
for example,
Knuth (1969)
example, Knuth
(1969)
appeared. In
and
that of
of
observedthat
Burke and
and Kincanon
Kincanon(1991)
and Burke
(1991) observed
the
physical constants
constants(e.g.,
most commonly
used physical
the most
(e.g.,
commonlyused
355
355
of
the constants
as speed
speed of
of light
lightand
and force
forceof
the
constantssuch
such as
inside cover
coverof
of an
an introducintroducgravitylisted
listed on
on the
the inside
gravity
tory
physics textbook),
about 30%
30% have
have leading
leading
tory physics
textbook),about
significant
digit
the
Becker(1982)
(1982) observed
observedthat
that the
significant
digit1.
1. Becker
have aa
decimal
parts of
(hazard) rates
rates often
oftenhave
decimalparts
offailure
failure(hazard)
and
and Buck,
Buck, Merchant
Merchantand
logarithmicdistribution,
distribution,
logarithmic
and
raPerez (1993),
the values
values of
ofthe
the 477
477 raPerez
(1993), in
in studying
studyingthe
dioactive
which
of unhindered
unhinderedaa decays
decays which
dioactivehalf-lives
half-livesof
throughoutthe
the present
present
have
been accumulated
have been
accumulated throughout
century
magwhichvary
varyover
overmany
manyorders
ordersof
ofmagcenturyand
and which
nitude,
of
frequencyof
of occurrence
occurrenceof
nitude,found
foundthat
that the
the frequency
the
both measured
valthefirst
firstdigits
ofboth
measuredand
and calculated
calculatedvaldigitsof
in "good
ues
of the
the half-lives
is in
with
ues of
half-livesis
"good agreement"
agreement"with
Benford's
Benford'slaw.
law.
In
In scientific
the assumption
of loglogassumptionof
scientificcalculations
calculationsthe
arithmically
distributed
distributedmantissae
mantissae"is
"is widely
used
widelyused
arithmically
and
and Turner,
1986,
and well
well established"
established"(Feldstein
(Feldsteinand
Turner,1986,
page 241),
ago,
and as
as early
as aa quarter-century
ago,
early as
quarter-century
page
241), and
Hamming
page 1609)
called the
the appearance
1609) called
appearance
Hamming(1970,
(1970, page
of
in floating-point
of the
the logarithmic
distributionin
floating-point
logarithmicdistribution
numbers
Benford-like
input
is often
Benford-like
often
inputis
numbers"well-known."
"well-known."
aa common
forextensive
extensivenumerical
numericalcalcalcommonassumption
assumptionfor
but Benford-like
output
culations
culations (Knuth,
Benford-like
output
1969), but
(Knuth, 1969),
is
when the
the input
inputhas
has random
random
is also
also observed
observedeven
even when
(non-Benford)
distributions.
Adhikari
Adhikariand
and Sarkar
Sarkar
(non-Benford)
distributions.
(1968)
ran"that when
when ran(1968) observed
observedexperimentally
experimentally"that
dom
dom numbers
or their
their reciprocals
are raised
raised to
to
reciprocalsare
numbersor
higher and
powers, they
and higher
have log
distributheyhave
log distribuhigher
higherpowers,
tion of
digit
in the
tion
of most
mostsignificant
the limit."
limit."Schatte
Schatte
digitin
significant
(1988,
page 443)
"In the
of
that "In
the course
course of
reportsthat
(1988, page
443) reports
in floating-point
aa sufficiently
long
computationin
floating-point
sufficiently
long computation
arithmetic,
the occurring
mantissas have
have nearly
nearly
occurringmantissas
arithmetic,the
logarithmic
distribution."
distribution."
logarithmic
Extensive
law
evidence of
of the
the significant-digit
law
Extensive evidence
significant-digit
has also
in accounting
data. Varian
Varian(1972)
has
also surfaced
surfacedin
accountingdata.
(1972)
in 777
in the
studied
777 tracts
tractsin
the San
San FranFranstudiedland
land usage
usage in
cisco
be seen,
"As can
cisco Bay
and concluded
concluded''As
can be
seen,
area and
Bay area
in fairly
both the
the input
data and
and the
the forecasts
forecastsare
are in
both
fairly
inputdata
Nigrini and
good
and Wood
withBenford's
Benford'sLaw."
Law."Nigrini
Wood
accordwith
goodaccord
(1995)
of
1990 census
show that
that the
the 1990
census populations
populationsof
(1995) show
.the
in the
"follow
the 3141
the United
United States
States "follow
3141 counties
counties in
and
Benford's
caland Nigrini
Benford'sLaw
Law very
closely,"
Nigrini(1996)
(1996) calveryclosely,"
culated
of income
incometax
tax
the digital
culated that
that the
frequenciesof
digitalfrequencies
of
data
Revenue Service
Service of
to the
the Internal
InternalRevenue
data reported
reportedto
is an
interest
paid is
an extremely
receivedand
and interest
interestpaid
interestreceived
extremely
segood
the seBenford.Ley
found"that
"thatthe
fitto
to Benford.
Ley (1995)
(1995) found
good fit
ries
returns on
Industrial
on the
theDow-Jones
Dow-JonesIndustrial
riesof
ofone-day
one-dayreturns
Average Index
Poor's
Index (DJIA)
and the
the Standard
Standardand
and Poor's
Average
(DJIA) and
Index
law."
withBenford's
Benford'slaw."
Index(S&P)
reasonablyagrees
agreeswith
(S&P) reasonably
All these
also highly
All
the author
authoralso
these statistics
statisticsaside,
highly
aside, the
justifiably skeptical
recommends that
reader
that the
the justifiably
recommends
skepticalreader
perform aa simple
such as
as randomly
randomly
experiment,such
perform
simple experiment,
of sevsevselecting
pages of
data from
fromfront
frontpages
numericaldata
selectingnumerical
as
eral
Farmer'sAlmanack"
Almanack"as
"oraa Farmer's
eral local
local newspapers,
newspapers,"or
Knuth (1969)
Knuth
(1969) suggests.
suggests.
T.
T. P.
P. HILL
HILL
356
356
CLASSICAL
CLASSICAL EXPLANATIONS
EXPLANATIONS
Since
law (4)
Since the
the empirical
does
empirical significant-digit
significant-digitlaw
(4) does
not
or
not specify
specifyaa well-defined
well-definedstatistical
statistical experiment
experiment or
sample space,
space, most
most attempts
attempts to
to prove
the law
law have
have
sample
prove the
been purely
purely mathematical
in nature,
been
mathematical (deterministic)
(deterministic)in
nature,
attempting
built-in charattemptingto
to show
show that
that the
the law
law "is
"is aa built-in
characteristic
of our
as Weaver
acteristic of
our number
number system,"
system,"as
Weaver (1963)
(1963)
was to
called
prove first
called it.
it. The
The idea
idea was
to prove
firstthat
the set
set
that the
of
of real
and then
real numbers
numbers satisfies
satisfies (4),
(4), and
then suggest
suggest that
that
this explains
this
explains the
the empirical
empirical statistical
statistical evidence.
evidence.
point has
been to
A common
A
commonstarting
startingpoint
has been
to try
tryto
to estabestablish
positive integers
N, beginning
beginning with
for the
lish (4)
(4) for
the positive
integers NJ,
with
= I}
= {I,
the prototypical
prototypical set
{D11 =
1} =
the
set {D
{1, 10,
10, 11,
11, 12,
12, 13,
13,
14,
positive integers
14, ...
.. .,, 19,
19, 100,
100, 101,
101, ...},
.. .}, the
the set
set of
ofpositive
integers
with leading
1. The
The source
of diffidiffiwith
leading significant
significantdigit
digit 1.
source of
culty
problem
of the
culty and
and much
much of
of the
the fascination
fascination of
the problem
= I}
1} does
{D11 =
is
is that
that this
this set
set {D
does not
not have
have aa natural
natural
density
densityamong
among the
the integers,
integers,that
that is,
is,
1
= l}
lim
{D 1 =
n {l, 2, ... , n}
lim -{D1
1}fn{1,2,...,n}
n
n
noo
n~oo
.!
does
sets of
or
does not
not exist,
the sets
of even
even integers
exist, unlike
unlike the
integers or
which have
reprimes which
primes
have natural
natural densities
densities 1/2
1/2 and
and 0,
0, respectively.
spectively.It
It is
is easy
easy to
to see
see that
that the
the empirical
empirical density
density
= I}
1} oscillates
of {D
{D11 =
and
of
between 1/9
oscillates repeatedly
repeatedly between
1/9 and
5/9,
possible to
and thus
thus it
is theoretically
5/9, and
it is
theoreticallypossible
to assign
assign
as the
any
number in
the "probability"
in [1/9,
ofthis
this
any number
[1/9, 5/9]
5/9] as
"probability"of
set.
used aa reiterated-averaging
set. Flehinger
Flehinger (1966)
(1966) used
reiterated-averaging
technique
which asasto define
defineaa generalized
technique to
generalized density
density which
2 to
{D11 ==
Benford value
signs
the "correct"
"correct"Benford
value log10
to {D
signs the
log1o2
I}, Cohen
1},
Cohen (1976)
(1976) showed
showed that
that "any
"any generalization
generalization
of
of natural
which applies
natural density
to the
the [significant
density which
applies to
[significant
digit
which satisfies
satisfies one
one additional
additional conconand which
digit sets]
sets] and
= I}]"
2 to
must assign
the value
value log10
to [{
dition
D1 =
dition must
assign the
[{D1
1}]
log1o2
and
found necessary
and sufficient
conand Jech
Jech (1992)
(1992) found
necessary and
sufficientconditions
be the
foraa finitely-additive
set function
functionto
to be
the
ditions for
finitely-additiveset
log
None of
function.None
of these
these solutions,
relog function.
solutions, however,
however, resulted
probability, the
in aa true
the
sulted in
true (countably
(countably additive)
additive) probability,
in the
difficulty
being exactly
the same
same as
as that
founthat in
the foundifficulty
being
exactly the
dational
problem of
an integer
of "picking
random"
dational problem
at random"
"picking an
integer at
(CL
pages 86
86 and
de Finetti,
and 98-99),
(cf. de
Finetti, 1972,
1972, pages
98-99), namely,
namely,
if
probabilifeach
each singleton
singletoninteger
integer occurs
occurs with
with equal
equal pr6bability,
whole
ity,then
then countable
countable additivity
additivityimplies
implies that
that the
the whole
space
probability zero
zero or
or infinity.
space must
must have
have probability
infinity.
These
been
These discrete-summability
have been
discrete-summabilityarguments
arguments have
extended
via various
extended via
various integration
Fourier
integration schemes,
schemes, Fourier
analysis
and Banach
Banach measures
measures to
densianalysis and
to continuous
continuous densi= I}
{D11 =
1} is
ties on
positive reals,
is now
now the
the
ties
on the
the positive
where {D
reals, where
set
positive numbers
with first
set of
ofpositive
numbers with
firstsignificant
significantdigit
digit 1,
1,
that is,
that
is,
(5)
{D, = 1} =
00
00
U [1, 2)
n=-oo
n=-oo
X lon.
in this
One
popular assumption
One popular
assumption in
this context
contexthas
has been
been
that of
scale invariance,
that
of scale
invariance, which
which corresponds
corresponds to
to the
the
intuitively
intuitively attractive
attractive idea
idea that
that any
any universal
universal law
law
should
be independent
should be
independent of
of units
units (e.g.,
(e.g., metric
metric or
or EnEnglish).
problem here,
glish). The
The problem
here, however,
however,as
as Knuth
Knuth (1969)
(1969)
observed,
scale-invariant Borel
Borel
is that
is no
observed, is
that there
there is
no scale-invariant
probability
measure on
on the
the positive
reals since
since then
then
probability measure
positive reals
the probability
probability of
the
of the
of
the set
set (0,
(0, 1)
1) would
would equal
equal that
that of
(0,
s) for
s, which
all s,
(0, s)
forall
which again
again would
would contradict
contradictcountcountable
able additivity.
additivity.(Raimi,
(Raimi, 1976,
1976, has
has aa good
good review
review of
of
many of
many
of these
these arguments.)
arguments.) Just
Just as
as with
with the
the dendensity
proofs for
sity proofs
for the
the integers,
integers, none
none of
of these
these methods
methods
yielded
probabilistic law
yielded either
either aa true
true probabilistic
law or
or any
any statisstatistical
tical insights.
insights.
Attempts
prove the
based on
Attempts to
to prove
the law
law based
on various
various
urn schemes
picking significant
urn
schemes for
for picking
ransignificantdigits
digits at
at ranin general,
dom
been equally
dom have
have been
equally unsuccessful
unsuccessful in
general,
although
in some
although in
some restricted
restrictedsettings
settings log-limit
log-limitlaws
laws
have
been established.
have been
established. Adhikari
Adhikari and
and Sarkar
Sarkar (1968)
(1968)
proved that
powers of
proved
that powers
of aa uniform
uniform(0,1)
(0, 1) random
random varivariin the
able
able satisfy
satisfy Benford's
Benford's law
law in
the limit,
limit, Cohen
Cohen and
and
Katz
prime chosen
ranKatz (1984)
showed that
chosen at
that aa prime
at ran(1984) showed
dom with
dom
zeta distribution
with respect
to the
the zeta
distributionsatisfies
satisfies
respect to
the
law and
and Schatte
the logarithmic
Schatte
logarithmic significant-digit
significant-digitlaw
(1988)
Benford's law
law
(1988) established
established convergence
convergence to
to Benford's
for
products of
for sums
and products
of certain
certain nonlattice
nonlattice i.i.d.
i.i.d.
sums and
variables.
variables.
THE
THE NATURAL
SPACE
NATURAL PROBABILITY
PROBABILITY SPACE
The
putting the
of putting
law into
The task
task of
the significant-digit
into
significant-digitlaw
aa proper
proper countably
probability framework
additive probability
framework
countably additive
is
is actually
Since the
actually rather
rather easy.
easy. Since
the conclusion
conclusion of
of the
the
law
law (4)
is simply
(4) is
simply aa statement
statement about
about the
the significantsignificantdigit
D 1 , D2,
D 2 , •••
.. .,, let
functions (random
let
digit functions
(random variables)
variables) D1,
the
sample
space
be
~+,
the
set
of
positive
the sample space be OR, the set of positive reals,
reals,
and
and let
of events
let the
the sigma
sigma algebra
algebra of
events simply
simply be
be the
the
,
IT-field
generated
by
{D
D
•.•
}
[or
equivalently,
..
.}
,
o-field generated by {D1,
[or equivalently,
2
1 D2,
generated
by the
x r--+
x)].
the single
functionx
generated by
single function
?-+ mantissa(
mantissa(x)].
It
seen
that
this
IT-algebra,
which
will
be
is easily
seen
which
will
be
It is
that
this
easily
o-algebra,
denoted
JI and
be called
mantissa
will be
called the
the (decimal)
mantissa
denoted .,X
and will
(decimal)
in
IT-algebra, is
the Borels
is aa sub-lT-field
of the
Borels and
and that
sub-u-fieldof
that in
u-algebra,
fact
fact
(6)
(6)
S
Se!EE JI
{}
S
S==
00
00
U
U
n=-oo
n=-oo
Bx
B x 10'
Ion
c [1,10),
for
B ~
forsome
some Borel
Borel B
[1, 10),
which
just the
which is
is just
of the
the repthe obvious
obvious generalization
generalization of
representation (5)
{D11 == I}.
1}.
for{D
resentation
(5) for
357
357
SIGNIFICANT-DIGIT LAW
LAW
THE SIGNIFICANT-DIGIT
The mantissa
mantissa cr-algebra
IT-algebra JI,
although quite
quite simsimX#,although
The
has several
several interesting
interesting properties:
properties:
ple, has
ple,
(7)
(7)
(i) every
every nonempty
nonempty set
set in
in ./
JI is
is infinite
infinite
(i)
with
accumulation
points
at
0
and at
at
0
and
at
points
withaccumulation
+00;
+oo;
(ii) JI
is closed
closed under
under scalar
scalar multiplicamultiplica/#is
(ii)
tion (s
(s >> 0,
0, SS E
E JI
::::} sS
sS E
E X);
JI);
//=X
tion
(iii) JI
is closed
closed under
under integral
integral roots
roots
,# is
(iii)
(m E
EN,S
E
JI
::::}
slim
E
JI),
but
not
=
Si/m
but
not
.,X),
E
E
(m
Ng,S
powers;
powers;
is self-similar
self-similar in
in the
the sense
sense that
that
(iv) JI
X4 is
(iv)
if S
S EE JI,
then lOmS
10 m S =
= SS for
for every
every
if
X/,then
integer m
m
integer
(where aS
as and
and Sa
sa denote
denote the
the sets
sets {as:
E S}
and
{as: ss E
S} and
(where
{sa:
E S},
S}, respectively).
respectively).
{sa: s E
Property (i)
(i) implies
implies that
that finite
finite intervals
intervals such
such as
as
Property
[1, 2)
2) are
are not
not in
in JI
(i.e., are
are not
not expressible
expressible in
in terms
terms
X4(i.e.,
[1,
of the
the significant
digits alone;
alone; e.g.,
e.g., significant
digdigsignificant
of
digits
significant
its alone
alone cannot
cannot distinguish
distinguish between
between the
the numbers
numbers
its
2 and
and 20)
20) and
and thus
thus the
the countable-additivity
countable-additivity contracontra2
dictions associated
associated with
with scale
invariance disappear.
disappear.
scale invariance
dictions
Properties (i),
and (iv)
follow easily
easily by
by (6),
but
(6), but
(iv) follow
(i), (ii)
(ii) and
Properties
(iii)
closer inspection.
inspection. The
The square
root of
of
square root
warrantsaa closer
(iii) warrants
aa set
similarly
and similarly
"parts,"and
in JI
consistof
oftwo
two"parts,"
mayconsist
set in
X4may
for
if
For example,
example,if
forhigher
roots.For
higherroots.
00
00
S
= {D 1
S={D1=1}=
= I} =
U
x lon,
U [1,2)
[1,2)xlOn,
n=-oo
n=-oo
then
then
2
S1/
=
S12 =
00
00
U
U
n=-oo
n=-oo
[1~,J2)
x Ion
[10Nd2)X1.On
00
U [m,,J20) x
u
U
U
n=-oo
[A/-0
A/--)
Ion
X lon
E
E
JI,
h/
but
but
S2
S2 =
=U
00
00
2n
U [1,4)
//
102n ¢
[194) xX 10
V JI,
n=-oo
n=-oo
since
canand thus
thuscantoolarge
whichare
are too
largeand
sinceit
it has
has gaps
gaps which
not
}. Just
as
... .}.
Just as
in terms
of {D
D2,
terms of
{D1,
written in
be written
not be
1, D
2 , ..•
property
scale
of scale
the hypothesis
is the
to the
hypothesisof
the key
key to
(ii) is
property(ii)
invariance,
(iv)
is the
to aa hypothesis
the key
hypothesis
keyto
(iv) is
property
invariance,property
of
willbe
describedbelow.
below.
be described
whichwill
ofbase
base invariance,
invariance,which
(Although
the
is emphasized
R+is
above,the
emphasizedabove,
the space
space ~+
(Althoughthe
analogous
on
intethe positive
on the
mantissaIT-algebra
positiveintecr-algebra
analogousmantissa
gers
removes
as such
suchremoves
and as
is essentially
thesame
same and
essentiallythe
1Nis
gersN
the
density
since
on N
N since
thecountable-additivity
densityproblem
problemon
countable-additivity
nonempty
finite
the
ofthe
in the
the domain
domainof
not in
sets are
are not
finitesets
nonempty
probability
function.)
function.)
probability
SCALE AND
AND BASE
BASE INVARIANCE
INVARIANCE
SCALE
With the
the proper
proper measurability
measurability structure
structure now
now
With
identified, aa rigorous
rigorous notion
notion of
of scale
scale invariance
invariance is
is
identified,
is
X/#
easy to
to state.
state. Recall
Recall (7)
(7) (ii)
(ii) that
that JI
is closed
closed under
under
easy
scalar multiplication.
multiplication.
scalar
DEFINITION 1.
1. A
A probability
probability measure
measure P
P on
on
DEFINITION
(~+, .X#)
JI) is
is scale
scale invariant
invariant if
if P(S)
P(S) == P(sS)
P(sS) for
for
(R',
all ss >> 0
0 and
and all
all S
S E
E X.
JI.
all
In
In fact,
fact, scale
scale invariance
invariance characterizes
characterizes the
the general
general
significant-digit law
law (4).
(4).
significant-digit
1 (Hill,
(Hill, 1995a).
THEOREM
THEOREM 1
1995a). A
A probability
probability measure
measure
P on
on (R+,
(~+, JI)
is scale
scale invariant
invariant if
if and
and only
only if
if
P
//)is
00
(8)
P
U [1, t)t)
PC~}l,
n=-oo
xx
lOn)
lon
=
10).
[1, 10).
= logl0
IOg10 t
t for
for all
all t
t E
E [1,
inscale inofscale
One possible
possible drawback
drawback to
to aa hypothesis
hypothesis of
One
"universalconstants,"
variance in
in tables
tables of
of "universal
constants," however,
however,
variance
ex1. For
For exconstant1.
is the
the special
role played
played by
by the
the constant
is
special role
= ma
ma and
ample, consider
consider the
the two
two physical
physical laws
laws ff =
and
ample,
2
=
constants,
laws involve
involve universal
universal constants,
• Both
ee =
mC
Both laws
mC2.
is not
not recorded
recorded
constant11 is
but the
forceequation
but
the force
equation constant
constant
lightconstant
oflight
speed of
the speed
whereasthe
in
in most
tables,whereas
mosttables,
conlist of
ofuniversal
universalphysical
C
physical conIf aa "complete"
is. If
C is.
"complete"list
that
plausiblethat
it seems
seems plausible
l's, it
stants
the l's,
also included
includedthe
stantsalso
withstrictly
posstrictlyposoccurwith
this
constantmight
mightoccur
this special
special constant
scale
violate scale
would violate
that would
itive
However,
However,that
frequency.
itive frequency.
all other
other
(and all
the constant
constant22 (and
invariance,
sincethen
thenthe
invariance,since
same positive
withthis
thissame
probconstants)
occurwith
positiveprobwouldoccur
constants)would
ability:
ability.
reasonis assumed
any reasonthat any
assumed that
Instead,
it is
suppose it
Instead,suppose
shouldbe
be base
base
law should
able
law
universalsignificant-digit
able universal
significant-digit
valid when
when
should be
invariant,
be equally
equally valid
that is,
is, should
invariant,that
10. In
In fact,
fact,
otherthan
than 10.
in terms
ofbases
bases other
rewritten
in
termsof
rewritten
Benford's
all
supportingBenford's
ofthe
the classical
classical arguments
all of
argumentssupporting
1976,
mutandis (Raimi,
(Raimi, 1976,
law
mutatis mutandis
over mutatis
law carry
carry over
As will
seen shortly,
will be
be seen
shortly,
page
bases. As
to other
otherbases.
536) to
page 536)
characterizes
invariance characterizes
the
of base
base invariance
the hypothesis
hypothesisof
Dirac probability
law and
and aa Dirac
probability
mixtures
of Benford's
Benford'slaw
mixturesof
occur
constant1,
whichmay
mayoccur
measure
1,which
on the
thespecial
specialconstant
measureon
with
withpositive
probability.
positiveprobability:
conofbase
base invariance,
invariance,conTo
of
definition
To motivate
motivatethe
thedefinition
with
numberswith
1} of
{D11 = I}
of positive
sider
set {D
positivenumbers
sider the
the set
set
This same
same set
leading
digit
10). This
(base 10).
digit11 (base
leadingsignificant
significant
as
written
of
numbers
can
also
[ct:
(5)]
be
written
as
be
can
numbers
also
of
[cf.(5)]
00
00
U [1,2)
1}== U
{D1=
loon
{D
x loon
[1, 2) x
1 = I}
n=-oo
n=-oo
00
00
uU U [10,
x 100,
loon,
20) x
[10,20)
n=-oo
n=-oo
358
358
HILL
T.
T. P.
P. HILL
ofpositive
that is,
1} is
is also
also the
the set
set of
positivenumbers
numbers
that
is, {D
{D11 = I}
whose
digit
in the
is in
significant
digit(base
(base 100)
100) is
the
whoseleading
leadingsignificant
set
11, ... , 19}. In
ofreal
In general,
general,every
everyset
set of
real
set {I,
{1, 10,
10,11,...,19}.
set
is exactly
exactlythe
the same
same set
(base 10)
10) in
in ./
numbers
S (base
numbers 8
JI is
1/ 2 (base
as
JI.
in X.
of real
(base 100)
100) in
real numbers
numbers8S1'2
as the
the set
set of
measure
invariant,the
themeasure
is base
Thus
is
base invariant,
Thus if
ifaa probability
probability
uof
(in the
themantissa
mantissauset of
ofreal
real numbers
numbers(in
ofany
anygiven
givenset
algebra
JI) should
be the
bases and,
and, in
in
forall
all bases
shouldbe
the same
same for
algebra.,X)
original
particular, for
bases which
powersof
ofthe
theoriginal
forbases
whichare
are powers
particular,
base. This
natural
following
naturaldefinition
definition
suggeststhe
the following
base.
This suggests
roots,
under integral
integralroots,
[recall
is also
also closed
closed under
[recall that
that JI
./ is
property (7)(iii)].
property
(7)(iii)].
at
basic questions
in0. [A
questionsconcerning
concerninginnumberof
of basic
at o.
[A number
variance
under multiplication
are
open, such
such
are still
still open,
multiplication
variance under
as
25-year-old
conjecture
that
unitheuniconjecture
thatthe
Furstenberg's
25-year-old
as Furstenberg's
is the
theonly
onlyatomless
atomlessprobform
on
probon [0,
[0, 1)
1) is
formdistribution
distribution
ability
invariant
2x(mod1)
1)
distribution
underboth
both2x(mod
invariantunder
abilitydistribution
and
and 3x(mod
3x(mod1).]
1).]
FROM
RANDOM
RANDOM SAMPLES
SAMPLES FROM
RANDOM
RANDOM DISTRIBUTIONS
DISTRIBUTIONS
Theorems
be clean
clean mathematically,
mathematically,
and 22 may
maybe
Theorems11 and
but they
of
the appearance
appearance of
but
hardly help
help explain
explain the
they hardly
census
Benford's
What do
do 1990
1990 census
Benford'slaw
law empirically.
empirically.What
in common
populations of
have in
with
commonwith
of U.S.
U.S. counties
countieshave
populations
probability measure
P on
on
2. A
A probability
measure P
DEFINITION 2.
data
from
numerical
of
logarithm
tables,
1880
users
of
logarithm
tables,
numerical
data
from
1880
users
1
n
= p(Sl/n)
forall
all
if P(S)
(~+
JI) is
P( 8) =
P( 8 / ) for
invariantif
(R', , X)
is base
base invariant
collected
ofthe
the 1930s
1930s collected
newspaperarticles
articlesof
front-page
newspaper
front-page
positive integers
JI.
integersnn and
and all
all 8S EE X.
positive
by Benford
or universal
constantsexamexamphysicalconstants
Benfordor
universalphysical
by
ined
by
Knuth
in
the
1960s?
Why
should
these
in
should
these
1960s?
Why
Knuth
the
ined
by
Next, observe
numbers
set of
ofnumbers
observethat
thatthe
the set
Next,
tables be
be logarithmic
or,
scale
or base
base
scale or
tables
equivalently,
logarithmic
or,equivalently,
forall
all jj >> I}
8Si=
D j = 00 for
1}
includinvariant?
notof
ofthis
thisform,
form,includManytables
tables are
are not
invariant?Many
=1,1, Di
{D,1 =
1 = {D
ing
he noted),
noted),
tables (as
(as he
individualtables
ing even
even Benford's
Benford'sindividual
=
...}
{..., , 0.01,
= {...
1, 10,
10, 100,
100,...}
0.01, 0.1,
0.1, 1,
closbut as
pointed out,
came closout,"what
"whatcame
but
(1969) pointed
as Raimi
Raimi (1969)
00
00
est
unionof
ofall
his tables."
tables."
est of
ofall,
was the
the union
all his
however,was
all, however,
= U
Ion eEE JI
{1} xX lon
U {I}
basewith baseCombine
weighttables
tables with
molecularweight
Combinethe
the molecular
n=-oo
n=-oo
ball statistics
there
of rivers,
rivers,and
and then
thenthere
and areas
areas of
ball
statisticsand
has [by
JI-measurable
so
no nonempty
.-measurable subsets,
subsets,so
nonempty
has
[by(6)]
(6)] no
is
of
previousexplanations
explanationsof
Many of
ofthe
the previous
is aa good
goodfit.
fit.Many
the
of
this
set
is
well
defined.
defined.
theDirac
deltameasure
measure881
of
this
set
is
well
Dirac delta
1
Benford's
universal
some universal
have hypothesized
hypothesizedsome
Benford'slaw
law have
D 8Si1 and
= 11 if
= 00 otherwise,
if8S ;2
forall
all
[Here
for
and =
otherwise,
[Here 861(S)
1 (8) =
table
"stock
Raimi's (1985,
217) "stock
(1985, page
page 217)
table of
of constants,
constants,Raimi's
8
P L denote
probabilS EE JI.J
logarithmic
probabilLettingPL
denotethe
thelogarithmic
./.] Letting
of
in the
or Knuth's
Knuth's
oftabular
the world's
world'slibraries"
libraries"or
tabulardata
data in
ity
on
givenin
in (8),
(8), aa complete
complete
on (~+,
X4)given
itydistribution
distribution
(lR', JI)
(1969)
of real
and
"some imagined
set of
real numbers,"
numbers,"and
imaginedset
(1969) "some
characterization
for
base-invariant significant-digit
forbase-invariant
significant-digit tried
characterization
real obserprove why
sets of
ofreal
obserto prove
certainspecific
specificsets
whycertain
triedto
probability measures
be given.
given.
measurescan
can now
nowbe
probability
vations
of
ofeither
eitherthis
this mystical
vationswere
were representative
mystical
representative
universal
or the
all real
universaltable
the set
set of
ofall
real numbers.
numbers.
table or
THEOREM 22 (Hill,
A probability
probability measure
measure
(Hill, 1995a).
1995a). A
What
morenatural
is to
to think
thinkof
of data
data as
What seems
seems more
naturalis
as
P
P on
ifand
and only
onlyif
if
//)is
is base
base invariant
invariantif
on (~+,
(O;i, JI)
as was
coming
was
distributions,as
from many
many different
differentdistributions,
coming from
clearly
in Benford's
in his
his "ef"efthe case
case in
Benford's(1938)
(1938) study
studyin
clearlythe
P == qP
for some
some qq eE [0,1].
P
+ (1[0, 1].
qPLL +
(-q)51q)8 1 for
fort
possible
as possible
data from
fromas
as many
fieldsas
manyfields
fortto
to collect
collectdata
and
oftypes"
wide variety
and to
to include
includeaa wide
552);
varietyof
types"(page
(page 552);
From
it is
is easily
see that
that
easily see
From Theorems
Theorems11 and
and 22 it
as
"the
range
of
subjects
studied
and
tabulated
was
as
and
was
"the
of
studied
tabulated
subjects
range
scale invariance
but not
not
scale
invarianceimplies
base invariance,
invariance,but
implies base
wide
as
time
and
energy
permitted"
(page
554).
and
wide
as
time
554).
(page
energy
permitted"
conversely
base but
but not
not scale
scale
is clearly
clearly base
conversely(e.g.,
(e.g., 851
1 is
probability meaRecall
mearandomprobability
that aa (real
Recall that
(real Borel)
Borel) random
invariant).
invariant).
sure
(r.p.m.)
M
is
a
random
vector
[on
an
underlying
M
a
vector
an
is
random
underlying
sure
[on
(r.p.m.)
The
proof of
the
of Theorem
followseasily
fromthe
The proof
Theorem11 follows
easily from
, P)]
are
probability space
whichare
values which
takingvalues
space (0,
(Q, Y:-,
P)] taking
probability
invariance
fact
to
factthat
to invariance
thatscale
scale invariance
invariancecorresponds
corresponds
is
Borel
probability
measures
on
IR
and
which
is
reguon
and
which
reguBorel
measures
Rlt
probability
-*
on
tinder
irrational rotations
rotations xx -+
+ s) (mod
under irrational
(x +
(mod 1)
1) on
B c IR,
Borelset
lar
in the
foreach
each Borel
set B
R, M(B)
M(B)
lar in
thatfor
sense that
thesense
the1circle,
probability meameathe
and the
theunique
invariantprobability
uniqueinvariant
circle,and
is
a
random
variable
(CL
Kallenberg,
1983).
1983).
variable
Kallenberg,
(cf.
is
a
random
knownto
sure
is
be
thistransformation
is well
well known
to be
sure under
underthis
transformation
in turn
the
(Lebesgue)
turncorcorthe uniform
whichin
uniform
measure,which
(Lebesgue)measure,
DEFINITION 3.
distributionmeasure
measure
3. The
The expected
expecteddistribution
responds to
Proof
Proofof
of
distribution.
the log
log mantissa
mantissadistribution.
responds
to the
of
a
r.p.m.
M
is
the
probability
measure
EM
(on
the
is
the
measure
EM
a
of
(on the
M1
probability
r.p.m.
Theorem
base
is slightly
since base
morecomplicated,
Theorem22 is
complicated,since
slightlymore
Borel
subsets
of
~
)
defined
by
of
defined
subsets
by
Borel
DR)
invariance
multito invariance
invarianceunder
under multicorrespondsto
invariancecorresponds
plication x
x -+
-> nx
used here
here
nx (mod
The key
key tool
tool used
(mod 1).
1). The
plication
B c ~
(9)
for
forall
all Borel
Borel B
lR
E(M(B))
(9) (EM)(B)
(EM)(B) == E(M(B))
(Hill,
probais that
Borel probathat aa Borel
Proposition4.1)
4.1) is
(Hill, 1995a,
1995a, Proposition
[where
E(·)
denotesexpectaand throughout,
here and
bility Q
underthe
the mappings
on [0,
is invariant
invariantunder
expectaE(.) denotes
mappings
throughout,
[wherehere
1) is
bility
Q on
[0, 1)
tion
probability
P on
the underlying
to P
on the
tionwith
withrespect
is aa convex
convex
probability
nx
underlying
if 'and
if Q
respectto
forall
all nn if
and only
nx (mod
onlyif
Q is
(mod1)
1) for
space].
mass
combination
point mass
space].
of uniform
uniformmeasure
measure and
and point
combinationof
359
359
SIGNIFICANT-DIGIT LAW
LAW
THE SIGNIFICANT-DIGIT
For example,
example, if
if MVl
M is
is aa random
random probability
probability which
which is
is
For
U[O, 1]
1] with
with probability
probability 1/2
1/2 and
and otherwise
otherwise is
is an
an exexU[O,
ponential distribution
distribution with
with mean
mean 1,
1, then
then EM
EM is
is simsimponential
ply the
the continuous
continuous distribution
distribution with
with density
density ff(x)
(x) ==
ply
(1 +
+ e-x)/2
e- X )/2 for
for O
0 <:::: x
x <:::: 1
1 and
and == e-x/2
e- x /2 for
for x
x >> 1.
1.
(1
The next
next definition
definition plays
plays aa central
central role
role in
in this
this secsecThe
tion and
and formalizes
formalizes the
the concept
concept of
of the
the following
following natnattion
ural process
process which
which mimics
mimics Benford's
Benford's data-collection
data-collection
ural
procedure: pick
pick aa distribution
distribution at
at random
random and
and take
take aa
procedure:
sample of
of size
size k
k from
from this
this distribution;
distribution; then
then pick
pick aa
sample
second distribution
distribution at
at random
random and
and take
take aa sample
sample of
of
second
size k
k from
from this
this second
second distribution
distribution and
and so
so forth.
forth.
size
DEFINITION 4.
4. For
For an
an r.p.m.
r.p.m. MI
M and
and positive
positive inteinteDEFINITION
ger k,
k, aa sequence of M-random
M-random k-samples
k-samples is
is aa seseger
quence of
X 2 , •••
. .. on
on (0,!7,
variables Xl'
ofrandom
random variables
s,
P)
X1, X2,
(fQ, P)
so that
that for
for some
some i.i.d.
i.i.d. sequence
sequence Ml,
M I , M2,
M2 , M3,
M3 , ... of
of
so
r.p.m.'s with
with the
the same
same distribution
distribution as
as MR
M and
and for
for each
each
r.p.m.'s
= 1,2,
... ..
1, 2, ...,
jj =
In
In general,
general, sequences
sequences of
of M-random
M-random k-samples
k-samples are
are
not
not independent,
independent, not
not exchangeable,
exchangeable, not
not Markov,
Markov, not
not
martingale and
and not
not stationary
stationary sequences.
sequences.
martingale
EXAMPLE.
EXAMPLE. Let
Let MI
M be
be aa random
random measure
measure which
which is
is
the Dirac
Dirac probability
probability measure
measure 5(1)
5(1) at
at 1
1 with
with probprobthe
ability
ability 1/2,
1/2, and
and which
which is
is (8(1)
(5(1) +
+ 8(2))/2
5(2))/2 otherwise,
otherwise,
and
and let
let k
k =
= 3.
3. Then
Then P(X2
P(X 2 == 2)
2) == 1/4,
1/4, but
but P(X2
P(X 2 ==
=
=
22 dI X1
Xl = 2)
2) = 1/2,
1/2, so
so X1,
Xl' X2
X 2 are
are not
not independent.
independent.
Since
Since
P((X1,
P«X l' X2,
X 2 , X3,
X 3 , X4)
X 4)
and
and
(11)
XU-1)k+l'···'
jk
... X
Xjk
(11) X(j_1)k+,
{M
{Ai,i ,
X(i-l)k+l' ...
***,,
X(i-l)k+l?
of
independent
independent of
:
i
all
for
X
}
for
all
i
=f.
j.
j.
ik
Xik}
are
are
The
lemma
curious
showsthe
thesomewhat
somewhatcurious
lemmashows
The following
following
structure
ofsuch
such sequences.
structureof
sequences.
LEMMA
X 2 , •••
M... be
be aa sequence
of M1. Let
Let Xl'
sequence of
LEMMA1.
X1, X2,
random
for some
M.
some kk and
some r.p.m.
r.p.m. M.
and some
k-samples for
random k-samples
Then:
Then:
(i)
n} are
distributed with
with
the {X
are a.s.
a.s. identically
identically distributed
(i) the
{Xn}
distribution
indein general
but are
are not
not in
distributionEM,
general indeEM, but
pendent;
pendent;
(ii)
the{X
are a.s.
a.s. indepenindepen* *},}, the
{M1,
given{M
M2,
(ii) given
{Xn}
I, M
2 , ...
n } are
but
are
not
in
general
identically
disdent,
disin
identically
but
are
not
general
dent,
tributed.
tributed.
PROOF.
of(ii)
The first
firstpart
followseasily
(10)
easilyby
by(10)
PROOF. The
(ii) follows
partof
and
(11);
the
second
part
follows
since
whenever
and (11); the second part followssince whenever
[MIj,
M
j, X
same distribution
distribution
will not
not have
have the
the same
Xik
ik will
mji =f. M
as
X
jk.
The
first
part
of
(i)
follows
conditioning
as Xjk. The firstpart of (i) followsby
by conditioning
on
on M
j :
MIj:
P(X
j(B)]
B) = E[M
E[MIj(B)]
P(Xj j EE B)
==(2,
(2, 1,
1, 1,
1)),
1, 1)),
the
the {X
are not
not exchangeable;
exchangeable; since
since
{X,}n} are
P(X3
1)
P( X 3 =
= 11 1I X1
Xl =
= X2
X2 =
= 1)
=
1 1X2
1),
= 9/10
9/10 >> 5/6
5/6 =
= P(X3
P(X 3 =
= 11
X2 =
= 1),
since
the {X
are not
not Markov;
Markov; since
{X n}} are
the
E(X2
I X1 = 2) = 3/2,
since
the {X
are not
not aa martingale;
martingale; and
and since
the
{X,}n} are
= (1,
1, 1))
P((X1,I , X2,
(1, 1,
1))
P«X
X 2 , X3)
X 3) =
= (1,
= P«X
= 9/16
1, 1)),
9/16>> 15/32
15/32=
P((X2,
X3,
(1, 1,
1)),
=
X 4) =
2, X
3 , X4)
notstationary.
the
the {X
are not
stationary.
{X,,
n }} are
of the
the
is simply
the statement
statementof
The
next lemma
lemma is
The next
simplythe
distrithat the
factthat
the empirical
intuitively
distriintuitivelyplausible
plausible fact
empirical
exto the
theexbution
k-samples
ofM-random
butionof
MI-random
k-samplesconverges
convergesto
that
not
of
this-is
pected
distribution
of
M;
that
this"'is
not
completely
distribution
pected
MR;
completely
trivial
followsfrom
fromthe
the independence-identically
trivial follows
independence-identically
= 1,
If kk =
in Lemma
1. If
statedin
Lemma 1.
distributed
dichotomy
stated
distributed
dichotomy
1,
law
of
case of
of the
it
just the
law
of
the Bernoulli
Bernoullicase
the strong
it is
is just
strong
large
numbers.
largenumbers.
... be
be
2. Let
M be
be aa r.p.m.,
and let
let Xl'
LEMMA2.
Let M
LEMMA
r.p.m.,and
X1, X
X22 •••
k. Then
Then
some k.
aa sequence
for some
M-random k-samples
sequence of
of M-random
k-samplesfor
< n:X
. #{i::::
n:Xii EE B}
B}
Im-----Ilim#i
n~oo
n
B c ~.
= E[M(
a.s. for
all Borel
Borel B
=
B)] a.s.
for all
E[M(B)]
R8.
B and jj EE N,
and let
let
PROOF.
and
Fix Band
PROOF. Fix
NkJ,
= E[M(
=
B)] for
B c ~,
Borel B
forall
all Borel
R8,
E[M(B)]
where
j has
MI1
has the
the
since M
followssince
the last
last equality
wherethe
equalityfollows
same
as
of(i)
folThe second
as M.
secondpart
M. The
(i) foldistribution
partof
same distribution
lows
distrii.i.d. samples
fromaa distrifactthat
thati.i.d.
samplesfrom
fromthe
thefact
lowsfrom
bution
about
thedistribution,
information
aboutthe
distribution,
butionmay
giveinformation
maygive
as
in the
thenext
nextexample.
as seen
seen in
example. 0LI
1, 1,2))
1, 2))
1,
=
= 9/64
9/64 >> 3/64
3/64 == P((X1,
P«X I , X2,
X 2 , X3,
X 3 , X4)
X 4)
...
random variables
given Mj
Mj =
P, the
the random
variables
(10)
= P,
(10) given
with
are
i.i.d.
X(j-l)k+l'
.
..
,
X
jk
are
i.i.d.
with
d.£
P;
d.f.
P;
Xjk
...1
X(j-1)k+
=
= (1,
(1,
< k:
= #{m,
< m
k:X(j-l)k+m
B}.
m ::::
#{m,11 :::
YYjj =
X(j1)k+m EE B}.
Clearly,
Clearly,
(12)
(12)
B} =lim
#{i < n: Xi EE B}
#{i::::
I. L7=1 ljY j
. n:mXi
I1m
=
1m - - n-*oo
n-*oo
n~oo
nn
n~oo
km
km
limitexists)
(if
thelimit
(ifthe
exists)
T.
P HILL
T. P.
HILL
360
360
is binomially
By (10),
(10), given
given M
distributed
By
j' Y
binomially distributed
MIj,
Yjj is
so
by
(9),
with
parameters kk and
j(B)],
so
by
(9),
withparameters
and E[M
E[Mj(B)],
(13)
= E(E(Y
EY
E(E(Yj j
EYjj =
IM
j»
Mj))
n, then
= 2
has baseX
then{X
2n,
2 and
and Zn =
X n_==
base=1,1, Y
{Xn}
Yn
n == 2
n } has
has
but not
{Y
frequency,
}
not scale-neutral
mantissafrequency,
scale-neutralmantissa
but
{YnI
n has
of
and Theorem
Theorem11 of
above and
neither
and (by
(byTheorem
Theorem11 above
neitherand
has both.
1977) {Zn}
Diaconis, 1977)
Diaconis,
both.
{Zn} has
and
of scale-neutral
scale-neutral and
Mathematical
examples
of
examples
Mathematical
= kE[M(B)]
all j,
=
a.s.
j,
a.s. for
forall
kE[M(B)]
will
scale-biased
processes are
as
as will
are easy
easy to
to construct,
construct,
scale-biasedprocesses
the same
distribution
as M.
MR.
since
sinceM
MD
j has
has the
same distribution
as
pick
be described
below. For
real-lifeexample,
example,pick
For aa real-life
be
describedbelow.
By (11),
independent.Since
Since they
they
By
(11), the
the {Y
are independent.
{Yj}
j } are
aa beverage-producing
beverage-producing company
in continental
EucontinentalEucompanyin
have [via
means kE[M(B)]
kE[M(B)] and
and are
are
have
[via (13)]
(13)] identical
identicalmeans
volumesof
of
the metric
metricvolumes
look at
at the
rope at
randomand
and look
rope
at random
uniformly bounded
bounded [so
LOO(Var(Yj )/ j2) << 00],
it foloo],it
foluniformly
[so EZ(Var(Yj)/j2)
aa sample
products; then
pick aa second
second
of its
its products;
then pick
of kk of
sample of
lows
page 250)
lows (CL
(cf.Loeve,
Loeve,1977,
1977,page
250) that
that
in
company
product volumes
volumesin
so forth.
Since product
forth.Since
and so
companyand
this
case
are
probably
closely
related
to
liters,
this
liters,
this
closely
related
to
this
case
are
probably
. L7=I Y j
(14)
11m
a.s.,
lim j=i
kE[M(B)]
a.s.,
(14)
(random
process is
not scale
scale
mostlikely
likelynot
is most
k-sample)process
(randomk-sample)
m = kE[M(B)]
mo
m
as galgalunitsuch
such as
neutral
to another
anotherunit
and conversion
conversionto
neutraland
and
by (12)
and (14).
and the
the conclusion
conclusionfollows
followsby
(12) and
(14). D
D2
set of
of
different
lons
probably yield
set
yieldaa radically
radicallydifferent
lonswould
wouldprobably
ifspecies
species
first-digit
frequencies.
On
otherhand,
hand,if
On the
the other
frequencies.
first-digit
An
proof can
be based
An even
on the
the obserobsercan be
based on
even shorter
shorterproof
of
randomand
and
in Europe
selectedat
at random
are selected
ofmammals
mammalsin
Europe are
that the
the variables
are
vation that
variables Xi,
vation
Xi' X
k+i' X
2k+i' ...... are
X2k+i,
Xk+i,
their
less likely
likely
it seems
seems less
sampled,it
metricvolumes
volumessampled,
theirmetric
i.i.d.
i.i.d. for
but the
the argument
argumentgiven
given
forallIs
all 1 < ii s< k,
k, but
that
process is
to the
the choice
of
is related
relatedto
choiceof
that this
this second
secondprocess
the asabove
be easily
to show
show that
that the
asabove can
can be
modifiedto
easily modified
units.
units.
is sampled
timesis
is
sumption
j is
thateach
each M
MII
sampledexactly
exactlykk times
sumptionthat
Similarly,
base-neutral and
base-biased processes
and base-biased
processes
base-neutral
Similarly,
K ij times,
times,
not
jth r.p.m.
ifthe
is sampled
r.p.m.is
sampledK
notessential;
essential;if
the jth
The quesare
mathematically.
The
quesmathematically.
are also
also easy
to construct
construct
easy to
where the
uniformly
bounded
are independent
bounded
where
the{K
uniformly
independent
{Kjj}} are
tion
base-neutrality is
when
is most
whenthe
the
mostinteresting
interesting
tionof
ofbase-neutrality
N-valued random
are also
also indepen(whichare
indepenRN-valued
randomvariables
variables(which
units
universally agreed
in question
upon,such
such
agreedupon,
unitsin
are universally
questionare
dent
rest of
process), then
the process),
thenthe
the same
same concondentof
ofthe
the rest
ofthe
as
For real-life
examples,
real-lifeexamples,
numbersof
of things.
things.For
as the
the numbers
clusion
clusionholds.
holds.
the numnumpicking cities
lookingat
at the
at random
randomand
and looking
cities at
picking
ber of
people from
of people
fromthose
those
ber
of fingers
of k-samples
k-samplesof
fingersof
A NEW
A
NEW STATISTICAL
STATISTICAL DERIVATION
DERIVATION
cities
base-10 dependent
(that
cities is
is certainly
dependent(that
heavilybase-10
certainlyheavily
is
where
base
10
originated),
whereas
picking
cities
cities
whereas
picking
is
base
10
originated),
where
The
The stage
is now
now set
set to
to give
new statistical
statistical
give aa new
stage is
and looking
at the
numberof
ofleaves
leaves of
of
at random
randomand
the number
lookingat
limit
below) which
limitlaw
whichis
is aa central-limitcentral-limit- at
law (Theorem
(Theorem33 below)
k-samples
of
trees
from
those
cities
is
probably
less
less
from
those
cities
is
probably
of
trees
k-samples
like
digits.
forsignificant
like theorem
theoremfor
digits.Roughly
Roughlyspeakspeaksignificant
in the
base dependent.
be seen
As will
will be
seen in
the next
nexttheotheodependent.As
ing,
probability distributions
if probability
distributions base
law says
that if
ing, this
this law
says that
rem,
scale
and
base
neutrality
of
random
k-samples
k-samples
of
random
neutrality
and
base
rem,
scale
are
are
at random
randomand
and random
randomsamples
are selected
selected at
samples are
are
to scale
scale and
and base
base unbiunbiare essentially
essentiallyequivalent
equivalentto
then
in
in any
thentaken
fromeach
each of
ofthese
these distributions
distributions
taken from
any
of
the
underlying
r.p.m.
M.
asedness
the
MI.
asedness
of
underlying
r.p.m.
way
process is
base)
is scale
scale (or
so that
that the
the overall
overall process
(or base)
way so
the
neutral, then
ofthe
thenthe
the significant-digit-frequencies
frequenciesof
neutral,
significant-digit
if
unbiased if
DEFINITION 6.
6. An
An r.p.m.
is scale
scale unbiased
Mris
r.p.m. M
combined
the logarithmic
to the
combinedsample
will converge
convergeto
logarithmic
sample will
on
its
expected
distribution
EM
is
scale
invariant
on
is
scale
invariant
distribution
EMA
its
expected
distribution.
This
preand preThis theorem
theoremhelps
distribution.
helps explain
explain and
ifEM
EM is
is base
base invariinvariis base
base unbiased
unbiasedif
and is
(DR, JI)
X/) and
dict
distribution
distribution (~+,
ofthe
the logarithmic
dictthe
the appearance
logarithmic
appearanceof
ant
on
(~+,
JI).
[Recall
that
JI
is
a
sub-o--algebra
is
a
that
/#
ant
on
sub-u-algebra
(DR+,
X).
[Recall
in
digits
in significant
oftabulated
tabulateddata.
data.
significant
digitsof
of
probability on
DR(such
on ~
(such
ofthe
the Borels,
so every
Borel probability
everyBorel
Borels,so
as
EM)
induces
a
unique
probability
on
(~+,
JI).]
on
as
induces
a
/#).]
(DR,
unique
probability
EM)
A sequence
of random
random variables
variables
DEFINITION 5.
5. A
sequence of
~ 11,, X 2,
scale-neutral mantissa frequency
frequency if
if
... has
has scale-neutral
2 , •••
scale
A
point here
of
is that
the definition
definition
ofscale
A crucial
hereis
thatthe
crucialpoint
n,II#{i
S n: Xi E
- #{i S n:
Xi E sS}l
sS}1 -~ 0O a.s.
E S}
n: XiE
n l#{i<n:Xi
S}-#{
and
base unbiased
thatindividual
unbiaseddoes
does not
notrequire
individual
and base
requirethat
in fact
fact
realizations
be scale
base invariant;
or base
ofM
scale or
invariant;in
realizationsof
M be
for
0 and
and all
all S
S EE JI,
and has
has base-neutral
base-neutral
for all
all ss >> 0
X, and
it
is
often
the
case
[see
Benford's
(1938)
data
and
data
and
it
is
the
case
Benford's
(1938)
often
[see
mantissa
frequency if
if
mantissa frequency
is scale
scale
example
below] that
realizationsis
thatnone
noneof
ofthe
therealizations
examplebelow]
n-II#{i
Xi E
n-1l#{iS< n:
n: Xi
S}
E S}
on the
the
invariant,
but only
process on
thesampling
thatthe
samplingprocess
invariant,but
onlythat
another.
average
scale over
overanother.
does not
notfavor
favorone
one scale
averagedoes
-#{i
Xi ~
a.s.
E sI/m}I~O
-#{i S
a.s.
<n:n : Xi
sl/m}Ij>O
result:here
hereM(t)
Now for
M(t)
statisticalresult:
forthe
themain
mainnew
newstatistical
Now
where D
denotes
the random
variable M(D
denotesthe
randomvariable
and
for
forall
all mEN
m E RNI
and S
S EE JI.
/.
Dtt ==
M(Dt),
t ), where
U~=_oo[l,
Ion is the
positive numbers
with
the set of
ofpositive
numbers with
t) x lon
0n _o0[j, t)
in light
ofthe
the repmantissae
in [1/10,
the
mantissaein
are the
repif {X
and {Zn}
lightof
For example,
n}, and
t/10).[Thus
[Thus in
[1/10,t/10).
For
example,if
{Yn,
{ZnJ are
n }, {Y
{Xnl,
the random
random
resentation
M(t) may
be viewed
viewedas
as the
resentation(6),
sequences
random variables
maybe
definedby
(6), M(t)
variablesdefined
of(constant)
by
sequencesof
(constant)random
(13)
m~oo
1°
THE SIGNIFICANT-DIGIT
SIGNIFICANT-DIGITLAW
LAW
cumulative
function
forthe
the mantissae
mantissae
cumulativedistribution
distribution
functionfor
of
ofthe
the r.p.m.
r.p.m.M.]
M.]
THEOREM
digits).
THEOREM3
3 (Log-limit
(Log-limitlaw
law for
forsignificant
significant
digits).
Let M
(~+, .4).
following are
Let
M be
be an
an r.p.m.
r.p.m.on
on (DR,
X6').The
The following
are
equivalent:
equivalent:
(i) M is
is scale
unbiased;
(i)
scale unbiased;
(ii) M is
is base
base unbiased
unbiased and
and EM
is atomless;
(ii)
EM is
atomless;
= 10gIO
(iii) E[M(t)]
log10 tt for
all tt EE [1,
[1, 10);
10);
(iii)
E[M(t)] =
for all
M-random k-sample
scale-neutral
(iv)
(iv) every
every M-random
k-sample has
has scale-neutral
mantissa frequency;
frequency;
mantissa
(v)
M-random kk(v) EM
EM is
is atomless,
atomless, and
and every
every M-random
sample has
frequency;
has base-neutral
base-neutral mantissa
mantissa frequency;
sample
(vi) for
everyM-random
k-sample Xl'
* * *,,
(vi)
for every
M-random k-sample
X 2 , •••
Xl, X2,
I
< n:
nn-1#{i
n: mantissa(X
#{i :::;
[1/10, tl10)}
mantissa(Xii )) EE [1/10,
t/10)}
->
a.s. for
all tt eE [1,
[1, 10).
10).
~ 10gIO
for all
loglo tt a.s.
PROOF. (i)
by Definitions
11 and
Immediateby
Definitions
and
PROOF.
(i) {:>
X (iii).
(iii). Immediate
1.
66 and
and Theorem
Theorem1.
(ii)
thatthe
theBorel
Borel
(ii) {:>
X (iii).
(iii). It
It follows
followseasily
easilyfrom
from(6)
(6) that
ifit
it is
probability EM
ifand
and only
is atomatomEM is
is atomless
atomlessif
probability
onlyif
(ii) is
less
That (ii)
is equivalent
follows
less on
on .4.
to (iii)
thenfollows
X#.That
equivalentto
(iii) then
easily
by Definitions
22 and
and 66 and
2.
Definitions
and Theorem
Theorem2.
easilyby
(iii)
X (iv).
Lemma2,
(iii) {:>
(iv). By
By Lemma
2,
< n:
An := n-II#{i
Xi EE S}I
S}
n:Xi
An:=
n-ll#{i:::;
~
E[M(S)]
E[M(S)]
a.s.,
a.s.,
and
and
< n:
n-1l#{i::::
n: Xi
B n := n-II#{i
Xi EE sS}I
sS}1
Bn:=
~
E[M(sS)]
E[M(sS)]
a.s.,
a.s.,
ifand
ifEM(S)
so IAn
0 a.s.
a.s. if
and only
- B nId~ 0
S) == EM(
sS),
so
onlyifEM(
EM(sS),
IAn-Bn
which by
by Definition
Definition1 and
and Theorem
Theorem11 is
is equivalent
which
equivalent
to (iii).
to
(iii).
(iii)
Lemma 2,
Definition22
(iii) {:>
X (v).
(v). Similar,
Similar,using
using Lemma
2, Definition
and
2.
and Theorem
Theorem2.
D
(iii)
by Lemma
2. D
Immediateby
Lemma 2.
(iii) {:>
X (vi).
(vi). Immediate
One
points of
Theorem33 is
is that
that there
thereare
are
the points
One of
ofthe
ofTheorem
many (natural)
procedures which
to
whichlead
lead to
many
(natural)sampling
samplingprocedures
the log
helping
howthe
the differdifferthe
log distribution,
distribution,
helpingexplain
explainhow
ent
Newcomb, Benford,
Knuth
Knuth
entempirical
evidenceof
ofNewcomb,
empiricalevidence
Benford,
and
Nigrini all
law. This
This may
also
and Nigrini
all led
led to
to the
the same
same hiw.
may also
help
thenumbers
numbersfrom
fromnewsnewshelpexplain
explainwhy
whysampling
samplingthe
paper front
pages (Benford,
page 556),
or alalfrontpages
paper
(Benford,1938,
1938, page
556), or
manacs
data often
oftentends
tends
or extensive
extensiveaccounting
manacs or
accountingdata
in each
these
toward
since
the log
since in
each of
of these
towardthe
log distribution,
distribution,
in aa
cases
are
being sampled
cases various
variousdistributions
distributions
are being
sampledin
presumably unbiased
the first
firstarticle
article
unbiasedway.
presumably
way.Perhaps
Perhapsthe
in
population
in the
about population
the newspaper
has statistics
newspaperhas
statisticsabout
growth,
prices and
stockprices
and
the second
second article
articleabout
about stock
growth,the
indithe third
None of
ofthese
these indithe
thirdabout
about forest
forestacreage.
acreage. None
the
vidual distributions
itself
be unbiased,
but the
itselfmay
vidual
distributions
maybe
unbiased,but
mixture may
be.
well be.
mixture
maywell
361
base ununhypothesis
ofscale
scale or
orbase
Justification
of
of
Justification
ofthe
thehypothesis
biasedness is
justification of
is akin
tojustification
ofthe
the hypothesis
hypothesis
biasedness
akin to
of
in
in apidenticaldistribution)
distribution)
apofindependence
independence(and
(and identical
or central
central
law of
of large
large numbers
numbersor
plying
the strong
stronglaw
plying the
hypothneitherhypothlimittheorem
theoremto
to real-life
real-lifeprocesses:
limit
processes: neither
esis
be proved,
proved, yet
in many
sampling
esis can
can be
yet in
manyreal-life
real-lifesampling
assumpto be
reasonableassumpprocedures,
theyappear
appear to
procedures, they
be reasonable
straighttions. Conversely,
Theorem
Theorem33 suggests
suggestsaa straighttions.
Conversely,
forward
data-simply test
test
forwardtest
test for
forunbiasedness
unbiasednessof
ofdata-simply
goodness-of-fit
to
distribution.
to the
the logarithmic
logarithmic
distribution.
goodness-of-fit
Many standard
of
are auauconstructions
of r.p.m.'s
r.p.m.'sare
Many
standardconstructions
tomatically
scale
base neutral,
thussatisfy
satisfy
and base
neutral,and
and thus
tomatically
scale and
the log-limit
significant-digit
law.
probtheproblaw.Consider
Considerthe
the
log-limit
significant-digit
lem
aa random
r.p.m.)on
on
randomvariable
variableX
X (or
(orr.p.m.)
lemof
ofgenerating
generating
be just
as
If the
chosenare
are desired
desiredto
to be
[1,10).
just as
[1, 10). If
the units
unitschosen
likely
per dollars
[orBenBenas dollars
dollarsper
perstock
stock[or
likelystock
stockper
dollarsas
ford's
per watt"
per
"wattsper
ford's(1938)
watt"versus
versus "watts
(1938) "candles
"candles per
candle"],
generated
shouldbe
be
thenthe
the distribution
distribution
generatedshould
candle"],then
should
so for
forexample
exampleits
its 10gIO
log10should
reciprocal
reciprocalinvariant,
invariant,so
be symmetric
about
F(l) == 00 and
firstset
set F(1)
and
be
about 1/2.
So first
1/2.So
symmetric
= 1;
F(10-)
pick F(,JIO)
[accordF( 10) randomly
randomly[accordF(10-) =
1; next
next pick
10
ing
on (0,1)]
since ,JIO
measure on
(0, 1)] since
ing to,
to, say,
say, uniform
uniformmeasure
= 101t;
thenpick
pick
is
point tt =
point
10/t;then
is the
the reciprocal-invariant
reciprocal-invariant
F(10 I/4 ) and
F(10 3 / 4 ), independently
and
and uniformly
uniformly
independently
F(101/4)
and F(103/4),
on
F(,JIO))
and
(F(,JIO)
,
1),
respectively,
and
on (0,
and
and
(F( 10), 1), respectively,
(0, F( 10))
in this
continue
this manner.
manner.This
This classical
classical construcconstruccontinuein
tion
of Dubins
Dubins and
9.28)
Freedman(1967,
(1967, Lemma
Lemma 9.28)
tion of
and Freedman
expected
is
an r.p.m.
r.p.m.a.s.
a.s. whose
whoseexpected
is known
knownto
to generate
generatean
distribution
EM
L
the logarithmic
PL
EM is
logarithmicprobability
probabilityP
distribution
is the
of
by Theorem
base
of (8),
and hence
and base
hence by
Theorem33 is
is scale
scale and
(8), and
unbiased, even
probability 11 every
even though
everydisdisthough with
withprobability
unbiased,
tribution
generated this
will be
tributiongenerated
this way
be both
both scale
scale and
and
way will
base
is unbiunbibase biased.
On the
the average,
average, this
this r.p.m.
r.p.m.is
biased. On
law
ased,
so the
the log-limit
law will
will apapsignificant-digit
ased, so
log-limitsignificant-digit
ply to
to all
all M-random
k-samples.[The
[The construction
construction
ply
M-randomk-samples.
described
using uniform
is not
not crucrudescribedabove
above using
uniformmeasure
measureis
cial.
base measure
about
on (0,1)
about
cial. Any
measure on
(0, 1) symmetric
symmetric
Any base
1/2
property (Dubins
willhave
thesame
and FreedFreed(Dubinsand
1/2will
have the
same property
man,
Theorem9.29).]
9.29).]
man, 1967,
1967,Theorem
data
Also,
than
data sets
sets other
otherthan
Also, many
manysignificant-digit
significant-digit
random k-samples
base-neutral manmanhave scalescale-or
orbase-neutral
random
k-sampleshave
tissa
in
in which
such data
data
tissa frequency,
whichcase
case combining
combiningsuch
frequency,
did
together
unbiased random
withunbiased
randomk-samples
(as did
k-samples(as
togetherwith
Benford,
perhaps, in
mathein combining
data from
frommathecombiningdata
Benford,perhaps,
matical
withthat
thatfrom
fromnewspaper
maticaltables
tables with
statistics)
newspaperstatistics)
will
in convergence
to the
the logarithmic
will still
still result
result in
logarithmic
convergenceto
ifcertain
distribution.
For
certaindata
data represents
distribution.
Forexample,
represents
example,if
(deterministic)
periodic sampling
proofaa geometric
prosamplingof
geometric
(deterministic)
periodic
= 2n),
1 ofDiaconis
Xn =
2 n ), then
by Theorem
cess
thenby
Theorem10fDiaconis
cess (e.g.,
(e.g.,Xn
Ben(1977),
process is
is aa strong
this deterministic
deterministic
strongBenprocess
(1977), this
ford
its limiting
frefordsequence,
whichimplies
that its
limitingfreimpliesthat
sequence,which
quency
unbiased ranranor averaged
withunbiased
averagedwith
quency(separately
(separatelyor
dom
willsatisfy
dom k-samples)
satisfy(4).
(4).
k-samples)will
An
is to
to determine
determine
An interesting
open problem
problemis
interestingopen
which
(or
mixturesthereof)
distributions
whichcommon
commondistributions
thereof)
(or mixtures
362
362
T.
T. P.
P. HILL
HILL
or base
base insatisfy
law, that
that is,
is, are
are scale
scale or
insatisfyBenford's
Benford'slaw,
withlogarithmic
logarithmic
variant
have mantissas
mantissaswith
variantor
or which
whichhave
Cauchy
example, the
the standard
standard Cauchy
distributions.For
For example,
distributions.
distribution
is
Benford's
Benford'slaw
law (c£
(cf.
is close
close to
to satisfying
satisfying
distribution
is not,
not,but
but
the standard
standardGaussian
Gaussian is
Raimi, 1976)
Raimi,
1976) and
and the
perhaps certain
some common
common
certainnatural
naturalmixtures
mixturesof
ofsome
perhaps
distributions
are.
distributions
are.
Of
and sampling
sampling
Of course
thereare
are many
manyr.p.m.'s
r.p.m.'sand
coursethere
processes which
law
law (and
(and
do not
notsatisfy
satisfythe
thelog-limit
log-limit
processes
whichdo
biased),
scale and
and base
hence are
both scale
base biased),
hence
are necessarily
necessarilyboth
such
on
distribution
on
such as
as the
the (a.s.)
constantuniform
uniformdistribution
(a.s.) constant
[1,
reason not
well understood
understood
not yet
yetwell
[1, 10)
10) or
or (for
(forsome
some reason
by the
via DubinsDubinsthe author)
the r.p.m.
r.p.m.constructed
constructedvia
by
author)the
Freedman with
base probability
probability uniform
measure
Freedman
with base
uniformmeasure
on
the rectangle,
rectangle,which
which
on the
the horizontal
horizontalbisector
bisectorof
of the
archas expected
distributionaa renormalized
renormalizedarchas
expectedlog
log distribution
sin
(Dubins
and Freedman,
1967,TheThesin distribution
distribution
(Dubins and
Freedman,1967,
orem
orem9.21).
9.21).
APPLICATIONS
APPLICATIONS
The
law
law TheTheThe statistical
log-limitsignificant-digit
significant-digit
statisticallog-limit
orem
may help
justify some
someof
ofthe
the recent
recentapplicaapplicaorem33 may
helpjustify
tions of
be
tions
ofBenford's
Benford'slaw,
severalof
ofwhich
will now
now be
law,several
whichwill
described.
described.
In scientific
if the
ofininIn
calculating,
of
the distribution
distribution
scientific
calculating,if
put data
processing station
is known,
intoaa central
centralprocessing
stationis
known,
put
data into
then
can
be used
can be
to design
comthenthis
thisinformation
information
used to
designaa computer which
numberof
ofways)
whichis
is optimal
ofaa number
ways)
optimal(in
(in any
any of
puter
with
Thus if
if the
the comcomto that
that distribution.
distribution. Thus
with respect
respect to
ofNewcomb
puter users
Newcomb
like the
the log-table
users of
users are
are like
log-tableusers
puter
or
Nigrini's study,
orthe
ofNigrini's
theirdata
data reprereprethetaxpayers
study,their
taxpayersof
sent
unbiased (as
base, reciprocity,
...)
to units,
reciprocity,...)
sent an
an unbiased
(as to
units,base,
in which
random
in
which
mixtureof
of various
various distributions,
distributions,
randommixture
case
it will
will (by
followBenBennecessarilyfollow
case it
Theorem3)
3) necessarily
(by Theorem
ford's
distributionhas
has
ford'slaw.
law. Once
Once aa specific
specificinput
input distribution
in this
been identified,
in
distributhiscase
case the
thelogarithmic
distribubeen
logarithmic
identified,
imtion, then
can
can be
be exploited
to imthenthat
that information
information
exploitedto
tion,
prove computer
Turner(1986)
Feldsteinand
and Turner
(1986)
design.Feldstein
prove
computerdesign.
show
showthat
that
the logarithunder the
the assumption
of the
under
logarithassumptionof
mic distribution
of numbers,
mic
distributionof
numbers,floatingfloatingpoint addition
result
additionand
and subtraction
subtractioncan
can result
point
in
in overflow
underflowwith
with alarming
alarming
overflowor
or underflow
frequency.
.. and
the suggestion
and lead
lead to
to the
suggestion
frequency...
reduce
of
whichwill
will reduce
ofaa long
wordformat
formatwhich
long word
the
the risks
risksto
to acceptable
levels.
acceptablelevels.
of
Schatte
under assumption
that under
concludesthat
Schatte(1988)
assumptionof
(1988) concludes
reis optimal
logarithmic input,
base bb =
withreoptimalwith
_ 2233 is
logarithmic
input,base
spect
storage
Knuth(1969)
afto minimizing
(1969) afminimizing
storagespace.
space. Knuth
spectto
inforinter
"established the
law
law for
the logarithmic
ter having
logarithmic
having"established
as an
an exercise
exercise
tegers by
by direct
leaves as
directcalculation,"
calculation,"leaves
tegers
the
(page
ofhexadechexadecthe desirability
desirabilityof
determining
(page 228)
228) determining
objecto different
different
withrespect
respectto
imal versus
versus binary
imal
binary with
objectives.
and Bareiss
Bareiss (1985)
(1985)
tives.Barlow
Barlowand
computer
conclude
that the
the logarithmic
logarithmiccomputer
concludethat
confidenceintervals
intervals
smaller error
error confidence
has smaller
has
for
errors
point
than aa floating
floatingpoint
forroundoff
roundoff
errorsthan
computer
word
computerword
withthe
the same
same computer
computerwith
the
thesame
same number
number
size
size and
and approximately
approximately
range.
range.
A
law is
is
ofBenford's
Benford'slaw
A second
modernapplication
applicationof
secondmodern
to
where goodness-of-fit
goodness-of-fit
to mathematical
mathematicalmodelling,
modelling,where
against
distribution
has
been sugsughas been
the logarithmic
logarithmic
distribution
against the
gested
testof
of reasonableness
reasonableness
Varian, 1972)
1972) as
as aa test
gested (c£
(cf. Varian,
of
proposed model,
of aa proposed
sortof
of"Benford"Benfordmodel,aa sort
of output
outputof
in-Benford-out"
criteria.
Wood's
In Nigrini
Nigrini and
and Wood's
criteria. In
in-Benford-out"
1990
(1995)
forexample,
example,the
the 1990
tabulations,for
(1995) census
census tabulations,
census
populations of
in the
the United
of the
the counties
countiesin
United
census populations
States
logarithmic
law
the significant-digit
logarithmiclaw
States follow
followthe
significant-digit
very
it seems
reasonablethat
thatmathematmathematso it
seemsreasonable
veryclosely,
closely,so
ofthe
the
ical
predicting future
futurepopulations
populationsof
forpredicting
ical models
modelsfor
counties
be aa close
fitto
If not,
not,
countiesshould
shouldalso
also be
closefit
to Benford.
Benford.If
perhaps aa different
model
be considered.
modelshould
considered.
different
shouldbe
perhaps
Nigrini has
vast
As one
finalexample,
has amassed
amassed aa vast
example,Nigrini
As
one final
collection
tax and
and accounting
data includincludcollectionof
of U.S.
U.S. tax
accountingdata
ining
interest
IRS-reported
interestinofIRS-reported
ing 91,022
91,022observations
observationsof
come
the
and share
share volumes
volumes (at
(at the
come (Nigrini,
1996), and
(Nigrini,1996),
rate
per day)
New York
the New
millionper
on the
York
day) on
rate of
of 200-350
200-350 million
in most
Stock
mostof
ofthese
these
StockExchange
1995),and
and in
(Nigrini,1995),
Exchange(Nigrini,
fit
cases
distribution
is
an excellent
excellentfit
the logarithmic
distribution
is an
logarithmic
cases the
(perhaps
because each
is an
an unbiased
unbiasedmixmixeach is
exactlybecause
(perhapsexactly
ture
distributions).
He
posdata from
fromdifferent
different
He posof data
distributions).
ture of
tulates
reasonable distridistrithat Benford
is often
oftenaa reasonable
tulates that
Benfordis
bution to
digits
of large
the significant
bution
to expect
forthe
large
significant
digitsof
expectfor
accounting
proposed aa goodnessgoodnessand has
data sets
sets and
has proposed
accountingdata
In an
artiof-fit
Benford to
fraud. In
to detect
an artidetectfraud.
testagainst
of-fittest
against Benford
in July
cle
Journal in
1995 (Berton,
in the
StreetJournal
(Berton,
July1995
cle in
the Wall
WallStreet
1995)
thatthe
the District
DistrictAttorney's
it was
was announced
announcedthat
Attorney's
1995) it
BenNew York,
using Nigrini's
office
in Brooklyn,
York,using
Nigrini'sBenofficein
Brooklyn,New
tests,
ford
has detected
detectedand
and charged
fordgoodness-of-fit
charged
tests,has
goodness-of-fit
withfraud.
fraud.
groups
New York
companieswith
at seven
seven New
Yorkcompanies
groupsat
in using
The
this
has expressed
interestin
The Dutch
DutchIRS
IRS has
usingthis
expressedinterest
and Nigrini
Benford
Nigrini
tax fraud,
to detect
detectincome
incometax
Benfordtest
testto
fraud,and
IRS.
has
proposals to
to the
the U.S.
U.S. IRS.
has submitted
submittedproposals
ACKNOWLEDGMENTS
ACKNOWLEDGMENTS
of
The
the Free
Free University
to the
Universityof
The author
authoris
is grateful
gratefulto
Amsterdam
Piet Holewijn
ProfessorPiet
and especially
Amsterdamand
Holewijn
especiallyProfessor
summer
during
for
thesummer
and hospitality
fortheir
theirsupport
duringthe
hospitality
supportand
David
PieterAllaart,
of
is grateful
to Pieter
and also
also is
of1995,
Allaart,David
gratefulto
1995,and
foraa number
number
Gilat,
PeterSchatte
Schattefor
and Peter
Raimiand
Gilat,Ralph
Ralph Raimi
correcforseveral
severalcorrecof
to
van Harn
to Klaas
Klaas van
Harn for
ofsuggestions,
suggestions,
and
tions
notationand
advice concerning
valuable advice
tions and
and valuable
concerningnotation
THE
THE SIGNIFICANT-DIGIT
SIGNIFICANT-DIGITLAW
LAW
excellent
to
Editor for
for excellent
Associate Editor
to an
an anonymous
anonymousAssociate
research
exposition.This
This research
ideas
improvingthe
the exposition.
ideas for
forimproving
was
partially supported
by NSF
NSF Grant
Grant DMS-95DMS-95was partially
supportedby
03375.
03375.
REFERENCES
REFERENCES
ofmost
mostsignifisignifiB. (1968).
(1968). Distribution
Distribution
ADHIKARI,
of
A. and
and SARKAR,
SARKAR, B.
ADHIKARI, A.
cant
whose
in certain
functions
whosearguments
are random
random
argumentsare
cantdigit
digitin
certainfunctions
variables.
B 30
Ser. B
30 47-58.
47-58.
variables. Sankhya
Sankhyd Sere
errordistribudistribu(1985). On
On roundoff
roundoff
BARLOW,
error
and BAREISS, E.
E. (1985).
BARLOW, J.
J. and
point and
arithmetic.
Computtions
in floating
and logarithmic
Computarithmetic.
logarithmic
tionsin
floatingpoint
ing
34 325-347.
325-347.
ing 34
in listings
MTTF
BECKER, P.
and
P. (1982).
offailure-rate
failure-rate
and MTTF
listingsof
(1982). Patterns
Patternsin
values
IEEE Transactions
Reon Reand listings
of other
other data.
data. IEEE
Transactions on
values and
listings of
liability
R-31 132-134.
132-134.
liability R-31
BENFORD, F.
Proceedings
law of
ofanomalous
numbers.Proceedings
anomalousnumbers.
F. (1938).
(1938).The
The law
of
American Philosophical
Philosophical Society
theAmerican
78 551-572.
551-572.
Society 78
of the
to
BERTON, L.
uses math
mathto
theirnumber:
number:scholar
scholaruses
(1995). He's
He's got
gottheir
L. (1995).
foil
Journal, July
July 10.
10.
foilfinancial
fraud. Wall
Wall Street
Street Journal,
financial fraud.
BUCK, B.,
of
of
An illustration
illustration
S. (1993).
(1993). An
MERCHANT, A.
A. and
and PEREZ, S.
B., MERCHANT,
Benford's
first
Eurohalflives.
decayhalf
lives.EuroBenford's
law using
usingalpha
alpha decay
firstdigit
digitlaw
14 59-63.
pean J
J. Phys.
Phys. 14
59-63.
pean
BURKE,
E. (1991).
Benford'slaw
law and
and physical
J. and
physical
and KINCANON,
(1991). Benford's
KINCANON, E.
BuRKE, J.
constants:
of
Amer. J.
J. Phys.
Phys.
of initial
initial digits.
digits.Amer.
constants:the
the distribution
distribution
59952.
59 952.
COHEN, D.
An explanation
ofthe
firstdigit
digitphenomenon.
phenomenon.
thefirst
D. (1976).
explanationof
(1976).An
20 367-370.
J. Combin.
A 20
J
Combin. Theory
Ser. A
367-370.
TheorySere
and the
digit
COHEN, D.
thefirst
firstdigit
D. and
T. (1984).
Primenumbers
numbersand
and KATZ,
KATZ, T.
(1984).Prime
J. Number
Number Theory
phenomenon. J.
Theory18
18 261-268.
261-268.
phenomenon.
and Statistics.
DE FINETTI, B.
Probability, Induction
Induction and
Statistics. Wiley,
Wiley,
B. (1972).
(1972). Probability,
New York.
New
York.
uniDIACONIS, P.
of
digitsand
and uniP. (1977).
The distribution
distribution
ofleading
leadingdigits
(1977). The
mod 1.
1. Ann.
form
Ann. Probab.
Probab. 55 72-81.
72-81.
formdistribution
distributionmod
DIACONIS, P.
percentages.
percentages.
P. and
D. (1979).
(1979).On
On rounding
rounding
and FREEDMAN, D.
74 359-364.
J. Amer.
Amer. Statist.
Assoc. 74
359-364.
J.
Statist. Assoc.
funcDUBINS,
funcL. and
and FREEDMAN, D.
D. (1967).
Randomdistribution
distribution
(1967). Random
DuBINs, L.
Statist. Probab.
183tions.
Proc. Fifth
Fifth Berkeley
Berkeley Symp.
Math. Statist.
Probab. 183tions. Proc.
Symp. Math.
214.
214. Univ.
Press,Berkeley.
Berkeley.
Univ.California
CaliforniaPress,
363
363
and
P. (1986).
(1986). Overflow,
Overflow,
FELDSTEIN, A.
A. and
and TuRNER,
TURNER, P.
underflow,
FELDSTEIN,
underflow,
and
severe
in
addition
ofsignificance
loss of
in floating-point
additionand
and subsubfloating-point
severeloss
significance
IMA J
traction.
J. Numer.
Numer. Anal.
Anal. 66 241-251.
241-251.
traction.lMA
randomnumber
number
probability that
On the
theprobability
thataa random
FLEHINGER, B.
B. (1966).
(1966). On
73 1056-1061.
1056-1061.
has
A. Amer.
Amer. Math.
Monthly 73
Math. Monthly
has initial
initial digit
digitA.
HAMMING,
of
Bell
Bell System
System
R. (1970).
On the
thedistribution
distribution
ofnumbers.
numbers.
HAMMING, R.
(1970).On
Technical
Journal 49
49 1609-1625.
1609-1625.
Technical Journal
HILL, T.
Proc.
law. Proc.
impliesBenford's
Benford'slaw.
HILL,
T. (1995a).
(1995a). Base-invariance
Base-invarianceimplies
Amer. Math.
Math. Soc.
Amer.
123 887-895.
887-895.
Soc. 123
HILL, T.
phenomenon. Amer.
Amer. Math.
Math.
significant-digit
phenomenon.
HILL,
T. (1995b).
(1995b).The
The significant-digit
Monthly 102
102 322-327.
322-327.
Monthly
distribution
of
of leading
leadingdigits
digits
T. (1992).
The logarithmic
logarithmic
distribution
JECH, T.
(1992). The
Math. 108
and
Discrete Math.
108 53-57.
53-57.
and finitely
measures. Discrete
finitelyadditive
additive measures.
AcademicPress,
Press,
KALLENBERG,
Random Measures.
Measures. Academic
KALLENBERG, O.
0. (1983).
(1983). Random
New York.
New
York.
KNuTH, D.
Art ofComputer
Programming 22 219-229.
D. (1969).
The Art
219-229.
of ComputerProgramming
KNUTH,
(1969). The
MA.
Addison-Wesley,
Reading,
Addison-Wesley,
Reading,MA.
peculiar distribution
of
of the
the U.S.
U.S. stock
stock
E. (1995).
On the
the peculiar
distribution
LEY, E.
(1995). On
Amer. Statist.
indices
Statist.To
To appear.
appear.
indicesdigits.
digits.Amer.
LOEVE,
Probability Theory
ed. Springer,
M. (1977).
4th ed.
Springer,
Theory1,
1, 4th
LoEVE, M.
(1977). Probability
New York.
New
York.
use of
ofthe
NEWCOMB, S.
Note on
of
on the
thefrequency
ofuse
thedifferent
different
frequency
NEwcOMB,
S. (1881).
(1881).Note
in natural
Amer. J
J. Math.
Math. 44 39-40.
digits
naturalnumbers.
numbers.
Amer.
39-40.
digitsin
M. (1995).
Privatecommunication.
communication.
NIGRINI, M.
(1995). Private
BenJ. (1996).
A taxpayer
ofBenM. J.
taxpayercompliance
complianceapplication
applicationof
NIGRINI, M.
(1996).A
the American
Taxation Association
18
ford's
Journal of
American Taxation
Association 18
ford'slaw.
law. Journal
of the
72-91.
72-91.
of
theintegrity
oftabtabM. and
and WOOD,
W. (1995).
integrity
(1995).Assessing
Assessingthe
NIGRINI, M.
WOOD,W.
and St.
ulated demographic
data.
Univ.
ulated
data. Preprint,
Univ.Cincinnati
Cincinnatiand
St.
Preprint,
demographic
Mary's
Univ.
Mary'sUniv.
firstdigits.
RAIMI,
peculiar distribution
of
R. (1969).
offirst
digits.ScienSciendistribution
RAIMI,R.
(1969). The
The peculiar
December 109-119.
tific
American December
109-119.
tificAmerican
RAIMI,
problem. Amer.
Amer. Math.
Math. Monthly
Monthly
R. (1976).
firstdigit
digit problem.
RAIMI, R.
(1976). The
The first
102
102 322-327.
322-327.
RAIMI,
phenomenon again.
Proceedings
R. (1985).
The first
firstdigit
digitphenomenon
again.Proceedings
RAIMI,R.
(1985). The
129 211-219.
of
American Philosophical
Philosophical Society
theAmerican
Society 129211-219.
of the
in computing
SCHATTE, P.
in
and
P. (1988).
mantissa distributions
and
On mantissa
distributions
computing
(1988). On
Benford's law.
J. Inform.
Inform. Process.
Process. Cybernet.
24 443-455.
443-455.
Benford's
law. J
Cybernet.24
23 65-66.
VARIAN, H.
law.
Amer. Statist.
H. (1972).
law.Amer.
Statist. 23
65-66.
Benford's
VARIAN,
(1972). Benford's
WEAVER,
Lady Luck:
Luck: The
Probability 270The Theory
270W. (1963).
Theoryof
of Probability
WEAVER, W.
(1963). Lady
277. Doubleday,
New York.
York.
277.
Doubleday,New
Download