THE SIGNIFICANT-DIGIT THE SIGNIFICANT-DIGITLAW LAW A perhaps perhaps surprising ofthe the general generallaw law A surprisingcorollary corollaryof (4) (4) is is that that(CL (cf.Hill, Hill, 1995b) 1995b) the digits the significant are dependent dependent significant digitsare and mightexpect. expect. and not notindependent independentas as one one might From prob(unconditional)probFrom(2) (2) it it follows followsthat that the the (unconditional) ability by the second seconddigit is 22 is is ~ 0.109, thatthe but by 0.109, but digitis abilitythat (4) probability that the(conditional) thatthe thesecond seconddigit digit (4) the (conditional)probability _ 0.115. This dedeis given that is 2, thatthe thefirst firstdigit is 1, is ~ 0.115.This digitis 2,given 1,is pendence among digits decreasesrapidly rapidly pendence amongsignificant significant digitsdecreases as between the the digits and it it as the the distance distancebetween digitsincreases, increases,and follows fromthe the general law (4) (4) that thatthe the disdisfollowseasily easilyfrom generallaw tribution digit ofthe the nth nthsignificant the digitapproaches approachesthe tributionof significant uniform on {O, .. . ,9} uniformdistribution distributionon 1, ... , 9} exponentially exponentially {0, 1, -? 00. fast will concentrate This article article will concentrateon on decifast as as nn ~ oo. This decimal and digits; mal (base-l0) and significant significant digits; (base-10)representations representations the analog forother bases bb >> 11 thecorresponding of(3) for otherbases corresponding analog of(3) < tlb) = 10gb for is b) ::: t/b)= is simply simplyProb Prob(mantissa (mantissa(base (base b) logbtt for all all tt EE [1, [1, b).] b).] EMPIRICAL EVIDENCE EMPIRICALEVIDENCE Of Of course, of numerical do numerical data data do course, many many tables tables of not not follow followthis this logarithmic of distribution-listsof logarithmicdistribution-lists in aa given telephone benumbersin regiontypically typicallybetelephonenumbers given region gin with the same with the same few fewdigits-and even "neutral" "neutral" gin digits-and even data of integers data such such as as square-root tables of are integersare square-roottables not diverse diversecolcolnot good good fits. fits.However, However,aa surprisingly surprisingly lection of empirical lection of data does does seem seem to to obey obey the the empiricaldata significant-digit law. law. significant-digit Newcomb (1881) Newcomb noticed"how "how much much faster fasterthe the (1881) noticed first pages [of tables] wear wear out out than than firstpages [of logarithmic logarithmictables] the last the last ones," and after afterseveral several short shortheuristics, heuristics, ones,"and law. concluded law. Some Some concludedthe the equiprobable-mantissae equiprobable-mantissae 57 the physicist physicist Frank 57 years Frank Benford Benfordredisredislater the years later covered the it with withover over20,000 covered the law law and and supported 20,000 supportedit entries from tables such didientries from20 20 different different tables including includingsuch of verse data of 335 335 rivers, heats of verse data as as areas areas of rivers,specific specificheats 1389 basechemicalcompounds, AmericanLeague 1389 chemical League basecompounds,American ball statistics Reader's fromReader's statisticsand and numbers numbersgleaned ball gleaned from AlDigest articles pages of of,newspapers. articlesand and front frontpages newspapers.AlDigest though page 363) Diaconis and and Freedman Freedman (1979, (1979, page 363) thoughDiaconis offer/convincing evidence evidencethat thatBenford Benfordmanipulated offerconvincing manipulated round-off errors. better fit errorsto to obtain obtain aa better fitto to the the logaround-off logarithmic the unmanipulated data are are aa rithmiclaw, law, even even the unmanipulateddata remarkably good Newcomb's article been fit.Newcomb's articlehaving havingbeen remarkably goodfit. overlooked, the became known Benford's also became knownas as Benford's thelaw law also overlooked, law. law. popularization of Since of the an Since Benford's Benford'spopularization the law, law, an abundance of additional evidence has has additional empirical abundance of empirical evidence appeared. physics, for In physics, for example, Knuth (1969) example, Knuth (1969) appeared. In and that of of observedthat Burke and and Kincanon Kincanon(1991) and Burke (1991) observed the physical constants constants(e.g., most commonly used physical the most (e.g., commonlyused 355 355 of the constants as speed speed of of light lightand and force forceof the constantssuch such as inside cover coverof of an an introducintroducgravitylisted listed on on the the inside gravity tory physics textbook), about 30% 30% have have leading leading tory physics textbook),about significant digit the Becker(1982) (1982) observed observedthat that the significant digit1. 1. Becker have aa decimal parts of (hazard) rates rates often oftenhave decimalparts offailure failure(hazard) and and Buck, Buck, Merchant Merchantand logarithmicdistribution, distribution, logarithmic and raPerez (1993), the values values of ofthe the 477 477 raPerez (1993), in in studying studyingthe dioactive which of unhindered unhinderedaa decays decays which dioactivehalf-lives half-livesof throughoutthe the present present have been accumulated have been accumulated throughout century magwhichvary varyover overmany manyorders ordersof ofmagcenturyand and which nitude, of frequencyof of occurrence occurrenceof nitude,found foundthat that the the frequency the both measured valthefirst firstdigits ofboth measuredand and calculated calculatedvaldigitsof in "good ues of the the half-lives is in with ues of half-livesis "good agreement" agreement"with Benford's Benford'slaw. law. In In scientific the assumption of loglogassumptionof scientificcalculations calculationsthe arithmically distributed distributedmantissae mantissae"is "is widely used widelyused arithmically and and Turner, 1986, and well well established" established"(Feldstein (Feldsteinand Turner,1986, page 241), ago, and as as early as aa quarter-century ago, early as quarter-century page 241), and Hamming page 1609) called the the appearance 1609) called appearance Hamming(1970, (1970, page of in floating-point of the the logarithmic distributionin floating-point logarithmicdistribution numbers Benford-like input is often Benford-like often inputis numbers"well-known." "well-known." aa common forextensive extensivenumerical numericalcalcalcommonassumption assumptionfor but Benford-like output culations culations (Knuth, Benford-like output 1969), but (Knuth, 1969), is when the the input inputhas has random random is also also observed observedeven even when (non-Benford) distributions. Adhikari Adhikariand and Sarkar Sarkar (non-Benford) distributions. (1968) ran"that when when ran(1968) observed observedexperimentally experimentally"that dom dom numbers or their their reciprocals are raised raised to to reciprocalsare numbersor higher and powers, they and higher have log distributheyhave log distribuhigher higherpowers, tion of digit in the tion of most mostsignificant the limit." limit."Schatte Schatte digitin significant (1988, page 443) "In the of that "In the course course of reportsthat (1988, page 443) reports in floating-point aa sufficiently long computationin floating-point sufficiently long computation arithmetic, the occurring mantissas have have nearly nearly occurringmantissas arithmetic,the logarithmic distribution." distribution." logarithmic Extensive law evidence of of the the significant-digit law Extensive evidence significant-digit has also in accounting data. Varian Varian(1972) has also surfaced surfacedin accountingdata. (1972) in 777 in the studied 777 tracts tractsin the San San FranFranstudiedland land usage usage in cisco be seen, "As can cisco Bay and concluded concluded''As can be seen, area and Bay area in fairly both the the input data and and the the forecasts forecastsare are in both fairly inputdata Nigrini and good and Wood withBenford's Benford'sLaw." Law."Nigrini Wood accordwith goodaccord (1995) of 1990 census show that that the the 1990 census populations populationsof (1995) show .the in the "follow the 3141 the United United States States "follow 3141 counties counties in and Benford's caland Nigrini Benford'sLaw Law very closely," Nigrini(1996) (1996) calveryclosely," culated of income incometax tax the digital culated that that the frequenciesof digitalfrequencies of data Revenue Service Service of to the the Internal InternalRevenue data reported reportedto is an interest paid is an extremely receivedand and interest interestpaid interestreceived extremely segood the seBenford.Ley found"that "thatthe fitto to Benford. Ley (1995) (1995) found good fit ries returns on Industrial on the theDow-Jones Dow-JonesIndustrial riesof ofone-day one-dayreturns Average Index Poor's Index (DJIA) and the the Standard Standardand and Poor's Average (DJIA) and Index law." withBenford's Benford'slaw." Index(S&P) reasonablyagrees agreeswith (S&P) reasonably All these also highly All the author authoralso these statistics statisticsaside, highly aside, the justifiably skeptical recommends that reader that the the justifiably recommends skepticalreader perform aa simple such as as randomly randomly experiment,such perform simple experiment, of sevsevselecting pages of data from fromfront frontpages numericaldata selectingnumerical as eral Farmer'sAlmanack" Almanack"as "oraa Farmer's eral local local newspapers, newspapers,"or Knuth (1969) Knuth (1969) suggests. suggests. T. T. P. P. HILL HILL 356 356 CLASSICAL CLASSICAL EXPLANATIONS EXPLANATIONS Since law (4) Since the the empirical does empirical significant-digit significant-digitlaw (4) does not or not specify specifyaa well-defined well-definedstatistical statistical experiment experiment or sample space, space, most most attempts attempts to to prove the law law have have sample prove the been purely purely mathematical in nature, been mathematical (deterministic) (deterministic)in nature, attempting built-in charattemptingto to show show that that the the law law "is "is aa built-in characteristic of our as Weaver acteristic of our number number system," system,"as Weaver (1963) (1963) was to called prove first called it. it. The The idea idea was to prove firstthat the set set that the of of real and then real numbers numbers satisfies satisfies (4), (4), and then suggest suggest that that this explains this explains the the empirical empirical statistical statistical evidence. evidence. point has been to A common A commonstarting startingpoint has been to try tryto to estabestablish positive integers N, beginning beginning with for the lish (4) (4) for the positive integers NJ, with = I} = {I, the prototypical prototypical set {D11 = 1} = the set {D {1, 10, 10, 11, 11, 12, 12, 13, 13, 14, positive integers 14, ... .. .,, 19, 19, 100, 100, 101, 101, ...}, .. .}, the the set set of ofpositive integers with leading 1. The The source of diffidiffiwith leading significant significantdigit digit 1. source of culty problem of the culty and and much much of of the the fascination fascination of the problem = I} 1} does {D11 = is is that that this this set set {D does not not have have aa natural natural density densityamong among the the integers, integers,that that is, is, 1 = l} lim {D 1 = n {l, 2, ... , n} lim -{D1 1}fn{1,2,...,n} n n noo n~oo .! does sets of or does not not exist, the sets of even even integers exist, unlike unlike the integers or which have reprimes which primes have natural natural densities densities 1/2 1/2 and and 0, 0, respectively. spectively.It It is is easy easy to to see see that that the the empirical empirical density density = I} 1} oscillates of {D {D11 = and of between 1/9 oscillates repeatedly repeatedly between 1/9 and 5/9, possible to and thus thus it is theoretically 5/9, and it is theoreticallypossible to assign assign as the any number in the "probability" in [1/9, ofthis this any number [1/9, 5/9] 5/9] as "probability"of set. used aa reiterated-averaging set. Flehinger Flehinger (1966) (1966) used reiterated-averaging technique which asasto define defineaa generalized technique to generalized density density which 2 to {D11 == Benford value signs the "correct" "correct"Benford value log10 to {D signs the log1o2 I}, Cohen 1}, Cohen (1976) (1976) showed showed that that "any "any generalization generalization of of natural which applies natural density to the the [significant density which applies to [significant digit which satisfies satisfies one one additional additional conconand which digit sets] sets] and = I}]" 2 to must assign the value value log10 to [{ dition D1 = dition must assign the [{D1 1}] log1o2 and found necessary and sufficient conand Jech Jech (1992) (1992) found necessary and sufficientconditions be the foraa finitely-additive set function functionto to be the ditions for finitely-additiveset log None of function.None of these these solutions, relog function. solutions, however, however, resulted probability, the in aa true the sulted in true (countably (countably additive) additive) probability, in the difficulty being exactly the same same as as that founthat in the foundifficulty being exactly the dational problem of an integer of "picking random" dational problem at random" "picking an integer at (CL pages 86 86 and de Finetti, and 98-99), (cf. de Finetti, 1972, 1972, pages 98-99), namely, namely, if probabilifeach each singleton singletoninteger integer occurs occurs with with equal equal pr6bability, whole ity,then then countable countable additivity additivityimplies implies that that the the whole space probability zero zero or or infinity. space must must have have probability infinity. These been These discrete-summability have been discrete-summabilityarguments arguments have extended via various extended via various integration Fourier integration schemes, schemes, Fourier analysis and Banach Banach measures measures to densianalysis and to continuous continuous densi= I} {D11 = 1} is ties on positive reals, is now now the the ties on the the positive where {D reals, where set positive numbers with first set of ofpositive numbers with firstsignificant significantdigit digit 1, 1, that is, that is, (5) {D, = 1} = 00 00 U [1, 2) n=-oo n=-oo X lon. in this One popular assumption One popular assumption in this context contexthas has been been that of scale invariance, that of scale invariance, which which corresponds corresponds to to the the intuitively intuitively attractive attractive idea idea that that any any universal universal law law should be independent should be independent of of units units (e.g., (e.g., metric metric or or EnEnglish). problem here, glish). The The problem here, however, however,as as Knuth Knuth (1969) (1969) observed, scale-invariant Borel Borel is that is no observed, is that there there is no scale-invariant probability measure on on the the positive reals since since then then probability measure positive reals the probability probability of the of the of the set set (0, (0, 1) 1) would would equal equal that that of (0, s) for s, which all s, (0, s) forall which again again would would contradict contradictcountcountable able additivity. additivity.(Raimi, (Raimi, 1976, 1976, has has aa good good review review of of many of many of these these arguments.) arguments.) Just Just as as with with the the dendensity proofs for sity proofs for the the integers, integers, none none of of these these methods methods yielded probabilistic law yielded either either aa true true probabilistic law or or any any statisstatistical tical insights. insights. Attempts prove the based on Attempts to to prove the law law based on various various urn schemes picking significant urn schemes for for picking ransignificantdigits digits at at ranin general, dom been equally dom have have been equally unsuccessful unsuccessful in general, although in some although in some restricted restrictedsettings settings log-limit log-limitlaws laws have been established. have been established. Adhikari Adhikari and and Sarkar Sarkar (1968) (1968) proved that powers of proved that powers of aa uniform uniform(0,1) (0, 1) random random varivariin the able able satisfy satisfy Benford's Benford's law law in the limit, limit, Cohen Cohen and and Katz prime chosen ranKatz (1984) showed that chosen at that aa prime at ran(1984) showed dom with dom zeta distribution with respect to the the zeta distributionsatisfies satisfies respect to the law and and Schatte the logarithmic Schatte logarithmic significant-digit significant-digitlaw (1988) Benford's law law (1988) established established convergence convergence to to Benford's for products of for sums and products of certain certain nonlattice nonlattice i.i.d. i.i.d. sums and variables. variables. THE THE NATURAL SPACE NATURAL PROBABILITY PROBABILITY SPACE The putting the of putting law into The task task of the significant-digit into significant-digitlaw aa proper proper countably probability framework additive probability framework countably additive is is actually Since the actually rather rather easy. easy. Since the conclusion conclusion of of the the law law (4) is simply (4) is simply aa statement statement about about the the significantsignificantdigit D 1 , D2, D 2 , ••• .. .,, let functions (random let digit functions (random variables) variables) D1, the sample space be ~+, the set of positive the sample space be OR, the set of positive reals, reals, and and let of events let the the sigma sigma algebra algebra of events simply simply be be the the , IT-field generated by {D D •.• } [or equivalently, .. .} , o-field generated by {D1, [or equivalently, 2 1 D2, generated by the x r--+ x)]. the single functionx generated by single function ?-+ mantissa( mantissa(x)]. It seen that this IT-algebra, which will be is easily seen which will be It is that this easily o-algebra, denoted JI and be called mantissa will be called the the (decimal) mantissa denoted .,X and will (decimal) in IT-algebra, is the Borels is aa sub-lT-field of the Borels and and that sub-u-fieldof that in u-algebra, fact fact (6) (6) S Se!EE JI {} S S== 00 00 U U n=-oo n=-oo Bx B x 10' Ion c [1,10), for B ~ forsome some Borel Borel B [1, 10), which just the which is is just of the the repthe obvious obvious generalization generalization of representation (5) {D11 == I}. 1}. for{D resentation (5) for 357 357 SIGNIFICANT-DIGIT LAW LAW THE SIGNIFICANT-DIGIT The mantissa mantissa cr-algebra IT-algebra JI, although quite quite simsimX#,although The has several several interesting interesting properties: properties: ple, has ple, (7) (7) (i) every every nonempty nonempty set set in in ./ JI is is infinite infinite (i) with accumulation points at 0 and at at 0 and at points withaccumulation +00; +oo; (ii) JI is closed closed under under scalar scalar multiplicamultiplica/#is (ii) tion (s (s >> 0, 0, SS E E JI ::::} sS sS E E X); JI); //=X tion (iii) JI is closed closed under under integral integral roots roots ,# is (iii) (m E EN,S E JI ::::} slim E JI), but not = Si/m but not .,X), E E (m Ng,S powers; powers; is self-similar self-similar in in the the sense sense that that (iv) JI X4 is (iv) if S S EE JI, then lOmS 10 m S = = SS for for every every if X/,then integer m m integer (where aS as and and Sa sa denote denote the the sets sets {as: E S} and {as: ss E S} and (where {sa: E S}, S}, respectively). respectively). {sa: s E Property (i) (i) implies implies that that finite finite intervals intervals such such as as Property [1, 2) 2) are are not not in in JI (i.e., are are not not expressible expressible in in terms terms X4(i.e., [1, of the the significant digits alone; alone; e.g., e.g., significant digdigsignificant of digits significant its alone alone cannot cannot distinguish distinguish between between the the numbers numbers its 2 and and 20) 20) and and thus thus the the countable-additivity countable-additivity contracontra2 dictions associated associated with with scale invariance disappear. disappear. scale invariance dictions Properties (i), and (iv) follow easily easily by by (6), but (6), but (iv) follow (i), (ii) (ii) and Properties (iii) closer inspection. inspection. The The square root of of square root warrantsaa closer (iii) warrants aa set similarly and similarly "parts,"and in JI consistof oftwo two"parts," mayconsist set in X4may for if For example, example,if forhigher roots.For higherroots. 00 00 S = {D 1 S={D1=1}= = I} = U x lon, U [1,2) [1,2)xlOn, n=-oo n=-oo then then 2 S1/ = S12 = 00 00 U U n=-oo n=-oo [1~,J2) x Ion [10Nd2)X1.On 00 U [m,,J20) x u U U n=-oo [A/-0 A/--) Ion X lon E E JI, h/ but but S2 S2 = =U 00 00 2n U [1,4) // 102n ¢ [194) xX 10 V JI, n=-oo n=-oo since canand thus thuscantoolarge whichare are too largeand sinceit it has has gaps gaps which not }. Just as ... .}. Just as in terms of {D D2, terms of {D1, written in be written not be 1, D 2 , ..• property scale of scale the hypothesis is the to the hypothesisof the key key to (ii) is property(ii) invariance, (iv) is the to aa hypothesis the key hypothesis keyto (iv) is property invariance,property of willbe describedbelow. below. be described whichwill ofbase base invariance, invariance,which (Although the is emphasized R+is above,the emphasizedabove, the space space ~+ (Althoughthe analogous on intethe positive on the mantissaIT-algebra positiveintecr-algebra analogousmantissa gers removes as such suchremoves and as is essentially thesame same and essentiallythe 1Nis gersN the density since on N N since thecountable-additivity densityproblem problemon countable-additivity nonempty finite the ofthe in the the domain domainof not in sets are are not finitesets nonempty probability function.) function.) probability SCALE AND AND BASE BASE INVARIANCE INVARIANCE SCALE With the the proper proper measurability measurability structure structure now now With identified, aa rigorous rigorous notion notion of of scale scale invariance invariance is is identified, is X/# easy to to state. state. Recall Recall (7) (7) (ii) (ii) that that JI is closed closed under under easy scalar multiplication. multiplication. scalar DEFINITION 1. 1. A A probability probability measure measure P P on on DEFINITION (~+, .X#) JI) is is scale scale invariant invariant if if P(S) P(S) == P(sS) P(sS) for for (R', all ss >> 0 0 and and all all S S E E X. JI. all In In fact, fact, scale scale invariance invariance characterizes characterizes the the general general significant-digit law law (4). (4). significant-digit 1 (Hill, (Hill, 1995a). THEOREM THEOREM 1 1995a). A A probability probability measure measure P on on (R+, (~+, JI) is scale scale invariant invariant if if and and only only if if P //)is 00 (8) P U [1, t)t) PC~}l, n=-oo xx lOn) lon = 10). [1, 10). = logl0 IOg10 t t for for all all t t E E [1, inscale inofscale One possible possible drawback drawback to to aa hypothesis hypothesis of One "universalconstants," variance in in tables tables of of "universal constants," however, however, variance ex1. For For exconstant1. is the the special role played played by by the the constant is special role = ma ma and ample, consider consider the the two two physical physical laws laws ff = and ample, 2 = constants, laws involve involve universal universal constants, • Both ee = mC Both laws mC2. is not not recorded recorded constant11 is but the forceequation but the force equation constant constant lightconstant oflight speed of the speed whereasthe in in most tables,whereas mosttables, conlist of ofuniversal universalphysical C physical conIf aa "complete" is. If C is. "complete"list that plausiblethat it seems seems plausible l's, it stants the l's, also included includedthe stantsalso withstrictly posstrictlyposoccurwith this constantmight mightoccur this special special constant scale violate scale would violate that would itive However, However,that frequency. itive frequency. all other other (and all the constant constant22 (and invariance, sincethen thenthe invariance,since same positive withthis thissame probconstants) occurwith positiveprobwouldoccur constants)would ability: ability. reasonis assumed any reasonthat any assumed that Instead, it is suppose it Instead,suppose shouldbe be base base law should able law universalsignificant-digit able universal significant-digit valid when when should be invariant, be equally equally valid that is, is, should invariant,that 10. In In fact, fact, otherthan than 10. in terms ofbases bases other rewritten in termsof rewritten Benford's all supportingBenford's ofthe the classical classical arguments all of argumentssupporting 1976, mutandis (Raimi, (Raimi, 1976, law mutatis mutandis over mutatis law carry carry over As will seen shortly, will be be seen shortly, page bases. As to other otherbases. 536) to page 536) characterizes invariance characterizes the of base base invariance the hypothesis hypothesisof Dirac probability law and and aa Dirac probability mixtures of Benford's Benford'slaw mixturesof occur constant1, whichmay mayoccur measure 1,which on the thespecial specialconstant measureon with withpositive probability. positiveprobability: conofbase base invariance, invariance,conTo of definition To motivate motivatethe thedefinition with numberswith 1} of {D11 = I} of positive sider set {D positivenumbers sider the the set set This same same set leading digit 10). This (base 10). digit11 (base leadingsignificant significant as written of numbers can also [ct: (5)] be written as be can numbers also of [cf.(5)] 00 00 U [1,2) 1}== U {D1= loon {D x loon [1, 2) x 1 = I} n=-oo n=-oo 00 00 uU U [10, x 100, loon, 20) x [10,20) n=-oo n=-oo 358 358 HILL T. T. P. P. HILL ofpositive that is, 1} is is also also the the set set of positivenumbers numbers that is, {D {D11 = I} whose digit in the is in significant digit(base (base 100) 100) is the whoseleading leadingsignificant set 11, ... , 19}. In ofreal In general, general,every everyset set of real set {I, {1, 10, 10,11,...,19}. set is exactly exactlythe the same same set (base 10) 10) in in ./ numbers S (base numbers 8 JI is 1/ 2 (base as JI. in X. of real (base 100) 100) in real numbers numbers8S1'2 as the the set set of measure invariant,the themeasure is base Thus is base invariant, Thus if ifaa probability probability uof (in the themantissa mantissauset of ofreal real numbers numbers(in ofany anygiven givenset algebra JI) should be the bases and, and, in in forall all bases shouldbe the same same for algebra.,X) original particular, for bases which powersof ofthe theoriginal forbases whichare are powers particular, base. This natural following naturaldefinition definition suggeststhe the following base. This suggests roots, under integral integralroots, [recall is also also closed closed under [recall that that JI ./ is property (7)(iii)]. property (7)(iii)]. at basic questions in0. [A questionsconcerning concerninginnumberof of basic at o. [A number variance under multiplication are open, such such are still still open, multiplication variance under as 25-year-old conjecture that unitheuniconjecture thatthe Furstenberg's 25-year-old as Furstenberg's is the theonly onlyatomless atomlessprobform on probon [0, [0, 1) 1) is formdistribution distribution ability invariant 2x(mod1) 1) distribution underboth both2x(mod invariantunder abilitydistribution and and 3x(mod 3x(mod1).] 1).] FROM RANDOM RANDOM SAMPLES SAMPLES FROM RANDOM RANDOM DISTRIBUTIONS DISTRIBUTIONS Theorems be clean clean mathematically, mathematically, and 22 may maybe Theorems11 and but they of the appearance appearance of but hardly help help explain explain the they hardly census Benford's What do do 1990 1990 census Benford'slaw law empirically. empirically.What in common populations of have in with commonwith of U.S. U.S. counties countieshave populations probability measure P on on 2. A A probability measure P DEFINITION 2. data from numerical of logarithm tables, 1880 users of logarithm tables, numerical data from 1880 users 1 n = p(Sl/n) forall all if P(S) (~+ JI) is P( 8) = P( 8 / ) for invariantif (R', , X) is base base invariant collected ofthe the 1930s 1930s collected newspaperarticles articlesof front-page newspaper front-page positive integers JI. integersnn and and all all 8S EE X. positive by Benford or universal constantsexamexamphysicalconstants Benfordor universalphysical by ined by Knuth in the 1960s? Why should these in should these 1960s? Why Knuth the ined by Next, observe numbers set of ofnumbers observethat thatthe the set Next, tables be be logarithmic or, scale or base base scale or tables equivalently, logarithmic or,equivalently, forall all jj >> I} 8Si= D j = 00 for 1} includinvariant? notof ofthis thisform, form,includManytables tables are are not invariant?Many =1,1, Di {D,1 = 1 = {D ing he noted), noted), tables (as (as he individualtables ing even even Benford's Benford'sindividual = ...} {..., , 0.01, = {... 1, 10, 10, 100, 100,...} 0.01, 0.1, 0.1, 1, closbut as pointed out, came closout,"what "whatcame but (1969) pointed as Raimi Raimi (1969) 00 00 est unionof ofall his tables." tables." est of ofall, was the the union all his however,was all, however, = U Ion eEE JI {1} xX lon U {I} basewith baseCombine weighttables tables with molecularweight Combinethe the molecular n=-oo n=-oo ball statistics there of rivers, rivers,and and then thenthere and areas areas of ball statisticsand has [by JI-measurable so no nonempty .-measurable subsets, subsets,so nonempty has [by(6)] (6)] no is of previousexplanations explanationsof Many of ofthe the previous is aa good goodfit. fit.Many the of this set is well defined. defined. theDirac deltameasure measure881 of this set is well Dirac delta 1 Benford's universal some universal have hypothesized hypothesizedsome Benford'slaw law have D 8Si1 and = 11 if = 00 otherwise, if8S ;2 forall all [Here for and = otherwise, [Here 861(S) 1 (8) = table "stock Raimi's (1985, 217) "stock (1985, page page 217) table of of constants, constants,Raimi's 8 P L denote probabilS EE JI.J logarithmic probabilLettingPL denotethe thelogarithmic ./.] Letting of in the or Knuth's Knuth's oftabular the world's world'slibraries" libraries"or tabulardata data in ity on givenin in (8), (8), aa complete complete on (~+, X4)given itydistribution distribution (lR', JI) (1969) of real and "some imagined set of real numbers," numbers,"and imaginedset (1969) "some characterization for base-invariant significant-digit forbase-invariant significant-digit tried characterization real obserprove why sets of ofreal obserto prove certainspecific specificsets whycertain triedto probability measures be given. given. measurescan can now nowbe probability vations of ofeither eitherthis this mystical vationswere were representative mystical representative universal or the all real universaltable the set set of ofall real numbers. numbers. table or THEOREM 22 (Hill, A probability probability measure measure (Hill, 1995a). 1995a). A What morenatural is to to think thinkof of data data as What seems seems more naturalis as P P on ifand and only onlyif if //)is is base base invariant invariantif on (~+, (O;i, JI) as was coming was distributions,as from many many different differentdistributions, coming from clearly in Benford's in his his "ef"efthe case case in Benford's(1938) (1938) study studyin clearlythe P == qP for some some qq eE [0,1]. P + (1[0, 1]. qPLL + (-q)51q)8 1 for fort possible as possible data from fromas as many fieldsas manyfields fortto to collect collectdata and oftypes" wide variety and to to include includeaa wide 552); varietyof types"(page (page 552); From it is is easily see that that easily see From Theorems Theorems11 and and 22 it as "the range of subjects studied and tabulated was as and was "the of studied tabulated subjects range scale invariance but not not scale invarianceimplies base invariance, invariance,but implies base wide as time and energy permitted" (page 554). and wide as time 554). (page energy permitted" conversely base but but not not scale scale is clearly clearly base conversely(e.g., (e.g., 851 1 is probability meaRecall mearandomprobability that aa (real Recall that (real Borel) Borel) random invariant). invariant). sure (r.p.m.) M is a random vector [on an underlying M a vector an is random underlying sure [on (r.p.m.) The proof of the of Theorem followseasily fromthe The proof Theorem11 follows easily from , P)] are probability space whichare values which takingvalues space (0, (Q, Y:-, P)] taking probability invariance fact to factthat to invariance thatscale scale invariance invariancecorresponds corresponds is Borel probability measures on IR and which is reguon and which reguBorel measures Rlt probability -* on tinder irrational rotations rotations xx -+ + s) (mod under irrational (x + (mod 1) 1) on B c IR, Borelset lar in the foreach each Borel set B R, M(B) M(B) lar in thatfor sense that thesense the1circle, probability meameathe and the theunique invariantprobability uniqueinvariant circle,and is a random variable (CL Kallenberg, 1983). 1983). variable Kallenberg, (cf. is a random knownto sure is be thistransformation is well well known to be sure under underthis transformation in turn the (Lebesgue) turncorcorthe uniform whichin uniform measure,which (Lebesgue)measure, DEFINITION 3. distributionmeasure measure 3. The The expected expecteddistribution responds to Proof Proofof of distribution. the log log mantissa mantissadistribution. responds to the of a r.p.m. M is the probability measure EM (on the is the measure EM a of (on the M1 probability r.p.m. Theorem base is slightly since base morecomplicated, Theorem22 is complicated,since slightlymore Borel subsets of ~ ) defined by of defined subsets by Borel DR) invariance multito invariance invarianceunder under multicorrespondsto invariancecorresponds plication x x -+ -> nx used here here nx (mod The key key tool tool used (mod 1). 1). The plication B c ~ (9) for forall all Borel Borel B lR E(M(B)) (9) (EM)(B) (EM)(B) == E(M(B)) (Hill, probais that Borel probathat aa Borel Proposition4.1) 4.1) is (Hill, 1995a, 1995a, Proposition [where E(·) denotesexpectaand throughout, here and bility Q underthe the mappings on [0, is invariant invariantunder expectaE(.) denotes mappings throughout, [wherehere 1) is bility Q on [0, 1) tion probability P on the underlying to P on the tionwith withrespect is aa convex convex probability nx underlying if 'and if Q respectto forall all nn if and only nx (mod onlyif Q is (mod1) 1) for space]. mass combination point mass space]. of uniform uniformmeasure measure and and point combinationof 359 359 SIGNIFICANT-DIGIT LAW LAW THE SIGNIFICANT-DIGIT For example, example, if if MVl M is is aa random random probability probability which which is is For U[O, 1] 1] with with probability probability 1/2 1/2 and and otherwise otherwise is is an an exexU[O, ponential distribution distribution with with mean mean 1, 1, then then EM EM is is simsimponential ply the the continuous continuous distribution distribution with with density density ff(x) (x) == ply (1 + + e-x)/2 e- X )/2 for for O 0 <:::: x x <:::: 1 1 and and == e-x/2 e- x /2 for for x x >> 1. 1. (1 The next next definition definition plays plays aa central central role role in in this this secsecThe tion and and formalizes formalizes the the concept concept of of the the following following natnattion ural process process which which mimics mimics Benford's Benford's data-collection data-collection ural procedure: pick pick aa distribution distribution at at random random and and take take aa procedure: sample of of size size k k from from this this distribution; distribution; then then pick pick aa sample second distribution distribution at at random random and and take take aa sample sample of of second size k k from from this this second second distribution distribution and and so so forth. forth. size DEFINITION 4. 4. For For an an r.p.m. r.p.m. MI M and and positive positive inteinteDEFINITION ger k, k, aa sequence of M-random M-random k-samples k-samples is is aa seseger quence of X 2 , ••• . .. on on (0,!7, variables Xl' ofrandom random variables s, P) X1, X2, (fQ, P) so that that for for some some i.i.d. i.i.d. sequence sequence Ml, M I , M2, M2 , M3, M3 , ... of of so r.p.m.'s with with the the same same distribution distribution as as MR M and and for for each each r.p.m.'s = 1,2, ... .. 1, 2, ..., jj = In In general, general, sequences sequences of of M-random M-random k-samples k-samples are are not not independent, independent, not not exchangeable, exchangeable, not not Markov, Markov, not not martingale and and not not stationary stationary sequences. sequences. martingale EXAMPLE. EXAMPLE. Let Let MI M be be aa random random measure measure which which is is the Dirac Dirac probability probability measure measure 5(1) 5(1) at at 1 1 with with probprobthe ability ability 1/2, 1/2, and and which which is is (8(1) (5(1) + + 8(2))/2 5(2))/2 otherwise, otherwise, and and let let k k = = 3. 3. Then Then P(X2 P(X 2 == 2) 2) == 1/4, 1/4, but but P(X2 P(X 2 == = = 22 dI X1 Xl = 2) 2) = 1/2, 1/2, so so X1, Xl' X2 X 2 are are not not independent. independent. Since Since P((X1, P«X l' X2, X 2 , X3, X 3 , X4) X 4) and and (11) XU-1)k+l'···' jk ... X Xjk (11) X(j_1)k+, {M {Ai,i , X(i-l)k+l' ... ***,, X(i-l)k+l? of independent independent of : i all for X } for all i =f. j. j. ik Xik} are are The lemma curious showsthe thesomewhat somewhatcurious lemmashows The following following structure ofsuch such sequences. structureof sequences. LEMMA X 2 , ••• M... be be aa sequence of M1. Let Let Xl' sequence of LEMMA1. X1, X2, random for some M. some kk and some r.p.m. r.p.m. M. and some k-samples for random k-samples Then: Then: (i) n} are distributed with with the {X are a.s. a.s. identically identically distributed (i) the {Xn} distribution indein general but are are not not in distributionEM, general indeEM, but pendent; pendent; (ii) the{X are a.s. a.s. indepenindepen* *},}, the {M1, given{M M2, (ii) given {Xn} I, M 2 , ... n } are but are not in general identically disdent, disin identically but are not general dent, tributed. tributed. PROOF. of(ii) The first firstpart followseasily (10) easilyby by(10) PROOF. The (ii) follows partof and (11); the second part follows since whenever and (11); the second part followssince whenever [MIj, M j, X same distribution distribution will not not have have the the same Xik ik will mji =f. M as X jk. The first part of (i) follows conditioning as Xjk. The firstpart of (i) followsby by conditioning on on M j : MIj: P(X j(B)] B) = E[M E[MIj(B)] P(Xj j EE B) ==(2, (2, 1, 1, 1, 1)), 1, 1)), the the {X are not not exchangeable; exchangeable; since since {X,}n} are P(X3 1) P( X 3 = = 11 1I X1 Xl = = X2 X2 = = 1) = 1 1X2 1), = 9/10 9/10 >> 5/6 5/6 = = P(X3 P(X 3 = = 11 X2 = = 1), since the {X are not not Markov; Markov; since {X n}} are the E(X2 I X1 = 2) = 3/2, since the {X are not not aa martingale; martingale; and and since the {X,}n} are = (1, 1, 1)) P((X1,I , X2, (1, 1, 1)) P«X X 2 , X3) X 3) = = (1, = P«X = 9/16 1, 1)), 9/16>> 15/32 15/32= P((X2, X3, (1, 1, 1)), = X 4) = 2, X 3 , X4) notstationary. the the {X are not stationary. {X,, n }} are of the the is simply the statement statementof The next lemma lemma is The next simplythe distrithat the factthat the empirical intuitively distriintuitivelyplausible plausible fact empirical exto the theexbution k-samples ofM-random butionof MI-random k-samplesconverges convergesto that not of this-is pected distribution of M; that this"'is not completely distribution pected MR; completely trivial followsfrom fromthe the independence-identically trivial follows independence-identically = 1, If kk = in Lemma 1. If statedin Lemma 1. distributed dichotomy stated distributed dichotomy 1, law of case of of the it just the law of the Bernoulli Bernoullicase the strong it is is just strong large numbers. largenumbers. ... be be 2. Let M be be aa r.p.m., and let let Xl' LEMMA2. Let M LEMMA r.p.m.,and X1, X X22 ••• k. Then Then some k. aa sequence for some M-random k-samples sequence of of M-random k-samplesfor < n:X . #{i:::: n:Xii EE B} B} Im-----Ilim#i n~oo n B c ~. = E[M( a.s. for all Borel Borel B = B)] a.s. for all E[M(B)] R8. B and jj EE N, and let let PROOF. and Fix Band PROOF. Fix NkJ, = E[M( = B)] for B c ~, Borel B forall all Borel R8, E[M(B)] where j has MI1 has the the since M followssince the last last equality wherethe equalityfollows same as of(i) folThe second as M. secondpart M. The (i) foldistribution partof same distribution lows distrii.i.d. samples fromaa distrifactthat thati.i.d. samplesfrom fromthe thefact lowsfrom bution about thedistribution, information aboutthe distribution, butionmay giveinformation maygive as in the thenext nextexample. as seen seen in example. 0LI 1, 1,2)) 1, 2)) 1, = = 9/64 9/64 >> 3/64 3/64 == P((X1, P«X I , X2, X 2 , X3, X 3 , X4) X 4) ... random variables given Mj Mj = P, the the random variables (10) = P, (10) given with are i.i.d. X(j-l)k+l' . .. , X jk are i.i.d. with d.£ P; d.f. P; Xjk ...1 X(j-1)k+ = = (1, (1, < k: = #{m, < m k:X(j-l)k+m B}. m :::: #{m,11 ::: YYjj = X(j1)k+m EE B}. Clearly, Clearly, (12) (12) B} =lim #{i < n: Xi EE B} #{i:::: I. L7=1 ljY j . n:mXi I1m = 1m - - n-*oo n-*oo n~oo nn n~oo km km limitexists) (if thelimit (ifthe exists) T. P HILL T. P. HILL 360 360 is binomially By (10), (10), given given M distributed By j' Y binomially distributed MIj, Yjj is so by (9), with parameters kk and j(B)], so by (9), withparameters and E[M E[Mj(B)], (13) = E(E(Y EY E(E(Yj j EYjj = IM j» Mj)) n, then = 2 has baseX then{X 2n, 2 and and Zn = X n_== base=1,1, Y {Xn} Yn n == 2 n } has has but not {Y frequency, } not scale-neutral mantissafrequency, scale-neutralmantissa but {YnI n has of and Theorem Theorem11 of above and neither and (by (byTheorem Theorem11 above neitherand has both. 1977) {Zn} Diaconis, 1977) Diaconis, both. {Zn} has and of scale-neutral scale-neutral and Mathematical examples of examples Mathematical = kE[M(B)] all j, = a.s. j, a.s. for forall kE[M(B)] will scale-biased processes are as as will are easy easy to to construct, construct, scale-biasedprocesses the same distribution as M. MR. since sinceM MD j has has the same distribution as pick be described below. For real-lifeexample, example,pick For aa real-life be describedbelow. By (11), independent.Since Since they they By (11), the the {Y are independent. {Yj} j } are aa beverage-producing beverage-producing company in continental EucontinentalEucompanyin have [via means kE[M(B)] kE[M(B)] and and are are have [via (13)] (13)] identical identicalmeans volumesof of the metric metricvolumes look at at the rope at randomand and look rope at random uniformly bounded bounded [so LOO(Var(Yj )/ j2) << 00], it foloo],it foluniformly [so EZ(Var(Yj)/j2) aa sample products; then pick aa second second of its its products; then pick of kk of sample of lows page 250) lows (CL (cf.Loeve, Loeve,1977, 1977,page 250) that that in company product volumes volumesin so forth. Since product forth.Since and so companyand this case are probably closely related to liters, this liters, this closely related to this case are probably . L7=I Y j (14) 11m a.s., lim j=i kE[M(B)] a.s., (14) (random process is not scale scale mostlikely likelynot is most k-sample)process (randomk-sample) m = kE[M(B)] mo m as galgalunitsuch such as neutral to another anotherunit and conversion conversionto neutraland and by (12) and (14). and the the conclusion conclusionfollows followsby (12) and (14). D D2 set of of different lons probably yield set yieldaa radically radicallydifferent lonswould wouldprobably ifspecies species first-digit frequencies. On otherhand, hand,if On the the other frequencies. first-digit An proof can be based An even on the the obserobsercan be based on even shorter shorterproof of randomand and in Europe selectedat at random are selected ofmammals mammalsin Europe are that the the variables are vation that variables Xi, vation Xi' X k+i' X 2k+i' ...... are X2k+i, Xk+i, their less likely likely it seems seems less sampled,it metricvolumes volumessampled, theirmetric i.i.d. i.i.d. for but the the argument argumentgiven given forallIs all 1 < ii s< k, k, but that process is to the the choice of is related relatedto choiceof that this this second secondprocess the asabove be easily to show show that that the asabove can can be modifiedto easily modified units. units. is sampled timesis is sumption j is thateach each M MII sampledexactly exactlykk times sumptionthat Similarly, base-neutral and base-biased processes and base-biased processes base-neutral Similarly, K ij times, times, not jth r.p.m. ifthe is sampled r.p.m.is sampledK notessential; essential;if the jth The quesare mathematically. The quesmathematically. are also also easy to construct construct easy to where the uniformly bounded are independent bounded where the{K uniformly independent {Kjj}} are tion base-neutrality is when is most whenthe the mostinteresting interesting tionof ofbase-neutrality N-valued random are also also indepen(whichare indepenRN-valued randomvariables variables(which units universally agreed in question upon,such such agreedupon, unitsin are universally questionare dent rest of process), then the process), thenthe the same same concondentof ofthe the rest ofthe as For real-life examples, real-lifeexamples, numbersof of things. things.For as the the numbers clusion clusionholds. holds. the numnumpicking cities lookingat at the at random randomand and looking cities at picking ber of people from of people fromthose those ber of fingers of k-samples k-samplesof fingersof A NEW A NEW STATISTICAL STATISTICAL DERIVATION DERIVATION cities base-10 dependent (that cities is is certainly dependent(that heavilybase-10 certainlyheavily is where base 10 originated), whereas picking cities cities whereas picking is base 10 originated), where The The stage is now now set set to to give new statistical statistical give aa new stage is and looking at the numberof ofleaves leaves of of at random randomand the number lookingat limit below) which limitlaw whichis is aa central-limitcentral-limit- at law (Theorem (Theorem33 below) k-samples of trees from those cities is probably less less from those cities is probably of trees k-samples like digits. forsignificant like theorem theoremfor digits.Roughly Roughlyspeakspeaksignificant in the base dependent. be seen As will will be seen in the next nexttheotheodependent.As ing, probability distributions if probability distributions base law says that if ing, this this law says that rem, scale and base neutrality of random k-samples k-samples of random neutrality and base rem, scale are are at random randomand and random randomsamples are selected selected at samples are are to scale scale and and base base unbiunbiare essentially essentiallyequivalent equivalentto then in in any thentaken fromeach each of ofthese these distributions distributions taken from any of the underlying r.p.m. M. asedness the MI. asedness of underlying r.p.m. way process is base) is scale scale (or so that that the the overall overall process (or base) way so the neutral, then ofthe thenthe the significant-digit-frequencies frequenciesof neutral, significant-digit if unbiased if DEFINITION 6. 6. An An r.p.m. is scale scale unbiased Mris r.p.m. M combined the logarithmic to the combinedsample will converge convergeto logarithmic sample will on its expected distribution EM is scale invariant on is scale invariant distribution EMA its expected distribution. This preand preThis theorem theoremhelps distribution. helps explain explain and ifEM EM is is base base invariinvariis base base unbiased unbiasedif and is (DR, JI) X/) and dict distribution distribution (~+, ofthe the logarithmic dictthe the appearance logarithmic appearanceof ant on (~+, JI). [Recall that JI is a sub-o--algebra is a that /# ant on sub-u-algebra (DR+, X). [Recall in digits in significant oftabulated tabulateddata. data. significant digitsof of probability on DR(such on ~ (such ofthe the Borels, so every Borel probability everyBorel Borels,so as EM) induces a unique probability on (~+, JI).] on as induces a /#).] (DR, unique probability EM) A sequence of random random variables variables DEFINITION 5. 5. A sequence of ~ 11,, X 2, scale-neutral mantissa frequency frequency if if ... has has scale-neutral 2 , ••• scale A point here of is that the definition definition ofscale A crucial hereis thatthe crucialpoint n,II#{i S n: Xi E - #{i S n: Xi E sS}l sS}1 -~ 0O a.s. E S} n: XiE n l#{i<n:Xi S}-#{ and base unbiased thatindividual unbiaseddoes does not notrequire individual and base requirethat in fact fact realizations be scale base invariant; or base ofM scale or invariant;in realizationsof M be for 0 and and all all S S EE JI, and has has base-neutral base-neutral for all all ss >> 0 X, and it is often the case [see Benford's (1938) data and data and it is the case Benford's (1938) often [see mantissa frequency if if mantissa frequency is scale scale example below] that realizationsis thatnone noneof ofthe therealizations examplebelow] n-II#{i Xi E n-1l#{iS< n: n: Xi S} E S} on the the invariant, but only process on thesampling thatthe samplingprocess invariant,but onlythat another. average scale over overanother. does not notfavor favorone one scale averagedoes -#{i Xi ~ a.s. E sI/m}I~O -#{i S a.s. <n:n : Xi sl/m}Ij>O result:here hereM(t) Now for M(t) statisticalresult: forthe themain mainnew newstatistical Now where D denotes the random variable M(D denotesthe randomvariable and for forall all mEN m E RNI and S S EE JI. /. Dtt == M(Dt), t ), where U~=_oo[l, Ion is the positive numbers with the set of ofpositive numbers with t) x lon 0n _o0[j, t) in light ofthe the repmantissae in [1/10, the mantissaein are the repif {X and {Zn} lightof For example, n}, and t/10).[Thus [Thus in [1/10,t/10). For example,if {Yn, {ZnJ are n }, {Y {Xnl, the random random resentation M(t) may be viewed viewedas as the resentation(6), sequences random variables maybe definedby (6), M(t) variablesdefined of(constant) by sequencesof (constant)random (13) m~oo 1° THE SIGNIFICANT-DIGIT SIGNIFICANT-DIGITLAW LAW cumulative function forthe the mantissae mantissae cumulativedistribution distribution functionfor of ofthe the r.p.m. r.p.m.M.] M.] THEOREM digits). THEOREM3 3 (Log-limit (Log-limitlaw law for forsignificant significant digits). Let M (~+, .4). following are Let M be be an an r.p.m. r.p.m.on on (DR, X6').The The following are equivalent: equivalent: (i) M is is scale unbiased; (i) scale unbiased; (ii) M is is base base unbiased unbiased and and EM is atomless; (ii) EM is atomless; = 10gIO (iii) E[M(t)] log10 tt for all tt EE [1, [1, 10); 10); (iii) E[M(t)] = for all M-random k-sample scale-neutral (iv) (iv) every every M-random k-sample has has scale-neutral mantissa frequency; frequency; mantissa (v) M-random kk(v) EM EM is is atomless, atomless, and and every every M-random sample has frequency; has base-neutral base-neutral mantissa mantissa frequency; sample (vi) for everyM-random k-sample Xl' * * *,, (vi) for every M-random k-sample X 2 , ••• Xl, X2, I < n: nn-1#{i n: mantissa(X #{i :::; [1/10, tl10)} mantissa(Xii )) EE [1/10, t/10)} -> a.s. for all tt eE [1, [1, 10). 10). ~ 10gIO for all loglo tt a.s. PROOF. (i) by Definitions 11 and Immediateby Definitions and PROOF. (i) {:> X (iii). (iii). Immediate 1. 66 and and Theorem Theorem1. (ii) thatthe theBorel Borel (ii) {:> X (iii). (iii). It It follows followseasily easilyfrom from(6) (6) that ifit it is probability EM ifand and only is atomatomEM is is atomless atomlessif probability onlyif (ii) is less That (ii) is equivalent follows less on on .4. to (iii) thenfollows X#.That equivalentto (iii) then easily by Definitions 22 and and 66 and 2. Definitions and Theorem Theorem2. easilyby (iii) X (iv). Lemma2, (iii) {:> (iv). By By Lemma 2, < n: An := n-II#{i Xi EE S}I S} n:Xi An:= n-ll#{i:::; ~ E[M(S)] E[M(S)] a.s., a.s., and and < n: n-1l#{i:::: n: Xi B n := n-II#{i Xi EE sS}I sS}1 Bn:= ~ E[M(sS)] E[M(sS)] a.s., a.s., ifand ifEM(S) so IAn 0 a.s. a.s. if and only - B nId~ 0 S) == EM( sS), so onlyifEM( EM(sS), IAn-Bn which by by Definition Definition1 and and Theorem Theorem11 is is equivalent which equivalent to (iii). to (iii). (iii) Lemma 2, Definition22 (iii) {:> X (v). (v). Similar, Similar,using using Lemma 2, Definition and 2. and Theorem Theorem2. D (iii) by Lemma 2. D Immediateby Lemma 2. (iii) {:> X (vi). (vi). Immediate One points of Theorem33 is is that that there thereare are the points One of ofthe ofTheorem many (natural) procedures which to whichlead lead to many (natural)sampling samplingprocedures the log helping howthe the differdifferthe log distribution, distribution, helpingexplain explainhow ent Newcomb, Benford, Knuth Knuth entempirical evidenceof ofNewcomb, empiricalevidence Benford, and Nigrini all law. This This may also and Nigrini all led led to to the the same same hiw. may also help thenumbers numbersfrom fromnewsnewshelpexplain explainwhy whysampling samplingthe paper front pages (Benford, page 556), or alalfrontpages paper (Benford,1938, 1938, page 556), or manacs data often oftentends tends or extensive extensiveaccounting manacs or accountingdata in each these toward since the log since in each of of these towardthe log distribution, distribution, in aa cases are being sampled cases various variousdistributions distributions are being sampledin presumably unbiased the first firstarticle article unbiasedway. presumably way.Perhaps Perhapsthe in population in the about population the newspaper has statistics newspaperhas statisticsabout growth, prices and stockprices and the second second article articleabout about stock growth,the indithe third None of ofthese these indithe thirdabout about forest forestacreage. acreage. None the vidual distributions itself be unbiased, but the itselfmay vidual distributions maybe unbiased,but mixture may be. well be. mixture maywell 361 base ununhypothesis ofscale scale or orbase Justification of of Justification ofthe thehypothesis biasedness is justification of is akin tojustification ofthe the hypothesis hypothesis biasedness akin to of in in apidenticaldistribution) distribution) apofindependence independence(and (and identical or central central law of of large large numbers numbersor plying the strong stronglaw plying the hypothneitherhypothlimittheorem theoremto to real-life real-lifeprocesses: limit processes: neither esis be proved, proved, yet in many sampling esis can can be yet in manyreal-life real-lifesampling assumpto be reasonableassumpprocedures, theyappear appear to procedures, they be reasonable straighttions. Conversely, Theorem Theorem33 suggests suggestsaa straighttions. Conversely, forward data-simply test test forwardtest test for forunbiasedness unbiasednessof ofdata-simply goodness-of-fit to distribution. to the the logarithmic logarithmic distribution. goodness-of-fit Many standard of are auauconstructions of r.p.m.'s r.p.m.'sare Many standardconstructions tomatically scale base neutral, thussatisfy satisfy and base neutral,and and thus tomatically scale and the log-limit significant-digit law. probtheproblaw.Consider Considerthe the log-limit significant-digit lem aa random r.p.m.)on on randomvariable variableX X (or (orr.p.m.) lemof ofgenerating generating be just as If the chosenare are desired desiredto to be [1,10). just as [1, 10). If the units unitschosen likely per dollars [orBenBenas dollars dollarsper perstock stock[or likelystock stockper dollarsas ford's per watt" per "wattsper ford's(1938) watt"versus versus "watts (1938) "candles "candles per candle"], generated shouldbe be thenthe the distribution distribution generatedshould candle"],then should so for forexample exampleits its 10gIO log10should reciprocal reciprocalinvariant, invariant,so be symmetric about F(l) == 00 and firstset set F(1) and be about 1/2. So first 1/2.So symmetric = 1; F(10-) pick F(,JIO) [accordF( 10) randomly randomly[accordF(10-) = 1; next next pick 10 ing on (0,1)] since ,JIO measure on (0, 1)] since ing to, to, say, say, uniform uniformmeasure = 101t; thenpick pick is point tt = point 10/t;then is the the reciprocal-invariant reciprocal-invariant F(10 I/4 ) and F(10 3 / 4 ), independently and and uniformly uniformly independently F(101/4) and F(103/4), on F(,JIO)) and (F(,JIO) , 1), respectively, and on (0, and and (F( 10), 1), respectively, (0, F( 10)) in this continue this manner. manner.This This classical classical construcconstruccontinuein tion of Dubins Dubins and 9.28) Freedman(1967, (1967, Lemma Lemma 9.28) tion of and Freedman expected is an r.p.m. r.p.m.a.s. a.s. whose whoseexpected is known knownto to generate generatean distribution EM L the logarithmic PL EM is logarithmicprobability probabilityP distribution is the of by Theorem base of (8), and hence and base hence by Theorem33 is is scale scale and (8), and unbiased, even probability 11 every even though everydisdisthough with withprobability unbiased, tribution generated this will be tributiongenerated this way be both both scale scale and and way will base is unbiunbibase biased. On the the average, average, this this r.p.m. r.p.m.is biased. On law ased, so the the log-limit law will will apapsignificant-digit ased, so log-limitsignificant-digit ply to to all all M-random k-samples.[The [The construction construction ply M-randomk-samples. described using uniform is not not crucrudescribedabove above using uniformmeasure measureis cial. base measure about on (0,1) about cial. Any measure on (0, 1) symmetric symmetric Any base 1/2 property (Dubins willhave thesame and FreedFreed(Dubinsand 1/2will have the same property man, Theorem9.29).] 9.29).] man, 1967, 1967,Theorem data Also, than data sets sets other otherthan Also, many manysignificant-digit significant-digit random k-samples base-neutral manmanhave scalescale-or orbase-neutral random k-sampleshave tissa in in which such data data tissa frequency, whichcase case combining combiningsuch frequency, did together unbiased random withunbiased randomk-samples (as did k-samples(as togetherwith Benford, perhaps, in mathein combining data from frommathecombiningdata Benford,perhaps, matical withthat thatfrom fromnewspaper maticaltables tables with statistics) newspaperstatistics) will in convergence to the the logarithmic will still still result result in logarithmic convergenceto ifcertain distribution. For certaindata data represents distribution. Forexample, represents example,if (deterministic) periodic sampling proofaa geometric prosamplingof geometric (deterministic) periodic = 2n), 1 ofDiaconis Xn = 2 n ), then by Theorem cess thenby Theorem10fDiaconis cess (e.g., (e.g.,Xn Ben(1977), process is is aa strong this deterministic deterministic strongBenprocess (1977), this ford its limiting frefordsequence, whichimplies that its limitingfreimpliesthat sequence,which quency unbiased ranranor averaged withunbiased averagedwith quency(separately (separatelyor dom willsatisfy dom k-samples) satisfy(4). (4). k-samples)will An is to to determine determine An interesting open problem problemis interestingopen which (or mixturesthereof) distributions whichcommon commondistributions thereof) (or mixtures 362 362 T. T. P. P. HILL HILL or base base insatisfy law, that that is, is, are are scale scale or insatisfyBenford's Benford'slaw, withlogarithmic logarithmic variant have mantissas mantissaswith variantor or which whichhave Cauchy example, the the standard standard Cauchy distributions.For For example, distributions. distribution is Benford's Benford'slaw law (c£ (cf. is close close to to satisfying satisfying distribution is not, not,but but the standard standardGaussian Gaussian is Raimi, 1976) Raimi, 1976) and and the perhaps certain some common common certainnatural naturalmixtures mixturesof ofsome perhaps distributions are. distributions are. Of and sampling sampling Of course thereare are many manyr.p.m.'s r.p.m.'sand coursethere processes which law law (and (and do not notsatisfy satisfythe thelog-limit log-limit processes whichdo biased), scale and and base hence are both scale base biased), hence are necessarily necessarilyboth such on distribution on such as as the the (a.s.) constantuniform uniformdistribution (a.s.) constant [1, reason not well understood understood not yet yetwell [1, 10) 10) or or (for (forsome some reason by the via DubinsDubinsthe author) the r.p.m. r.p.m.constructed constructedvia by author)the Freedman with base probability probability uniform measure Freedman with base uniformmeasure on the rectangle, rectangle,which which on the the horizontal horizontalbisector bisectorof of the archas expected distributionaa renormalized renormalizedarchas expectedlog log distribution sin (Dubins and Freedman, 1967,TheThesin distribution distribution (Dubins and Freedman,1967, orem orem9.21). 9.21). APPLICATIONS APPLICATIONS The law law TheTheThe statistical log-limitsignificant-digit significant-digit statisticallog-limit orem may help justify some someof ofthe the recent recentapplicaapplicaorem33 may helpjustify tions of be tions ofBenford's Benford'slaw, severalof ofwhich will now now be law,several whichwill described. described. In scientific if the ofininIn calculating, of the distribution distribution scientific calculating,if put data processing station is known, intoaa central centralprocessing stationis known, put data into then can be used can be to design comthenthis thisinformation information used to designaa computer which numberof ofways) whichis is optimal ofaa number ways) optimal(in (in any any of puter with Thus if if the the comcomto that that distribution. distribution. Thus with respect respect to ofNewcomb puter users Newcomb like the the log-table users of users are are like log-tableusers puter or Nigrini's study, orthe ofNigrini's theirdata data reprereprethetaxpayers study,their taxpayersof sent unbiased (as base, reciprocity, ...) to units, reciprocity,...) sent an an unbiased (as to units,base, in which random in which mixtureof of various various distributions, distributions, randommixture case it will will (by followBenBennecessarilyfollow case it Theorem3) 3) necessarily (by Theorem ford's distributionhas has ford'slaw. law. Once Once aa specific specificinput input distribution in this been identified, in distributhiscase case the thelogarithmic distribubeen logarithmic identified, imtion, then can can be be exploited to imthenthat that information information exploitedto tion, prove computer Turner(1986) Feldsteinand and Turner (1986) design.Feldstein prove computerdesign. show showthat that the logarithunder the the assumption of the under logarithassumptionof mic distribution of numbers, mic distributionof numbers,floatingfloatingpoint addition result additionand and subtraction subtractioncan can result point in in overflow underflowwith with alarming alarming overflowor or underflow frequency. .. and the suggestion and lead lead to to the suggestion frequency... reduce of whichwill will reduce ofaa long wordformat formatwhich long word the the risks risksto to acceptable levels. acceptablelevels. of Schatte under assumption that under concludesthat Schatte(1988) assumptionof (1988) concludes reis optimal logarithmic input, base bb = withreoptimalwith _ 2233 is logarithmic input,base spect storage Knuth(1969) afto minimizing (1969) afminimizing storagespace. space. Knuth spectto inforinter "established the law law for the logarithmic ter having logarithmic having"established as an an exercise exercise tegers by by direct leaves as directcalculation," calculation,"leaves tegers the (page ofhexadechexadecthe desirability desirabilityof determining (page 228) 228) determining objecto different different withrespect respectto imal versus versus binary imal binary with objectives. and Bareiss Bareiss (1985) (1985) tives.Barlow Barlowand computer conclude that the the logarithmic logarithmiccomputer concludethat confidenceintervals intervals smaller error error confidence has smaller has for errors point than aa floating floatingpoint forroundoff roundoff errorsthan computer word computerword withthe the same same computer computerwith the thesame same number number size size and and approximately approximately range. range. A law is is ofBenford's Benford'slaw A second modernapplication applicationof secondmodern to where goodness-of-fit goodness-of-fit to mathematical mathematicalmodelling, modelling,where against distribution has been sugsughas been the logarithmic logarithmic distribution against the gested testof of reasonableness reasonableness Varian, 1972) 1972) as as aa test gested (c£ (cf. Varian, of proposed model, of aa proposed sortof of"Benford"Benfordmodel,aa sort of output outputof in-Benford-out" criteria. Wood's In Nigrini Nigrini and and Wood's criteria. In in-Benford-out" 1990 (1995) forexample, example,the the 1990 tabulations,for (1995) census census tabulations, census populations of in the the United of the the counties countiesin United census populations States logarithmic law the significant-digit logarithmiclaw States follow followthe significant-digit very it seems reasonablethat thatmathematmathematso it seemsreasonable veryclosely, closely,so ofthe the ical predicting future futurepopulations populationsof forpredicting ical models modelsfor counties be aa close fitto If not, not, countiesshould shouldalso also be closefit to Benford. Benford.If perhaps aa different model be considered. modelshould considered. different shouldbe perhaps Nigrini has vast As one finalexample, has amassed amassed aa vast example,Nigrini As one final collection tax and and accounting data includincludcollectionof of U.S. U.S. tax accountingdata ining interest IRS-reported interestinofIRS-reported ing 91,022 91,022observations observationsof come the and share share volumes volumes (at (at the come (Nigrini, 1996), and (Nigrini,1996), rate per day) New York the New millionper on the York day) on rate of of 200-350 200-350 million in most Stock mostof ofthese these StockExchange 1995),and and in (Nigrini,1995), Exchange(Nigrini, fit cases distribution is an excellent excellentfit the logarithmic distribution is an logarithmic cases the (perhaps because each is an an unbiased unbiasedmixmixeach is exactlybecause (perhapsexactly ture distributions). He posdata from fromdifferent different He posof data distributions). ture of tulates reasonable distridistrithat Benford is often oftenaa reasonable tulates that Benfordis bution to digits of large the significant bution to expect forthe large significant digitsof expectfor accounting proposed aa goodnessgoodnessand has data sets sets and has proposed accountingdata In an artiof-fit Benford to fraud. In to detect an artidetectfraud. testagainst of-fittest against Benford in July cle Journal in 1995 (Berton, in the StreetJournal (Berton, July1995 cle in the Wall WallStreet 1995) thatthe the District DistrictAttorney's it was was announced announcedthat Attorney's 1995) it BenNew York, using Nigrini's office in Brooklyn, York,using Nigrini'sBenofficein Brooklyn,New tests, ford has detected detectedand and charged fordgoodness-of-fit charged tests,has goodness-of-fit withfraud. fraud. groups New York companieswith at seven seven New Yorkcompanies groupsat in using The this has expressed interestin The Dutch DutchIRS IRS has usingthis expressedinterest and Nigrini Benford Nigrini tax fraud, to detect detectincome incometax Benfordtest testto fraud,and IRS. has proposals to to the the U.S. U.S. IRS. has submitted submittedproposals ACKNOWLEDGMENTS ACKNOWLEDGMENTS of The the Free Free University to the Universityof The author authoris is grateful gratefulto Amsterdam Piet Holewijn ProfessorPiet and especially Amsterdamand Holewijn especiallyProfessor summer during for thesummer and hospitality fortheir theirsupport duringthe hospitality supportand David PieterAllaart, of is grateful to Pieter and also also is of1995, Allaart,David gratefulto 1995,and foraa number number Gilat, PeterSchatte Schattefor and Peter Raimiand Gilat,Ralph Ralph Raimi correcforseveral severalcorrecof to van Harn to Klaas Klaas van Harn for ofsuggestions, suggestions, and tions notationand advice concerning valuable advice tions and and valuable concerningnotation THE THE SIGNIFICANT-DIGIT SIGNIFICANT-DIGITLAW LAW excellent to Editor for for excellent Associate Editor to an an anonymous anonymousAssociate research exposition.This This research ideas improvingthe the exposition. ideas for forimproving was partially supported by NSF NSF Grant Grant DMS-95DMS-95was partially supportedby 03375. 03375. REFERENCES REFERENCES ofmost mostsignifisignifiB. (1968). (1968). Distribution Distribution ADHIKARI, of A. and and SARKAR, SARKAR, B. ADHIKARI, A. cant whose in certain functions whosearguments are random random argumentsare cantdigit digitin certainfunctions variables. B 30 Ser. B 30 47-58. 47-58. variables. Sankhya Sankhyd Sere errordistribudistribu(1985). On On roundoff roundoff BARLOW, error and BAREISS, E. E. (1985). BARLOW, J. J. and point and arithmetic. Computtions in floating and logarithmic Computarithmetic. logarithmic tionsin floatingpoint ing 34 325-347. 325-347. ing 34 in listings MTTF BECKER, P. and P. (1982). offailure-rate failure-rate and MTTF listingsof (1982). Patterns Patternsin values IEEE Transactions Reon Reand listings of other other data. data. IEEE Transactions on values and listings of liability R-31 132-134. 132-134. liability R-31 BENFORD, F. Proceedings law of ofanomalous numbers.Proceedings anomalousnumbers. F. (1938). (1938).The The law of American Philosophical Philosophical Society theAmerican 78 551-572. 551-572. Society 78 of the to BERTON, L. uses math mathto theirnumber: number:scholar scholaruses (1995). He's He's got gottheir L. (1995). foil Journal, July July 10. 10. foilfinancial fraud. Wall Wall Street Street Journal, financial fraud. BUCK, B., of of An illustration illustration S. (1993). (1993). An MERCHANT, A. A. and and PEREZ, S. B., MERCHANT, Benford's first Eurohalflives. decayhalf lives.EuroBenford's law using usingalpha alpha decay firstdigit digitlaw 14 59-63. pean J J. Phys. Phys. 14 59-63. pean BURKE, E. (1991). Benford'slaw law and and physical J. and physical and KINCANON, (1991). Benford's KINCANON, E. BuRKE, J. constants: of Amer. J. J. Phys. Phys. of initial initial digits. digits.Amer. constants:the the distribution distribution 59952. 59 952. COHEN, D. An explanation ofthe firstdigit digitphenomenon. phenomenon. thefirst D. (1976). explanationof (1976).An 20 367-370. J. Combin. A 20 J Combin. Theory Ser. A 367-370. TheorySere and the digit COHEN, D. thefirst firstdigit D. and T. (1984). Primenumbers numbersand and KATZ, KATZ, T. (1984).Prime J. Number Number Theory phenomenon. J. Theory18 18 261-268. 261-268. phenomenon. and Statistics. DE FINETTI, B. Probability, Induction Induction and Statistics. Wiley, Wiley, B. (1972). (1972). Probability, New York. New York. uniDIACONIS, P. of digitsand and uniP. (1977). The distribution distribution ofleading leadingdigits (1977). The mod 1. 1. Ann. form Ann. Probab. Probab. 55 72-81. 72-81. formdistribution distributionmod DIACONIS, P. percentages. percentages. P. and D. (1979). (1979).On On rounding rounding and FREEDMAN, D. 74 359-364. J. Amer. Amer. Statist. Assoc. 74 359-364. J. Statist. Assoc. funcDUBINS, funcL. and and FREEDMAN, D. D. (1967). Randomdistribution distribution (1967). Random DuBINs, L. Statist. Probab. 183tions. Proc. Fifth Fifth Berkeley Berkeley Symp. Math. Statist. Probab. 183tions. Proc. Symp. Math. 214. 214. Univ. Press,Berkeley. Berkeley. Univ.California CaliforniaPress, 363 363 and P. (1986). (1986). Overflow, Overflow, FELDSTEIN, A. A. and and TuRNER, TURNER, P. underflow, FELDSTEIN, underflow, and severe in addition ofsignificance loss of in floating-point additionand and subsubfloating-point severeloss significance IMA J traction. J. Numer. Numer. Anal. Anal. 66 241-251. 241-251. traction.lMA randomnumber number probability that On the theprobability thataa random FLEHINGER, B. B. (1966). (1966). On 73 1056-1061. 1056-1061. has A. Amer. Amer. Math. Monthly 73 Math. Monthly has initial initial digit digitA. HAMMING, of Bell Bell System System R. (1970). On the thedistribution distribution ofnumbers. numbers. HAMMING, R. (1970).On Technical Journal 49 49 1609-1625. 1609-1625. Technical Journal HILL, T. Proc. law. Proc. impliesBenford's Benford'slaw. HILL, T. (1995a). (1995a). Base-invariance Base-invarianceimplies Amer. Math. Math. Soc. Amer. 123 887-895. 887-895. Soc. 123 HILL, T. phenomenon. Amer. Amer. Math. Math. significant-digit phenomenon. HILL, T. (1995b). (1995b).The The significant-digit Monthly 102 102 322-327. 322-327. Monthly distribution of of leading leadingdigits digits T. (1992). The logarithmic logarithmic distribution JECH, T. (1992). The Math. 108 and Discrete Math. 108 53-57. 53-57. and finitely measures. Discrete finitelyadditive additive measures. AcademicPress, Press, KALLENBERG, Random Measures. Measures. Academic KALLENBERG, O. 0. (1983). (1983). Random New York. New York. KNuTH, D. Art ofComputer Programming 22 219-229. D. (1969). The Art 219-229. of ComputerProgramming KNUTH, (1969). The MA. Addison-Wesley, Reading, Addison-Wesley, Reading,MA. peculiar distribution of of the the U.S. U.S. stock stock E. (1995). On the the peculiar distribution LEY, E. (1995). On Amer. Statist. indices Statist.To To appear. appear. indicesdigits. digits.Amer. LOEVE, Probability Theory ed. Springer, M. (1977). 4th ed. Springer, Theory1, 1, 4th LoEVE, M. (1977). Probability New York. New York. use of ofthe NEWCOMB, S. Note on of on the thefrequency ofuse thedifferent different frequency NEwcOMB, S. (1881). (1881).Note in natural Amer. J J. Math. Math. 44 39-40. digits naturalnumbers. numbers. Amer. 39-40. digitsin M. (1995). Privatecommunication. communication. NIGRINI, M. (1995). Private BenJ. (1996). A taxpayer ofBenM. J. taxpayercompliance complianceapplication applicationof NIGRINI, M. (1996).A the American Taxation Association 18 ford's Journal of American Taxation Association 18 ford'slaw. law. Journal of the 72-91. 72-91. of theintegrity oftabtabM. and and WOOD, W. (1995). integrity (1995).Assessing Assessingthe NIGRINI, M. WOOD,W. and St. ulated demographic data. Univ. ulated data. Preprint, Univ.Cincinnati Cincinnatiand St. Preprint, demographic Mary's Univ. Mary'sUniv. firstdigits. RAIMI, peculiar distribution of R. (1969). offirst digits.ScienSciendistribution RAIMI,R. (1969). The The peculiar December 109-119. tific American December 109-119. tificAmerican RAIMI, problem. Amer. Amer. Math. Math. Monthly Monthly R. (1976). firstdigit digit problem. RAIMI, R. (1976). The The first 102 102 322-327. 322-327. RAIMI, phenomenon again. Proceedings R. (1985). The first firstdigit digitphenomenon again.Proceedings RAIMI,R. (1985). The 129 211-219. of American Philosophical Philosophical Society theAmerican Society 129211-219. of the in computing SCHATTE, P. in and P. (1988). mantissa distributions and On mantissa distributions computing (1988). On Benford's law. J. Inform. Inform. Process. Process. Cybernet. 24 443-455. 443-455. Benford's law. J Cybernet.24 23 65-66. VARIAN, H. law. Amer. Statist. H. (1972). law.Amer. Statist. 23 65-66. Benford's VARIAN, (1972). Benford's WEAVER, Lady Luck: Luck: The Probability 270The Theory 270W. (1963). Theoryof of Probability WEAVER, W. (1963). Lady 277. Doubleday, New York. York. 277. Doubleday,New