Content uploaded by Andrea Ellero

Author content

All content in this area was uploaded by Andrea Ellero

Content may be subject to copyright.

Department of Applied Mathematics, University of Venice

WORKING PAPER SERIES

Marco Corazza, Andrea Ellero and

Alberto Zorzi

What Sequences obey

Benford's Law ?

Working Paper n. 185/2008

November 2008

ISSN: 1828-6887

This Working Paper is published under the auspices of the Department of Applied

Mathematics of the Ca’ Foscari University of Venice. Opinions expressed herein are

those of the authors and not those of the Department. The Working Paper series is

designed to divulge preliminary or incomplete work, circulated to favour discussion

and comments. Citation of this paper should consider its provisional nature.

What Sequences obey Benford’s Law?

Marco Corazza Andrea Ellero

<corazza@unive.it> <ellero@unive.it>

Alberto Zorzi

<albzorzi@unive.it>

Dipartimento di Matematica Applicata

Universit`a Ca’ Foscari di Venezia

Abstract. We propose a new necessary and suﬃcient condition to test whether a se-

quence is Benford (base-b) or not and apply this characterization to some kinds of sequences

(re)obtaining some well known results, as the fact that the sequence of powers of 2 is Benford

(base-10).

Keywords: Benford’s law, equidistributed sequences, ergodic endomorphisms

JEL Classiﬁcation Numbers: C02, C83

MathSci Classiﬁcation Numbers: 28D99, 60F99

Correspondence to:

Marco Corazza Dept. of Applied Mathematics, University of Venice

Dorsoduro 3825/e

30123 Venezia, Italy

Phone: [++39] (041)-234-6921

Fax: [++39] (041)-522-1756

E-mail: corazza@unive.it

1 The importance of being one

If we consider the most signiﬁcant digit of the powers of two, 21,22,23...2nit turns out

that the frequencies are not the same for all the ﬁgures: for example, among the ﬁrst 1000

powers of two the ones which start with digit 1 appear more often (30.1 %), the powers

which have 2 as ﬁrst digit follow (17.6 %), then the ones with 3 (12.5 %), and so on the

frequencies decrease until 4.5 % for digit 9. In fact, it is possible to prove (see Section 3)

that the probability for a generic term of the sequence to display das the most signiﬁcant

digit is

log10 µ1 + 1

d¶,

it means that the sequence of powers of 2 obeys Benford’s Law [5].

In general a sequence of real numbers represented in base bis said to be Benford (base-b)

if the probability to observe digit das the ﬁrst digit of a term of the sequence is

logbµ1 + 1

d¶,

for each integer dsuch that 1 ≤d < b [8]. For an overview of Benford’s Law and a discussion

of its possible applications see e.g. [5]. 1

The main result we propose in this paper is a new necessary and suﬃcient condition

to test whether a sequence is Benford (base-b) or not: we then apply this characterization

to some kinds of sequences (re)obtaining, for example, that the sequence of powers of 2 is

Benford (base-10) but not Benford (base-4).

We also show how the proposed characterization is related in a natural way to Birkhoﬀ’s

ergodic theorem.

2 Benford (base-b) sequences: a necessary and suﬃcient con-

dition

We ﬁrst show how to deﬁne the most signiﬁcant digit of a number by means of elementary

functions. Given a real number x, we use the ﬂoor and ceiling functions deﬁned, respectively,

by ⌊x⌋= max{n∈Z : n≤x}and ⌈x⌉= min{n∈Z : n≥x}. this way the fractional part

of xis xmod 1 = x− ⌊x⌋.

Lemma 1 If xis a positive real number, then its ﬁrst digit in base bis ⌊b(logbx)mod1⌋.

Proof. Since ⌊logbx⌋ ≤ logbx < ⌊logbx⌋+ 1 we have b⌊logbx⌋≤x < b⌊logbx⌋+1 and 1 ≤

x

b⌊logbx⌋< b. Therefore ⌊x

b⌊logbx⌋⌋is the most signiﬁcant digit of x. To complete the proof

it is suﬃcient to observe that x=blogbx=b⌊logbx⌋b(logbx)mod1, i.e., each real number is the

sum of its integer and fractional parts; this way x

b⌊logbx⌋=b(logbx)mod1.⋄

1As observed in [8] (Remark 9.2.5), the deﬁnition of Benford (base-b) sequence we use here is less

restrictive than the deﬁnition of b-Benford sequence given for example in [1, 7].

1

The next result provides a possible way to count the number of terms of a sequence

which display a ﬁrst signiﬁcant digit which is lower than d: it will be used later on to prove

a general criterion to test whether a sequence is Benford (base-b) or not.

Lemma 2 If x1, x2, ...xnare positive real numbers and if 1≤d < b, then

♯{xk: 1 ≤k≤n, ﬁrst digit of xk≤d}=

n

X

k=1 »1

b(logbxk)mod1 −1

d+ 1¼.

Proof. According to Lemma 1, the ﬁrst digit of xkis ≤dif and only if b(logbxk)mod1 < d + 1,

that is, if and only if (logbxk)mod1 <logb(d+ 1). To complete the proof we just need to

prove that the function

fd+1(y) = »1

by−1

d+ 1¼

can be rewritten as

fd+1(y) = ½1 if y∈[0,logb(d+ 1))

0 if y∈[logb(d+ 1),1] .

In fact, function g(y) = 1

by−1

d+1 is strictly decreasing on [0,1] since g′(y) = −1

bylog b < 0.

Moreover 0 < g(0) = d

d+1 <1, −1< g(1) = 1

b−1

d+1 ≤0 and g(logb(d+ 1)) = 0. ⋄

Now it is possible to prove the following theorem, which characterizes a Benford (base-b)

sequence.

Theorem 1 The sequence xkof positive real numbers is Benford (base-b) if and only if

lim

n→+∞

1

n

n

X

k=1 »1

b(logbxk)mod1 −1

d+ 1¼= logb(d+ 1)

for each integer dsuch that 1≤d < b.

Proof. Consider a generic term xkof the sequence. Using Lemma 2, we have that the

probability that the ﬁrst digit of xkis not greater than dis given by

P rob( ﬁrst digit of xkis ≤d) = lim

n→+∞

1

n

n

X

k=1 »1

b(logbxk)mod1 −1

d+ 1¼.

Observe now that the deﬁnition of Benford (base-b) sequence given in Section 1, since

P rob( ﬁrst digit of xk≤d) =

d

X

i=1

P rob( ﬁrst digit of xk=i)

and

P rob( ﬁrst digit of xk=i) = P rob( ﬁrst digit of xk≤i)−P r ob( ﬁrst digit of xk≤i−1) ,

2

allows to claim that xkis b-Benford if and only if

P rob( ﬁrst digit of xk≤d) = logb(d+ 1)

for each integer dsuch that 1 ≤d < b. This completes the proof. ⋄

The theorem above, suggests a way to verify whether a sequence is Benford or not, com-

puting a rather complicated limit, it seems to be a diﬃcult task. The following Lemma

suggests, as a way to deal with the limit, to compute a suitably deﬁned integral: we will

show how to use this idea in the next Section.

Lemma 3 Let be dan integer such that 1≤d < b and fd+1(y) = l1

by−1

d+1 m. If the

sequence ykis contained in the interval [0,1] and if

lim

n→+∞

1

n

n

X

k=1

fd+1(yk) = Z1

0

fd+1(y)dy

then

lim

n→+∞

1

n

n

X

k=1 »1

byk−1

d+ 1¼= logb(d+ 1) .

Proof. It is suﬃcient to observe that

fd+1(y) = »1

by−1

d+ 1¼=½1 if y∈[0,logb(d+ 1))

0 if y∈[logb(d+ 1),1] ,

thus R1

0fd+1(y)dy = logb(d+ 1) .⋄

3 Some Benford (base-b) sequences

As a possible application of Theorem 1, we give a new kind of proof of a well known

theorem, due to Diaconis [3]: a sequence of positive real numbers xkis Benford (base-b) if

the sequence of their logarithms logbxkis uniformly distributed modulo 1. We recall that

a sequence ykis uniformly distributed modulo 1, or equidistributed in [0,1], if

lim

n→+∞

♯{yk: 1 ≤k≤n, a ≤yk≤b}

n=b−a

for each interval [a, b]⊆[0,1]. Another, and equivalent way to deﬁne an equidistributed se-

quence is the following (see for example [9, Problem 162]): the sequence ykis equidistributed

if and only if

lim

n→+∞

1

n

n

X

k=1

f(yk) = Z1

0

f(y)dy (1)

for every function fwhich is Riemann integrable on [0,1]. It is now an easy matter to prove

the following well known result, stating that the exponentials of equidistributed sequences

are Benford.

3

Theorem 2 ([3, 8]) A sequence xkof positive real numbers is Benford (base-b) if the

sequence yk= logbxkis uniformly distributed modulo 1.

Proof. For a ﬁxed integer dconsider function fd+1(y) = l1

by−1

d+1 m, which is Riemann

integrable on [0,1], and the sequence yk. Applying (1) we obtain

lim

n→+∞

1

n

n

X

k=1

fd+1(yk) = Z1

0

fd+1(y)dy .

By Lemma 3 we have therefore

lim

n→+∞

1

n

n

X

k=1 »1

byk−1

d+ 1¼= logb(d+ 1) .

Thus the sequence xkis Benford (base-b) due to Theorem 1. ⋄

The previous theorem allows to easily use known results on equidistributed sequences to

state that related geometric sequences are Benford (base-b).

For example, the sequence yk= (kα) mod 1 is equidistributed if αis an irrational

number, as independently proved by Bohl, Sierpinski and Weyl (see e.g. [8], Theorem

12.3.2); hence, the sequence xk=rkis Benford (base-b) if logbris irrational ([8], Theorem

9.2.6). Incidentally, this property allows to prove the claim made at the beginning of Section

1: the sequence xk= 2kis Benford (base-10), since log10 2 is irrational. The same sequence

is clearly not Benford (base-4), instead, as one can easily observe 2.

A more general context where Lemma 3 applies concerns ergodic theory. Consider an

ergodic endomorphism Tdeﬁned on a probability space with domain [0,1], and function

fd+1 deﬁned as in Lemma 3. Birkhoﬀ ’s ergodic theorem (see e.g. [2], Appendix 3) claims

that for almost every y1∈[0,1],

lim

n→+∞

1

n

n

X

k=1

fd+1(Tk(y1)) = Z1

0

fd+1(y)dy

while Lemma 3 tells us that

Z1

0

fd+1(y)dy = logb(d+ 1) .

Thus, by Theorem 1, we obtain that a sequence xkof positive real is Benford (base-b)

if yk= logbxkis generated via an ergodic endomorphism. More precisely, xkis Benford

(base-b) as soon as there exists an ergodic endomorphism Tsuch that Tky1=yk∀kand

the equality of Birkhoﬀ’s ergodic theorem holds for the initial term y1and for each integer

dsuch that 1 ≤d < b.

2The sequence xk= 2kwritten using base 4 reads

1,2,10,20,100,200, ...

4

References

[1] A. Berger, L.A. Bunimovich, T.P. Hill (2005), “One-dimensional dynamical systems

and Benford’s law”, Transactions of the American Math. Soc.,357 (1), 197–219.

[2] I.P. Cornfeld, S.V. Fomin and Ya.G. Sinai (1982), Ergodic Theory, Springer Verlag,

New York.

[3] P. Diaconis (1977), The Annals of Probabilty,5(1), 72–81.

[4] G.H. Hardy, E.M. Wright (1979), An Introduction to the Theory of Numbers, ﬁfth ed.,

Oxford Univ. Press, New York.

[5] T.P. Hill (1995), “The Signiﬁcant-Digit Phenomenon”, The Amer. Mathematical

Monthly,102 (4), 322–327.

[6] T.P. Hill (1995), “A Statistical Derivation of the Signiﬁcant-Digit Law”, Statistical Sc.,

10 (4), 354–363.

[7] A.V. Kontorovich, S.J. Miller (2005), “Benford’s law, values of L-ﬁnctions and the

3x+1 problem”, Acta Arithmetica,120, 269–297.

[8] S.J. Miller, R. Takloo-Bighash (2006), An Invitation to Modern Number Theory,

Princeton Univ. Press, Princeton and Oxford.

[9] G. Polya, G. Szeg¨o,(1972), Problems and Theorems in Analysis vol.I, Springer-Verlag,

Heidelberg.

5