Content uploaded by Andrea Ellero
Author content
All content in this area was uploaded by Andrea Ellero
Content may be subject to copyright.
Department of Applied Mathematics, University of Venice
WORKING PAPER SERIES
Marco Corazza, Andrea Ellero and
Alberto Zorzi
What Sequences obey
Benford's Law ?
Working Paper n. 185/2008
November 2008
ISSN: 1828-6887
This Working Paper is published under the auspices of the Department of Applied
Mathematics of the Ca’ Foscari University of Venice. Opinions expressed herein are
those of the authors and not those of the Department. The Working Paper series is
designed to divulge preliminary or incomplete work, circulated to favour discussion
and comments. Citation of this paper should consider its provisional nature.
What Sequences obey Benford’s Law?
Marco Corazza Andrea Ellero
<corazza@unive.it> <ellero@unive.it>
Alberto Zorzi
<albzorzi@unive.it>
Dipartimento di Matematica Applicata
Universit`a Ca’ Foscari di Venezia
Abstract. We propose a new necessary and sufficient condition to test whether a se-
quence is Benford (base-b) or not and apply this characterization to some kinds of sequences
(re)obtaining some well known results, as the fact that the sequence of powers of 2 is Benford
(base-10).
Keywords: Benford’s law, equidistributed sequences, ergodic endomorphisms
JEL Classification Numbers: C02, C83
MathSci Classification Numbers: 28D99, 60F99
Correspondence to:
Marco Corazza Dept. of Applied Mathematics, University of Venice
Dorsoduro 3825/e
30123 Venezia, Italy
Phone: [++39] (041)-234-6921
Fax: [++39] (041)-522-1756
E-mail: corazza@unive.it
1 The importance of being one
If we consider the most significant digit of the powers of two, 21,22,23...2nit turns out
that the frequencies are not the same for all the figures: for example, among the first 1000
powers of two the ones which start with digit 1 appear more often (30.1 %), the powers
which have 2 as first digit follow (17.6 %), then the ones with 3 (12.5 %), and so on the
frequencies decrease until 4.5 % for digit 9. In fact, it is possible to prove (see Section 3)
that the probability for a generic term of the sequence to display das the most significant
digit is
log10 µ1 + 1
d¶,
it means that the sequence of powers of 2 obeys Benford’s Law [5].
In general a sequence of real numbers represented in base bis said to be Benford (base-b)
if the probability to observe digit das the first digit of a term of the sequence is
logbµ1 + 1
d¶,
for each integer dsuch that 1 ≤d < b [8]. For an overview of Benford’s Law and a discussion
of its possible applications see e.g. [5]. 1
The main result we propose in this paper is a new necessary and sufficient condition
to test whether a sequence is Benford (base-b) or not: we then apply this characterization
to some kinds of sequences (re)obtaining, for example, that the sequence of powers of 2 is
Benford (base-10) but not Benford (base-4).
We also show how the proposed characterization is related in a natural way to Birkhoff’s
ergodic theorem.
2 Benford (base-b) sequences: a necessary and sufficient con-
dition
We first show how to define the most significant digit of a number by means of elementary
functions. Given a real number x, we use the floor and ceiling functions defined, respectively,
by ⌊x⌋= max{n∈Z : n≤x}and ⌈x⌉= min{n∈Z : n≥x}. this way the fractional part
of xis xmod 1 = x− ⌊x⌋.
Lemma 1 If xis a positive real number, then its first digit in base bis ⌊b(logbx)mod1⌋.
Proof. Since ⌊logbx⌋ ≤ logbx < ⌊logbx⌋+ 1 we have b⌊logbx⌋≤x < b⌊logbx⌋+1 and 1 ≤
x
b⌊logbx⌋< b. Therefore ⌊x
b⌊logbx⌋⌋is the most significant digit of x. To complete the proof
it is sufficient to observe that x=blogbx=b⌊logbx⌋b(logbx)mod1, i.e., each real number is the
sum of its integer and fractional parts; this way x
b⌊logbx⌋=b(logbx)mod1.⋄
1As observed in [8] (Remark 9.2.5), the definition of Benford (base-b) sequence we use here is less
restrictive than the definition of b-Benford sequence given for example in [1, 7].
1
The next result provides a possible way to count the number of terms of a sequence
which display a first significant digit which is lower than d: it will be used later on to prove
a general criterion to test whether a sequence is Benford (base-b) or not.
Lemma 2 If x1, x2, ...xnare positive real numbers and if 1≤d < b, then
♯{xk: 1 ≤k≤n, first digit of xk≤d}=
n
X
k=1 »1
b(logbxk)mod1 −1
d+ 1¼.
Proof. According to Lemma 1, the first digit of xkis ≤dif and only if b(logbxk)mod1 < d + 1,
that is, if and only if (logbxk)mod1 <logb(d+ 1). To complete the proof we just need to
prove that the function
fd+1(y) = »1
by−1
d+ 1¼
can be rewritten as
fd+1(y) = ½1 if y∈[0,logb(d+ 1))
0 if y∈[logb(d+ 1),1] .
In fact, function g(y) = 1
by−1
d+1 is strictly decreasing on [0,1] since g′(y) = −1
bylog b < 0.
Moreover 0 < g(0) = d
d+1 <1, −1< g(1) = 1
b−1
d+1 ≤0 and g(logb(d+ 1)) = 0. ⋄
Now it is possible to prove the following theorem, which characterizes a Benford (base-b)
sequence.
Theorem 1 The sequence xkof positive real numbers is Benford (base-b) if and only if
lim
n→+∞
1
n
n
X
k=1 »1
b(logbxk)mod1 −1
d+ 1¼= logb(d+ 1)
for each integer dsuch that 1≤d < b.
Proof. Consider a generic term xkof the sequence. Using Lemma 2, we have that the
probability that the first digit of xkis not greater than dis given by
P rob( first digit of xkis ≤d) = lim
n→+∞
1
n
n
X
k=1 »1
b(logbxk)mod1 −1
d+ 1¼.
Observe now that the definition of Benford (base-b) sequence given in Section 1, since
P rob( first digit of xk≤d) =
d
X
i=1
P rob( first digit of xk=i)
and
P rob( first digit of xk=i) = P rob( first digit of xk≤i)−P r ob( first digit of xk≤i−1) ,
2
allows to claim that xkis b-Benford if and only if
P rob( first digit of xk≤d) = logb(d+ 1)
for each integer dsuch that 1 ≤d < b. This completes the proof. ⋄
The theorem above, suggests a way to verify whether a sequence is Benford or not, com-
puting a rather complicated limit, it seems to be a difficult task. The following Lemma
suggests, as a way to deal with the limit, to compute a suitably defined integral: we will
show how to use this idea in the next Section.
Lemma 3 Let be dan integer such that 1≤d < b and fd+1(y) = l1
by−1
d+1 m. If the
sequence ykis contained in the interval [0,1] and if
lim
n→+∞
1
n
n
X
k=1
fd+1(yk) = Z1
0
fd+1(y)dy
then
lim
n→+∞
1
n
n
X
k=1 »1
byk−1
d+ 1¼= logb(d+ 1) .
Proof. It is sufficient to observe that
fd+1(y) = »1
by−1
d+ 1¼=½1 if y∈[0,logb(d+ 1))
0 if y∈[logb(d+ 1),1] ,
thus R1
0fd+1(y)dy = logb(d+ 1) .⋄
3 Some Benford (base-b) sequences
As a possible application of Theorem 1, we give a new kind of proof of a well known
theorem, due to Diaconis [3]: a sequence of positive real numbers xkis Benford (base-b) if
the sequence of their logarithms logbxkis uniformly distributed modulo 1. We recall that
a sequence ykis uniformly distributed modulo 1, or equidistributed in [0,1], if
lim
n→+∞
♯{yk: 1 ≤k≤n, a ≤yk≤b}
n=b−a
for each interval [a, b]⊆[0,1]. Another, and equivalent way to define an equidistributed se-
quence is the following (see for example [9, Problem 162]): the sequence ykis equidistributed
if and only if
lim
n→+∞
1
n
n
X
k=1
f(yk) = Z1
0
f(y)dy (1)
for every function fwhich is Riemann integrable on [0,1]. It is now an easy matter to prove
the following well known result, stating that the exponentials of equidistributed sequences
are Benford.
3
Theorem 2 ([3, 8]) A sequence xkof positive real numbers is Benford (base-b) if the
sequence yk= logbxkis uniformly distributed modulo 1.
Proof. For a fixed integer dconsider function fd+1(y) = l1
by−1
d+1 m, which is Riemann
integrable on [0,1], and the sequence yk. Applying (1) we obtain
lim
n→+∞
1
n
n
X
k=1
fd+1(yk) = Z1
0
fd+1(y)dy .
By Lemma 3 we have therefore
lim
n→+∞
1
n
n
X
k=1 »1
byk−1
d+ 1¼= logb(d+ 1) .
Thus the sequence xkis Benford (base-b) due to Theorem 1. ⋄
The previous theorem allows to easily use known results on equidistributed sequences to
state that related geometric sequences are Benford (base-b).
For example, the sequence yk= (kα) mod 1 is equidistributed if αis an irrational
number, as independently proved by Bohl, Sierpinski and Weyl (see e.g. [8], Theorem
12.3.2); hence, the sequence xk=rkis Benford (base-b) if logbris irrational ([8], Theorem
9.2.6). Incidentally, this property allows to prove the claim made at the beginning of Section
1: the sequence xk= 2kis Benford (base-10), since log10 2 is irrational. The same sequence
is clearly not Benford (base-4), instead, as one can easily observe 2.
A more general context where Lemma 3 applies concerns ergodic theory. Consider an
ergodic endomorphism Tdefined on a probability space with domain [0,1], and function
fd+1 defined as in Lemma 3. Birkhoff ’s ergodic theorem (see e.g. [2], Appendix 3) claims
that for almost every y1∈[0,1],
lim
n→+∞
1
n
n
X
k=1
fd+1(Tk(y1)) = Z1
0
fd+1(y)dy
while Lemma 3 tells us that
Z1
0
fd+1(y)dy = logb(d+ 1) .
Thus, by Theorem 1, we obtain that a sequence xkof positive real is Benford (base-b)
if yk= logbxkis generated via an ergodic endomorphism. More precisely, xkis Benford
(base-b) as soon as there exists an ergodic endomorphism Tsuch that Tky1=yk∀kand
the equality of Birkhoff’s ergodic theorem holds for the initial term y1and for each integer
dsuch that 1 ≤d < b.
2The sequence xk= 2kwritten using base 4 reads
1,2,10,20,100,200, ...
4
References
[1] A. Berger, L.A. Bunimovich, T.P. Hill (2005), “One-dimensional dynamical systems
and Benford’s law”, Transactions of the American Math. Soc.,357 (1), 197–219.
[2] I.P. Cornfeld, S.V. Fomin and Ya.G. Sinai (1982), Ergodic Theory, Springer Verlag,
New York.
[3] P. Diaconis (1977), The Annals of Probabilty,5(1), 72–81.
[4] G.H. Hardy, E.M. Wright (1979), An Introduction to the Theory of Numbers, fifth ed.,
Oxford Univ. Press, New York.
[5] T.P. Hill (1995), “The Significant-Digit Phenomenon”, The Amer. Mathematical
Monthly,102 (4), 322–327.
[6] T.P. Hill (1995), “A Statistical Derivation of the Significant-Digit Law”, Statistical Sc.,
10 (4), 354–363.
[7] A.V. Kontorovich, S.J. Miller (2005), “Benford’s law, values of L-finctions and the
3x+1 problem”, Acta Arithmetica,120, 269–297.
[8] S.J. Miller, R. Takloo-Bighash (2006), An Invitation to Modern Number Theory,
Princeton Univ. Press, Princeton and Oxford.
[9] G. Polya, G. Szeg¨o,(1972), Problems and Theorems in Analysis vol.I, Springer-Verlag,
Heidelberg.
5