ArticlePDF Available

What Sequences obey Benford's Law ?

Authors:

Abstract

We propose a new necessary and sufficient condition to test whether a sequence is Benford (base-b) or not and apply this characterization to some kinds of sequences (re)obtaining some well known results, as the fact that the sequence of powers of 2 is Benford (base-10).
Department of Applied Mathematics, University of Venice
WORKING PAPER SERIES
Marco Corazza, Andrea Ellero and
Alberto Zorzi
What Sequences obey
Benford's Law ?
Working Paper n. 185/2008
November 2008
ISSN: 1828-6887
This Working Paper is published under the auspices of the Department of Applied
Mathematics of the Ca’ Foscari University of Venice. Opinions expressed herein are
those of the authors and not those of the Department. The Working Paper series is
designed to divulge preliminary or incomplete work, circulated to favour discussion
and comments. Citation of this paper should consider its provisional nature.
What Sequences obey Benford’s Law?
Marco Corazza Andrea Ellero
<corazza@unive.it> <ellero@unive.it>
Alberto Zorzi
<albzorzi@unive.it>
Dipartimento di Matematica Applicata
Universit`a Ca’ Foscari di Venezia
Abstract. We propose a new necessary and sufficient condition to test whether a se-
quence is Benford (base-b) or not and apply this characterization to some kinds of sequences
(re)obtaining some well known results, as the fact that the sequence of powers of 2 is Benford
(base-10).
Keywords: Benford’s law, equidistributed sequences, ergodic endomorphisms
JEL Classification Numbers: C02, C83
MathSci Classification Numbers: 28D99, 60F99
Correspondence to:
Marco Corazza Dept. of Applied Mathematics, University of Venice
Dorsoduro 3825/e
30123 Venezia, Italy
Phone: [++39] (041)-234-6921
Fax: [++39] (041)-522-1756
E-mail: corazza@unive.it
1 The importance of being one
If we consider the most significant digit of the powers of two, 21,22,23...2nit turns out
that the frequencies are not the same for all the figures: for example, among the first 1000
powers of two the ones which start with digit 1 appear more often (30.1 %), the powers
which have 2 as first digit follow (17.6 %), then the ones with 3 (12.5 %), and so on the
frequencies decrease until 4.5 % for digit 9. In fact, it is possible to prove (see Section 3)
that the probability for a generic term of the sequence to display das the most significant
digit is
log10 µ1 + 1
d,
it means that the sequence of powers of 2 obeys Benford’s Law [5].
In general a sequence of real numbers represented in base bis said to be Benford (base-b)
if the probability to observe digit das the first digit of a term of the sequence is
logbµ1 + 1
d,
for each integer dsuch that 1 d < b [8]. For an overview of Benford’s Law and a discussion
of its possible applications see e.g. [5]. 1
The main result we propose in this paper is a new necessary and sufficient condition
to test whether a sequence is Benford (base-b) or not: we then apply this characterization
to some kinds of sequences (re)obtaining, for example, that the sequence of powers of 2 is
Benford (base-10) but not Benford (base-4).
We also show how the proposed characterization is related in a natural way to Birkhoff’s
ergodic theorem.
2 Benford (base-b) sequences: a necessary and sufficient con-
dition
We first show how to define the most significant digit of a number by means of elementary
functions. Given a real number x, we use the floor and ceiling functions defined, respectively,
by x= max{nZ : nx}and x= min{nZ : nx}. this way the fractional part
of xis xmod 1 = x− ⌊x.
Lemma 1 If xis a positive real number, then its first digit in base bis b(logbx)mod1.
Proof. Since logbx⌋ ≤ logbx < logbx+ 1 we have blogbxx < blogbx+1 and 1
x
blogbx< b. Therefore x
blogbxis the most significant digit of x. To complete the proof
it is sufficient to observe that x=blogbx=blogbxb(logbx)mod1, i.e., each real number is the
sum of its integer and fractional parts; this way x
blogbx=b(logbx)mod1.
1As observed in [8] (Remark 9.2.5), the definition of Benford (base-b) sequence we use here is less
restrictive than the definition of b-Benford sequence given for example in [1, 7].
1
The next result provides a possible way to count the number of terms of a sequence
which display a first significant digit which is lower than d: it will be used later on to prove
a general criterion to test whether a sequence is Benford (base-b) or not.
Lemma 2 If x1, x2, ...xnare positive real numbers and if 1d < b, then
{xk: 1 kn, first digit of xkd}=
n
X
k=1 »1
b(logbxk)mod1 1
d+ 1¼.
Proof. According to Lemma 1, the first digit of xkis dif and only if b(logbxk)mod1 < d + 1,
that is, if and only if (logbxk)mod1 <logb(d+ 1). To complete the proof we just need to
prove that the function
fd+1(y) = »1
by1
d+ 1¼
can be rewritten as
fd+1(y) = ½1 if y[0,logb(d+ 1))
0 if y[logb(d+ 1),1] .
In fact, function g(y) = 1
by1
d+1 is strictly decreasing on [0,1] since g(y) = 1
bylog b < 0.
Moreover 0 < g(0) = d
d+1 <1, 1< g(1) = 1
b1
d+1 0 and g(logb(d+ 1)) = 0.
Now it is possible to prove the following theorem, which characterizes a Benford (base-b)
sequence.
Theorem 1 The sequence xkof positive real numbers is Benford (base-b) if and only if
lim
n+
1
n
n
X
k=1 »1
b(logbxk)mod1 1
d+ 1¼= logb(d+ 1)
for each integer dsuch that 1d < b.
Proof. Consider a generic term xkof the sequence. Using Lemma 2, we have that the
probability that the first digit of xkis not greater than dis given by
P rob( first digit of xkis d) = lim
n+
1
n
n
X
k=1 »1
b(logbxk)mod1 1
d+ 1¼.
Observe now that the definition of Benford (base-b) sequence given in Section 1, since
P rob( first digit of xkd) =
d
X
i=1
P rob( first digit of xk=i)
and
P rob( first digit of xk=i) = P rob( first digit of xki)P r ob( first digit of xki1) ,
2
allows to claim that xkis b-Benford if and only if
P rob( first digit of xkd) = logb(d+ 1)
for each integer dsuch that 1 d < b. This completes the proof.
The theorem above, suggests a way to verify whether a sequence is Benford or not, com-
puting a rather complicated limit, it seems to be a difficult task. The following Lemma
suggests, as a way to deal with the limit, to compute a suitably defined integral: we will
show how to use this idea in the next Section.
Lemma 3 Let be dan integer such that 1d < b and fd+1(y) = l1
by1
d+1 m. If the
sequence ykis contained in the interval [0,1] and if
lim
n+
1
n
n
X
k=1
fd+1(yk) = Z1
0
fd+1(y)dy
then
lim
n+
1
n
n
X
k=1 »1
byk1
d+ 1¼= logb(d+ 1) .
Proof. It is sufficient to observe that
fd+1(y) = »1
by1
d+ 1¼=½1 if y[0,logb(d+ 1))
0 if y[logb(d+ 1),1] ,
thus R1
0fd+1(y)dy = logb(d+ 1) .
3 Some Benford (base-b) sequences
As a possible application of Theorem 1, we give a new kind of proof of a well known
theorem, due to Diaconis [3]: a sequence of positive real numbers xkis Benford (base-b) if
the sequence of their logarithms logbxkis uniformly distributed modulo 1. We recall that
a sequence ykis uniformly distributed modulo 1, or equidistributed in [0,1], if
lim
n+
{yk: 1 kn, a ykb}
n=ba
for each interval [a, b][0,1]. Another, and equivalent way to define an equidistributed se-
quence is the following (see for example [9, Problem 162]): the sequence ykis equidistributed
if and only if
lim
n+
1
n
n
X
k=1
f(yk) = Z1
0
f(y)dy (1)
for every function fwhich is Riemann integrable on [0,1]. It is now an easy matter to prove
the following well known result, stating that the exponentials of equidistributed sequences
are Benford.
3
Theorem 2 ([3, 8]) A sequence xkof positive real numbers is Benford (base-b) if the
sequence yk= logbxkis uniformly distributed modulo 1.
Proof. For a fixed integer dconsider function fd+1(y) = l1
by1
d+1 m, which is Riemann
integrable on [0,1], and the sequence yk. Applying (1) we obtain
lim
n+
1
n
n
X
k=1
fd+1(yk) = Z1
0
fd+1(y)dy .
By Lemma 3 we have therefore
lim
n+
1
n
n
X
k=1 »1
byk1
d+ 1¼= logb(d+ 1) .
Thus the sequence xkis Benford (base-b) due to Theorem 1.
The previous theorem allows to easily use known results on equidistributed sequences to
state that related geometric sequences are Benford (base-b).
For example, the sequence yk= () mod 1 is equidistributed if αis an irrational
number, as independently proved by Bohl, Sierpinski and Weyl (see e.g. [8], Theorem
12.3.2); hence, the sequence xk=rkis Benford (base-b) if logbris irrational ([8], Theorem
9.2.6). Incidentally, this property allows to prove the claim made at the beginning of Section
1: the sequence xk= 2kis Benford (base-10), since log10 2 is irrational. The same sequence
is clearly not Benford (base-4), instead, as one can easily observe 2.
A more general context where Lemma 3 applies concerns ergodic theory. Consider an
ergodic endomorphism Tdefined on a probability space with domain [0,1], and function
fd+1 defined as in Lemma 3. Birkhoff ’s ergodic theorem (see e.g. [2], Appendix 3) claims
that for almost every y1[0,1],
lim
n+
1
n
n
X
k=1
fd+1(Tk(y1)) = Z1
0
fd+1(y)dy
while Lemma 3 tells us that
Z1
0
fd+1(y)dy = logb(d+ 1) .
Thus, by Theorem 1, we obtain that a sequence xkof positive real is Benford (base-b)
if yk= logbxkis generated via an ergodic endomorphism. More precisely, xkis Benford
(base-b) as soon as there exists an ergodic endomorphism Tsuch that Tky1=ykkand
the equality of Birkhoff’s ergodic theorem holds for the initial term y1and for each integer
dsuch that 1 d < b.
2The sequence xk= 2kwritten using base 4 reads
1,2,10,20,100,200, ...
4
References
[1] A. Berger, L.A. Bunimovich, T.P. Hill (2005), “One-dimensional dynamical systems
and Benford’s law”, Transactions of the American Math. Soc.,357 (1), 197–219.
[2] I.P. Cornfeld, S.V. Fomin and Ya.G. Sinai (1982), Ergodic Theory, Springer Verlag,
New York.
[3] P. Diaconis (1977), The Annals of Probabilty,5(1), 72–81.
[4] G.H. Hardy, E.M. Wright (1979), An Introduction to the Theory of Numbers, fifth ed.,
Oxford Univ. Press, New York.
[5] T.P. Hill (1995), “The Significant-Digit Phenomenon”, The Amer. Mathematical
Monthly,102 (4), 322–327.
[6] T.P. Hill (1995), “A Statistical Derivation of the Significant-Digit Law”, Statistical Sc.,
10 (4), 354–363.
[7] A.V. Kontorovich, S.J. Miller (2005), “Benford’s law, values of L-finctions and the
3x+1 problem”, Acta Arithmetica,120, 269–297.
[8] S.J. Miller, R. Takloo-Bighash (2006), An Invitation to Modern Number Theory,
Princeton Univ. Press, Princeton and Oxford.
[9] G. Polya, G. Szeg¨o,(1972), Problems and Theorems in Analysis vol.I, Springer-Verlag,
Heidelberg.
5
Preprint
Full-text available
There are now many theoretical explanations for why Benford's law of digit bias surfaces in so many diverse fields and data sets. After briefly reviewing some of these, we discuss in depth recurrence relations. As these are discrete analogues of differential equations and model a variety of real world phenomena, they provide an important source of systems to test for Benfordness. Previous work showed that fixed depth recurrences with constant coefficients are Benford modulo some technical assumptions which are usually met; we briefly review that theory and then prove some new results extending to the case of linear recurrence relations with non-constant coefficients. We prove that, for certain families of functions $f$ and $g$, a sequence generated by a recurrence relation of the form $a_{n+1} = f(n)a_n + g(n)a_{n-1}$ is Benford for all initial values. The proof proceeds by parameterizing the coefficients to obtain a recurrence relation of lower degree, and then converting to a new parameter space. From there we show that for suitable choices of $f$ and $g$ where $f(n)$ is nondecreasing and $g(n)/f(n)^2 \to 0$ as $n \to \infty$, the main term dominates and the behavior is equivalent to equidistribution problems previously studied. We also describe the results of generalizing further to higher-degree recurrence relations and multiplicative recurrence relations with non-constant coefficients, as well as the important case when $f$ and $g$ are values of random variables.
Article
Full-text available
Near a stable fixed point at 0 or ∞, many real-valued dynamical systems follow Benford's law: under iteration of a map T the proportion of values in {x, T(x), T2(x), ... , Tn(x)} with mantissa (base b) less than t tends to logbt for all t in [1,b) as n→ ∞, for all integer bases b>1. In particular, the orbits under most power, exponential, and rational functions (or any successive combination thereof), follow Benford's law for almost all sufficiently large initial values. For linearly-dominated systems, convergence to Benford's distribution occurs for every x, but for essentially nonlinear systems, exceptional sets may exist. Extensions to nonautonomous dynamical systems are given, and the results are applied to show that many differential equations such as x=F(x), where F is C2 with F(0)=0>F'(0), also follow Benford's law. Besides generalizing many well-known results for sequences such as (n!) or the Fibonacci numbers, these findings supplement recent observations in physical experiments and numerical simulations of dynamical systems.
Article
Full-text available
We show the leading digits of a variety of systems satisfying certain conditions follow Benford's Law. For each system proving this involves two main ingredients. One is a structure theorem of the limiting distribution, specific to the system. The other is a general technique of applying Poisson Summation to the limiting distribution. We show the distribution of values of L-functions near the central line and (in some sense) the iterates of the 3x+1 Problem are Benford.
Article
Near a stable fixed point at 0 or ∞, many real-valued dynamical systems follow Benford’s law: under iteration of a map T the proportion of values in {x,T(x),T 2 (x),⋯,T n (x)} with mantissa (base b) less than t tends to log b t for all t in [1,b) as n→∞, for all integer bases b>1. In particular, the orbits under most power, exponential, and rational functions (or any successive combination thereof), follow Benford’s law for almost all sufficiently large initial values. For linearly-dominated systems, convergence to Benford’s distribution occurs for every x, but for essentially nonlinear systems, exceptional sets may exist. Extensions to nonautonomous dynamical systems are given, and the results are applied to show that many differential equations such as x ˙=F(x), where F is C 2 with F(0)=0>F ' (0), also follow Benford’s law. Besides generalizing many well-known results for sequences such as (n!) or the Fibonacci numbers, these findings supplement recent observations in physical experiments and numerical simulations of dynamical systems.
Article
The history, empirical evidence and classical explanations of the significant-digit (or Benford's) law are reviewed, followed by a summary of recent invariant-measure characterizations. Then a new statistical derivation of the law in the form of a CLT-like theorem for significant digits is presented. If distributions are selected at random (in any "unbiased" way) and random samples are then taken from each of these distributions, the significant digits of the combines sample will converge to the logarithmic (Benford) distribution. This helps explain and predict the appearance of the significant0digit phenomenon in many different empirical contexts and helps justify its recent application to computer design, mathematical modeling and detection of fraud in accounting data.
The Annals of Probabilty
  • P Diaconis
P. Diaconis (1977), The Annals of Probabilty, 5 (1), 72-81.