Content uploaded by Susan P Holmes

Author content

All content in this area was uploaded by Susan P Holmes

Content may be subject to copyright.

ANALYSIS OF CASINO SHELF SHUFFLING MACHINES

PERSI DIACONIS, JASON FULMAN, AND SUSAN HOLMES

Abstract. Many casinos routinely use mechanical card shuﬄing machines. We were

asked to evaluate a new product, a shelf shuﬄer. This leads to new probability, new

combinatorics, and to some practical advice which was adopted by the manufacturer. The

interplay between theory, computing, and real-world application is developed.

1. Introduction

We were contacted by a manufacturer of casino equipment to evaluate a new design for

a casino card-shuﬄing machine. The machine, already built, was a sophisticated “shelf

shuﬄer” consisting of an opaque box containing ten shelves. A deck of cards is dropped

into the top of the box. An internal elevator moves the deck up and down within the

box. Cards are sequentially dealt from the bottom of the deck onto the shelves; shelves are

chosen uniformly at random at the command of a random number generator. Each card is

randomly placed above or below previous cards on the shelf with probability 1/2. At the

end, each shelf contains about 1/10 of the deck. The ten piles are now assembled into one

pile, in random order. The manufacturer wanted to know if one pass through the machine

would yield a well-shuﬄed deck.

Testing for randomness is a basic task of statistics. A standard approach is to design

some ad hoc tests such as: Where do the original top and bottom cards wind up? What

is the distribution of cards that started out together? What is the distribution, after one

shuﬄe, of the relative order of groups of consecutive cards? Such tests had been carried

out by the engineers who designed the machine, and seemed satisfactory.

We ﬁnd closed-form expressions for the probability of being at a given permutation after

the shuﬄe. This gives exact expressions for various global distances to uniformity, e.g.,

total variation. These suggest that the machine has ﬂaws. The engineers (and their bosses)

needed further convincing; using our theory, we were able to show that a knowledgeable

player could guess about 9 1/2 cards correctly in a single run through a 52-card deck. For

a well-shuﬄed deck, the optimal strategy gets about 4 1/2 cards correct. This data did

convince the company. The theory also suggested a useful remedy. Journalist accounts of

our shuﬄing adventures can be found in Klarreich (2003, 2002); Mackenzie (2002).

Section 2 gives background on casino shuﬄers, needed probability, and the literature of

shuﬄing. Section 3 gives an analysis of a single shuﬄe; we give a closed formula for the

chance that a deck of n cards passed through a machine with m shelves is in ﬁnal order

w. This is used to compute several classical distances to randomness. In particular it is

shown that, for n cards, the l(∞) distance is asymptotic to e

1/12c

2

− 1 if the number of

shelves m = cn

3/2

and n is large. The combinatorics of shelf shuﬄers turns out to have

connections to the “peak algebra” of algebraic combinatorics. This allows nice formulae for

Date: July 18, 2011.

The ﬁrst author is supported in part by National Science Foundation grant DMS 0804324.

The second author is supported in part by National Science Foundation grant DMS 0802082 and National

Security Agency grant H98230-08-1-0133.

The third author is supported in part by National Institutes of Health grant R01 GM086884-02.

1

arXiv:1107.2961v1 [math.CO] 14 Jul 2011

2 PERSI DIACONIS, JASON FULMAN, AND SUSAN HOLMES

the distribution of several classical test statistics: the cycle structure (e.g., the number of

ﬁxed points), the descent structure, and the length of the longest increasing subsequence.

Section 4 develops tools for analyzing repeated shelf shuﬄing. Section 5 develops our

“how many can be correctly guessed” tests. This section also contains our ﬁnal conclusions.

2. Background

This section gives background and a literature review. Section 2.1 treats shuﬄing ma-

chines; Section 2.2 gives probability background; Section 2.3 gives an overview of related

literature and results on the mathematics of shuﬄing cards.

2.1. Card shuﬄing machines. Casinos worldwide routinely employ mechanical card-

shuﬄing machines for games such as blackjack and poker. For example, for a single deck

game, two decks are used. While the dealer is using the ﬁrst deck in the usual way, the

shuﬄing machine mixes the second deck. When the ﬁrst deck is used up (or perhaps half-

used), the second deck is brought into play and the ﬁrst deck is inserted into the machine.

Two-, four-, and six-deck machines of various designs are also in active use.

The primary rationale seems to be that dealer shuﬄing takes time and use of a machine

results in approximately 20% more hands per hour. The machines may also limit dealer

cheating.

The machines in use are sophisticated, precision devices, rented to the casino (with service

contracts) for approximately $500 per month per machine. One company told us they had

about 8,000 such machines in active use; this amounts to millions of dollars per year. The

companies involved are substantial businesses, listed on the New York Stock Exchange.

One widely used machine simulates an ordinary riﬄe shuﬄe by pushing two halves of a

single deck together using mechanical pressure to make the halves interlace. The randomness

comes from slight physical diﬀerences in alignment and pressure. In contrast, the shelf

shuﬄers we analyze here use computer-generated pseudo-random numbers as a source of

their randomness.

The pressure shuﬄers require multiple passes (perhaps seven to ten) to adequately mix

52 cards. Our manufacturer was keen to have a single pass through suﬃce.

2.2. Probability background. Let S

n

denote the group of permutations of n objects. Let

U(σ) = 1/n! denote the uniform distribution on S

n

. If P is a probability on S

n

, the total

variation, separation, and l

∞

distances to uniformity are

(2.1)

kP − Uk

TV

=

1

2

X

w

|P (w) − U (w)| = max

A⊆S

n

|P (A) − U(A)| =

1

2

max

kfk

∞

≤1

|P (f) − U (f )|,

sep(P ) = max

w

1 −

P (w)

U(w)

, kP − Uk

∞

= max

w

1 −

P (w)

U(w)

.

Note that kP − Uk

TV

≤ sep(P ) ≤ kP − Uk

∞

. The ﬁrst two distances are less than 1; the

k k

∞

norm can be as large as n! −1.

If one of these distances is suitably small then many test statistics evaluate to approx-

imately the same thing under P and U . This gives an alternative to ad hoc tests. The

methods developed below allow exact evaluation of these and many further distances (e.g.,

chi-square or entropy).

Repeated shuﬄing is modeled by convolution:

P ∗ P (w) =

X

v

P (v)P (wv

−1

), P

∗k

(w) = P ∗ P

∗(k−1)

(w).

ANALYSIS OF CASINO SHELF SHUFFLING MACHINES 3

All of the shelf shuﬄers generate ergodic Markov chains (even if only one shelf is involved),

and so P

∗k

(w) → U (w) as k → ∞. One question of interest is the quantitative measurement

of this convergence using one of the metrics above.

2.3. Previous work on shuﬄing.

Early work. The careful analysis of repeated shuﬄes of a deck of cards has challenged

probabilists for over a century. The ﬁrst eﬀorts were made by Markov (1906) in his papers

on Markov chains. Later, Poincar´e (1912) studied the problem. These great mathematicians

proved that in principle repeated shuﬄing would mix cards at an exponential rate but gave

no examples or quantitative methods to get useful numbers in practical problems.

Borel and Ch´eron (1955) studied riﬄe shuﬄing and concluded heuristically that about

seven shuﬄes would be required to mix 52 cards. Emile Borel also reported joint work with

Paul Levy, one of the great probabilists of the twentieth century; they posed some problems

but were unable to make real progress.

Isolated but serious work on shuﬄing was reported in a 1955 Bell Laboratories report by

Edgar Gilbert. He used information theory to attack the problems and gave some tools for

riﬄe shuﬄing developed jointly with Claude Shannon.

They proposed what has come to be called the Gilbert–Shannon–Reeds model for riﬄe

shuﬄing; this presaged much later work. Thorp (1973) proposed a less realistic model and

showed how poor shuﬄing could be exploited in casino games. Thorp’s model is analyzed

in Morris (2009). Epstein (1977) reports practical studies of how casino dealers shuﬄe with

data gathered with a very precise microphone! The upshot of this work was a well-posed

mathematics problem and some heuristics; further early history appears in Chapter 4 of

Diaconis (1988).

The modern era. The modern era in quantitative analysis of shuﬄing begins with papers

of Diaconis and Shahshahani (1981) and Aldous (1983). They introduced rigorous meth-

ods, Fourier analysis on groups, and coupling. These gave sharp upper and lower bounds,

suitably close, for real problems. In particular, Aldous sketched out a proof that

3

2

log

2

n

riﬄe shuﬄes mixed n cards. A more careful argument for riﬄe shuﬄing was presented by

Aldous and Diaconis (1986). This introduced “strong stationary times,” a powerful method

of proof which has seen wide application. It is applied here in Section 4.

A deﬁnitive analysis of riﬄe shuﬄing was ﬁnally carried out in Bayer and Diaconis (1992)

and Diaconis et al. (1995). They were able to derive simple closed-form expressions for all

quantities involved and do exact computations for n = 52 (or 32 or 104 or . . . ). This results

in the “seven shuﬄes theorem” explained below. A clear elementary account of these ideas

is in Mann (1994, 1995) reprinted in Grinstead and Snell (1997). See Ethier (2010) for an

informative textbook account.

The successful analysis of shuﬄing led to a host of developments, the techniques reﬁned

and extended. For example, it is natural to want not only the order of the cards, but

also the “up-down pattern” of one-way backs to be randomized. Highlights include work

of Bidigare et al. (1999) and Brown and Diaconis (1998) who gave a geometric interpre-

tation of shuﬄing which had many extensions to which the same analysis applied. Lalley

(1996, 1999) studied less random methods of riﬄe shuﬄing. Fulman (2000a,b, 2001) showed

that interspersing cuts doesn’t materially eﬀect things and gave high level explanations for

miraculous accidents connecting shuﬄing and Lie theory. The work is active and ongoing.

Recent surveys are given by Diaconis (1996, 2003); Fulman (1998); Stark et al. (2002).

In recent work, Diaconis et al. (1995); Conger and Viswanath (2006) and Assaf et al.

(2011) have studied the number of shuﬄes required to have selected features randomized

(e.g., the original top card, or the values but not the suits). Here, fewer shuﬄes suﬃce.

4 PERSI DIACONIS, JASON FULMAN, AND SUSAN HOLMES

Conger and Howald (2010) shows that the way the cards are dealt out after shuﬄing aﬀects

things. The mathematics of shuﬄing is closely connected to modern algebraic combinatorics

through quasi-symmetric functions Stanley (2001). The descent theory underlying shuﬄing

makes equivalent appearances in the basic task of carries when adding integers Diaconis

and Fulman (2009a,b); Diaconis and Fulman (2011).

3. Analysis of one pass through a shelf shuffler

This section gives a fairly complete analysis of a single pass through a shelf shuﬄer.

Section 3.1 gives several equivalent descriptions of the shuﬄe. In Section 3.2, a closed-form

formula for the chance of any permutation w is given. This in turn depends only on the

number of “valleys” in w. The number of permutations with j valleys is easily calculated

and so exact computations for any of the distances above are available. Section 3.3 uses the

exact formulae to get asymptotic rates of convergence for l(∞) and separation distances.

Section 3.4 gives the distribution of such permutations by cycle type. Section 3.5 gives the

distribution of the “shape” of such a permutation under the Robinson–Schensted–Knuth

map. Section 3.6 gives the distribution of the number of descents. We ﬁnd it surprising that

a real-world applied problem makes novel contact with elegant combinatorics. In Section 4,

iterations of a shelf shuﬄer are shown to be equivalent to shelf shuﬄing with more shelves.

Thus all of the formulae of this section apply.

3.1. Alternative descriptions. Consider two basic shelf shuﬄers: for the ﬁrst, a deck of

n cards is sequentially distributed on one of m shelves. (Here, n = 52, m = 10, are possible

choices.) Each time, the cards are taken from the bottom of the deck, a shelf is chosen at

random from one to m, and the bottom card is placed on top of any previous cards on the

shelf. At the end, the packets on the shelves are unloaded into a ﬁnal deck of n. This may

be done in order or at random; it turns out not to matter. Bayer and Diaconis (1992) called

this an inverse m-shuﬄe.

This shuﬄing scheme, that is the main object of study, is based on m shelves. At each

stage that a card is placed on a shelf, the choice of whether to put it on the top or the

bottom of the existing pile on that shelf is made at random (1/2 each side). This will be

called a shelf shuﬄe. There are several equivalent descriptions of shelf shuﬄes:

Description 1 (Shelf shuﬄes). A deck of cards is initially in order 1, 2, 3, . . . , n. Label the

back of each card with n random numbers chosen at random between one and 2m. Remove

all cards labeled 1 and place them on top, keeping them in the same relative order. Then

remove all cards labeled 2 and place them under the cards labeled 1, reversing their relative

order. This continues with the cards labeled 3, labeled 4, and so on, reversing the order in

each even labeled packet. If at any stage there are no cards with a given label, this empty

packet still counts in the alternating pattern.

For example, a twelve-card deck with 2m = 4,

Label 2 1 1 4 3 3 1 2 4 3 4 1

Card 1 2 3 4 5 6 7 8 9 10 11 12

is reordered as

2 3 7 12 8 1 5 6 10 11 9 4.

Description 2 (Inverse shelf shuﬄes). Cut a deck of n cards into 2m piles according to a

multinomial distribution; thus the number of cards cut oﬀ in pile i has the same description

as the number of balls in the ith box if n balls are dropped randomly into 2m boxes. Reverse

the order of the even-numbered packets. Finally, riﬄe shuﬄe the 2m packets together by

the Gilbert–Shannon–Reeds (GSR) distribution Bayer and Diaconis (1992) dropping each

ANALYSIS OF CASINO SHELF SHUFFLING MACHINES 5

card sequentially with probability proportional to packet size. This makes all possible

interleavings equally likely.

1

10

B

B

B

B

B

B

B

B

B

B

B

B

B

B

B

B

B

B

B

B

B

B

Figure 1. Two shelves in shelf shuﬄe.

Description 3 (Geometric description). Consider the function f

m

(x) from [0, 1] to [0, 1]

which has “tents,” each of slope ±2m centered at

1

2m

,

3

2m

,

5

2m

, . . . ,

2m−1

2m

. Figure 1 illustrates

an example with m = 2. Place n labeled points uniformly at random into the unit interval.

Label them, from left to right, x

1

, x

2

, . . . , x

n

. Applying f

m

gives y

i

= f

m

(x

i

). This gives

the permutation

1 2 ··· n

π

1

π

2

··· π

n

with π

1

the relative position from the bottom of y

1

, . . . , π

i

the relative position from the

bottom of y

i

among the other y

j

. This permutation has the distribution of an inverse shelf

shuﬄe. It is important to note that the natural distances to uniformity (total variation,

separation, l

∞

) are the same for inverse shuﬄes and forward shuﬄes. In Section 4, this

description is used to show that repeated shelf shuﬄing results in a shelf shuﬄe with more

shelves.

3.2. Formula for the chance of a permutation produced by a shelf shuﬄer. To

describe the main result, we call i a valley of the permutation w ∈ S

n

if 1 < i < n and

w(i − 1) > w(i) < w(i + 1). Thus w = 5136724 has two valleys. The number of valleys is

classically used as a test of randomness for time series. See Warren and Seneta (1996) and

their references. If v(n, k) denotes the number of permutations on n symbols with k valleys,

then Warren and Seneta (1996) v(1, 0) = 1, v(n, k) = (2k + 2)v(n − 1, k) + (n − 2k)v(n −

1, k − 1). So v(n, k) is easy to compute for numbers of practical interest. Asymptotics are

in Warren and Seneta (1996) which also shows the close connections between valleys and

descents.

Theorem 3.1. The chance that a shelf shuﬄer with m shelves and n cards outputs a

permutation w is

4

v(w)+1

2(2m)

n

m−1

X

a=0

n + m − a − 1

n

n − 1 − 2v(w)

a − v(w)

where v(w) is the number of valleys of w. This can be seen to be the coeﬃcient of t

m

in

1

2(2m)

n

(1 + t)

n+1

(1 − t)

n+1

4t

(1 + t)

2

v(w)+1

.

6 PERSI DIACONIS, JASON FULMAN, AND SUSAN HOLMES

Example. Suppose that m = 1. Then the theorem yields the uniform distribution on the

2

n−1

permutations with no valleys. Permutations with no valleys are also sometimes called

unimodal permutations. These arise in social choice theory through Coombs’ “unfolding”

hypothesis (Diaconis, 1988, Chap. 6).

Remark. By considering the cases m ≥ n and n ≥ m we see that, in the formula of Theorem

3.1, the range of summation can be taken up to n −1 instead of m −1. This will be useful

later.

Theorem 3.1 makes it easy to compute the distance to stationarity for any of the metrics

in Section 2.2. Indeed, the separation and l(∞) distance is attained at either permutations

with a maximum number of valleys (when n = 52, this maximum is 25) or for permutations

with 0 valleys. For the total variation distance, with P

m

(v) denoting the probability in

Theorem 3.1,

kP

m

− Uk

TV

=

1

2

b

n−1

2

c

X

a=0

v(n, a)

P

m

(a) −

1

n!

.

Table 1. Distances for various numbers of shelves m.

m 10 15 20 25 30 35 50 100 150 200 250 300

kP

m

− Uk

TV

1 .943 .720 .544 .391 .299 .159 .041 .018 .010 .007 .005

sep(P

m

) 1 1 1 1 1 .996 .910 .431 .219 .130 .085 .060

kP

m

− Uk

∞

∞ ∞ ∞ 45118 3961 716 39 1.9 .615 .313 .192 .130

Table 1 gives these distances when n = 52 for various numbers of shelves m. Larger values

of m are of interest because of the convolution results explained in Section 4. These numbers

show that ten shelves are woefully insuﬃcient. Indeed, 50 shelves are hardly suﬃcient.

To prove Theorem 3.1, we will relate it to the following 2m-shuﬄe on the hyperoctahedral

group B

n

: cut the deck multinomially into 2m piles. Then ﬂip over the odd numbered stacks,

and riﬄe the piles together, by dropping one card at a time from one of the stacks (at each

stage with probability proportional to stack size). When m = 1 this shuﬄe was studied in

Bayer and Diaconis (1992), and for larger m it was studied in Fulman (2001).

It will be helpful to have a description of the inverse of this 2m-shuﬄe. To each of the

numbers {1, . . . , n} is assigned independently and uniformly at random one of −1, 1, −2, 2,

. . . , −m, m. Then a signed permutation is formed by starting with numbers mapped to −1

(in decreasing order and with negative signs), continuing with the numbers mapped to 1

(in increasing order and with positive signs), then continuing to the numbers mapped to

−2 (in decreasing order and with negative signs), and so on. For example the assignment

{1, 3, 8} 7→ −1, {5} 7→ 1, {2, 7} 7→ 2, {6} 7→ −3, {4} 7→ 3

leads to the signed permutation

−8 − 3 −1 5 2 7 − 6 4.

The proof of Theorem 3.1 depends on an interesting relation with shuﬄes for signed

permutations (hyperoctahedral group). This is given next followed by the proof of Theorem

3.1.

Theorem 3.2 gives a formula for the probability for w after a hyperoctahedral 2m-shuﬄe,

when one forgets signs. Here p(w) is the number of peaks of w, where i is said to be a peak

of w if 1 < i < n and w(i − 1) < w(i) > w(i + 1). Also Λ(w) denotes the peak set of w

and D(w) denotes the descent set of w (i.e., the set of points i such that w(i) > w(i + 1)).

Finally, let [n] = {1, . . . , n}.

ANALYSIS OF CASINO SHELF SHUFFLING MACHINES 7

Theorem 3.2. The chance of a permutation w obtained by performing a 2m shuﬄe on the

hyperoctahedral group and then forgetting signs is

4

p(w

−1

)+1

2(2m)

n

m−1

X

a=0

n + m − a − 1

n

n − 1 − 2p(w

−1

)

a − p(w

−1

)

where p(w

−1

) is the number of peaks of w

−1

.

Proof. Let P

0

(m) denote the set of nonzero integers of absolute value at most m, totally

ordered so that

−1 ≺ 1 ≺ −2 ≺ 2 ≺ ··· ≺ −m ≺ m.

Then given a permutation w = (w

1

, ··· , w

n

), page 768 of Stembridge (1997) deﬁnes a

quantity ∆(w). (Stembridge calls it ∆(w, γ), but throughout we always choose γ to be the

identity map on [n], and so suppress the symbol γ whenever he uses it). By deﬁnition,

∆(w) enumerates the number of maps f : [n] 7→ P

0

(m) such that

• f (w

1

) ··· f(w

n

)

• f (w

i

) = f(w

i+1

) > 0 ⇒ i /∈ D(w)

• f (w

i

) = f(w

i+1

) < 0 ⇒ i ∈ D(w)

We claim that the number of maps f : [n] 7→ P

0

(m) with the above three properties is

equal to (2m)

n

multiplied by the chance that a hyperoctahedral 2m-shuﬄe results in the

permutation w

−1

. This is most clearly explained by example:

w = 8 3 1 5 2 7 6 4

f = −1 −1 −1 1 2 2 −3 3

¿From the inverse description of the hyperoctahedral 2m-shuﬄe stated before the proof,

the assignment yields w. This proves the claim.

Let Λ(w) denote the set of peaks of w. From Proposition 3.5 of Stembridge (1997),

∆(w) = 2

p(w)+1

X

E⊆[n−1]:Λ(w)⊆E4(E+1)

L

E

.

Here

L

E

=

X

1≤i

1

≤···≤i

n

≤m

k∈E⇒i

k

<i

k+1

1,

and 4 denotes symmetric diﬀerence, i.e., A4B = (A − B) ∪ (B − A). Now a simple

combinatorial argument shows that L

E

=

n+m−|E|−1

n

. Indeed, L

E

is equal to the number

of integral i

1

, . . . , i

n

with 1 ≤ i

1

··· ≤ i

n

≤ m − |E|, which by a “stars and bars” argument

is

n+m−|E|−1

n

. Thus

∆(w) = 2

p(w)+1

X

E⊆[n−1]:Λ(w)⊆E4(E+1)

n + m − |E| − 1

n

.

Now let us count the number of E of size a appearing in this sum. For each j ∈ Λ(w),

exactly one of j or j − 1 must belong to E, and the remaining n − 1 − 2p(w) elements of

[n − 1] can be independently and arbitrarily included in E. Thus the number of sets E of

size a appearing in the sum is 2

p(w)

n−1−2p(w)

a−p(w)

. Hence

∆(w) =

4

p(w)+1

2

n−1

X

a=0

n + m − a − 1

n

n − 1 − 2p(w)

a − p(w)

,

which completes the proof.

8 PERSI DIACONIS, JASON FULMAN, AND SUSAN HOLMES

Proof of Theorem 3.1. To deduce Theorem 3.1 from Theorem 3.2, it is not hard to see that a

shelf shuﬄe with m shelves is equivalent to taking w

0

to be the inverse of a permutation after

a hyperoctahedral 2m-shuﬄe, then taking a permutation w deﬁned by w(i) = n −w

0

(i) + 1.

Thus the shelf shuﬄe formula is obtained from the hyperoctahedral 2m-shuﬄe formula by

replacing peaks by valleys.

Remarks.

• The paper Fulman (2001) gives an explicit formula for the chance of a signed per-

mutation after a 2m-shuﬄe on B

n

in terms of cyclic descents. Namely it shows this

probability to be

m+n−cd(w

−1

)

n

(2m)

n

where cd(w) is the number of cyclic descents of w, deﬁned as follows: Ordering the

integers 1 < 2 < 3 < ··· < ··· < −3 < −2 < −1,

– w has a cyclic descent at position i for 1 ≤ i ≤ n − 1 if w(i) > w(i + 1).

– w has a cyclic descent at position n if w(n) < 0.

– w has a cyclic descent at position 1 if w(1) > 0.

For example the permutation 3 1 − 2 4 5 has two cyclic descents at position 1 and

a cyclic descent at position 3, so cd(w) = 3.

This allows one to study aspects of shelf shuﬄers by lifting the problem to B

n

,

using cyclic descents (where calculations are often easier), then forgetting about

signs. This idea was used in Fulman (2001) to study the cycle structure of unimodal

permutations, and in Aguiar et al. (2004) to study peak algebras of types B and D.

• The appearance of peaks in the study of shelf shuﬄers is interesting, as peak algebras

have appeared in various parts of mathematics. Nyman (2003) proves that the

peak algebra is a subalgebra of the symmetric group algebra, and connections with

geometry of polytopes can be found in Aguiar et al. (2006) and Billera et al. (2003).

There are also close connections with the theory of P -partitions Petersen (2005,

2007); Stembridge (1997).

The following corollary shows that for a shelf shuﬄer of n cards with m shelves, the

chance of a permutation w with v valleys is monotone decreasing in v. Thus, the identity

(or any other unimodal permutation) is most likely and an alternating permutation (down,

up, down, up, . . . ) is least likely. From Theorem 3.1, the chance of a ﬁxed permutation

with v valleys is

(3.1) P (v) =

4

v+1

2(2m)

n

n−1

X

a=0

n + m − 1 − a

n

n − 1 − 2v

a − v

.

Corollary 3.3. For P (v) deﬁned at (3.1), P (v) ≥ P (v + 1), 0 ≤ v ≤ (n − 1)/2.

Proof. Canceling common terms, and setting a − v = j (so a = j + v) in (3.1), we

have

2(2m)

n

4

v+1

P (v) =

P

n−1−2v

j=0

f(j + v)

n−1−2v

j

= 2

n−1−2v

E(f(S

n−1−2v

+ v)) with f(a) =

n+m−1−a

n

and S

n−1−2v

distributed as Binomial(n −1 −2v,

1

2

). The proposed inequality is

equivalent to

(3.2) E (f (S

n−1−2v

+ v)) ≥ E (f (S

n−1−2v−2

+ v + 1)) .

To prove this, represent S

n−1−2v

= S

n−1−2v−2

+ Y

1

+ Y

2

, with Y

i

independent taking values

in {0, 1}, with probability 1/2. Then (3.2) is equivalent to

(3.3)

X

j

1

4

f(j + v) +

1

2

f(j + v + 1) +

1

4

f(j + v + 2) − f(j + v + 1)

P {S

n−1−2v−2

= j} ≥ 0.

ANALYSIS OF CASINO SHELF SHUFFLING MACHINES 9

Thus if

1

2

f(j + v) +

1

2

f(j + v + 2) ≥ f(j + v + 1), e.g., f(a) is convex, we are done. Writing

out the expression f(a) + f(a + 2) ≥ 2f(a + 1) and canceling common terms, it must be

shown that

(3.4) (m + n −1 −a)(m + n −2 −a) + (m −1 −a)(m −2 −a) ≥ 2(m + n −2 −a)(m −1 −a)

for all 0 ≤ a ≤ n − 1. Subtracting the right side from the left, the coeﬃcients of a

2

and a

cancel, leaving n(n − 1) ≥ 0.

3.3. Asymptotics for the kP − Uk

∞

and separation distances. Recall the distances

kP − Uk

∞

= max

w

1 −

P (w)

U(w)

and sep(P ) = max

w

1 −

P (w)

U(w)

.

Theorem 3.4. Consider the shelf shuﬄing measure P

m

with n cards and m shelves. Sup-

pose that m = cn

3/2

. Then, as n tends to inﬁnity with 0 < c < ∞ ﬁxed,

kP

m

− Uk

∞

∼ e

1/(12c

2

)

− 1,

sep(P

m

) ∼ 1 − e

−1/(24c

2

)

.

Remark. We ﬁnd it surprising that this many shelves are needed. For example, when n = 52,

to make the distance less than 1/100, m

.

= 1, 085 shelves are required for kP

m

− U k

∞

and

m

.

= 764 are required for sep(P

m

).

Proof. Using Corollary 3.3, the distance is achieved at the identity permutation or a per-

mutation with b(n − 1)/2c valleys. For the identity, consider n!P

m

(id). Using Theorem

3.1,

(3.5) n!P

m

(id) =

2(n!)

(2m)

n

n−1

X

a=0

m + n − a − 1

n

n − 1

a

.

To bound this sum, observe that

n−1

a

/2

n−1

is the binomial probability density. To keep

the bookkeeping simple, assume throughout that n is odd. The argument for even n is

similar.

For a =

n−1

2

+ j, the local central limit theorem as in Feller (1968, Chap. VII.2), shows

(3.6)

n−1

n−1

2

+j

2

n−1

∼

e

−2j

2

/n

p

πn/2

for j = o(n

2/3

).

In the following, we show further that

(3.7)

n!

m

n

m +

n−1

2

− j

n

∼ e

−

1

24c

2

+

j

c

√

n

uniformly for j = o(n).

Combining (3.6), (3.7), gives a Riemann sum for the integral

e

−

1

24c

2

p

π/2

Z

∞

−∞

e

−2x

2

+x/c

dx = e

1/(12c

2

)

,

the claimed result. This part of the argument follows Feller (1968, Chap. VII.2) and we

suppress further details. To complete the argument the tails of the sum in (3.5) must be

bounded.

We ﬁrst prove (3.7). From the deﬁnitions

n!

m

n

m − j +

n−1

2

n

=

n−1

2

Y

i=−

n−1

2

1 −

j

m

+

i

m

10 PERSI DIACONIS, JASON FULMAN, AND SUSAN HOLMES

using log(1 − x) = −x −

x

2

2

+ O(x

3

),

n−1

2

X

i=−

n−1

2

log

1 −

j

m

+

i

m

= −

X

i

−

j

m

+

i

m

−

1

2

X

i

−

j

m

+

i

m

2

+ nO

n

m

3

=

nj

m

−

1

2

nj

2

m

2

+

1

12

n(n

2

− 1)

m

2

+ O

1

√

n

=

j

c

√

n

−

j

2

2c

2

n

2

−

1

24c

2

+ O

1

√

n

.

(3.8)

The error term in (3.8) is uniform in j. For j = o(n), j

2

/n

2

= o(1) and (3.7) follows.

To bound the tails of the sum, ﬁrst observe that (3.8) implies that

n!

m

n

m−j+

n−1

2

n

= e

O

(

√

n

)

for all j. From Bernstein’s inequality, if X

i

= ±1 with probability 1/2, P (|X

1

+···+X

n−1

| >

a) ≤ 2e

−a

2

/(n−1)

. Using this, the sum over |j| ≥ An

3/4

is negligible for A suﬃciently large.

The Gaussian approximation to the binomial works for j n

2/3

. To bound the sum for

|j| between n

2/3

and n

3/4

, observe from (3.8) that in this range,

n!

m

n

m−j+

n−1

2

n

= O

e

n

1/4

.

Then, Feller (1968, p. 195) shows

n−1

n−1

2

+j

2

n−1

∼

1

p

πn/2

e

−

1

2

(j)

2

/(n/4)−f

j/

√

n/4

with f(x) =

P

∞

a=3

(

1

2

)

a−1

+

(

−

1

2

)

a−1

a(a−1)

1

√

n/4

a−2

x

a

= c

1

x

4

n

+ c

2

x

6

n

2

+ . . . for explicit constants

c

1

, c

2

, . . . . For θ

1

n

2/3

≤ |j| ≤ θ

2

n

3/4

, the sum under study is dominated by A

P

j≥n

2/3

e

−Bj

1/6

which tends to zero.

The separation distance is achieved at permutations with

n−1

2

valleys (recall we are

assuming that n is odd). From (3.1),

1 − n!P

m

n − 1

2

= 1 −

n!

m

n

m +

n−1

2

n

.

The result now follows from (3.7) with j = 0.

Remark. A similar argument allows asymptotic evaluation of total variation. We have not

carried out the details.

3.4. Distribution of cycle type. The number of ﬁxed points and the number of cycles

are classic descriptive statistics of a permutation. More generally, the number of i-cycles for

1 ≤ i ≤ n has been intensively studied Shepp and Lloyd (1966); Diaconis et al. (1995). This

section investigates the distribution of cycle type of a permutation w produced from a shelf

shuﬄer with m shelves and n cards. Similar results for ordinary riﬄe shuﬄes appeared in

Diaconis et al. (1995), and closely related results in the type B case (not in the language

of shelf-shuﬄing) appear in Fulman (2001, 2002). Recall also that in the case of one shelf,

the shelf shuﬄer generates one of the 2

n−1

unimodal permutations uniformly at random.

The cycle structure of unimodal permutations has been studied in several papers in the

literature: see Fulman (2001, 2002); Thibon (2001) for algebraic/combinatorial approaches

and Gannon (2001); Rogers (1981) for approaches using dynamical systems.

ANALYSIS OF CASINO SHELF SHUFFLING MACHINES 11

For what follows, we deﬁne

f

i,m

=

1

2i

X

d|i

d odd

µ(d)(2m)

i/d

where µ is the M¨oebius function of elementary number theory.

Theorem 3.5. Let P

m

(w) denote the probability that a shelf shuﬄer with m shelves produces

a permutation w. Let N

i

(w) denote the number of i-cycles of a permutation w in S

n

. Then

(3.9) 1 +

X

n≥1

u

n

X

w∈S

n

P

m

(w)

Y

i≥1

x

N

i

(w)

i

=

Y

i≥1

1 + x

i

(u/2m)

i

1 − x

i

(u/2m)

i

f

i,m

.

Proof. By the proof of Theorem 3.1, a permutation produced by a shelf shuﬄer with m

shelves is equivalent to forgetting signs after the inverse of a type B riﬄe shuﬄe with 2m

piles, then conjugating by the longest element n, n − 1, . . . , 1. Since a permutation and

its inverse have the same cycle type and conjugation leaves cycle type invariant, the result

follows from either (Fulman, 2001, Thm. 7) or (Fulman, 2002, Thm. 9), both of which

derived the generating function for cycle type after type B shuﬄes.

Theorem 3.5 leads to several corollaries. We say that a random variable X is Bino-

mial(n,p) if P(X = j) =

n

j

p

j

(1 − p)

n−j

, 0 ≤ j ≤ n, and that X is negative binomial with

parameters (f, p) if P(X = j) =

f+j−1

j

p

j

(1 − p)

f

, 0 ≤ j < ∞.

Corollary 3.6. Let N

i

(w) be the number of i-cycles of a permutation w.

(1) Fix u such that 0 < u < 1. Then choose a random number N of cards so that

P(N = n) = (1 − u)u

n

. Let w be produced by a shelf shuﬄer with m shelves and N

cards. Then any ﬁnite number of the random variables {N

i

} are independent, and

N

i

is distributed as the convolution of a Binomial

f

i,m

,

(u/2m)

i

1+(u/2m)

i

and a negative

binomial with parameters (f

i,m

, (u/2m)

i

).

(2) Let w be produced by a shelf shuﬄer with m shelves and n cards. Then in the n → ∞

limit, any ﬁnite number of the random variables {N

i

} are independent. The N

i

are

distributed as the convolution of a Binomial

f

i,m

,

1

(2m)

i

+1

and a negative binomial

with parameters (f

i,m

, (1/2m)

i

).

Proof. Setting all x

i

= 1 in equation (3.9) yields the equation

(3.10) (1 − u)

−1

=

Y

i≥1

1 + (u/2m)

i

1 − (u/2m)

i

f

i,m

.

Taking reciprocals of equation (3.10) and multiplying by equation (3.9) gives the equality

(3.11) (1 − u) +

X

n≥1

(1 − u)u

n

X

w∈S

n

P

m

(w)

Y

i≥1

x

n

i

(w)

i

=

Y

i≥1

1 + x

i

(u/2m)

i

1 + (u/2m)

i

f

i,m

·

1 − (u/2m)

i

1 − x

i

(u/2m)

i

f

i,m

.

This proves part 1 of the theorem, the ﬁrst term on the right corresponding to the convo-

lution of binomials, and the second term to the convolution of negative binomials.

The second part follows from the claim that if a generating function f(u) has a Taylor

series which converges at u = 1, then the n → ∞ limit of the coeﬃcient of u

n

in f(u)/(1 −

12 PERSI DIACONIS, JASON FULMAN, AND SUSAN HOLMES

u) is f (1). Indeed, write the Taylor expansion f(u) =

P

∞

n=0

a

n

u

n

and observe that the

coeﬃcient of u

n

in f(u)/(1 − u) is

P

n

i=0

a

i

. Now apply the claim to equation (3.11) with

all but ﬁnitely many x

i

equal to 1.

Remark. For example, when i = 1, f

i,m

= m; the number of ﬁxed points are distributed

as a sum of Binomial

m,

1

2m+1

and negative binomial

m,

1

2m

. Each of these converges

to Poisson(1/2) and so the number of ﬁxed points is approximately Poisson(1). A similar

analysis holds for the other cycle counts. Corollary 3.6 could also be proved by the method

of moments, along the lines of the arguments of Diaconis et al. (1995) for the case of ordinary

riﬄe shuﬄes.

For the next result, recall that the limiting distribution of the large cycles of a uniformly

chosen permutation in S

n

has been determined by Goncharov Gontcharoﬀ (1942); Gon-

charov (1944), Shepp and Lloyd (1966), Vershik and Shmidt (1977, 1978), and others. For

instance the average length of the longest cycle L

1

is approximately .63n and L

1

/n has a

known limiting distribution. The next result shows that even with a ﬁxed number of shelves,

the distribution of the large cycles approaches that of a uniform random permutation, as

long as the number of cards is growing. We omit the proof, which goes exactly along the

lines of the corresponding result for riﬄe shuﬄes in Diaconis et al. (1995).

Corollary 3.7. Fix k and let L

1

(w), L

2

(w), . . . , L

k

(w) be the lengths of the k longest cycles

of w ∈ S

n

produced by a shelf shuﬄer with m shelves. Then for m ﬁxed, or growing with n,

as n → ∞,

|P

m

{L

1

/n ≤ t

1

, . . . , L

k

/n ≤ t

n

} − P

∞

{L

1

/n ≤ t

1

, . . . , L

k

/n ≤ t

n

}| → 0

uniformly in t

1

, t

2

, . . . , t

k

.

As a ﬁnal corollary, we note that Theorems 3.1 and 3.5 give the following generating

function for the joint distribution of permutations by valleys and cycle type. Note that

this gives the joint generating function for the distribution of permutations by peaks and

cycle type, since conjugating by the permutation n, n−1, . . . , 1 preserves the cycle type and

swaps valleys and peaks.

Corollary 3.8. Let v(w) denote the number of valleys of a permutation w. Then

t

1 − t

+

X

n≥1

u

n

X

w∈S

n

1

2

(1 + t)

n+1

(1 − t)

n+1

4t

(1 + t)

2

v(w)+1

Y

i≥1

x

N

i

(w)

i

=

X

m≥1

t

m

Y

i≥1

(

1 + x

i

u

i

1 − x

i

u

i

)

f

i,m

.

The same result holds with v(w) replaced by p(w), the number of peaks of w.

Remark. There is a large literature on the joint distribution of permutations by cycles

and descents Gessel and Reutenauer (1993); Diaconis et al. (1995); Reiner (1993); Fulman

(2000b); Blessenohl et al. (2005); Poirier (1998) and by cycles and cyclic descents Fulman

(2001, 2002, 2000a), but Corollary 3.8 seems to be the ﬁrst result on the joint distribution

by cycles and peaks.

3.5. Distribution of RSK shape. In this section we obtain the distribution of the Rob-

inson–Schensted–Knuth (RSK) shape of a permutation w produced from a shelf shuﬄer

with m shelves and n cards. For background on the RSK algorithm, see Stanley (1999).

The RSK bijection associates to a permutation w ∈ S

n

a pair of standard Young tableaux

(P (w), Q(w)) of the same shape and size n. Q(w) is called the recording tableau of w.

To state our main result, we use a symmetric function S

λ

studied in Stembridge (1997) (a

special case of the extended Schur functions in Kerov and Vershik (1986)). One deﬁnition

ANALYSIS OF CASINO SHELF SHUFFLING MACHINES 13

of the S

λ

is as the determinant

S

λ

(y) = det(q

λ

i

−i+j

)

where q

−r

= 0 for r > 0 and for r ≥ 0, q

r

is deﬁned by setting

X

n≥0

q

n

t

n

=

Y

i≥1

1 + y

i

t

1 − y

i

t

.

We also let f

λ

denote the number of standard Young tableaux of shape λ.

Theorem 3.9. The probability that a shelf shuﬄer with m shelves and n cards produces a

permutation with recording tableau T is equal to

1

2

n

S

λ

1

m

, . . . ,

1

m

for any T of shape λ, where S

λ

has m variables. Thus the probability that w has RSK shape

λ is equal to

f

λ

2

n

S

λ

1

m

, . . . ,

1

m

.

Proof. By the proof of Theorem 3.1, a permutation produced by a shelf shuﬄer with m

shelves is equivalent to forgetting signs after the inverse of a type B 2m-shuﬄe, and then

conjugating by the permutation n n −1 . . . 1. Since a permutation and its inverse have the

same RSK shape (Stanley, 1999, Sect. 7.13), and conjugation by n, n − 1, . . . , 1 leaves the

RSK shape unchanged (Stanley, 1999, Thm. A1.2.10), the result follows from Theorem 8

of Fulman (2002), which studied RSK shape after type B riﬄe shuﬄes.

3.6. Distribution of descents. A permutation w is said to have a descent at position i

(1 ≤ i ≤ n − 1) if w(i) > w(i + 1). We let d(w) denote the total number of descents of π.

For example the permutation 3 1 5 4 2 has d(w) = 3 and descent set 1, 3, 4. The purpose of

this section is to derive a generating function for the number of descents in a permutation

w produced by a shelf shuﬄer with m shelves and n cards. More precisely, we prove the

following result.

Theorem 3.10. Let P

m

(w) denote the probability that a shelf shuﬄer with m shelves and

n cards produces a permutation w. Letting [u

n

]f(u) denote the coeﬃcient of u

n

in a power

series f (u), one has that

(3.12)

X

w∈S

n

P

m

(w)t

d(w)+1

=

(1 − t)

n+1

2

n

X

k≥1

t

k

[u

n

]

(1 + u/m)

km

(1 − u/m)

km

.

The proof uses the result about RSK shape mentioned in Section 3.5, and symmetric

function theory; background on these topics can be found in the texts Stanley (1999) and

Macdonald (1995) respectively.

Proof. Let w be a permutation produced by a shelf shuﬄer with m shelves and n cards.

The RSK correspondence associates to w a pair of standard Young tableaux (P (w), Q(w))

of the same shape. Moreover, there is a notion of descent set for standard Young tableaux,

and by Lemma 7.23.1 of Stanley (1999), the descent set of w is equal to the descent set of

Q(w). Let f

λ

(r) denote the number of standard Young tableaux of shape λ with r descents.

Then Theorem 3.9 implies that

P(d(w) = r) =

X

|λ|=n

f

λ

(r)

2

n

S

λ

1

m

, . . . ,

1

m

.

14 PERSI DIACONIS, JASON FULMAN, AND SUSAN HOLMES

By equation 7.96 of Stanley (1999), one has that

X

r≥0

f

λ

(r)t

r+1

= (1 − t)

n+1

X

k≥1

s

λ

(1, . . . , 1)t

k

where in the kth summand, s

λ

(1, . . . , 1) denotes the Schur function with k variables spe-

cialized to 1. Thus

X

r≥0

P(d(w) = r) · t

r+1

=

X

r≥0

X

|λ|=n

f

λ

(r)

2

n

S

λ

1

m

, . . . ,

1

m

· t

r+1

=

(1 − t)

n+1

2

n

X

k≥1

t

k

X

|λ|=n

S

λ

1

m

, . . . ,

1

m

s

λ

(1, . . . , 1)

=

(1 − t)

n+1

2

n

X

k≥1

t

k

[u

n

]

X

n≥0

X

|λ|=n

S

λ

1

m

, . . . ,

1

m

s

λ

(1, . . . , 1) · u

n

.

¿From Appendix A.4 of Stembridge (1997), if λ ranges over all partitions of all natural

numbers, then

X

λ

s

λ

(x)S

λ

(y) =

Y

i,j≥1

1 + x

i

y

j

1 − x

i

y

j

.

Setting x

1

= ··· = x

k

= u and y

1

= ··· = y

m

=

1

m

completes the proof of the theorem.

For what follows we let A

n

(t) =

P

w∈S

n

t

d(w)+1

be the generating function of elements in

S

n

by descents. This is known as the Eulerian polynomial and from page 245 of Comtet

(1974), one has that

(3.13) A

n

(t) = (1 − t)

n+1

X

k≥1

t

k

k

n

.

This also follows by letting m → ∞ in equation (3.12).

The following corollary derives the mean and variance of the number of descents of a

permutation produced by a shelf shuﬄer.

Corollary 3.11. Let w be a permutation produced by a shelf shuﬄer with m shelves and

n ≥ 2 cards.

(1) The expected value of d(w) is

n−1

2

.

(2) The variance of d(w) is

n+1

12

+

n−2

6m

2

.

Proof. The ﬁrst step is to expand [u

n

]

(1+u/m)

km

(1−u/m)

km

as a series in k. One calculates that

[u

n

]

(1 + u/m)

km

(1 − u/m)

km

=

1

m

n

X

a≥0

km

a

km + n − a − 1

n − a

=

1

m

n

X

a≥0

(km) . . . (km − a + 1)

a!

(km + n − a − 1) . . . (km)

(n − a)!

=

1

n!

2

n

k

n

+

2

n

n(n − 1)(n − 2)

12m

2

k

n−2

+ . . .

ANALYSIS OF CASINO SHELF SHUFFLING MACHINES 15

where the . . . in the last equation denote terms of lower order in k. Thus Theorem 3.10

gives

X

w

P

m

(w)t

d(w)+1

=

(1 − t)

n+1

n!

X

k≥1

t

k

k

n

+

n − 2

12m

2

(1 − t)

2

(1 − t)

n−1

(n − 2)!

X

k≥1

t

k

k

n−2

+ (1 −t)

3

C(t)

where C(t) is a polynomial in t. By equation (3.13), it follows that

X

w

P

m

(w)t

d(w)+1

=

A

n

(t)

n!

+ (1 −t)

2

n − 2

12m

2

A

n−2

(t)

(n − 2)!

+ (1 −t)

3

C(t).

Since the number of descents of a random permutation has mean (n − 1)/2 and variance

(n + 1)/12 for n ≥ 2, it follows that

A

0

n

(1)

n!

=

(n+1)

2

and also that

A

00

n

(1)

n!

= (3n

2

+ n −2)/12.

Thus

X

w

P

m

(w)d(w) =

n − 1

2

and

X

w

P

m

(w)d(w)[d(w) + 1] =

3n

2

+ n −2

12

+

n − 2

6m

2

and the result follows.

Remarks.

• Part 1 of Corollary 3.11 can be proved without generating functions simply by noting

that by the way the shelf shuﬄer works, w and its reversal are equally likely to be

produced.

• Theorem 3.10 has an analog for ordinary riﬄe shuﬄes which is useful in the study

of carries in addition. See Diaconis and Fulman (2009a) for details.

4. Iterated shuffling

This section shows how to analyze repeated shuﬄes. Section 4.1 shows how to combine

shuﬄes. Section 4.2 gives a clean bound for the separation distance.

4.1. Combining shuﬄes. To describe what happens to various combinations of shuﬄes,

we need the notion of a signed m−shuﬄe. This has the following geometric description:

divide the unit interval into sub-intervals of length

1

m

; each sub-interval contains the graph

of a straight line of slope ±m. The left-to-right pattern of signs ±s is indicated by a vector

x of length m. Thus if m = 4 and x = + + ++, an x−shuﬄe is generated as shown on the

left side of Figure 2. If m = 4 and x = + − −+, the graph becomes that of the right side

of Figure 2. Call this function f

x

.

The shuﬄe proceeds as in the ﬁgure with n points dropped at random into the unit

interval, labeled left to right, y

1

, y

2

, . . . , y

n

and then permuted by f

x

. In each case there is

a simple forward description: the deck is cut into m piles by a multinomial distribution and

piles corresponding to negative coordinates are reversed. Finally, all packets are shuﬄed

together by the GSR procedure. Call the associated measure on permutations P

x

.

Remark. Thus, ordinary riﬄe shuﬄes are ++ shuﬄes. The shelf shuﬄe with 10 shelves is

an inverse + −+ − ··· + − (length 20) shuﬄe in this notation.

16 PERSI DIACONIS, JASON FULMAN, AND SUSAN HOLMES

0 1

1

1

4

1

2

3

4

B

B

B

B

B

B

B

B

B

B

B

0 1

1

1

4

1

2

3

4

B

B

B

B

B

B

B

B

B

B

B

Figure 2. Left: Four peaks; right: m = 4, x = + − −+.

The following theorem reduces repeated shuﬄes to a single shuﬄe. To state it, one piece

of notation is needed. Let x = (x

1

, x

2

, . . . , x

a

) and y = (y

1

, y

2

, . . . , y

b

) be two sequences

of ± signs. Deﬁne a sequence of length ab as x ∗ y = y

x

1

, y

x

2

, . . . , y

x

a

with (y

1

, . . . , y

b

)

1

=

(y

1

, . . . , y

b

) and (y

1

, . . . , y

b

)

−1

= (−y

b

, −y

b−1

, . . . , −y

1

). This is an associative product on

strings; it is not commutative. Let P

x

be the measure induced on S

n

(forward shuﬄes).

Example.

(+ + +) ∗ (++) = + + + + ++

(+−) ∗ (+−) = + − +−

(+−) ∗ (+ + −+) = + + −+ − + −−

Theorem 4.1. If x and y are ±1 sequences of length a and b, respectively, then

P

x

∗ P

y

= P

x∗y

.

Proof. This follows most easily from the geometric description underlying Figure 1 and

Figure 2. If a uniformly chosen point in [0, 1] is expressed base a, the “digits” are uniform

and independently distributed in {0, 1, . . . , a − 1}. Because of this, iterating the maps on

the same uniform points gives the convolution. The iterated maps have the claimed pattern

of slopes by a simple geometric argument.

Corollary 4.2. The convolution of k + − shuﬄes is a + −+ −···+ − (2

k

terms) shuﬄe.

Further, the convolution of a shelf shuﬄer with m

1

and then m

2

shelves is the same as a

shelf shuﬄer with 2m

1

m

2

shelves.

4.2. Bounds for separation distance. The following theorem gives a bound for separa-

tion (and so for total variation) for a general P

x

shuﬄe on S

n

.

Theorem 4.3. For any ±1 sequence x of length a, with P

x

the associated measure on S

n

,

and sep(P

x

) from (2.1),

(4.1) sep(P

x

) ≤ 1 −

n−1

Y

i=1

1 −

i

a

.

Proof. It is easiest to argue using shuﬄes as in Description 1. There, the backs of cards are

labeled, independently and uniformly, with symbols 1, 2, . . . , a. For the inverse shuﬄe, all

cards labeled 1 are removed, keeping them in their same relative order, and placed on top

followed by the cards labeled 2 (placed under the 1s) and so on, with the following proviso:

ANALYSIS OF CASINO SHELF SHUFFLING MACHINES 17

if the ith coordinate of x is −1, the cards labeled i have their order reversed; so if they are

1, 5, 17 from top down, they are placed in order 17, 5, 1. All of this results in a single

permutation drawn from P

x

. Repeated shuﬄes are modeled by labeling each card with a

vector of symbols. The kth shuﬄe is determined by the kth coordinate of this vector. The

ﬁrst time t that the ﬁrst k coordinates of those n vectors are all distinct forms a strong

stationary time. See Aldous and Diaconis (1986) or Fulman (1998) for further details. The

usual bound for separation yields

sep(P

x

) ≤ P {all n labels are distinct}.

The bound (4.1) now follows from the classical birthday problem.

Remarks.

• For a large with respect to n, the right side is well-approximated by 1 − e

−

n(n−1)

2a

.

This is small when n

2

a.

• The theorem gives a clean upper bound on the distance to uniformity. For example,

when n = 52, after 8 ordinary riﬄe shuﬄes (so x = + + ··· + +, length 256), the

bound (4.1) is sep(P

x

) ≤ 0.997, in agreement with Table 1 of Assaf et al. (2011).

For the actual shelf shuﬄe with x = + − + − ··· + − (length 20), the bound gives

sep(P

x

) = 1 but sep(P

x

∗ P

x

) ≤ 0.969 and sep(P

x

∗ P

x

∗ P

x

) ≤ 0.153.

• The bound in Theorem 4.3 is simple and general. However, it is not sharp for the

original shelf shuﬄer. The results of Section 3.3 show that m = cn

3/2

shelves suﬃce

to make sep(P

m

) small when c is large. Theorem 4.3 shows that m = cn

2

steps

suﬃce.

5. Practical tests and conclusions

The engineers and executives who consulted us found it hard to understand the total

variation distance. They asked for more down-to-earth notions of discrepancy. This section

reports some ad hoc tests which convinced them that the machine had to be used diﬀerently.

Section 5.1 describes the number of cards guessed correctly. Section 5.2 brieﬂy describes

three other tests. Section 5.3 describes conclusions and recommendations.

5.1. Card guessing with feedback. Suppose, after a shuﬄe, cards are dealt face-up, one

at a time, onto the table. Before each card is shown, a guess is made at the value of the

card. Let X

i

, 1 ≤ i ≤ n, be one or zero as the ith guess is correct and T

n

= X

1

+ ··· + X

n

the total number of correct guesses. If the cards were perfectly mixed, the chance that

X

1

= 1 is 1/n, the chance that X

2

= 1 is 1/(n − 1), . . . , that X

i

= 1 is 1/(n − i + 1).

Further, the X

i

are independent. Thus elementary arguments give the following.

Proposition 5.1. Under the uniform distribution, the number of cards guessed correctly

T

n

satisﬁes

• E(T

n

) =

1

n

+

1

n−1

+ ··· + 1 ∼ log n + γ + O

1

n

with γ

.

= 0.577 Euler’s constant.

• var(T

n

) =

1

n

1 −

1

n

+

1

n−1

1 −

1

n−1

+ ··· +

1

2

1 −

1

2

∼ log n + γ −

π

2

6

+ O

1

n

.

• Normalized by its mean and variance, T

n

has an approximate normal distribution.

When n = 52, T

n

has mean approximately 4.5, standard deviation approximately

√

2.9,

and the number of correct guesses is between 2.7 and 6.3, 70% of the time.

Based on the theory developed in Section 3 we constructed a guessing strategy — con-

jectured to be optimal — for use after a shelf shuﬄe.

Strategy

• To begin, guess card 1.

18 PERSI DIACONIS, JASON FULMAN, AND SUSAN HOLMES

• If guess is correct, remove card 1 from the list of available cards. Then guess card

2, card 3, . . . .

• If guess is incorrect and card i is shown, remove card i from the list of available

cards and guess card i + 1, card i + 2, . . . .

• Continue until a descent is observed (order reversal with the value of the current

card smaller than the value of the previously seen card). Then change the guessing

strategy to guess the next-smallest available card.

• Continue until an ascent is observed, then guess the next-largest available card, and

so on.

Table 2. Mean and variance for n = 52 after a shelf shuﬄe with m shelves under

the conjectured optimal strategy.

m 1 2 4 10 20 64

mean 39 27 17.6 9.3 6.2 4.7

variance 3.2 5.6 6.0 4.7 3.8 3.1

A Monte Carlo experiment was run to determine the distribution of T

n

for n = 52 with

various values of m (10,000 runs for each value). Table 2 shows the mean and variance for

various numbers of shelves. Thus for the actual shuﬄer, m = 10 gives about 9.3 correct

guesses versus 4.5 for a well-shuﬄed deck. A closely related study of optimal strategy for

the GSR measure (without feedback) is carried out by Ciucu (1998).

5.2. Three other tests. For the shelf shuﬄer with m shelves, an easy argument shows

that the chance that the original top card is still on top is at least 1/2m instead of 1/n.

When n = 52, this is 1/20 versus 1/52. The chance that card 2 is on top is approximately

1

2m

1 −

1

2m

while the chance that card 2 is second from the top is roughly

1

(2m)

2

. The

same probabilities hold for the bottom cards. While not as striking as the guessing test of

Section 5.1, this still suggests that the machine is “oﬀ.”

Our second test supposed that the deck was originally arranged with all the red cards on

top and all the black cards at the bottom. The test statistic is the number of changes of

color going through the shuﬄed deck. Under uniformity, simulations show this has mean

26 and standard deviation 3.6. With a 10-shelf machine, simulations showed 17 ± 1.83, a

noticeable deviation.

The third test is based on the spacings between cards originally near the top of the

deck. Let w

j

denote the position of the card originally at position j from the top. Let

D

j

= |w

j

− w

j+1

|. Figure 3 shows a histogram of D

j

for 1 ≤ j ≤ 9, from a simulation with

n = 52 based on a 10-shelf shuﬄer. Figure 4 shows histograms for the same statistics for a

well-shuﬄed deck; there are striking discrepancies.

5.3. Conclusions and recommendations. The study above shows that a single iteration

of a 10-shelf shuﬄer is not suﬃciently random. The president of the company responded

“We are not pleased with your conclusions, but we believe them and that’s what we hired

you for.”

We suggested a simple alternative: use the machine twice. This results in a shuﬄe

equivalent to a 200-shelf machine. Our mathematical analysis and further tests, not reported

here, show that this is adequately random. Indeed, Table 1 shows, for total variation, this

is equivalent to 8-to-9 ordinary riﬄe shuﬄes.

ANALYSIS OF CASINO SHELF SHUFFLING MACHINES 19

Figure 3. 9 spacings from a 10-shelf shuﬄe; j varies from top left to bottom right,

1 ≤ j ≤ 9.

References

Aguiar, M., Bergeron, N. and Nyman, K. (2004). The peak algebra and the descent

algebras of types B and D. Trans. Amer. Math. Soc., 356 2781–2824. URL http:

//dx.doi.org/10.1090/S0002-9947-04-03541-X.

Aguiar, M., Bergeron, N. and Sottile, F. (2006). Combinatorial Hopf algebras and

generalized Dehn-Sommerville relations. Compos. Math., 142 1–30. URL http://dx.

doi.org/10.1112/S0010437X0500165X.

Aldous, D. (1983). Random walks on ﬁnite groups and rapidly mixing Markov chains.

In Seminar on probability, XVII, vol. 986 of Lecture Notes in Math. Springer, Berlin,

243–297.

Aldous, D. and Diaconis, P. (1986). Shuﬄing cards and stopping times. Amer. Math.

Monthly, 93 333–348.

20 PERSI DIACONIS, JASON FULMAN, AND SUSAN HOLMES

Figure 4. 9 spacings from a uniform shuﬄe; j varies from top left to bottom right,

1 ≤ j ≤ 9.

Assaf, S., Diaconis, P. and Soundararajan, K. (2011). A rule of thumb for riﬄe

shuﬄing. To appear.

Bayer, D. and Diaconis, P. (1992). Trailing the dovetail shuﬄe to its lair. Ann. Appl.

Probab., 2 294–313.

Bidigare, P., Hanlon, P. and Rockmore, D. (1999). A combinatorial description of

the spectrum for the Tsetlin library and its generalization to hyperplane arrangements.

Duke Math. J., 99 135–174.

Billera, L. J., Hsiao, S. K. and van Willigenburg, S. (2003). Peak quasisymmetric

functions and Eulerian enumeration. Adv. Math., 176 248–276. URL http://dx.doi.

org/10.1016/S0001-8708(02)00067-1.

Blessenohl, D., Hohlweg, C. and Schocker, M. (2005). A symmetry of the descent

algebra of a ﬁnite Coxeter group. Adv. Math., 193 416–437. URL http://dx.doi.org/

10.1016/j.aim.2004.05.007.

Borel, E. and Ch

´

eron, A. (1955). Th´eorie math´ematique du bridge `a la port´ee de tous.

Gauthier-Villars, Paris. 2`eme ´ed.

Brown, K. S. and Diaconis, P. (1998). Random walks and hyperplane arrangements.

Ann. Probab., 26 1813–1854.

Ciucu, M. (1998). No-feedback card guessing for dovetail shuﬄes. Ann. Appl. Probab., 8

1251–1269. URL http://dx.doi.org/10.1214/aoap/1028903379.

ANALYSIS OF CASINO SHELF SHUFFLING MACHINES 21

Comtet, L. (1974). Advanced combinatorics. enlarged ed. D. Reidel Publishing Co.,

Dordrecht. The art of ﬁnite and inﬁnite expansions.

Conger, M. and Viswanath, D. (2006). Riﬄe shuﬄes of decks with repeated cards. Ann.

Probab., 34 804–819. URL http://dx.doi.org/10.1214/009117905000000675.

Conger, M. A. and Howald, J. (2010). A better way to deal the cards. Amer. Math.

Monthly, 117 686–700. URL http://dx.doi.org/10.4169/000298910X515758.

Diaconis, P. (1988). Group representations in probability and statistics. Institute of

Mathematical Statistics Lecture Notes—Monograph Series, 11, Institute of Mathematical

Statistics, Hayward, CA.

Diaconis, P. (1996). The cutoﬀ phenomenon in ﬁnite Markov chains. Proc. Nat. Acad.

Sci. U.S.A., 93 1659–1664.

Diaconis, P. (2003). Mathematical developments from the analysis of riﬄe shuﬄing. In

Groups, Combinatorics & Geometry (Durham, 2001). World Sci. Publ., River Edge, NJ,

73–97.

Diaconis, P. and Fulman, J. (2009a). Carries, shuﬄing, and an amazing matrix. Amer.

Math. Monthly, 116 788–803. URL http://dx.doi.org/10.4169/000298909X474864.

Diaconis, P. and Fulman, J. (2009b). Carries, shuﬄing, and symmetric functions. Adv.

in Appl. Math., 43 176–196. URL http://dx.doi.org/10.1016/j.aam.2009.02.002.

Diaconis, P. and Fulman, J. (2011). Foulkes characters, Eulerian idempotents, and an

amazing matrix. ArXiv e-prints. 1102.5159.

Diaconis, P., McGrath, M. and Pitman, J. (1995). Riﬄe shuﬄes, cycles, and descents.

Combinatorica, 15 11–29.

Diaconis, P. and Shahshahani, M. (1981). Generating a random permutation with

random transpositions. Z. Wahrsch. Verw. Gebiete, 57 159–179.

Epstein, R. A. (1977). The theory of gambling and statistical logic. Revised ed. Academic

Press [Harcourt Brace Jovanovich Publishers], New York.

Ethier, S. N. (2010). The Doctrine of Chances. Probability and its Applications (New

York), Springer-Verlag, Berlin. Probabilistic aspects of gambling, URL http://dx.doi.

org/10.1007/978-3-540-78783-9.

Feller, W. (1968). An Introduction to Probability Theory and its Applications. Vol. I.

3rd ed. John Wiley & Sons Inc., New York.

Fulman, J. (1998). The combinatorics of biased riﬄe shuﬄes. Combinatorica, 18 173–184.

Fulman, J. (2000a). Aﬃne shuﬄes, shuﬄes with cuts, the Whitehouse module, and pa-

tience sorting. J. Algebra, 231 614–639.

Fulman, J. (2000b). Semisimple orbits of Lie algebras and card-shuﬄing measures on

Coxeter groups. J. Algebra, 224 151–165. URL http://dx.doi.org/10.1006/jabr.

1999.8157.

Fulman, J. (2001). Applications of the Brauer complex: Card shuﬄing, permutation

statistics, and dynamical systems. J. Algebra, 243 96–122.

Fulman, J. (2002). Applications of symmetric functions to cycle and increasing subse-

quence structure after shuﬄes. J. Algebraic Combin., 16 165–194.

Gannon, T. (2001). The cyclic structure of unimodal permutations. Discrete Math., 237

149–161. URL http://dx.doi.org/10.1016/S0012-365X(00)00368-X.

Gessel, I. M. and Reutenauer, C. (1993). Counting permutations with given cycle

structure and descent set. J. Combin. Theory Ser. A, 64 189–215. URL http://dx.doi.

org/10.1016/0097-3165(93)90095-P.

Goncharov, V. (1944). Du domaine d’analyse combinatoire. Bull. Acad. Sci. URSS Ser.

Math (Izv. Akad. Nauk SSSR), 8 3–48. Amer. Math. Soc. Transl. (2) 19 (1962), 1–46.

Gontcharoff, W. (1942). Sur la distribution des cycles dans les permutations. C. R.

(Doklady) Acad. Sci. URSS (N.S.), 35 267–269.

22 PERSI DIACONIS, JASON FULMAN, AND SUSAN HOLMES

Grinstead, C. M. and Snell, J. L. (1997). Introduction to Probability. 2nd ed. Ameri-

can Mathematical Society, Providence, RI. URL http://www.dartmouth.edu/

~

chance/

teaching_aids/books_articles/probability_book/pdf.html.

Kerov, S. V. and Vershik, A. M. (1986). The characters of the inﬁnite symmetric

group and probability properties of the Robinson–Schensted–Knuth algorithm. SIAM J.

Algebraic Discrete Methods, 7 116–124. URL http://dx.doi.org/10.1137/0607014.

Klarreich, E. (2002). Coming up trumps. New Scientist, 175 42–44.

Klarreich, E. (2003). Within every math problem, for this mathematician, lurks a card-

shuﬄing problem. SIAM News, 36. URL http://www.siam.org/pdf/news/295.pdf.

Lalley, S. P. (1996). Cycle structure of riﬄe shuﬄes. Ann. Probab., 24 49–73. URL

http://dx.doi.org/10.1214/aop/1042644707.

Lalley, S. P. (1999). Riﬄe shuﬄes and their associated dynamical systems. J. Theoret.

Probab., 12 903–932. URL http://dx.doi.org/10.1023/A:1021636902356.

Macdonald, I. G. (1995). Symmetric Functions and Hall Polynomials. 2nd ed. Oxford

Mathematical Monographs, The Clarendon Press Oxford University Press, New York.

With contributions by A. Zelevinsky, Oxford Science Publications.

Mackenzie, D. (2002). The mathematics of . . . shuﬄing. DISCOVER. URL http://

discovermagazine.com/2002/oct/featmath.

Mann, B. (1994). How many times should you shuﬄe a deck of cards? UMAP J., 15

303–332.

Mann, B. (1995). How many times should you shuﬄe a deck of cards? In Topics in Con-

temporary Probability and its Applications. Probab. Stochastics Ser., CRC, Boca Raton,

FL, 261–289.

Morris, B. (2009). Improved mixing time bounds for the Thorp shuﬄe and L-reversal

chain. Ann. Probab., 37 453–477. URL http://dx.doi.org/10.1214/08-AOP409.

Nyman, K. L. (2003). The peak algebra of the symmetric group. J. Algebraic Combin.,

17 309–322. URL http://dx.doi.org/10.1023/A:1025000905826.

Petersen, T. K. (2005). Cyclic descents and P -partitions. J. Algebraic Combin., 22

343–375. URL http://dx.doi.org/10.1007/s10801-005-4532-5.

Petersen, T. K. (2007). Enriched P -partitions and peak algebras. Adv. Math., 209

561–610. URL http://dx.doi.org/10.1016/j.aim.2006.05.016.

Poirier, S. (1998). Cycle type and descent set in wreath products. In Proceedings of the

7th Conference on Formal Power Series and Algebraic Combinatorics (Noisy-le-Grand,

1995), vol. 180. 315–343. URL http://dx.doi.org/10.1016/S0012-365X(97)00123-4.

Reiner, V. (1993). Signed permutation statistics and cycle type. European J. Combin.,

14 569–579. URL http://dx.doi.org/10.1006/eujc.1993.1059.

Rogers, T. D. (1981). Chaos in systems in population biology. In Progress in Theoretical

Biology, Vol. 6. Academic Press, New York, 91–146.

Shepp, L. A. and Lloyd, S. P. (1966). Ordered cycle lengths in a random permutation.

Trans. Amer. Math. Soc., 121 340–357.

Stanley, R. P. (1999). Enumerative Combinatorics. Vol. 2, vol. 62 of Cambridge Studies

in Advanced Mathematics. Cambridge University Press, Cambridge. With a foreword by

Gian-Carlo Rota and appendix 1 by Sergey Fomin.

Stanley, R. P. (2001). Generalized riﬄe shuﬄes and quasisymmetric functions. Ann.

Comb., 5 479–491. Dedicated to the memory of Gian-Carlo Rota (Tianjin, 1999).

Stark, D., Ganesh, A. and O’Connell, N. (2002). Information loss in riﬄe shuf-

ﬂing. Combin. Probab. Comput., 11 79–95. URL http://dx.doi.org/10.1017/

S0963548301004990.

Stembridge, J. R. (1997). Enriched P -partitions. Trans. Amer. Math. Soc., 349 763–788.

URL http://dx.doi.org/10.1090/S0002-9947-97-01804-7.

ANALYSIS OF CASINO SHELF SHUFFLING MACHINES 23

Thibon, J.-Y. (2001). The cycle enumerator of unimodal permutations. Ann. Comb.,

5 493–500. Dedicated to the memory of Gian-Carlo Rota (Tianjin, 1999), URL http:

//dx.doi.org/10.1007/s00026-001-8024-6.

Thorp, E. O. (1973). Nonrandom shuﬄing with applications to the game of Faro. Jour-

nal of the American Statistical Association, 68 842–847. URL http://www.jstor.org/

stable/2284510.

Vershik, A. M. and Shmidt, A. A. (1977). Limit measures arising in the asymptotic

theory of symmetric groups .1. Theor. Probab. Appl.-Engl. Tr., 22 70–85.

Vershik, A. M. and Shmidt, A. A. (1978). Limit measures arising in the asymptotic

theory of symmetric groups .2. Theor. Probab. Appl.-Engl. Tr., 23 36–49.

Warren, D. and Seneta, E. (1996). Peaks and Eulerian numbers in a random sequence.

J. Appl. Probab., 33 101–114.

Departments of Mathematics and Statistics, Stanford University

E-mail address: diaconis@math.stanford.edu

Department of Mathematics, University of Southern California

E-mail address: fulman@usc.edu

Department of Statistics, Stanford University

E-mail address: susan@stat.stanford.edu