Content uploaded by Sebastian Wild

Author content

All content in this area was uploaded by Sebastian Wild on Oct 15, 2019

Content may be subject to copyright.

Average Cost of QuickXsort with

Pivot Sampling

Sebastian Wild∗

August 15, 2018

Abstract:

QuickXsort is a strategy to combine Quicksort with another sorting

method X, so that the result has essentially the same comparison cost as X in

isolation, but sorts in place even when X requires a linear-size buﬀer. We solve the

recurrence for QuickXsort precisely up to the linear term including the optimization

to choose pivots from a sample of

k

elements. This allows to immediately obtain

overall average costs using only the average costs of sorting method X (as if run in

isolation). We thereby extend and greatly simplify the analysis of QuickHeapsort

and QuickMergesort with practically eﬃcient pivot selection, and give the ﬁrst tight

upper bounds including the linear term for such methods.

1. Introduction

In QuickXsort [

5

], we use the recursive scheme of ordinary Quicksort, but instead of doing

two recursive calls after partitioning, we ﬁrst sort one of the segments by some other sorting

method X. Only the second segment is recursively sorted by QuickXsort. The key insight is

that X can use the second segment as a temporary buﬀer for elements. By that, QuickXsort is

sorting in-place (using O(1) words of extra space) even when X itself is not.

Not every method makes a suitable ‘X’; it must use the buﬀer in a swap-like fashion: After

X has sorted its segment, the elements originally stored in our buﬀer must still be intact,

i.e.

,

they must still be stored in the buﬀer, albeit in a diﬀerent order. Two possible examples that

use extra space in such a way are Mergesort (see Section 6 for details) and a comparison-eﬃcient

Heapsort variant [

1

] with an output buﬀer. With QuickXsort we can make those methods sort

in-place while retaining their comparison eﬃciency. (We lose stability, though.)

While other comparison-eﬃcient in-place sorting methods are known (e.g. [18, 12, 9]), the

ones based on QuickXsort and elementary methods X are particularly easy to implement

1

since

one can adapt existing implementations for X. In such an implementation, the tried and tested

optimization to choose the pivot as the median of a small sample suggests itself to improve

QuickXsort. In previous works [

1

,

5

,

3

,

6

], the inﬂuence of QuickXsort on the performance of X

was either studied by ad-hoc techniques that do not easily apply with general pivot sampling

∗David R. Cheriton School of Computer Science, University of Waterloo, Email: wild @ uwaterloo.ca

This work was supported by the Natural Sciences and Engineering Research Council of Canada and the

Canada Research Chairs Programme.

1

See for example the code for QuickMergesort that was presented for discussion on code review stack exchange,

https://codereview.stackexchange.com/q/149443, and the succinct C++ code in [6].

arXiv:1803.05948v2 [cs.DS] 17 May 2018

2Average Cost of QuickXsort with Pivot Sampling

or it was studied for the case of very good pivots: exact medians or medians of a sample of

√n

elements. Both are typically detrimental to the average performance since they add signiﬁcant

overhead, whereas most of the beneﬁt of sampling is realized already for samples of very small

constant sizes like 3,5or 9. Indeed, in a very recent manuscript [

6

], Edelkamp and Weiß

describe an optimized median-of-3 QuickMergesort implementation in C++ that outperformed

the library Quicksort in std::sort.

The contribution of this paper is

a general transfer theorem (Theorem 5.1) that

expresses the costs of QuickXsort with median-of-ksampling (for any odd constant

k) directly in terms of the costs of X,

(

i.e.

, the costs that X needs to sort

n

elements in

isolation). We thereby obtain the ﬁrst analyses of QuickMergesort and QuickHeapsort with

best possible constant-coeﬃcient bounds on the linear term under realistic sampling schemes.

Since Mergesort only needs a buﬀer for one of the two runs, QuickMergesort should not

simply give Mergesort the smaller of the two segments to sort, but rather the largest one for

which the other segments still oﬀers suﬃcient buﬀer space. (This will be the larger segment of

the two if the smaller one contains at least a third of the elements; see Section 6 for details.)

Our transfer theorem covers this reﬁned version of QuickMergesort, as well, which had not

been analyzed before.2

The rest of the paper is structured as follows: In Section 2, we summarize previous work

on QuickXsort with a focus on contributions to its analysis. Section 3 collects mathematical

facts and notations used later. In Section 4 we deﬁne QuickXsort and formulate a recurrence

for its cost. Its solution is stated in Section 5. Section 6 presents the QuickMergesort as our

stereotypical instantiation of QuickXsort. The proof of the transfer spreads over Sections 7

and 8. In Section 9, we apply our result to QuickHeapsort and QuickMergesort and discuss

some algorithmic implications.

2. Previous Work

The idea to combine Quicksort and a secondary sorting method was suggested by Contone and

Cincotti [

2

,

1

]. They study Heapsort with an output buﬀer (external Heapsort),

3

and combine

it with Quicksort to QuickHeapsort. They analyze the average costs for external Heapsort in

isolation and use a diﬀerencing trick for dealing with the QuickXsort recurrence; however, this

technique is hard to generalize to median-of-kpivots.

Diekert and Weiß [

3

] suggest optimizations for QuickHeapsort (some of which need extra

space again), and they give better upper bounds for QuickHeapsort with random pivots and

median-of-3. Their results are still not tight since they upper bound the total cost of all

Heapsort calls together (using ad hoc arguments on the form of the costs for one Heapsort

round), without taking the actual subproblem sizes into account that Heapsort is used on. In

particular, their bound on the overall contribution of the Heapsort calls does not depend on

the sampling strategy.

Edelkamp and Weiß [

5

] explicitly describe QuickXsort as a general design pattern and,

among others, consider using Mergesort as ‘X’. They use the median of

√n

elements in each

round throughout to guarantee good splits with high probability. They show by induction

2

Edelkamp and Weiß do consider this version of QuickMergesort [

5

], but only analyze it for median-of-

√n

pivots. In this case, the behavior coincides with the simpler strategy to always sort the smaller segment by

Mergesort since the segments are of almost equal size with high probability.

3

Not having to store the heap in a consecutive preﬁx of the array allows to save comparisons over classic

in-place Heapsort: After a delete-max operation, we can ﬁll the gap at the root of the heap by promoting the

largest child and recursively moving the gap down the heap. (We then ﬁll the gap with a

−∞

sentinel value).

That way, each delete-max needs exactly blg nccomparisons.

3. Preliminaries 3

that when X uses at most

nlg n

+

cn

+

o

(

n

)comparisons on average for some constant

c

, the

number of comparisons in QuickXsort is also bounded by

nlg n

+

cn

+

o

(

n

). By combining

QuickMergesort with Ford and Johnson’s MergeInsertion [

8

] for subproblems of logarithmic

size, Edelkamp and Weiß obtained an in-place sorting method that uses on the average a close

to minimal number of comparisons of nlg n−1.3999n+o(n).

In a recent follow-up manuscript [

6

], Edelkamp and Weiß investigated the practical per-

formance of QuickXsort and found that a tuned median-of-3 QuickMergesort variant indeed

outperformed the C++ library Quicksort. They also derive an upper bound for the average

costs of their algorithm using an inductive proof; their bound is not tight.

3. Preliminaries

A comprehensive list of used notation is given in Appendix A; we mention the most important

here. We use Iverson’s bracket [

stmt

]to mean 1if

stmt

is true and 0otherwise.

P[E]

denotes

the probability of event

E

,

E[X]

the expectation of random variable

X

. We write

XD

=Y

to

denote equality in distribution.

We heavily use the beta distribution: For

α, β ∈R>0

,

XD

= Beta

(

α, β

)if

X

admits the density

fX

(

z

) =

zα−1

(1

−z

)

β−1/

B(

α, β

)where B(

α, β

) =

R1

0zα−1

(1

−z

)

β−1dz

is the beta function.

Moreover, we use the beta-binomial distribution, which is a conditional binomial distribution

with the success probability being a beta-distributed random variable. If

XD

= BetaBin

(

n, α, β

)

then

P[X

=

i]

=

n

i

B(

α

+

i, β

+ (

n−i

))

/

B(

α, β

). For a collection of its properties see [

23

],

Section 2.4.7; one property that we use here is a local limit law showing that the normalized

beta-binomial distribution converges to the beta distribution. It is reproduced as Lemma C.1

in the appendix.

For solving recurrences, we build upon Roura’s master theorems [

20

]. The relevant continuous

master theorem is restated in the appendix (Theorem B.1).

4. QuickXsort

Let X be a sorting method that requires buﬀer space for storing at most

bαnc

elements (for

α∈

[0

,

1]) to sort

n

elements. The buﬀer may only be accessed by swaps so that once X

has ﬁnished its work, the buﬀer contains the same elements as before, but in arbitrary order.

Indeed, we will assume that X does not compare any buﬀer contents; then QuickXsort preserves

randomness: if the original input is a random permutation, so will be the segments after

partitioning and so will be the buﬀer after X has terminated.4

We can then combine5X with Quicksort as follows: We ﬁrst randomly choose a pivot and

partition the input around that pivot. This results in two contiguous segments containing the

J1

elements that are smaller than the pivot and the

J2

elements that are larger than the pivot,

respectively. We exclude the space for the pivot, so

J1

+

J2

=

n−

1; note that since the rank of

the pivot is random, so are the segment sizes

J1

and

J2

. We then sort one segment by X using

the other segment as a buﬀer, and afterwards sort the buﬀer segment recursively by QuickXsort.

To guarantee a suﬃciently large buﬀer for X when it sorts

Jr

(

r

= 1 or 2), we must make

sure that

J3−r≥αJr

. In case both segments could be sorted by X, we use the larger one. The

motivation behind this is that we expect an advantage from reducing the subproblem size for

the recursive call as much as possible.

4We assume in this paper throughout that the input contains pairwise distinct elements.

5

Depending on details of X, further precautions might have to be taken,

e.g.

, in QuickHeapsort [

1

]. We assume

here that those have already been taken care of and solely focus on the analysis of QuickXsort.

4Average Cost of QuickXsort with Pivot Sampling

We consider the practically relevant version of QuickXsort, where we use as pivot the median

of a sample of

k

= 2

t

+ 1 elements, where

t∈N0

is constant

w.r.t. n

. We think of

t

as a design

parameter of the algorithm that we have to choose. Setting

t

= 0 corresponds to selecting

pivots uniformly at random.

4.1. Recurrence for Expected Costs

Let

c

(

n

)be the expected number of comparisons in QuickXsort on arrays of size

n

and

x

(

n

)

be (an upper bound for) the expected number of comparisons in X. We will assume that

x

(

n

)

fulﬁlls

x(n) = an lg n+bn ±O(n1−ε),(n→ ∞),

for constants a,band ε∈(0,1].

For

α <

1, we obtain two cases: When the split induced by the pivot is “uneven” – namely

when

min{J1, J2}< α max{J1, J2}

,

i.e.

,

max{J1, J2}>n−1

1+α

– the smaller segment is not large

enough to be used as buﬀer. Then we can only assign the large segment as a buﬀer and run X

on the smaller segment. If however the split is about “even”,

i.e.

, both segments are

≤n−1

1+α

we

can sort the larger of the two segments by X. These cases also show up in the recurrence of

costs:

c(n) = b(n)≥0,(n≤k)

c(n) = (n−k) + b(k) + Eh[J1, J2≤1

1+α(n−1)][J1> J2](x(J1) + c(J2))i

+Eh[J1, J2≤1

1+α(n−1)][J1≤J2](x(J2) + c(J1))i

+Eh[J2>1

1+α(n−1)](x(J1) + c(J2))i

+Eh[J1>1

1+α(n−1)](x(J2) + c(J1))i(n≥2)

=

2

X

r=1

E[Ar(Jr)c(Jr)] + t(n)(1)

where

A1(J) = [J, J0≤1

1+α(n−1)] ·[J≤J0] + [J > 1

1+α(n−1)] with J0= (n−1) −J

A2(J) = [J, J0≤1

1+α(n−1)] ·[J < J0] + [J > 1

1+α(n−1)]

t(n) = (n−1) + E[A2(J2)x(J1)] + E[A1(J1)x(J2)]

The expectation here is taken over the choice for the random pivot,

i.e.

, over the segment

sizes

J1

resp.

J2

. Note that we use both

J1

and

J2

to express the conditions in a convenient

form, but actually either one is fully determined by the other via

J1

+

J2

=

n−

1. Note how

A1

and

A2

change roles in recursive calls and toll functions, since we always sort one segment

recursively and the other segment by X.

The base cases

b

(

n

)are the costs to sort inputs that are too small to sample

k

elements.

A practical choice is be to switch to Insertionsort for these, which is also used for sorting the

samples. Unlike for Quicksort itself,

b

(

n

)only inﬂuences the logarithmic term of costs (for

constant

k

). For our asymptotic transfer theorem, we only assume

b

(

n

)

≥

0, the actual values

are immaterial.

5. The Transfer Theorem 5

Distribution of Subproblem Sizes.

If pivots are chosen as the median of a random sample

of size

k

= 2

t

+ 1, the subproblem sizes have the same distribution,

J1D

=J2

. Without pivot

sampling, we have

J1D

=U

[0

..n −

1], a discrete uniform distribution. If we choose pivots as

medians of a sample of

k

= 2

t

+ 1 elements, the value for

J1

consists of two summands:

J1

=

t

+

I1

. The ﬁrst summand,

t

, accounts for the part of the sample that is smaller than

the pivot. Those

t

elements do not take part in the partitioning round (but they have to be

included in the subproblem).

I1

is the number of elements that turned out to be smaller than

the pivot during partitioning.

This latter number

I1

is random, and its distribution is

I1D

= BetaBin

(

n−k, t

+ 1

, t

+ 1), a

so-called beta-binomial distribution. The connection to the beta distribution is best seen by

assuming

n

independent and uniformly in (0

,

1) distributed reals as input. They are almost

surely pairwise distinct and their relative ranking is equivalent to a random permutation of [

n

],

so this assumption is

w.l.o.g.

for our analysis. Then, the value

P

of the pivot in the ﬁrst

partitioning step has a

Beta

(

t

+ 1

, t

+ 1) distribution by deﬁnition.Conditional on that value

P

=

p

,

I1D

= Bin

(

n−k, p

)has a binomial distribution; the resulting mixture is the so-called

beta-binomial distribution.

For

t

= 0,

i.e.

, no sampling, we have

t

+

BetaBin

(

n−k, t

+ 1

, t

+ 1) =

BetaBin

(

n−

1

,

1

,

1),

so we recover the uniform case U[0..n −1].

5. The Transfer Theorem

We now state the main result of the paper: an asymptotic approximation for c(n).

Theorem 5.1 (Total Cost of QuickXsort):

The expected number of comparisons needed

to sort a random permutation with QuickXsort using median-of-

k

pivots,

k

= 2

t

+ 1, and a

sorting method X that needs a buﬀer of

bαnc

elements for some constant

α∈

[0

,

1] to sort

n

elements and requires on average

x

(

n

) =

an lg n

+

bn ±O

(

n1−ε

)comparisons to do so as

n→ ∞

for some ε∈(0,1] is

c(n) = an lg n+1

H−a·Hk+1 −Ht+1

Hln 2 +b·n±O(n1−ε+ log n),

where

H=I0,α

1+α(t+ 2, t + 1) + I1

2,1

1+α

(t+ 2, t + 1)

is the expected relative subproblem size that is sorted by X.

Here Ix,y(α, β)is the regularized incomplete beta function

Ix,y(α, β) = Zy

x

zα−1(1 −z)β−1

B(α, β)dz, (α, β ∈R+,0≤x≤y≤1).

We prove Theorem 5.1 in Sections 7 and 8. To simplify the presentation, we will restrict

ourselves to a stereotypical algorithm for X and its value

α

=

1

2

; the given arguments, however,

immediately extend to the general statement above.

6. QuickMergesort

A natural candidate for X is Mergesort: It is comparison-optimal up to the linear term (and

quite close to optimal in the linear term), and needs a

Θ

(

n

)-element-size buﬀer for practical

6Average Cost of QuickXsort with Pivot Sampling

implementations of merging.6

To be usable in QuickXsort, we use a swap-based merge procedure as given in Algorithm 1.

Note that it suﬃces to move the smaller of the two runs to a buﬀer; we use a symmetric

Merge(A[`..r], m, B[b..e])

// Merges runs A[`, m −1] and A[m..r]in-place into A[l..r]using scratch space B[b..e]

1n1=r−`+ 1;n2=r−`+ 1

// Assumes A[`, m −1] and A[m..r]are sorted, n1≤n2and n1≤e−b+ 1.

2for i= 0, . . . , n1−1do Swap(A[`+i], B[b+i]) end for

3i1=b;i2=m;o=`

4while i1< b +n1and i2≤r

5if B[i1]≤A[i2]then Swap(A[o], B[i2]);o=o+ 1;i1=i1+ 1

6else Swap(A[o], A[i1]);o=o+ 1;i2=i2+ 1 end if

7while i1< b +n1do Swap(A[o], B[i2]);o=o+ 1;i1=i1+ 1 end while

Algorithm 1.:

A simple merging procedure that uses the buﬀer only by swaps. We move the ﬁrst

run

A

[

`..r

]into the buﬀer

B

[

b..b

+

n1−

1] and then merge it with the second run

A

[

m..r

](still stored in the original array) into the empty slot left by the ﬁrst run.

By the time this ﬁrst half is ﬁlled, we either have consumed enough of the second

run to have space to grow the merged result, or the merging was trivial,

i.e.

, all

elements in the ﬁrst run were smaller.

version of Algorithm 1 when the second run is shorter. Using classical top-down or bottom-up

Mergesort as described in any algorithms textbook (e. g. [22]), we thus get along with α=1

2.

6.1. Average Case of Mergesort

The average number of comparisons for Mergesort has the same – optimal – leading term

nlg n

as in the worst and best case; and this is true for both the top-down and bottom-up variants.

The coeﬃcient of the linear term of the asymptotic expansion, though, is not a constant, but a

bounded periodic function with period

lg n

, and the functions diﬀer for best, worst, and average

case and the variants of Mergesort [21, 7, 17, 10, 11].

In this paper, we will conﬁne ourselves to an upper bound for the average case

x

(

n

) =

an lg n

+

bn ±O

(

n1−ε

)with constant

b

valid for all

n

, so we will set

b

to the supremum of

the periodic function. We leave the interesting challenge open to trace the precise behavior of

the ﬂuctuations through the recurrence, where Mergesort is used on a logarithmic number of

subproblems with random sizes.

We use the following upper bounds for top-down [11] and bottom-up [17] Mergesort7

xtd(n) = nlg n−1.24n+ 2 and (2)

xbu(n) = nlg n−0.26n±O(1).(3)

6

Merging can be done in place using more advanced tricks (see,

e.g.

, [

15

]), but those tend not to be competitive

in terms of running time with other sorting methods. By changing the global structure, a pure in-place

Mergesort variant [

13

] can be achieved using part of the input as a buﬀer (as in QuickMergesort) at the

expense of occasionally having to merge runs of very diﬀerent lengths.

7

Edelkamp and Weiß [

5

] use

x

(

n

) =

nlg n−

1

.

26

n±o

(

n

); Knuth [

14

, 5.2.4–13] derived this formula for

n

a

power of 2(a general analysis is sketched, but no closed result for general

n

is given). Flajolet and Golin [

7

]

and Hwang [

11

] continued the analysis in more detail; they ﬁnd that the average number of comparisons is

nlg n−(1.25 ±0.01)n±O(1), where the linear term oscillates in the given range.

7. Solving the Recurrence: Leading Term 7

7. Solving the Recurrence: Leading Term

We start with Equation

(1)

. Since

α

=

1

2

for our Mergesort, we have

α

1+α

=

1

3

and

1

1+α

=

2

3

.

(The following arguments are valid for general

α

, including the extreme case

α

= 1, but in an

attempt to de-clutter the presentation, we stick to

α

=

1

2

here.) We rewrite

A1

(

J1

)and

A2

(

J2

)

explicitly in terms of the relative subproblem size:

A1(J1) = J1

n−1∈h1

3,1

2i∪2

3,1i,

A2(J2) = J2

n−1∈h1

3,1

2∪2

3,1i.

Graphically, if we view

J1/

(

n−

1) as a point in the unit interval, the following picture shows

which subproblem is sorted recursively; (the other subproblem is sorted by Mergesort).

A2= 1

A1= 1

A2= 1

A1= 1

01/3 1/2 2/31

Obviously, we have

A1

+

A2

= 1 for any choice of

J1

, which corresponds to having exactly one

recursive call in QuickMergesort.

7.1. The Shape Function

The expectations

E[Ar

(

Jr

)

c

(

Jr

)

]

in Equation

(1)

are actually ﬁnite sums over the values

0

, . . . , n −

1that

J:=J1

can attain. Recall that

J2

=

n−

1

−J1

and

A1

(

J1

) +

A2

(

J2

) = 1 for

any value of J. With J=J1D

=J2, we ﬁnd

2

X

r=1

E[Ar(Jr)c(Jr)] = E"J

n−1∈h1

3,1

2i∪2

3,1i·c(J)#

+E"J

n−1∈h1

3,1

2∪2

3,1i·c(J)#

=

n−1

X

j=0

wn,j ·c(j),where

wn,j =P[J=j]·hj

n−1∈[1

3,1

2]∪(2

3,1]i

+P[J=j]·hj

n−1∈[1

3,1

2)∪(2

3,1]i

=

2·P[J=j]if j

n−1∈[1

3,1

2)∪(2

3,1]

1·P[J=j]if j

n−1=1

2

0otherwise

We thus have a recurrence of the form required by the Roura’s continuous master theorem

(CMT) (see Theorem B.1 in Appendix B) with the weights

wn,j

from above (Figure 1 shows an

example how these weights look like).

It remains to determine

P[J

=

j]

. Recall that we choose the pivot as the median of

k

= 2

t

+ 1 elements for a ﬁxed constant

t∈N0

, and the subproblem size

J

fulﬁlls

J

=

t

+

I

8Average Cost of QuickXsort with Pivot Sampling

20 40 60 80 100

0.005

0.010

0.015

0.020

0.025

0.030

Figure 1: The weights wn,j for n= 101,t= 1; note the singular point at j= 50.

with ID

= BetaBin(n−k, t + 1, t + 1). So we have for i∈[0, n −1−t]by deﬁnition

P[I=i] = n−k

i!Bi+t+ 1,(n−k−i)) + t+ 1

B(t+ 1, t + 1)

= n−k

i!(t+ 1)i(t+ 1)n−k−i

(k+ 1)n−k

(For details, see [

23

, Section 2.4.7].) Now the local limit law for beta binomials (Lemma C.1

in Appendix C says that the normalized beta binomial

I/n

converges to a beta variable “in

density”, and the convergence is uniform. With the beta density

fP

(

z

) =

zt

(1

−z

)

t/

B(

t

+1

, t

+1),

we thus ﬁnd by Lemma C.1 that

P[J=j] = P[I=j−t] = 1

nfP(j/n)±O(n−2),(n→ ∞).

The shift by the small constant

t

from (

j−t

)

/n

to

j/n

only changes the function value by

O

(

n−1

)since

fP

is Lipschitz continuous on [0

,

1]. (Details of that calculation are also given in

[23], page 208.)

The ﬁrst step towards applying the CMT is to identify a shape function

w

(

z

)that approxi-

mates the relative subproblem size probabilities

w

(

z

)

≈nwn,bznc

for large

n

. With the above

observation, a natural choice is

w(z) = 2 1

3< z < 1

2∨z > 2

3zt(1 −z)t

B(t+ 1, t + 1) .(4)

We show in Appendix D that this is indeed a suitable shape function,

i.e.

, it fulﬁlls Equation

(11)

from the CMT.

7.2. Computing the Toll Function

The next step in applying the CMT is a leading-term approximation of the toll function. We

consider a general function

x

(

n

) =

an lg n

+

bn ±O

(

n1−ε

)where the error term holds for any

constant ε > 0as n→ ∞. We start with the simple observation that

Jlg J=Jlg(J

n) + lg n

=n·J

nlg J

n+J

nlg n

=J

nnlg n+J

nlgJ

nn. (5)

=J

nnlg n±O(n).(6)

7. Solving the Recurrence: Leading Term 9

For the leading term of

E[x

(

J

)

]

, we thus only have to compute the expectation of

J/n

, which is

essentially a relative subproblem size. In

t

(

n

), we also have to deal with the conditionals

A1

(

J

)

resp.

A2

(

J

), though. By approximating

J

n

with a beta distributed variable, the conditionals

translate to bounds of an integral. Details are given in Lemma E.1 (see Appendix E). This

yields

t(n) = n−1 + E[A2(J2)x(J1)] + E[A1(J1)x(J2)]

=aE[A2(J2)J1lg J1] + aE[A1(J1)J2lg J2)] ±O(n)

=

Lemma E.1–(a) 2a·t+ 1

2t+ 2 ·I0,1

3(t+ 2, t + 1) + I1

2,2

3(t+ 2, t + 1)·nlg n±O(n)

=aI0,1

3(t+ 2, t + 1) + I1

2,2

3(t+ 2, t + 1)

| {z }

¯a

·nlg n±O(n),(n→ ∞).(7)

Here we use the incomplete regularized beta function

Ix,y(α, β) = Zy

x

zα−1(1 −z)β−1

B(α, β)dz, (α, β ∈R+,0≤x≤y≤1)

for concise notation. (

Ix,y

(

α, β

)is the probability that a

Beta

(

α, β

)distributed random variable

falls into (x, y)⊂[0,1], and I0,x(α, β )is its cumulative distribution function.)

7.3. Which Case of the CMT?

We are now ready to apply the CMT (Theorem B.1). As shown in Section 7.2, our toll function

is Θ(nlog n), so we have α= 1 and β= 1. We hence compute

H= 1 −Z1

0

z w(z)dz

= 1 −Z1

0

21

3< z < 1

2∨z > 2

3zt+1(1 −z)t

B(t+ 1, t + 1) dz

= 1 −2t+ 1

k+ 1 Z1

01

3< z < 1

2∨z > 2

3zt+1(1 −z)t

B(t+ 2, t + 1) dz

= 1 −I1

3,1

2(t+ 2, t + 1) + I2

3,1(t+ 2, t + 1)

=I0,1

3(t+ 2, t + 1) + I1

2,2

3(t+ 2, t + 1) (8)

For any sampling parameters, we have

H >

0, so the overall costs satisfy by Case 1 of

Theorem B.1

c(n)∼t(n)

H∼¯an lg n

H,(n→ ∞).(9)

7.4. Cancellations

Combining Equations (7) and (9), we ﬁnd

c(n)∼an lg n, (n→ ∞);

since

I0,1

3

+

I1

3,1

2

+

I1

2,2

3

+

I2

3,1

= 1. The leading term of the number of comparisons in QuickXsort

is the same as in X itself, regardless of how the pivot elements are chosen! This is not as

surprising as it might ﬁrst seem. We are typically sorting a constant fraction of the input

10 Average Cost of QuickXsort with Pivot Sampling

by X and thus only do a logarithmic number of recursive calls on a geometrically decreasing

number of elements, so the linear contribution of Quicksort (partitioning and recursion cost) is

dominated by even the ﬁrst call of X, which has linearithmic cost. This remains true even if

we allow asymmetric sampling,

e.g.

, by choosing the pivot as the smallest (or any other order

statistic) of a random sample.

Edelkamp and Weiß [

5

] give the above result for the case of using the median of

√n

elements,

where we eﬀectively have exact medians from the perspective of analysis. In this case, the

informal reasoning given above is precise, and in fact, in this case the same form of cancellations

also happen for the linear term [

5

, Thm. 1]. (See also the “exact ranks” result in Section 9.) We

will show in the following that for practical schemes of pivot sampling,

i.e.

, with ﬁxed sample

sizes, these cancellations happen only for the leadings-term approximation. The pivot sampling

scheme does aﬀect the linear term signiﬁcantly; and to measure the beneﬁt of sampling, the

analysis thus has to continue to the next term of the asymptotic expansion of c(n).

Relative Subproblem Sizes.

The integral

R1

0zw

(

z

)

dz

is precisely the expected relative sub-

problem size for the recursive call, whereas for

t

(

n

)we are interested in the subproblem that is

sorted using X whose relative size is given by

R1

0

(1

−z

)

w

(

z

)

dz

= 1

−R1

0zw

(

z

)

dz

. We can thus

write ¯a=aH.

20 40 60 80 100

0.50

0.55

0.60

0.65

0.70

Figure 2: R1

0zw(z)dz, the relative recursive subproblem size, as a function of t.

The quantity

R1

0zw

(

z

)

dz

, the average relative size of the recursive call is of independent

interest. While it is intuitively clear that for

t→ ∞

,

i.e.

, the case of exact medians as pivots,

we must have a relative subproblem size of exactly

1

2

, this convergence is not apparent from

the behavior for ﬁnite

t

: the mass of the integral

R1

0zw

(

z

)

dz

concentrates at

z

=

1

2

, a point of

discontinuity in

w

(

z

). It is also worthy of note that the expected subproblem size is initially

larger than

1

2

(0

.

69

4

for

t

= 0), then decreases to

≈

0

.

449124 around

t

= 20 and then starts to

slowly increase again (see Figure 2).

8. Solving the Recurrence: The Linear Term

Since

c

(

n

)

∼an lg n

for any choice of

t

, the leading term alone does not allow to make distinctions

to judge the eﬀect of sampling schemes. To compute the next term in the asymptotic expansion

of

c

(

n

), we consider the values

c0

(

n

) =

c

(

n

)

−an lg n

.

c0

(

n

)has essentially the same recursive

8. Solving the Recurrence: The Linear Term 11

structure as c(n), only with a diﬀerent toll function:

c0(n) = c(n)−an lg n

=

2

X

r=1

EAr(Jr)c(Jr)−an lg n+t(n)

=

2

X

r=1EhAr(Jr)c(Jr)−aJrlg Jri+aEAr(Jr)Jrlg Jr−an lg n

+ (n−1) + EA2(J2)·x(J1)+EA1(J1)·x(J2)

=

2

X

r=1

EhAr(Jr)c0(Jr)i+ (n−1) −an lg n

+aEhA1(J1) + A2(J2)J1lg J1i+bE[A2(J2)J1]

+aEhA2(J2) + A1(J1)J2lg J2i+bE[A1(J1)J2]±O(n1−ε)

Since J1D

=J2we can simplify

EhA1(J1) + A2(J2)J1lg J1i+EhA2(J2) + A1(J1)J2lg J2i

=EhA1(J1) + A2(J2)J1lg J1i+EhA2(J1) + A1(J2)J1lg J1i

=EhJ1lg J1·A1(J1) + A1(J2)+A2(J1) + A2(J2)i

= 2E[Jlg J]

=

(5) 2E[J

n]·nlg n+ 2 ·1

ln 2 E[J

nln J

n]·n

=

Lemma E.1–(b)

nlg n−1

ln 2 Hk+1 −Ht+1n±O(n1−ε).

Plugging this back into our equation for c0(n), we ﬁnd

c0(n) =

2

X

r=1

EhAr(Jr)c0(Jr)i+ (n−1) −an lg n

+anlg n−1

ln 2 Hk+1 −Ht+1n

+bI0,1

3(t+ 2, t + 1) + I1

2,2

3(t+ 2, t + 1)·n±O(n1−ε)

=

2

X

r=1

EhAr(Jr)c0(Jr)i+t0(n)

where

t0(n) = b0n±O(n1−ε)

b0= 1 −a

ln 2 Hk+1 −Ht+1+b·H

Apart from the smaller toll function

t0

(

n

), this recurrence has the very same shape as the

original recurrence for

c

(

n

); in particular, we obtain the same shape function

w

(

z

)and the

same H > 0and obtain

c0(n)∼t0(n)

H∼b0n

H.

12 Average Cost of QuickXsort with Pivot Sampling

8.1. Error Bound

Since our toll function is not given precisely, but only up to an error term

O

(

n1−ε

)for a

given ﬁxed

ε∈

(0

,

1], we also have to estimate the overall inﬂuence of this term. For that

we consider the recurrence for

c

(

n

)again, but replace

t

(

n

)(entirely) by

C·n1−ε

. If

ε >

0,

R1

0z1−εw

(

z

)

dz < R1

0w

(

z

)

dz

= 1, so we still ﬁnd

H >

0and apply case 1 of the CMT. The

overall contribution of the error term is then

O

(

n1−ε

). For

ε

= 0,

H

= 0 and case 2 applies,

giving an overall error term of O(log n).

This completes the proof of Theorem 5.1.

9. Discussion

Since all our choices for X are leading-term optimal, so will QuickXsort be. We can thus ﬁx

a

= 1 in Theorem 5.1; only

b

(and the allowable

α

) still depend on X. We then basically ﬁnd

that going from X to QuickXsort adds a “penalty”

q

in the linear term that depends only on the

sampling size (and

α

), but not on X. Table 1 shows that this penalty is

≈n

without sampling,

but can be reduced drastically when choosing pivots from a sample of 3or 5elements. (Note

that the overall costs for pivot sampling are O(log n)for constant t.)

t= 0 t= 1 t= 2 t= 3 t= 10 t→ ∞

α= 1 1.1146 0.5070 0.3210 0.2328 0.07705 0

α=1

20.9120 0.4050 0.2526 0.1815 0.05956 0

Table 1:

QuickXsort penalty. QuickXsort with

x

(

n

) =

nlg n

+

bn

yields

c

(

n

) =

nlg n

+ (

q

+

b

)

n

where q, the QuickXsort penalty, is given in the table.

As we increase the sample size, we converge to the situation studied by Edelkamp and

Weiß using median-of-

√n

, where no linear-term penalty is left [

5

]. Given that

q

is less than

0

.

08 already for a sample of 21 elements, these large sample versions are mostly of theoretical

interest. It is noteworthy that the improvement from no sampling to median-of-3 yields a

reduction of

q

by more than 50%, which is much more than its eﬀect on Quicksort itself (where

it reduces the leading term of costs by 15 % from 2nln nto 12

7nln n).

We now apply our transfer theorem to the two most well-studied choices for X, Heapsort

and Mergesort, and compare the results to analyses and measured comparison counts from

previous work. The results conﬁrm that solving the QuickXsort recurrence exactly yields much

more accurate predictions for the overall number of comparisons than previous bounds that

circumvented this.

9.1. QuickHeapsort

The basic external Heapsort of Cantone and Cincotti [

1

] always traverses one path in the heap

from root to bottom and does one comparison for each edge followed,

i.e.

,

blg nc

or

blg nc −

1

many per deleteMax. By counting how many leaves we have on each level, Diekert and Weiß

found [3, Eq. 1]

nblg nc − 1+ 2n−2blg nc±O(log n)≤nlg n−0.913929n±O(log n)

comparisons for the sort-down phase. (The constant of the linear term is 1

−1

ln 2 −lg

(2

ln

2), the

supremum of the periodic function at the linear term). Using the classical heap construction

9. Discussion 13

method adds on average 1.8813726ncomparisons [4], so here

x(n) = nlg n+ 0.967444n±O(nε)

for any ε > 0.

Both [

1

] and [

3

] report averaged comparison counts from running time experiments. We

compare them in Table 2 against the estimates from our result and previous analyses. While

the approximation is not very accurate for

n

= 100 (for all analyses), for larger

n

, our estimate

is correct up to the ﬁrst three digits, whereas previous upper bounds have almost one order of

magnitude bigger errors. Note that it is expected for our bound to still be on the conservative

side since we used the supremum of the periodic linear term for Heapsort.

Instance observed W CC DW

Fig. 4 [1], n= 102,k= 1 806 +67 +158 +156

Fig. 4 [1], n= 102,k= 3 714 +98 —+168

Fig. 4 [1], n= 105,k= 1 1 869 769 −600 +90 795 +88 795

Fig. 4 [1], n= 105,k= 3 1 799 240 +9 165 —+79 324

Fig. 4 [1], n= 106,k= 1 21 891 874 +121 748 +1 035 695 +1 015 695

Fig. 4 [1], n= 106,k= 3 21 355 988 +49 994 —+751 581

Tab. 2 [3], n= 104,k= 1 152 573 +1 125 +10 264 +10 064

Tab. 2 [3], n= 104,k= 3 146 485 +1 136 —+8 152

Tab. 2 [3], n= 106,k= 1 21 975 912 +37 710 +951 657 +931 657

Tab. 2 [3], n= 106,k= 3 21 327 478 +78 504 —+780 091

Table 2:

Comparison of estimates from this paper (W), Theorem 6 of [

1

] (CC) and Theorem 1

of [3] (DW); shown is the diﬀerence between the estimate and the observed average.

9.2. QuickMergesort

For QuickMergesort, Edelkamp and Weiß [

5

, Fig. 4] report measured average comparison counts

for a median-of-3 version using top-down Mergesort: the linear term is shown to be between

−

0

.

8

n

and

−

0

.

9

n

. In a recent manuscript [

6

], they also analytically consider the simpliﬁed

median-of-3 QuickMergesort which always sorts the smaller segment by Mergesort (

i.e.

,

α

= 1).

It uses

nlg n−

0

.

7330

n

+

o

(

n

)comparisons on average (using

b

=

−

1

.

24). They use this as a

(conservative) upper bound for the original QuickMergesort.

Our transfer theorem shows that this bound is oﬀ by roughly 0

.

1

n

: median-of-3 Quick-

Mergesort uses at most

c

(

n

) =

nlg n−

0

.

8350

n±O

(

log n

)comparisons on average. Going

to median-of-5 reduces the linear term to

−

0

.

9874

n

, which is better than the worst-case for

top-down Mergesort for most n.

Skewed Pivots for Mergesort?

For Mergesort with

α

=

1

2

the largest fraction of elements

we can sort by Mergesort in one step is

2

3

; this suggests that using a slightly skewed pivot

might be beneﬁcial since it will increase the subproblem size for Mergesort and decrease the

size for recursive calls. Indeed, Edelkamp and Weiß allude to this variation: “With about

15% the time gap, however, is not overly big, and may be bridged with additional eﬀorts like

skewed pivots and reﬁned partitioning.” (the statement appears in the arXiv version of [

5

],

arxiv.org/abs/1307.3033

). And the above mentioned StackExchange post actually chooses

pivots as the second tertile.

Our analysis above can be extended to skewed sampling schemes (omitted due to space

constraints), but to illustrate this point it suﬃces to pay a short visit to “wishful-thinking land”

14 Average Cost of QuickXsort with Pivot Sampling

and assume that we can get exact quantiles for free. We can show (

e.g.

, with Roura’s discrete

master theorem [

20

]) that if we always pick the exact

ρ

-quantile of the input, for

ρ∈

(0

,

1), the

overall costs are

cρ(n) =

nlg n+1 + h(ρ)

1−ρ+bn±O(n1−ε)if ρ∈(1

3,1

2)∪(2

3,1)

nlg n+1 + h(ρ)

ρ+bn±O(n1−ε)otherwise

for

h

(

x

) =

xlg x

+ (1

−x

)

lg

(1

−x

). The coeﬃcient of the linear term has a strict minimum at

ρ

=

1

2

: Even for

α

=

1

2

, the best choice is to use the median of a sample. (The result is the

same for ﬁxed-size samples.) For QuickMergesort, skewed pivots turn out to be a pessimization,

despite the fact that we sort a larger part by Mergesort. A possible explanation is that skewed

pivots signiﬁcantly decrease the amount of information we obtain from the comparisons during

partitioning, but do not make partitioning any cheaper.

9.3. Future Work

More promising than skewed pivot sampling is the use of several pivots. The resulting

MultiwayQuickXsort would be able to sort all but one segment using X and recurse on only

one subproblem. Here, determining the expected subproblem sizes becomes a challenge, in

particular for α < 1; we leave this for future work.

We also conﬁned ourselves to the expected number of comparisons here, but more details

about the distribution of costs are possible to obtain. The variance follows a similar recurrence

as the one studied in this paper and a distributional recurrence for the costs can be given. The

discontinuities in the subproblem sizes add a new facet to these analyses.

Finally, it is a typical phenomenon that constant-factor optimal sorting methods exhibit

periodic linear terms. QuickXsort inherits these ﬂuctuations but smooths them through the

random subproblem sizes. Explicitly accounting for these eﬀects is another interesting challenge

for future work.

Acknowledgements.

I would like to thank three anonymous referees for many helpful com-

ments, references and suggestions that helped improve the presentation of this paper.

A. Notation 15

A. Notation

A.1. Generic Mathematics

N,N0,Z,R. . . . . . . . . natural numbers N={1,2,3, . . .},N0=N∪ {0}, integers

Z={. . . , −2,−1,0,1,2, . . .}, real numbers R.

R>1,N≥3etc. . . . . . . . restricted sets Xpred ={x∈X:xfulﬁlls pred}.

0.3. . . . . . . . . . . . . . . . .repeating decimal; 0.3=0.333 . . . =1

3;

numerals under the line form the repeated part of the decimal number.

ln(n),lg(n). . . . . . . . . . natural and binary logarithm; ln(n) = loge(n),lg(n) = log2(n).

X. . . . . . . . . . . . . . . . . .to emphasize that Xis a random variable it is Capitalized.

[a, b)...............

real intervals, the end points with round parentheses are excluded, those with

square brackets are included.

[m..n],[n]. . . . . . . . . . . integer intervals, [m..n] = {m, m + 1, . . . , n};[n] = [1..n].

[stmt],[x=y]. . . . . . . Iverson bracket, [stmt] = 1 if stmt is true, [stmt]=0otherwise.

Hn.................nth harmonic number; Hn=Pn

i=1 1/i.

x±y...............xwith absolute error |y|; formally the interval x±y= [x− |y|, x +|y|]; as

with O-terms, we use one-way equalities z=x±yinstead of z∈x±y.

B(α, β). . . . . . . . . . . . . the beta function, B(α, β ) = R1

0zα−1(1 −z)β−1dz

Ix,y(α, β). . . . . . . . . . . the regularized incomplete beta function; Ix,y (α, β ) = Ry

x

zα−1(1−z)β−1

B(α,β)dz for

α, β ∈R+,0≤x≤y≤1.

ab,ab. . . . . . . . . . . . . . .factorial powers; “ato the bfalling resp. rising.”

A.2. Stochastics-related Notation

P[E],P[X=x]. . . . . . probability of an event Eresp. probability for random variable Xto attain

value x.

E[X]. . . . . . . . . . . . . . . expected value of X; we write E[X|Y]for the conditional expectation of X

given

Y

, and

EX[f

(

X

)

]

to emphasize that expectation is taken

w.r.t.

random

variable X.

XD

=Y. . . . . . . . . . . . . equality in distribution; Xand Yhave the same distribution.

U(a, b). . . . . . . . . . . . . .uniformly in (a, b)⊂Rdistributed random variable.

Beta(α, β). . . . . . . . . . Beta distributed random variable with shape parameters α∈R>0and

β∈R>0.

Bin(n, p). . . . . . . . . . . . binomial distributed random variable with n∈N0trials and success

probability p∈[0,1].

BetaBin(n, α, β). . . . . beta-binomial distributed random variable; n∈N0,α, β ∈R>0;

A.3. Notation for the Algorithm

n. . . . . . . . . . . . . . . . . . length of the input array, i.e., the input size.

k,t. . . . . . . . . . . . . . . . sample size k∈N≥1, odd; k= 2t+ 1,t∈N0.

x(n),a,b. . . . . . . . . . . Average costs of X, x(n) = an lg n+bn ±O(n1−ε).

t(n),¯a,¯

b. . . . . . . . . . . .toll function t(n) = ¯an lg n+¯

bn ±O(n1−ε)

J1,J2. . . . . . . . . . . . . . (random) subproblem sizes; J1+J2=n−1;J1=t+I1;

I1,I2. . . . . . . . . . . . . . .(random) segment sizes in partitioning; I1

D

= BetaBin(n−k, t + 1, t + 1);

I2=n−k−I1;J1=t+I1

16 Average Cost of QuickXsort with Pivot Sampling

B. The Continuous Master Theorem

We restate Roura’s CMT here for convenience.

Theorem B.1 (Roura’s Continuous Master Theorem (CMT)):

Let

Fn

be recursively

deﬁned by

Fn=

bn,for 0≤n<N;

tn+

n−1

X

j=0

wn,j Fj,for n≥N,(10)

where

tn

, the toll function, satisﬁes

tn∼Knαlogβ

(

n

)as

n→ ∞

for constants

K6

= 0,

α≥

0

and

β > −

1. Assume there exists a function

w

: [0

,

1]

→R≥0

, the shape function, with

R1

0w(z)dz ≥1and

n−1

X

j=0 wn,j −Z(j+1)/n

j/n

w(z)dz=O(n−d),(n→ ∞),(11)

for a constant d > 0. With H:= 1 −Z1

0

zαw(z)dz, we have the following cases:

1. If H > 0, then Fn∼tn

H.

2. If H= 0, then Fn∼tnln n

˜

Hwith ˜

H=−(β+ 1) Z1

0

zαln(z)w(z)dz.

3. If H < 0, then Fn=O(nc)for the unique c∈Rwith Z1

0

zcw(z)dz = 1.

Theorem B.1 is the “reduced form” of the CMT, which appears as Theorem 1.3.2 in Roura’s

doctoral thesis [

19

], and as Theorem 18 of [

16

]. The full version (Theorem 3.3 in [

20

]) allows us

to handle sublogarithmic factors in the toll function, as well, which we do not need here.

C. Local Limit Law for the Beta-Binomial Distribution

Since the binomial distribution is sharply concentrated, one can use Chernoﬀ bounds on beta-

binomial variables after conditioning on the beta distributed success probability. That already

implies that

BetaBin

(

n, α, β

)

/n

converges to

Beta

(

α, β

)(in a speciﬁc sense). We can obtain

stronger error bounds, though, by directly comparing the PDFs. Doing that gives the following

result; a detailed proof is given in [23], Lemma 2.38.

Lemma C.1 (Local Limit Law for Beta-Binomial, [23], Lemma 2.38):

Let (

I(n)

)

n∈N≥1

be a family of random variables with beta-binomial distribution,

I(n)D

=

BetaBin

(

n, α, β

)where

α, β ∈ {

1

} ∪ R≥2

, and let

fB

(

z

)be the density of the

Beta

(

α, β

)

distribution. Then we have uniformly in z∈(0,1) that

n·PI=bz(n+ 1)c=fB(z)±O(n−1),(n→ ∞).

That is,

I(n)/n

converges to

Beta

(

α, β

)in distribution, and the probability weights converge

uniformly to the limiting density at rate O(n−1).

D. Smoothness of the Shape Function 17

D. Smoothness of the Shape Function

In this appendix we show that

w

(

z

)as given in Equation

(4)

on page 8 fulﬁlls Equation

(11)

on page 16, the approximation-rate criterion of the CMT. We consider the following ranges for

bznc

n−1=j

n−1separately:

•bznc

n−1<1

3and 1

2<bznc

n−1<2

3.

Here

wn,bznc

= 0 and so is

w

(

z

). So actual value and approximation are exactly the same.

•1

3<bznc

n−1<1

2and bznc

n−1>2

3.

Here

wn,j

= 2

P[J

=

j]

and

w

(

z

) = 2

fP

(

z

)where

fP

(

z

) =

zt

(1

−z

)

t/

B(

t

+ 1

, t

+ 1) is twice

the density of the beta distribution

Beta

(

t

+ 1

, t

+ 1). Since

fP

is Lipschitz-continuous on

the bounded interval [0

,

1] (it is a polynomial) the uniform pointwise convergence from

above is enough to bound the sum of

wn,j −R(j+1)/n

j/n w

(

z

)

dz

over all

j

in the range by

O(n−1).

•bznc

n−1∈ {1

3,1

2,2

3}.

At these boundary points, the diﬀerence between

wn,bznc

and

w

(

z

)does not vanish (in

particularly

1

2

is a singular point for

wn,bznc

), but the absolute diﬀerence is bounded.

Since this case only concerns 3out of

n

summands, the overall contribution to the error

is O(n−1).

Together, we ﬁnd that Equation (11) is fulﬁlled as claimed:

n−1

X

j=0 wn,j −Z(j+1)/n

j/n

w(z)dz=O(n−1) (n→ ∞).(12)

E. Approximation by (Incomplete) Beta Integrals

Lemma E.1:

Let

JD

= BetaBin

(

n−c1, α, β

) +

c2

be a random variable that diﬀers by ﬁxed

constants

c1

and

c2

from a beta-binomial variable with parameters

n∈N

and

α, β ∈N≥1

. Then

the following holds

(a) For ﬁxed constants 0≤x≤y≤1holds

E[xn ≤J≤yn]·Jlg J=α

α+βIx,y(α+ 1, β)·nlg n±O(n),(n→ ∞).

The result holds also when any or both of the inequalities in [xn ≤J≤yn]are strict.

(b) E[J

nln J

n] = α

α+β(Hα−Hα+β)±O(n−h)for any h∈(0,1).

Proof:

We start with part (a). By the local limit law for beta binomials (Lemma C.1) it

is plausible to expect a reasonably small error when we replace

E

[

xn ≤J≤yn

]

·Jlg J

by

E

[

x≤P≤y

]

·

(

P n

)

lg

(

P n

)

where

PD

= Beta

(

α, β

)is beta distributed. We bound the error in

the following.

We have

E

[

xn ≤J≤yn

]

·Jln J=E

[

xn ≤J≤yn

]

·J

n·nln n±O

(

n

)by Equation

(5)

;

it thus suﬃces to compute

E

[

xn ≤J≤yn

]

·J

n

. We ﬁrst replace

J

by

ID

= BetaBin

(

n, α, β

)

18 Average Cost of QuickXsort with Pivot Sampling

and argue later that this results in a suﬃciently small error. We expand

E[x≤I

n≤y]·I

n=

bync

X

i=dxne

i

n·P[I=i]

=1

n

bync

X

i=dxne

i

n·nP[I=i]

=

Lemma C.1

1

n

bync

X

i=dxne

i

n·(i/n)α−1(1 −(i/n))β−1

B(α, β)±O(n−1)

=1

B(α, β)·1

n

bync

X

i=dxne

f(i/n)±O(n−1),

where f(z) = zα(1 −z)β−1.

Note that

f

(

z

)is Lipschitz-continuous on the bounded interval [

x, y

]since it is continuously

diﬀerentiable (it is a polynomial). Integrals of Lipschitz functions are well-approximated by

ﬁnite Riemann sums; see Lemma 2.12 (b) of [

23

] for a formal statement. We use that on the

sum above

1

n

bync

X

i=dxne

f(i/n) = Zy

x

f(z)dz ±O(n−1),(n→ ∞).

Inserting above and using B(α+ 1, β)/B(α, β) = α/(α+β)yields

E[x≤I

n≤y]·I

n=Ry

xzα(1 −z)β−1dz

B(α, β)±O(n−1)

=α

α+βIx,y(α+ 1, β)±O(n−1); (13)

recall that

Ix,y(α, β) = Zy

x

zα−1(1 −z)β−1

B(α, β)dz =Px<P <y

denotes the regularized incomplete beta function.

Changing from

I

back to

J

has no inﬂuence on the given approximation. To compensate

for the diﬀerence in the number of trials (

n−c1

instead of

n

), we use the above formulas for

with

n−c1

instead of

n

; since we let

n

go to inﬁnity anyway, this does not change the result.

Moreover, replacing

I

by

I

+

c2

changes the value of the argument

z

=

I/n

of

f

by

O

(

n−1

);

since

f

is smooth, namely Lipschitz-continuous, this also changes

f

(

z

)by at most

O

(

n−1

). The

result is thus not aﬀected by more than the given error term:

E[x≤J

n≤y]·J

n=E[x≤I

n≤y]·I

n±O(n−1)

We obtain the claim by multiplying with nlg n.

Versions with strict inequalities in [

xn ≤J≤yn

]only aﬀect the bounds of the sums above

by one, which again gives a negligible error of O(n−1).

This concludes the proof of part (a).

E. Approximation by (Incomplete) Beta Integrals 19

For part (b), we follow a similar route. The function we integrate is no longer Lipschitz

continuous, but a weaker form of smoothness is suﬃcient to bound the diﬀerence between the

integral and its Riemann sums. Indeed, the above cited Lemma 2.12 (b) of [

23

] is formulated

for the weaker notion of Hölder-continuity: A function

f

:

I→R

deﬁned on a bounded interval

Iis called Hölder-continuous with exponent h∈(0,1] when

∃C∀x, y ∈I:f(x)−f(y)≤C|x−y|h.

This generalizes Lipschitz-continuity (which corresponds to h= 1).

As above, we replace

J

by

ID

= BetaBin

(

n, α, β

), which aﬀects the overall result by

O

(

n−1

).

We compute

EI

nln I

n=

n

X

i=0

i

nln i

n·P[I=i]

=

Lemma C.1

1

n

n

X

i=0

i

nln i

n·(i/n)α−1(1 −(i/n))β−1

B(α, β)±O(n−1)

=−1

B(α, β)·1

n

n

X

i=0

f(i/n)±O(n−1),

where now

f

(

z

) =

ln

(1

/z

)

·zα

(1

−z

)

β−1

. Since the derivative is

∞

for

z

= 0,

f

cannot

be Lipschitz-continuous, but it is Hölder-continuous on [0

,

1] for any exponent

h∈

(0

,

1):

z7→ ln

(1

/z

)

z

is Hölder-continuous (see,

e.g.

, [

23

], Prop. 2.13.), products of Hölder-continuous

function remain such on bounded intervals and the remaining factor of

f

is a polynomial in

z

,

which is Lipschitz- and hence Hölder-continuous.

By Lemma 2.12 (b) of [23] we then have

1

n

n

X

i=0

f(i/n) = Z1

0

f(z)dz ±O(n−h)

Recall that we can choose has close to 1as we wish; this will only aﬀect the constant hidden

by the

O

(

n−h

). It remains to actually compute the integral; fortunately, this “logarithmic beta

integral” has a well-known closed form (see, e.g., [23], Eq. (2.30)).

Z1

0

ln(z)·zα(1 −z)β−1= B(α+ 1, β)Hα−Hα+β

Inserting above, we ﬁnally ﬁnd

E[J

nln J

n] = E[I

nln I

n]±O(n−1)

=α

α+βHα−Hα+β±O(n−h)

for any h∈(0,1).

20 References

References

[1]

D. Cantone and G. Cincotti. Quickheapsort, an eﬃcient mix of classical sorting algorithms.

Theoretical Computer Science, 285(1):25–42, August 2002.

doi:10.1016/S0304-3975(01)

00288-2.

[2]

Domenico Cantone and Gianluca Cincotti. QuickHeapsort, an eﬃcient mix of classical

sorting algorithms. In Italian Conference on Algorithms and Complexity (CIAC), pages

150–162, 2000. doi:10.1007/3-540-46521-9_13.

[3]

Volker Diekert and Armin Weiß. QuickHeapsort: Modiﬁcations and improved analysis. The-

ory of Computing Systems, 59(2):209–230, aug 2016. doi:10.1007/s00224-015-9656-y.

[4]

Ernst E. Doberkat. An average case analysis of Floyd’s algorithm to construct heaps. Infor-

mation and Control, 61(2):114–131, May 1984. doi:10.1016/S0019-9958(84)80053-4.

[5]

Stefan Edelkamp and Armin Weiß. QuickXsort: Eﬃcient sorting with

nlog n−

1

.

399

n

+

o

(

n

)

comparisons on average. In International Computer Science Symposium in Russia, pages

139–152. Springer, 2014. doi:10.1007/978-3-319-06686-8_11.

[6]

Stefan Edelkamp and Armin Weiß. QuickMergesort: Practically eﬃcient constant-factor

optimal sorting, 2018. arXiv:1804.10062.

[7]

Philippe Flajolet and Mordecai Golin. Mellin transforms and asymptotics. Acta Informatica,

31(7):673–696, July 1994. doi:10.1007/BF01177551.

[8]

Lester R. Ford and Selmer M. Johnson. A tournament problem. The American Mathematical

Monthly, 66(5):387, May 1959. doi:10.2307/2308750.

[9]

Viliam Geﬀert and Jozef Gajdoš. In-place sorting. In SOFSEM 2011: Theory and Practice

of Computer Science, pages 248–259. Springer, 2011.

doi:10.1007/978-3-642-18381-2_

21.

[10]

Hsien-Kuei Hwang. Limit theorems for mergesort. Random Structures and Algorithms,

8(4):319–336, July 1996.

doi:10.1002/(sici)1098-2418(199607)8:4<319::aid-rsa3>

3.0.co;2-0.

[11]

Hsien-Kuei Hwang. Asymptotic expansions of the mergesort recurrences. Acta Informatica,

35(11):911–919, November 1998. doi:10.1007/s002360050147.

[12]

Jyrki Katajainen. The ultimate heapsort. In Proceedings of the Computing: The 4th

Australasian Theory Symposium, Australian Computer Science Communications, pages

87–96. Springer-Verlag Singapore Pte. Ltd., 1998. URL:

http://www.diku.dk/~jyrki/

Myris/Kat1998C.html.

[13]

Jyrki Katajainen, Tomi Pasanen, and Jukka Teuhola. Practical in-place mergesort. Nordic

Journal of Computing, 3(1):27–40, 1996. URL:

http://www.diku.dk/~jyrki/Myris/

KPT1996J.html.

[14]

Donald E. Knuth. The Art Of Computer Programming: Searching and Sorting. Addison

Wesley, 2nd edition, 1998.

[15]

Heikki Mannila and Esko Ukkonen. A simple linear-time algorithm for in situ merging.

Information Processing Letters, 18(4):203–208, May 1984.

doi:10.1016/0020-0190(84)

90112-1.

References 21

[16]

Conrado Martínez and Salvador Roura. Optimal sampling strategies in Quicksort

and Quickselect. SIAM Journal on Computing, 31(3):683–705, 2001.

doi:10.1137/

S0097539700382108.

[17]

Wolfgang Panny and Helmut Prodinger. Bottom-up mergesort—a detailed analysis.

Algorithmica, 14(4):340–354, October 1995. doi:10.1007/BF01294131.

[18]

Klaus Reinhardt. Sorting in-place with a worst case complexity of

nlog n−

1

.

3

n

+

O

(

log n

)

comparisons and

εn log n

+

O

(1) transports. In International Symposium on Algorithms

and Computation (ISAAC), pages 489–498, 1992. doi:10.1007/3-540-56279-6_101.

[19]

Salvador Roura. Divide-and-Conquer Algorithms and Data Structures. Tesi doctoral (Ph. D.

thesis, Universitat Politècnica de Catalunya, 1997.

[20]

Salvador Roura. Improved master theorems for divide-and-conquer recurrences. Journal

of the ACM, 48(2):170–205, 2001. doi:10.1145/375827.375837.

[21]

Robert Sedgewick and Philippe Flajolet. An Introduction to the Analysis of Algorithms.

Addison-Wesley-Longman, 2nd edition, 2013.

[22] Robert Sedgewick and Kevin Wayne. Algorithms. Addison-Wesley, 4th edition, 2011.

[23]

Sebastian Wild. Dual-Pivot Quicksort and Beyond: Analysis of Multiway Partitioning and

Its Practical Potential. Doktorarbeit (

Ph.D.

thesis), Technische Universität Kaiserslautern,

2016. ISBN 978-3-00-054669-3. URL:

http://nbn-resolving.de/urn/resolver.pl?urn:

nbn:de:hbz:386-kluedo-44682.