Content uploaded by Armin Weiß

Author content

All content in this area was uploaded by Armin Weiß on Nov 26, 2014

Content may be subject to copyright.

arXiv:1209.4214v2 [cs.DS] 6 Mar 2013

QuickHeapsort: Modiﬁcations and Improved

Analysis

Volker Diekert Armin Weiß

Universit¨at Stuttgart, FMI

Universit¨atsstraße 38

D-70569 Stuttgart, Germany

{diekert,weiss}@fmi.uni-stuttgart.de

March 7, 2013

Abstract

We present a new analysis for QuickHeapsort splitting it into the analysis

of the partition-phases and the analysis of the heap-phases. This enables

us to consider samples of non-constant size for the pivot selection and

leads to better theoretical bounds for the algorithm.

Furthermore we introduce some modiﬁcations of QuickHeapsort, both

in-place and using nextra bits. We show that on every input the ex-

pected number of comparisons is nlg n−0.03n+o(n) (in-place) respec-

tively nlg n−0.997n+o(n) (always lg n= lg2n). Both estimates improve

the previously known best results. (It is conjectured [19] that the in-place

algorithm Bottom-Up-Heapsort uses at most nlg n+ 0.4non average and

for Weak-Heapsort which uses nextra bits the average number of com-

parisons is at most nlg n−0.42n[8].) Moreover, our non-in-place variant

can even compete with index based Heapsort variants (e.g. Rank-Heapsort

[17]) and Relaxed-Weak-Heapsort (nlg n−0.9n+o(n) comparisons in the

worst case) for which no O(n)-bound on the number of extra bits is known.

Keywords. In-place sorting - heapsort - quicksort - analysis of algo-

rithms

1 Introduction

QuickHeapsort is a combination of Quicksort and Heapsort which was ﬁrst de-

scribed by Cantone and Cincotti [2]. It is based on Katajainen’s idea for Ul-

timate Heapsort [12]. In contrast to Ultimate Heapsort it does not have any

O(nlg n) bound for the worst case running time (lg n= lg2n). Its advantage is

that it is very fast in the average case and hence not only of theoretical interest.

Both algorithms have in common that ﬁrst the array is partitioned into two

parts. Then in one part a heap is constructed and the elements are successively

extracted. Finally the remaining elements are treated recursively. The main

1

advantage of this method is that for the sift-down only ¡one comparison per

level is needed, whereas standard Heapsort needs two comparisons per level (for

a description of standard Heapsort see some standard textbook, e.g. [6]). This

is a severe drawback and one of the reasons why standard Heapsort cannot

compete with Quicksort in practice (of course there are also other reasons like

cache behavior). Over the time a lot of solutions to this problem appeared like

Bottom-Up-Heapsort [19] or MDR-Heapsort [15],[18], which both perform the

sift-down by ﬁrst going down to some leaf and then searching upward for the

correct position. Since one can expect that the ﬁnal position of some introduced

element is near to some leaf, this is a good heuristic and it leads to provably

good results. The diﬀerence between QuickHeapsort and Ultimate Heapsort lies

in the choice of the pivot element for partitioning the array. While for Ultimate

Heapsort the pivot is chosen as median of the whole array, for QuickHeapsort

the pivot is selected as median of some smaller sample (e.g. as median of 3

elements).

In [2] the basic version with ﬁxed index as pivot is analyzed and – together

with the median of three version – implemented and compared with other Quick-

and Heapsort variants. In [8] Edelkamp and Stiegeler compare these variants

with so called Weak-Heapsort [7] and some modiﬁcations of it (e.g. Relaxed-

Weak-Heapsort). Weak-Heapsort beats basic QuickHeapsort with respect to

the number of comparisons, however it needs O(n) bits extra-space (for Relaxed-

Weak-Heapsort this bound is only conjectured), hence is not in place.

We split the analysis of QuickHeapsort into three parts: the partitioning

phases, the heap construction and the heap extraction. This allows us to get bet-

ter bounds for the running time, especially when choosing the pivot as median of

a larger sample. It also simpliﬁes the analysis. We introduce some modiﬁcations

of QuickHeapsort, too. The ﬁrst one is in-place and needs nlg n−0.03n+o(n)

comparisons on average what is to the best of our knowledge better than any

other known in-place Heapsort variant. We also examine a modiﬁcation using

O(n) bits extra-space, which applies the ideas of MDR-Heapsort to QuickHeap-

sort. With this method we can bound the average number of comparisons to

nlg n−0.997n+o(n). Actually, a complicated, iterated in-place MergeInsertion

uses only nlg n−1.3n+O(lg n) comparisons, [16]. Unfortunately, for practical

purposes this algorithm is not competitive.

Our contributions are as follows: 1. We give a simpliﬁed analysis which gives

better bounds than previously known. 2. Our approach yields the ﬁrst precise

analysis of QuickHeapsort when the pivot element is taken from a larger sample.

3. We give a simple in-place modiﬁcation of QuickHeapsort which saves 0.75n

comparisons. 4. We give a modiﬁcation of QuickHeapsort using nextra bits

only and we can bound the expected number of comparisons. This bound is

better than the previously known for the worst case of Heapsort variants using

O(nlg n) extra bits for which best and worst case are almost the same. 5. We

have implemented QuickHeapsort, and our experiments conﬁrm the theoretical

predictions.

The paper is organized as follows: Sect. 2 brieﬂy describes the basic Quick-

Heapsort algorithm together with our ﬁrst improvement. In Sect. 3 we analyze

the expected running time of QuickHeapsort. Then we introduce some improve-

ments in Sect. 4 allowing O(n) additional bits. Finally, in Sect. 5, we present

our experimental results comparing the diﬀerent versions of QuickHeapsort with

other Quicksort and Heapsort variants.

2

2 QuickHeapsort

Atwo-layer-min-heap is an array A[1..n] of nelements together with a partition

(G, R) of {1,...,n}into green and red elements such that for all g∈G, r ∈Rwe

have A[g]≤A[r]. Furthermore, the green elements gsatisfy the heap condition

A[g]≤min{A[2g], A[2g+1]}, and if gis red, then 2gand 2g+1 are red, too. (The

conditions are required to hold, only if the indices involved are in the range of 1

to n.) The green elements are called “green” because the they can be extracted

out of the heap without caution, whereas the “red” elements are blocked. Two-

layer-max-heaps are deﬁned analogously. We can think of a two-layer-heap as

rooted binary tree such that each node is either green or red. Green nodes

satisfy the standard heap-condition, children of red nodes are red. Two-layer-

heaps were deﬁned in [12]. In [2] for the same concept a diﬀerent language is

used (they describe the algorithm in terms of External Heapsort). Now we are

ready to describe the QuickHeapsort algorithm as it has been proposed in [2].

Most of it also can be found in pseudocode in App. D.

We intend to sort an array A[1..n]. First, we choose a pivot p. This is the

randomized part of the algorithm. Then, just as in Quicksort, we rearrange the

array according to p. That means, using n−1 comparisons the partitioning

function returns an index kand rearranges the array Aso that A[i]≥A[k] for

i < k,A[k] = p, and A[k]≥A[j] for k < j. After the partitioning a two-layer-

heap is built out of the elements of the smaller part of the array, either the part

left of the pivot or right of the pivot. We call this smaller part heap-area and

the larger part work-area. More precisely, if k−1< n −k, then {1,...,k−1}

is the heap-area and {k+ 1,...,n}is the work-area. If k−1≥n−k, then

{1,...,k−1}is the work-area and {k+ 1,...,n}is the heap-area. Note that

we know the ﬁnal position of the pivot element without any further comparison.

Therefore, we do not count it to the heap-area nor to the work-area. If the

heap-area the part of the array left of the pivot, a two-layer-max-heap is built,

otherwise a two-layer-min-heap is built.

At the beginning the heap-area is an ordinary heap, hence it is a two-layer-

heap consisting of green elements, only. Now the heap extraction phase starts.

We assume that we are in the case of a max-heap. The other case is symmetric.

Let mdenote the size of the heap-area. The melements of the heap-area are

moved to the work-area. The extraction of one element works as follows: the

root of the heap is placed at the current position of the work-area (which at the

beginning is its last position). Then, starting from the root the resulting “hole”

is trickled down: always the larger child is moved up into the vacant position

and then this child is treated recursively. This stops as soon as a leaf is reached.

We call this the SpecialLeaf procedure (Alg. 4.2) according to [2]. Now, the

element which before was at the current position in the work-area is placed as

red element in this hole at the leaf in the heap-area. Finally the current position

in the work-area is moved by one and the next element can be extracted.

The procedure sorts correctly, because after the partitioning it is guaranteed

that all red elements are smaller than all green elements. Furthermore there is

enough space in the work-area to place all green elements of the heap, since the

heap is always the smaller part of the array. After extracting all green elements

the pivot element it placed at its ﬁnal position and the remaining elements are

sorted recursively.

Actually we can improve the procedure, thereby saving 3n/4 comparisons by

3

a simple trick. Before the heap extraction phase starts in the heap-area with m

elements, we perform at most m+2

4additional comparisons in order to arrange

all pairs of leaves which share a parent such that the left child is not smaller

than its right sibling. Now, in every call of SpecialLeaf, we can save exactly one

comparison, since we do not need to compare two leaves. For a max-heap we

only need to move up the left child and put the right one at the place of the

former left one. Summing up over all heaps during an execution of standard

QuickHeapsort, we invest n+2t

4comparisons in order to save ncomparisons,

where tis the number of recursive calls. The expected number of tis in O(lg n).

Hence, we can expect to save 3n

4+O(lg n) comparisons. We call this version

the improved variant of QuickHeapsort.

3 Analysis of QuickHeapsort

This section contains the main contribution of the paper. We analyze the num-

ber of comparisons. By nwe denote the number of elements of an array to be

sorted. We use standard O-notation where O(g), o(g), and ω(g) denote classes

of functions. In our analysis we do not assume any random distribution of the

input, i.e. it is valid for every permutation of the input array. Randomization

is used however for pivot selection. With Pr [ e] we denote the probability of

some event e. The expected value of a random variable Tis denoted by E[T].

The number of assignments is bounded by some small constant times the

number of comparisons. Let T(n) denote the number of comparisons during

QuickHeapsort on a ﬁxed array of nelements. We are going to split the analysis

of QuickHeapsort into three parts:

1. Partitioning with an expected number of comparisons E[Tpart (n) ] (aver-

age case).

2. Heap construction with at most Tcon(n) comparisons (worst case).

3. Heap extraction (sorting phase) with at most Text(n) comparisons (worst

case).

We analyze the three parts separately and put them together at the end. The

partitioning is the only randomized part of our algorithm. The expected num-

ber of comparisons depends on the selection method for the pivot. For the

expected number of comparisons by QuickHeapsort on the input array we ob-

tain E[T(n) ] ≤Tcon (n) + Text(n) + E[Tpart(n)].

Theorem 3.1 The expected number E[T(n) ] of comparisons by basic resp. im-

proved QuickHeapsort with pivot as median of prandomly selected elements on

a ﬁxed input array of size nis E[T(n) ] ≤nlg n+cn +o(n)with cas follows:

pcbasic cimproved

1 +2.72 +1.97

3 +1.92 +1.17

f(n)+0.72 −0.03

Here, f∈ω(1) ∩o(n)with 1≤f(n)≤n, e.g., f(n) = √nand we assume that

we choose the median of f(n)randomly selected elements in time O(f(n)).

4

As we see, the selection method for the pivot is very important. However,

one should notice that the bound for ﬁxed size samples for pivot selection are

not tight. The proof of these results are postponed to Sect. 3.3. Note that it

is enough to prove the results without the improvement, since the diﬀerence is

always 0.75n.

3.1 Heap Construction

The standard heap construction [9] needs at most 2mcomparisons to construct

a heap of size min the worst case and approximately 1.88min the average case.

For the mathematical analysis better theoretical bounds can be used. The best

result we are aware of is due to Chen et al. in [5]. According to this result we

have Tcon(m)≤1.625m+o(m). Earlier results are of similar magnitude, by

[4] it has been known that Tcon(m)≤1.632m+o(m) and by [10] it has been

known Tcon(m)≤1.625m+o(m), but Gonnet and Munro used O(m) extra bits

to get this result, whereas the new result of Chen et al. is in-place (by using

only O(lg m) extra bits).

During the execution of QuickHeapsort over nelements, every element is part

of a heap only once. Hence, the sizes of all heaps during the entire procedure

sum up to n. With the result of [5] the total number of comparisons performed

in the construction of all heaps satisﬁes:

Proposition 3.2 Tcon(n)≤1.625n+o(n).

3.2 Heap Extraction

For a real number r∈Rwith r > 0 we deﬁne {r}by the following condition

r= 2k+{r}with k∈Zand 0 ≤ {r}<2k.

This means that 2kis largest power of 2 which is less than or equal to rand

{r}is the diﬀerence to that power, i.e. {r}=r−2⌊lg r⌋. In this section we ﬁrst

analyze the extraction phase of one two-layer-heap of size m. After that, we

bound the number of comparisons Text (n) performed in the worst case during

all heap extraction phases of one execution of QuickHeapsort on an array of size

n. Thm. 3.3 is our central result about heap extraction.

Theorem 3.3 Text(n)≤n·(⌊lg n⌋ − 3) + 2{n}+O(lg2n).

The proof of Thm. 3.3 covers almost the rest of Section 3.2. In the following,

the height height(v) of an element vin a heap His the maximal distance from

that node to a leaf below it. The height of His the height of its root. The level

level(v) of vto be its distance from the root. In this section we want to count

the comparisons during SpecialLeaf procedures, only. Recall that a SpecialLeaf

procedure is a cyclic shift on a path from the root down to some leaf, and the

number comparisons is exactly the length of this path. Hence the upper bound

is the height of the heap. But there is a better analysis.

Let us consider a heap with mgreen elements which are all extracted by

SpecialLeaf procedures. The picture is as follows: First, we color the green

root red. Next, we perform a cyclic shift deﬁned by the SpecialLeaf procedure.

In particular, the leaf is now red. Moreover, red positions remain red, but

5

there is exactly one position vwhich has changed its color from green to red.

This position vis on the path deﬁned by the SpecialLeaf procedure. Hence,

the number of comparisons needed to color the position vred is bounded by

height(v) + level(v).

The total number of comparisons E(m) to extract all melements of a Heap

His therefore bounded by

E(m)≤X

v∈H

(height(v) + level(v)).

We have height(H)−1≤height(v) + level(v)≤height(H) = ⌊lg m⌋for all

v∈H. We now count the number of elements vwhere height(v) + level(v) =

⌊lg m⌋and the number of elements vwhere height(v) + level(v) = ⌊lg m⌋ − 1.

Since there are exactly {m}+ 1 nodes of level ⌊lg m⌋, there are at most 2 {m}+

1 +lg melements vwith height(v)+ level(v) = ⌊lg m⌋. All other elements satisfy

height(v) + level(v) = ⌊lg m⌋ − 1. We obtain

E(m)≤2· {m} · ⌊lg m⌋+ (m−2· {m})(⌊lg m⌋ − 1) + O(lg m)

=m·(⌊lg m⌋ − 1) + 2 · {m}+O(lg m).(1)

Note that this is an estimate of the worst case, however this analysis also shows

that the best case only diﬀers by O(lg m)-terms from the worst case.

Now, we want to estimate the number of comparisons in the worst case

performed during all heap extraction phases together. During QuickHeapsort

over nelements we create a sequence H1,...,Htof heaps of green elements

which are extracted using the SpecialLeaf procedure. Let mi=|Hi|be the size

of the i-th Heap. The sequence satisﬁes 2mi≤n−Pj<i mj, because heaps are

constructed and extracted on the smaller part of the array.

Here comes a subtle observation: Assume that m1+m2≤n/2. If we

replace the ﬁrst two heaps with one heap H′of size |H|′=m1+m2, then

the analysis using the sequence H′, H3,...,Htcannot lead to a better bound.

Continuing this way, we may assume that we have t∈ O(lg n) and therefore

P1≤i≤tO(lg mi)⊆ O(lg2n).With Eq. (1) we obtain the bound

Text(n)≤

t

X

i=1

E(mi) = t

X

i=1

mi· ⌊lg mi⌋+ 2 {mi}!−n+O(lg2n).(2)

Later we will replace the miby other positive real numbers. Therefore we

deﬁne the following notion. Let 1 ≤ν∈R. We say a sequence x1, x2,...,xt

with xi∈R>0is valid w.r.t. ν, if for all 1 ≤i≤twe have 2xi≤ν−P

j<i

xj.

As just mentioned the initial sequence m1, m2...,mtis valid w.r.t. n. Let

us deﬁne a continuous function F:R>0→Rby F(x) = x· ⌊lg x⌋+ 2 {x}.It is

continuous since for x= 2k,k∈Zwe have F(x) = xk = limε→0(x−ε)(k−1) +

2{x−ε}. It is piecewise diﬀerentiable with right derivative ⌊lg x⌋+2. Therefore:

Lemma 3.4 Let x≥y > δ ≥0. Then we have the inequalities:

F(x) + F(y)≤F(x+δ) + F(y−δ)and F(x) + F(y)≤F(x+y).

6

Lemma 3.5 Let 1≤ν∈R. For all sequences x1, x2,...,xtwith xi∈R>0,

which are valid w.r.t. ν, we have

t

P

i=1

F(xi)≤⌊lg ν⌋

P

i=1

Fν

2i.

Proof. The result is true for ν≤2, because then F(xi)≤F(ν/2) ≤F(1) = 0

for all i. Thus, we may assume ν≥2. We perform induction on t. For t= 1

the statement is clear, since lg ν≥1 and x1≤ν/2. Now let t > 1. By Lem. 3.4,

we have F(x1) + F(x2)< F (x1+x2). Now, if x1+x2≤ν

2, then the sequence

x1+x2, x3,...,xtis valid, too; and we are done by induction. Hence, we may

assume x1+x2>ν

2. If x1≤x2, then

2x1= 2x2+ 2(x1−x2)≤ν−x1+ 2(x1−x2) = ν−x2+x1−x2≤ν−x2.

Thus, if x1≤x2, then the sequence x2, x1, x3,...,xtis valid, too. Thus, it is

enough to consider x1≥x2with x1+x2>ν

2.

We have ν

2≥1 and the sequence x′

2, x3,...xtwith x′

2=x1+x2−ν

2is valid

w.r.t. ν/2, because

x′

2=x1+x2−ν

2≤x1+ν−x1

2−ν

2=x1

2≤ν

4.

Therefore, by induction on tand Lem. 3.4 we obtain the claim:

t

X

i=1

F(xi)≤F(ν/2)+F(x′

2)+

t

X

i=3

F(xi)≤F(ν/2)+⌊lg ν⌋

X

i=2

Fν

2i≤⌊lg ν⌋

X

i=1

Fν

2i.

Lemma 3.6 ⌊lg n⌋

P

i=1

Fn

2i≤F(n)−2n+O(lg n).

Lem. 3.6

⌊lg n⌋

X

i=1

Fn

2i=n⌊lg n⌋ · ⌊lg n⌋

X

i=1

1

2i−n·⌊lg n⌋

X

i=1

i

2i+ 2 {n} · ⌊lg n⌋

X

i=1

1

2i

≤n⌊lg n⌋ · X

i≥1

1

2i−n·X

i≥1

i

2i+ 2 {n} · X

i≥1

1

2i+n

2⌊lg n⌋·X

i>0

i+⌊lg n⌋

2i

=n⌊lg n⌋ − 2n+ 2 {n}+O(lg n).

Applying these lemmata to Eq. (2) yields the proof of Thm. 3.3.

Corollary 3.7 We have Text(n)≤nlg n−2.9139n+O(lg2n).

Proof. By [18, Thm. 1] we have F(n)−2n≤nlg n−1.9139n. Hence, Cor. 3.7

follows directly from Thm. 3.3.

7

3.3 Partitioning

In the following Tpivot(n) denotes the number of comparisons required to choose

the pivot element in the worst case; and, as before, E[Tpart(n) ] denotes the

expected number of comparisons performed during partitioning. We have the

following recurrence:

E[Tpart(n) ] ≤n−1 + Tpivot(n) +

n

X

k=1

Pr [pivot = k]·E[Tpart(max {k−1, n −k})] .

(3)

If we choose the pivot at random, then we obtain by standard methods:

E[Tpart(n) ] ≤n−1 + 1

n·

n

X

k=1

E[Tpart(max {k−1, n −k}) ] ≤4n. (4)

Similarly, if we choose the pivot with the median-of-three, then we obtain:

E[Tpart(n) ] ≤3.2n+O(lg n).(5)

The proof of the ﬁrst part of Thm. 3.1 follows from the above eqations,

Thm. 3.3, and Prop. 3.2. Using a growing number of elements (as ngrows) as

sample for the pivot selection, we can do better. The second part of Thm. 3.1

follows from Thm. 3.3, Prop. 3.2, and Thm. 3.8.

Theorem 3.8 Let f∈ω(1) ∩o(n)with 1≤f(n)≤n. When choosing the

pivot as median of f(n)randomly selected elements in time O(f(n)) (e.g. with

the algorithm of [1]), the expected number of comparisons used in all recursive

calls of partitioning is in 2n+o(n).

Thm. 3.8 is close to a well-known result in [14, Thm. 5] on Quickselect, see

Cor. 3.10. Formally speaking we cannot use it directly, because we deal with

QuickHeapsort, where after partitioning the recursive call is on the larger part.

Because of that, and for the sake of completeness, we give a proof. Moreover, our

proof is elementary and simpler than the one in [14]. The key step is Lem. 3.9.

Its proof is rather standard and also can be found in App. A.

Lemma 3.9 Let 0< δ < 1

2. If we choose the pivot as median of 2c+ 1 elements

such that 2c+ 1 ≤n

2, then we have Pr pivot ≤n

2−δn <(2c+ 1)αcwhere

α= 4 1

4−δ2<1.

Proof of Thm. 3.8. As an abbreviation, we let E(n) = E[Tpart(n) ] be the

expected number of comparisons performed during partitioning. We are going

to show that for all ǫ > 0 there is some D∈Rsuch that

E(n)<(2 + ǫ)n+D. (6)

So, we ﬁx some 1 ≥ǫ > 0. We choose δ > 0 such that (2+ǫ)δ < ǫ

4.Moreover, for

this proof let µ=n+1

2. Positions of possible pivots kwith µ−δn ≤k≤µ+δn

form a small fraction of all positions, and they are located around the median.

8

Nevertheless, applying Lem. 3.9 with c=f(n)∈ω(1) ∩o(n) yields for all n,

which are large enough:

Pr [ pivot < µ −δn ]≤(2f(n) + 1) ·αf(n)≤1

48ǫ. (7)

The analogous inequality holds for Pr [ pivot > µ +δ n ]. Because Tpivot(n)∈

o(n), we have

Tpivot(n)≤1

8ǫn. (8)

for nlarge enough. Now, we choose n0such that Eq. (7) and Eq. (8) hold for

n≥n0and such that we have (2+ ǫ)δ+2

n0<ǫ

4. We set D=E(n0) + 1. Hence

for n < n0the desired result Eq. (6) holds. Now, let n≥n0. From Eq. (3) we

obtain by symmetry:

E(n)≤n−1 + Tpivot(n) + ⌊µ+δn⌋

X

k=⌈µ−δn⌉

Pr [ pivot = k]·E(k−1)

+ 2

n

X

k=⌊µ+δn⌋+1

Pr [ pivot = k]·E(k−1).

Since Eis monotone, E(k) can be bounded by the highest value in the respective

interval:

≤n+1

8ǫn + Pr [ µ−δn ≤pivot ≤µ+δn ]·E(⌊µ+δn⌋)

+ 2 Pr [ pivot > µ +δn ]·E(n−1)

≤n+1

8ǫn +1−1

24ǫ·E(⌊µ+δn⌋) + 2 1

48ǫ·E(n−1).

By induction we assume E(k)≤(2 + ǫ)k+Dfor k < n. Hence:

E(n)≤n+1

8ǫn +1−1

24ǫ·((2 + ǫ)·(µ+δn) + D) + 1

24ǫ·((2 + ǫ)n+D)

≤n+ (2 + ǫ)·n+ 1

2+δn+1

8ǫn +1

24ǫ(2 + ǫ)n+D

≤2n+ 1 + ǫ

2+ (2 + ǫ)δn +3

4ǫn +D < (2 + ǫ)n+D.

Corollary 3.10 ([14]) Let f∈ω(1) ∩o(n)with 1≤f(n)≤n. When im-

plementing Quickselect with the median of f(n)randomly selected elements as

pivot, the expected number of comparisons is 2n+o(n).

Proof. In QuickHeapsort the recursion is always on the larger part of the array.

Hence, the number of comparisons in partitioning for QuickHeapsort is an upper

bound on the number of comparisons in Quickselect.

In [14] it is also proved that choosing the pivot as median of O(√n) elements

is optimal for Quicksort as well as for Quickselect. This suggests that we choose

the same value in QuickHeapsort; what is backed by our experiments.

9

4 Modiﬁcations of QuickHeapsort Using Extra-

space

In this section we want to describe some modiﬁcation of QuickHeapsort using

nbits of extra storage. We introduce two bit-arrays. In one of them (the Com-

pareArray) – which is actually two bits per element – we store the comparisons

already done (we need two bits, because there are three possible values – right,

left, unknown – we have to store). In the other one (the RedGreenArray) we

store which element is red and which is green.

Since the heaps have maximum size n/2, the RedGreenArray only requires

n/2 bits. The CompareArray is only needed for the inner nodes of the heaps,

i.e. length n/4 is suﬃcient. Totally this sums up to nextra bits.

For the heap construction we do not use the algorithms described in Sect. 3.1.

With the CompareArray we can do better by using the algorithm of McDiarmid

and Reed [15]. The heap construction works similarly to Bottom-Up-Heapsort,

i.e. the array is traversed backward calling for all inner positions ithe Reheap

procedure on i. The Reheap procedure takes the subheap with root iand

restores the heap condition, if it is violated at the position i. First, the Reheap

procedure determines a special leaf using the SpecialLeaf procedure as described

in Sect. 2, but without moving the elements. Then, the ﬁnal position of the

former root is determined going upward from the special leaf (bottom-up-phase).

In the end, the elements above this ﬁnal position are moved up towards the root

by one position. That means that all but one element which are compared during

the bottom-up-phase, stay in their places. Since in the SpecialLeaf procedure

these elements have been compared with their siblings, these comparisons can

be stored in the CompareArray and can be used later.

With another improvement concerning the construction of heaps with seven

elements as in [3] the beneﬁts of this array can be exploited even more.

The RedGreenArray is used during the sorting phase, only. Its functionality

is straightforward: Every time a red element is inserted into the heap, the

corresponding bit is set to red. The SpecialLeaf procedure can stop as soon

as it reaches an element without green children. Whenever a red and a green

element have to be compared, the comparison can be skipped.

Theorem 4.1 Let f∈ω(1) ∩o(n)with 1≤f(n)≤n, e.g., f(n) = lg n, and

let E[T(n) ] be the expected number of comparisons by QuickHeapsort using the

CompareArray with the improvement of [3] and the RedGreenArray on a ﬁxed

input array of size n. Choosing the pivot as median of f(n)randomly selected

elements in time O(f(n)), we have

E[T(n) ] ≤nlg n−0.997n+o(n).

Proof. We can analyze the savings by the two arrays separately, because the

CompareArray only aﬀects comparisons between two green elements, while the

RedGreenArray only aﬀects comparisons involving at least one red element.

First, we consider the heap construction using the CompareArray. With this

array we obtain the same worst case bound as for the standard heap construction

method. However, the CompareArray has the advantage that at the end of the

heap construction many comparisons are stored in the array and can be reused

for the extraction phase. More precisely: For every comparison except the ﬁrst

10

one made when going upward from the special leaf, one comparison is stored

in the CompareArray, since for every additional comparison one element on the

path deﬁned by SpecialLeaf stays at its place. Because every pair of siblings

has to be compared at one point during the heap construction or extraction,

all these stored comparisons can be reused. Hence, we only have to count the

comparisons in the SpecialLeaf procedure during the construction plus n

2for the

ﬁrst comparison when going upward. Thus, we get an amortized bound for the

comparisons during construction of 3n

2.

In [3] the notion of Fine-Heaps is introduced. A Fine Heap is a heap with

the additional CompareArray such that for every node the larger child is stored

in the array. Such a Fine-Heap of size mcan be constructed using the above

method with 2mcomparisons. In [3] Carlsson, Chen and Mattsson showed that

a Fine-Heap of size mactually can be constructed with only 23

12 m+O(lg2m)

comparisons. That means we have to invest 23

12 m+O(lg2m) for the heap con-

struction and at the end there are m

2comparisons stored in the array. All these

comparisons stored in the array are used later. Summing up over all heaps

during an execution of QuickHeapsort, we can save another 1

12 ncomparisons

additionally to the comparisons saved by the CompareArray with the result of

[3]. Hence, for the amortized cost of the heap construction Tamort

con (i.e. the num-

ber of comparisons needed to build the heap minus the number of comparisons

stored in the CompareArray after the construction which all can be reused later)

we have obtained:

Proposition 4.2 Tamort

con (n)≤17

12 n+o(n).

This bound is slightly better than the average case for the heap construction

with the algorithm of [15] which is 1.52n.

Now, we want to count the number of comparisons we save using the Red-

GreenArray. We distinguish the two cases that two red elements are compared

and that a red and a green element are compared. Every position in the heap

has to turn red at one point. At that time, all nodes below this position are

already red. Hence, for that element we save as many comparisons as the ele-

ment is above the bottom level. Summing over all levels of a heap of size mthe

saving results in ≈m

4·1 + m

8·2 + ···=m·P

i≥1

i2−i−1=m. This estimate is

exact up to O(lg m)-terms. Since the expected number of heaps is O(lg n), we

obtain for the overall saving the value TsaveRR(n) = n+O(lg2n).

Another place where we save comparisons with the RedGreenArray is when

a red element is compared with a green element. It occurs at least one time –

when the node looses its last green child – for every inner node that we compare

a red child with a green child. Hence, we save at least as many comparisons

as there are inner nodes with two children, i.e. at least m

2−1. Since every

element – except the expected O(lg n) pivot elements – is part of a heap exactly

once, we save at least TsaveRG(n)≥n

2+O(lg n) comparisons when comparing

green with red elements. In the average case the saving might be even slightly

higher, since comparisons can also be saved when a node does not loose its last

green child.

Summing up all our savings and using the median of f(n)∈ω(1) ∩o(n) as

11

pivot we obtain the proof of Thm. 4.1:

E[T(n) ] ≤Tamort

con (n) + Text(n) + E[Tpart (n)] −TsaveRR(n)−TsaveRG(n)

≤17

12n+n·(⌊lg n⌋ − 3) + 2 {n}+ 2n−3n

2+o(n)

≤nlg n−0.997n+o(n).

5 Experimental Results and Conclusion

In Fig. 1 we present the number of comparisons of the diﬀerent versions of

QuickHeapsort we considered in this paper, i.e. the basic version, the improved

variant of Sect. 2, and the version using bit-arrays (however, without the mod-

iﬁcation by [3]) for diﬀerent values of n. We compare them with Quicksort,

Ultimate Heapsort, Bottom-Up-Heapsort and MDR-Heapsort. All algorithms

are implemented with median of √nelements as pivot (for Quicksort we show

additionally the data with median of 3). For the heap construction we imple-

mented the normal algorithm due to Floyd [9] as well as the algorithm using

the extra bit-array (which is the same as in MDR-Heapsort).

103104105106

−2

0

2

4

6

n

(#comparisons −nlg n)/n

Quicksort with Median of 3

Quicksort with Median of √n

Basic QuickHeapsort

Improved QuickHeapsort

QuickHeapsort with bit-arrays

MDR-Heapsort

Ultimate-Heapsort

Lower Bound

Figure 1: Average number of comparisons of QuickHeapsort implemented with

median of √ncompared with other algorithms

More results with other pivot selection strategies are in Table 2 and Table 3 in

App. B conﬁrming that a sample size of √nis optimal for pivot selection with re-

spect to the number of comparisons and also that the o(n)-terms in Thm. 3.1 and

Thm. 3.8 are not too big. In Table 1 in App. B we present actual running times

of the diﬀerent algorithms for n= 1000000. All the numbers, except the running

12

times, are average values over 100 runs with random data. As our theoretical es-

timates predict, QuickHeapsort with bit-arrays beats all other variants including

Relaxed-Weak-Heapsort (see Table 2, App. B) when implemented with median

of √nfor pivot selection. It also performs 326728 ≈0.33 ·106comparisons less

than our theoretical predictions which are 106·lg(106)−0.9139 ·106≈19017569

comparisons.

In this paper we have shown that with known techniques QuickHeapsort

can be implemented with expected number of comparisons less than nlg n−

0.03n+o(n) and extra storage O(1). On the other hand, using nextra bits we

can improve this to nlg n−0.997n+o(n), i.e. we showed that QuickHeapsort

can compete with the most advanced Heapsort variants. These theoretical es-

timates were also conﬁrmed by our experiments. We also considered diﬀerent

pivot selection schemes. For any constant size sample for pivot selection, Quick-

Heapsort beats Quicksort for large n, since Quicksort has a expected running

time of ≈Cn lg nwith C > 1. However, when choosing the pivot as median

of √nelements (i.e. with the optimal strategy) then our experiments show

that Quicksort needs less comparisons than QuickHeapsort. However, using bit-

arrays QuickHeapsort is the winner, again. In order to make the last statement

rigorous, better theoretical bounds for Quicksort with sampling √nelements

are needed. For future work it would also be of interest to prove the optimality

of √nelements for pivot selection in QuickHeapsort, to estimate the lower order

terms of the average running time of QuickHeapsort and also to ﬁnd an exact

average case analysis for the saving by the bit-arrays.

Acknowledgements.

We thank Martin Dietzfelbinger, Stefan Edelkamp and Jyrki Kata jainen for

their helpful comments. We thank Simon Paridon for implementing the algo-

rithms for our experiments.

References

[1] M. Blum, R. W. Floyd, V. Pratt, R. L. Rivest, and R. E. Tarjan. Time

bounds for selection. J. Comput. Syst. Sci., 7(4):448–461, 1973.

[2] D. Cantone and G. Cincotti. QuickHeapsort, an eﬃcient mix of classical

sorting algorithms. Theor. Comput. Sci., 285(1):25–42, 2002.

[3] S. Carlsson, J. Chen, and C. Mattsson. Heaps with Bits. In D.-Z. Du

and X.-S. Zhang, editors, ISAAC, volume 834 of LNCS, pages 288–296.

Springer, 1994.

[4] J. Chen. A Framework for Constructing Heap-like structures in-place. In

K.-W. Ng et al., editors, ISAAC, volume 762 of LNCS, pages 118–127.

Springer, 1993.

[5] J. Chen, S. Edelkamp, A. Elmasry, and J. Katajainen. In-place Heap

Construction with Optimized Comparisons, Moves, and Cache Misses. In

B. Rovan, V. Sassone, and P. Widmayer, editors, MFCS, volume 7464 of

LNCS, pages 259–270. Springer, 2012.

13

[6] T. H. Cormen, C. E. Leiserson, R. L. Rivest, and C. Stein. Introduction to

Algorithms. The MIT Press, 3 edition, 2009.

[7] R. D. Dutton. Weak-heap sort. BIT, 33(3):372–381, 1993.

[8] S. Edelkamp and P. Stiegeler. Implementing HEAPSORT with nlg n−0.9n

and QUICKSORT with nlg n+ 0.2ncomparisons. ACM J. of Exp. Alg.,

7:5, 2002.

[9] R. W. Floyd. Algorithm 245: Treesort. Commun. ACM, 7(12):701, 1964.

[10] G. H. Gonnet and J. I. Munro. Heaps on Heaps. SIAM J. Comput.,

15(4):964–971, 1986.

[11] K. Kaligosi and P. Sanders. How Branch Mispredictions Aﬀect Quicksort.

In Y. Azar and T. Erlebach, editors, ESA, volume 4168 of LNCS, pages

780–791. Springer, 2006.

[12] J. Katajainen. The Ultimate Heapsort. In X. Lin, editor, CATS, volume 20

of Australian Computer Science Communications, pages 87–96. Springer-

Verlag 1998.

[13] D. E. Knuth. The art of computer programming. Vol. 3. Addison-Wesley,

1998.

[14] C. Mart´ınez and S. Roura. Optimal Sampling Strategies in Quicksort and

Quickselect. SIAM J. Comput., 31(3):683–705, 2001.

[15] C. McDiarmid and B. A. Reed. Building Heaps Fast. J. Alg., 10(3):352–365,

1989.

[16] K. Reinhardt. Sorting in-place with a worst case complexity of nlg n−

1.3n+O(lg n) comparisons and ǫn lg n+O(1) transports. In T. Ibaraki et

al., editors, ISAAC, volume 650 of LNCS, pages 489–498. Springer, 1992.

[17] X.-D. Wang and Y.-J. Wu. An Improved HEAPSORT Algorithm with

nlg n−0.788928nComparisons in the Worst Case. J. of Comput. Sci. and

Techn., 22:898–903, 2007.

[18] I. Wegener. The Worst Case Complexity of McDiarmid and Reed’s Variant

of Bottom-Up-Heap Sort is Less Than nlg n+ 1.1n. In C. Choﬀrut and

M. Jantzen, editors, STACS, volume 480 of LNCS, pages 137–147. Springer,

1991.

[19] I. Wegener. BOTTOM-UP-HEAPSORT, a new variant of HEAPSORT,

beating, on an average, QUICKSORT (if nis not very small). Theor.

Comp. Sci., 118(1):81–98, 1993.

14

APPENDIX

A Proofs

Proof of Lem. 3.4 Since the right derivative is monotonically increasing we

have:

F(x+δ)−F(x) = Zx+δ

x

F′(t) dt≥F′(x)·δ= (⌊lg x⌋+ 2)δ

and

F(y)−F(y−δ) = Zy

y−δ

F′(t) dt≤F′(y)·δ= (⌊lg y⌋+ 2)δ.

This yields:

F(y)−F(y−δ)≤(⌊lg y⌋+ 2)δ≤(⌊lg x⌋+ 2)δ≤F(x+δ)−F(x).

By adding F(x) + F(y−δ) on both sides we obtain the ﬁrst claim of Lem. 3.4.

Note that limε→0F(ε) = 0. Hence the second claim follows from the ﬁrst by

considering the limit δ→y.

Proof of Lem. 3.9. First note that the probability for choosing the k-th ele-

ment as pivot satisﬁes

n

2c+ 1·Pr [ pivot = k] = k−1

cn−k

c.

We use the notation of falling factorial xℓ=x···(x−ℓ+ 1). Thus, x

ℓ=xℓ

ℓ!.

Pr [ pivot = k] = (2c+ 1)! ·(k−1)c·(n−k)c

(c!)2·n2c+1

=2c

c(2c+ 1) 1

(n−2c)

c−1

Y

i=0

(k−1−i)(n−k−i)

(n−2i−1)(n−2i).

For k≤cwe have Pr [ pivot = k] = 0. So, let c < k ≤n

2−δn and let us

consider an index iin the product with 0 ≤i < c.

(k−1−i)(n−k−i)

(n−2i−1)(n−2i)≤(k−i)(n−k−i)

(n−2i)(n−2i)

=n

2−i−n

2−k·n

2−i+n

2−k

(n−2i)2

=n

2−i2−n

2−k2

(n−2i)2

≤1

4−n

2−n

2−δn2

n2=1

4−δ2.

15

We have 2c

c≤4c. Since 2c+ 1 ≤n

2, we obtain:

Pr [ pivot = k]≤4c(2c+ 1) 1

(n−2c)1

4−δ2c

<(2c+ 1) 2

nαc.

Now, we obtain the desired result.

Pr hpivot ≤n

2−δn i<⌊n

2−δn⌋

X

k=0

(2c+ 1) 2

nαc≤(2c+ 1)αc

B More Experimental Results

In Table 1 we present actual running times of the diﬀerent algorithms for n=

1000000 with two diﬀerent comparison functions (the numbers displayed here

are averages over 10 runs with random data). One of them is the normal integer

comparison, the other one ﬁrst applies four times the logarithm to both operands

before comparing them. Like in [8], this simulates expensive comparisons.

In Table 2 all algorithms are implemented with median of 3 and with median

of √nelements as pivot. We compare them with Quicksort implemented with

the same pivot selection strategies, Ultimate Heapsort, Bottom-Up-Heapsort

and MDR-Heapsort. In Table 2 we also added the values for Relaxed-Weak-

Heapsort which were presented in [8].

Table 1: Running times for QuickHeapsort and other algorithms tested on 106,

average over 10 runs elements

Sorting algorithm integer data

time [s]

lg(4)-test-function

time [s]

Basic QuickHeapsort, median of 3 0.1154 4.21

Basic QuickHeapsort, median of √n0.1171 4.109

Improved QHS, median of 3 0.1073 4.049

Improved QHS, median of √n0.1118 3.911

QHS with bit-arrays, median of 3 0.1581 3.756

QHS with bit-arrays, median of √n0.164 3.7

Quicksort with median of 3 0.1181 3.946

Quicksort with median of √n0.1316 3.648

Ultimate Heapsort 0.135 5.109

Bottom-Up-Heapsort 0.1677 4.132

MDR-Heapsort 0.2596 4.129

We also compare the diﬀerent pivot selection strategies on the basic Quick-

Heapsort with no modiﬁcations. We test sample of sizes of one, three, approxi-

mately lg n,4

√n,pn/ lg n,√n, and n3

4for the pivot selection.

In Table 3 the average number of comparisons and the standard deviations

are listed. We ran the algorithms on arrays of length 10000 and one million.

16

Table 2: QuickHeapsort and other algorithms tested on 106elements (the data

for Relaxed-Weak-Heapsort is taken from [8]).

Sorting algorithm Average number of com-

parisons for n= 106

Basic QuickHeapsort with median of 3 21327478

Basic QuickHeapsort with median of √n20783631

Improved QuickHeapsort, median of 3 20639046

Improved QuickHeapsort, median of √n20135688

QuickHeapsort with bit-arrays, median of 3 19207289

QuickHeapsort with bit-arrays, median of √n18690841 ∗Best result∗

Quicksort with median of 3 21491310

Quicksort with median of √n19548149

Bottom-Up-Heapsort 20294866

MDR-Heapsort 20001084

Relaxed-Weak-Heapsort 18951425

Lower Bound: lg n! 18488884 ≈lg (106!)

The displayed data is the average resp. standard deviation of 100 runs of Quick-

Heapsort with the respective pivot selection strategy.

These results are not very surprising: The larger the samples get, the smaller

is the standard deviation. The average number of comparisons reaches its min-

imum with a sample size of approximately √nelements. One notices that the

diﬀerence for the average number of comparisons is relatively small, especially

between the diﬀerent pivot selection strategies with non-constant sample sizes.

This conﬁrms experimentally that the o(n)-terms in Thm. 3.1 and Thm. 3.8 are

not too big.

Table 3: Diﬀerent strategies for pivot selection for basic QuickHeapsort tested

on 104and 106elements. The standard deviation of our experiments is given in

percent of the average number of comparisons.

n104106

Sample size Average number

of comparisons

Standard

deviation

Average number

of comparisons

Standard

deviation

1 152573 4.281 21975912 3.452

3 146485 2.169 21327478 1.494

∼lg n143669 0.954 20945889 0.525

∼4

√n143620 0.857 20880430 0.352

∼pn/ lg n142634 0.413 20795986 0.315

∼√n142642 0.305 20783631 0.281

∼n3

4147134 0.195 20914822 0.168

17

C Some Words about the Worst Case Running

Time

Obviously the worst case running time depends on how the pivot element is

chosen. If just one random element is used as pivot we get the same quadratic

worst case running time as for Quicksort. However the probability that in

QuickHeapsort we run in such a “bad case” is not higher than in Quicksort, since

any choice of pivot elements leading to a worst case scenario in QuickHeapsort

also yields the worst case for Quicksort.

If we choose the pivot element as median of approximately 2 lg nelements,

we get a worst case running time of On2

lg n, i.e. for the worst case it makes

almost no diﬀerence, if the pivot is selected as median of 2 lg nor just as one

random element.

However, if we use approximately n

lg nelements as sample for the pivot se-

lection, we can get a better bound on the worst case.

Let f:N→N≥1be some monotonically growing function with f∈o(n)

(e.g. f(n) = lg n). We can apply the ideas of the Median of Medians algorithm

[1]: First we choose n

f(n)random elements, then we group them into groups

of ﬁve elements each. The median of each group can be determined with six

comparisons [13, p. 215]. Now, the median of these medians can be computed

using Quickselect. We assume that Quickselect is implemented with the same

strategy for pivot selection. That means we get the same recurrence relations

for the worst case complexity of the partitioning-phases in QuickHeapsort and

for the worst case of Quickselect:

T(n) = n+6n

5f(n)+Tn

5f(n)+Tn−3n

10f(n).

This yields T(n)≤cnf (n) for some clarge enough. Hence with this pivot

selection strategy, we reach a worst case running time for QuickHeapsort of

nlg n+O(nf(n)) and – if f(n)∈ω(1) – average running time as stated in

Sect. 3.

Driving this strategy to the end and choosing f(n) = 1 leads to Ultimate

Heapsort (or better a slight modiﬁcation of it – and Quickselect turns into the

Median of Medians algorithm). Then we have T(n) = nlg n+O(n) for the

worst case of QuickHeapsort. However, our bound for the average case does not

hold anymore.

In order to obtain an nlg n+O(n)-bound for the worst case without loosing

our bound for the average case, we can apply a simple trick: Whenever after the

partitioning it turns out that the pivot does not lie in the interval {n

4,...,3n

4}

we switch to Ultimate Heapsort. This immediately yields the worst case bound

of nlg n+O(n). Moreover, the proof of Thm. 3.8 can easily be changed in

order to deal with this modiﬁcation: Let C·nbe the worst case number of

comparisons for pivot selection and partitioning in Ultimate Heapsort. We can

change Eq. (7) to

Pr [ pivot < µ −δn ]≤1

8Cǫ.

Then, the rest of the proof is exactly the same. Hence, Thm. 3.8 and Thm. 3.1

are also valid when switching to Ultimate Heapsort in the case of a ‘bad’ choice

of the pivot.

18

D Pseudocode of Basic QuickHeapsort

Algorithm 4.1

procedure QuickHeapsort(A[1..n])

begin

if n > 1then

p:= ChoosePivot;

k:= PartitionReverse(A[1..n], p);

if k≤n/2then

TwoLayerMaxHeap(A[1..n], k−1); (∗heap-area: {1..k −1}∗)

swap(A[k], A[n−k+ 1]);

QuickHeapsort(A[1..n −k]); (∗recursion ∗)

else

TwoLayerMinHeap(A[1..n], n−k); (∗heap-area: {k+ 1..n}∗)

swap(A[k], A[n−k+ 1]);

QuickHeapsort(A[(n−k+ 2)..n]); (∗recursion ∗)

endif

endif

endprocedure

The ChoosePivot function returns an element pof the array chosen as pivot.

The PartitionReverse function returns an index kand rearranges the array A

so that p=A[k], A[i]≥A[k] for i < k and A[i]≤A[k] for i > k using n−1

comparisons.

Algorithm 4.2

function SpecialLeaf(A[1..m]):

begin

i:= 1;

while 2i≤mdo (∗i.e. while iis not a leaf ∗)

if 2i+ 1 ≤mand A[2i+ 1] > A[2i]then

A[i] := A[2i+ 1];

i:= 2i+ 1;

else

A[i] := A[2i];

i:= 2i;

endif

endwhile

return i;

endfunction

Algorithm 4.3

procedure TwoLayerMaxHeap(A[1..n], m)

begin

ConstructHeap(A[1..m]);

for i:= 1 to mdo

19

temp := A[n−i+ 1];

A[n−i+ 1] := A[1];

j:=SpecialLeaf(A[1..m]);

A[j] := temp;

endfor

endprocedure

The procedure TwoLayerMinHeap is symmetric to TwoLayerMaxHeap, so we

do not present its pseudocode here.

20