Content uploaded by Armin Weiß
Author content
All content in this area was uploaded by Armin Weiß on Nov 26, 2014
Content may be subject to copyright.
arXiv:1209.4214v2 [cs.DS] 6 Mar 2013
QuickHeapsort: Modifications and Improved
Analysis
Volker Diekert Armin Weiß
Universit¨at Stuttgart, FMI
Universit¨atsstraße 38
D-70569 Stuttgart, Germany
{diekert,weiss}@fmi.uni-stuttgart.de
March 7, 2013
Abstract
We present a new analysis for QuickHeapsort splitting it into the analysis
of the partition-phases and the analysis of the heap-phases. This enables
us to consider samples of non-constant size for the pivot selection and
leads to better theoretical bounds for the algorithm.
Furthermore we introduce some modifications of QuickHeapsort, both
in-place and using nextra bits. We show that on every input the ex-
pected number of comparisons is nlg n−0.03n+o(n) (in-place) respec-
tively nlg n−0.997n+o(n) (always lg n= lg2n). Both estimates improve
the previously known best results. (It is conjectured [19] that the in-place
algorithm Bottom-Up-Heapsort uses at most nlg n+ 0.4non average and
for Weak-Heapsort which uses nextra bits the average number of com-
parisons is at most nlg n−0.42n[8].) Moreover, our non-in-place variant
can even compete with index based Heapsort variants (e.g. Rank-Heapsort
[17]) and Relaxed-Weak-Heapsort (nlg n−0.9n+o(n) comparisons in the
worst case) for which no O(n)-bound on the number of extra bits is known.
Keywords. In-place sorting - heapsort - quicksort - analysis of algo-
rithms
1 Introduction
QuickHeapsort is a combination of Quicksort and Heapsort which was first de-
scribed by Cantone and Cincotti [2]. It is based on Katajainen’s idea for Ul-
timate Heapsort [12]. In contrast to Ultimate Heapsort it does not have any
O(nlg n) bound for the worst case running time (lg n= lg2n). Its advantage is
that it is very fast in the average case and hence not only of theoretical interest.
Both algorithms have in common that first the array is partitioned into two
parts. Then in one part a heap is constructed and the elements are successively
extracted. Finally the remaining elements are treated recursively. The main
1
advantage of this method is that for the sift-down only ¡one comparison per
level is needed, whereas standard Heapsort needs two comparisons per level (for
a description of standard Heapsort see some standard textbook, e.g. [6]). This
is a severe drawback and one of the reasons why standard Heapsort cannot
compete with Quicksort in practice (of course there are also other reasons like
cache behavior). Over the time a lot of solutions to this problem appeared like
Bottom-Up-Heapsort [19] or MDR-Heapsort [15],[18], which both perform the
sift-down by first going down to some leaf and then searching upward for the
correct position. Since one can expect that the final position of some introduced
element is near to some leaf, this is a good heuristic and it leads to provably
good results. The difference between QuickHeapsort and Ultimate Heapsort lies
in the choice of the pivot element for partitioning the array. While for Ultimate
Heapsort the pivot is chosen as median of the whole array, for QuickHeapsort
the pivot is selected as median of some smaller sample (e.g. as median of 3
elements).
In [2] the basic version with fixed index as pivot is analyzed and – together
with the median of three version – implemented and compared with other Quick-
and Heapsort variants. In [8] Edelkamp and Stiegeler compare these variants
with so called Weak-Heapsort [7] and some modifications of it (e.g. Relaxed-
Weak-Heapsort). Weak-Heapsort beats basic QuickHeapsort with respect to
the number of comparisons, however it needs O(n) bits extra-space (for Relaxed-
Weak-Heapsort this bound is only conjectured), hence is not in place.
We split the analysis of QuickHeapsort into three parts: the partitioning
phases, the heap construction and the heap extraction. This allows us to get bet-
ter bounds for the running time, especially when choosing the pivot as median of
a larger sample. It also simplifies the analysis. We introduce some modifications
of QuickHeapsort, too. The first one is in-place and needs nlg n−0.03n+o(n)
comparisons on average what is to the best of our knowledge better than any
other known in-place Heapsort variant. We also examine a modification using
O(n) bits extra-space, which applies the ideas of MDR-Heapsort to QuickHeap-
sort. With this method we can bound the average number of comparisons to
nlg n−0.997n+o(n). Actually, a complicated, iterated in-place MergeInsertion
uses only nlg n−1.3n+O(lg n) comparisons, [16]. Unfortunately, for practical
purposes this algorithm is not competitive.
Our contributions are as follows: 1. We give a simplified analysis which gives
better bounds than previously known. 2. Our approach yields the first precise
analysis of QuickHeapsort when the pivot element is taken from a larger sample.
3. We give a simple in-place modification of QuickHeapsort which saves 0.75n
comparisons. 4. We give a modification of QuickHeapsort using nextra bits
only and we can bound the expected number of comparisons. This bound is
better than the previously known for the worst case of Heapsort variants using
O(nlg n) extra bits for which best and worst case are almost the same. 5. We
have implemented QuickHeapsort, and our experiments confirm the theoretical
predictions.
The paper is organized as follows: Sect. 2 briefly describes the basic Quick-
Heapsort algorithm together with our first improvement. In Sect. 3 we analyze
the expected running time of QuickHeapsort. Then we introduce some improve-
ments in Sect. 4 allowing O(n) additional bits. Finally, in Sect. 5, we present
our experimental results comparing the different versions of QuickHeapsort with
other Quicksort and Heapsort variants.
2
2 QuickHeapsort
Atwo-layer-min-heap is an array A[1..n] of nelements together with a partition
(G, R) of {1,...,n}into green and red elements such that for all g∈G, r ∈Rwe
have A[g]≤A[r]. Furthermore, the green elements gsatisfy the heap condition
A[g]≤min{A[2g], A[2g+1]}, and if gis red, then 2gand 2g+1 are red, too. (The
conditions are required to hold, only if the indices involved are in the range of 1
to n.) The green elements are called “green” because the they can be extracted
out of the heap without caution, whereas the “red” elements are blocked. Two-
layer-max-heaps are defined analogously. We can think of a two-layer-heap as
rooted binary tree such that each node is either green or red. Green nodes
satisfy the standard heap-condition, children of red nodes are red. Two-layer-
heaps were defined in [12]. In [2] for the same concept a different language is
used (they describe the algorithm in terms of External Heapsort). Now we are
ready to describe the QuickHeapsort algorithm as it has been proposed in [2].
Most of it also can be found in pseudocode in App. D.
We intend to sort an array A[1..n]. First, we choose a pivot p. This is the
randomized part of the algorithm. Then, just as in Quicksort, we rearrange the
array according to p. That means, using n−1 comparisons the partitioning
function returns an index kand rearranges the array Aso that A[i]≥A[k] for
i < k,A[k] = p, and A[k]≥A[j] for k < j. After the partitioning a two-layer-
heap is built out of the elements of the smaller part of the array, either the part
left of the pivot or right of the pivot. We call this smaller part heap-area and
the larger part work-area. More precisely, if k−1< n −k, then {1,...,k−1}
is the heap-area and {k+ 1,...,n}is the work-area. If k−1≥n−k, then
{1,...,k−1}is the work-area and {k+ 1,...,n}is the heap-area. Note that
we know the final position of the pivot element without any further comparison.
Therefore, we do not count it to the heap-area nor to the work-area. If the
heap-area the part of the array left of the pivot, a two-layer-max-heap is built,
otherwise a two-layer-min-heap is built.
At the beginning the heap-area is an ordinary heap, hence it is a two-layer-
heap consisting of green elements, only. Now the heap extraction phase starts.
We assume that we are in the case of a max-heap. The other case is symmetric.
Let mdenote the size of the heap-area. The melements of the heap-area are
moved to the work-area. The extraction of one element works as follows: the
root of the heap is placed at the current position of the work-area (which at the
beginning is its last position). Then, starting from the root the resulting “hole”
is trickled down: always the larger child is moved up into the vacant position
and then this child is treated recursively. This stops as soon as a leaf is reached.
We call this the SpecialLeaf procedure (Alg. 4.2) according to [2]. Now, the
element which before was at the current position in the work-area is placed as
red element in this hole at the leaf in the heap-area. Finally the current position
in the work-area is moved by one and the next element can be extracted.
The procedure sorts correctly, because after the partitioning it is guaranteed
that all red elements are smaller than all green elements. Furthermore there is
enough space in the work-area to place all green elements of the heap, since the
heap is always the smaller part of the array. After extracting all green elements
the pivot element it placed at its final position and the remaining elements are
sorted recursively.
Actually we can improve the procedure, thereby saving 3n/4 comparisons by
3
a simple trick. Before the heap extraction phase starts in the heap-area with m
elements, we perform at most m+2
4additional comparisons in order to arrange
all pairs of leaves which share a parent such that the left child is not smaller
than its right sibling. Now, in every call of SpecialLeaf, we can save exactly one
comparison, since we do not need to compare two leaves. For a max-heap we
only need to move up the left child and put the right one at the place of the
former left one. Summing up over all heaps during an execution of standard
QuickHeapsort, we invest n+2t
4comparisons in order to save ncomparisons,
where tis the number of recursive calls. The expected number of tis in O(lg n).
Hence, we can expect to save 3n
4+O(lg n) comparisons. We call this version
the improved variant of QuickHeapsort.
3 Analysis of QuickHeapsort
This section contains the main contribution of the paper. We analyze the num-
ber of comparisons. By nwe denote the number of elements of an array to be
sorted. We use standard O-notation where O(g), o(g), and ω(g) denote classes
of functions. In our analysis we do not assume any random distribution of the
input, i.e. it is valid for every permutation of the input array. Randomization
is used however for pivot selection. With Pr [ e] we denote the probability of
some event e. The expected value of a random variable Tis denoted by E[T].
The number of assignments is bounded by some small constant times the
number of comparisons. Let T(n) denote the number of comparisons during
QuickHeapsort on a fixed array of nelements. We are going to split the analysis
of QuickHeapsort into three parts:
1. Partitioning with an expected number of comparisons E[Tpart (n) ] (aver-
age case).
2. Heap construction with at most Tcon(n) comparisons (worst case).
3. Heap extraction (sorting phase) with at most Text(n) comparisons (worst
case).
We analyze the three parts separately and put them together at the end. The
partitioning is the only randomized part of our algorithm. The expected num-
ber of comparisons depends on the selection method for the pivot. For the
expected number of comparisons by QuickHeapsort on the input array we ob-
tain E[T(n) ] ≤Tcon (n) + Text(n) + E[Tpart(n)].
Theorem 3.1 The expected number E[T(n) ] of comparisons by basic resp. im-
proved QuickHeapsort with pivot as median of prandomly selected elements on
a fixed input array of size nis E[T(n) ] ≤nlg n+cn +o(n)with cas follows:
pcbasic cimproved
1 +2.72 +1.97
3 +1.92 +1.17
f(n)+0.72 −0.03
Here, f∈ω(1) ∩o(n)with 1≤f(n)≤n, e.g., f(n) = √nand we assume that
we choose the median of f(n)randomly selected elements in time O(f(n)).
4
As we see, the selection method for the pivot is very important. However,
one should notice that the bound for fixed size samples for pivot selection are
not tight. The proof of these results are postponed to Sect. 3.3. Note that it
is enough to prove the results without the improvement, since the difference is
always 0.75n.
3.1 Heap Construction
The standard heap construction [9] needs at most 2mcomparisons to construct
a heap of size min the worst case and approximately 1.88min the average case.
For the mathematical analysis better theoretical bounds can be used. The best
result we are aware of is due to Chen et al. in [5]. According to this result we
have Tcon(m)≤1.625m+o(m). Earlier results are of similar magnitude, by
[4] it has been known that Tcon(m)≤1.632m+o(m) and by [10] it has been
known Tcon(m)≤1.625m+o(m), but Gonnet and Munro used O(m) extra bits
to get this result, whereas the new result of Chen et al. is in-place (by using
only O(lg m) extra bits).
During the execution of QuickHeapsort over nelements, every element is part
of a heap only once. Hence, the sizes of all heaps during the entire procedure
sum up to n. With the result of [5] the total number of comparisons performed
in the construction of all heaps satisfies:
Proposition 3.2 Tcon(n)≤1.625n+o(n).
3.2 Heap Extraction
For a real number r∈Rwith r > 0 we define {r}by the following condition
r= 2k+{r}with k∈Zand 0 ≤ {r}<2k.
This means that 2kis largest power of 2 which is less than or equal to rand
{r}is the difference to that power, i.e. {r}=r−2⌊lg r⌋. In this section we first
analyze the extraction phase of one two-layer-heap of size m. After that, we
bound the number of comparisons Text (n) performed in the worst case during
all heap extraction phases of one execution of QuickHeapsort on an array of size
n. Thm. 3.3 is our central result about heap extraction.
Theorem 3.3 Text(n)≤n·(⌊lg n⌋ − 3) + 2{n}+O(lg2n).
The proof of Thm. 3.3 covers almost the rest of Section 3.2. In the following,
the height height(v) of an element vin a heap His the maximal distance from
that node to a leaf below it. The height of His the height of its root. The level
level(v) of vto be its distance from the root. In this section we want to count
the comparisons during SpecialLeaf procedures, only. Recall that a SpecialLeaf
procedure is a cyclic shift on a path from the root down to some leaf, and the
number comparisons is exactly the length of this path. Hence the upper bound
is the height of the heap. But there is a better analysis.
Let us consider a heap with mgreen elements which are all extracted by
SpecialLeaf procedures. The picture is as follows: First, we color the green
root red. Next, we perform a cyclic shift defined by the SpecialLeaf procedure.
In particular, the leaf is now red. Moreover, red positions remain red, but
5
there is exactly one position vwhich has changed its color from green to red.
This position vis on the path defined by the SpecialLeaf procedure. Hence,
the number of comparisons needed to color the position vred is bounded by
height(v) + level(v).
The total number of comparisons E(m) to extract all melements of a Heap
His therefore bounded by
E(m)≤X
v∈H
(height(v) + level(v)).
We have height(H)−1≤height(v) + level(v)≤height(H) = ⌊lg m⌋for all
v∈H. We now count the number of elements vwhere height(v) + level(v) =
⌊lg m⌋and the number of elements vwhere height(v) + level(v) = ⌊lg m⌋ − 1.
Since there are exactly {m}+ 1 nodes of level ⌊lg m⌋, there are at most 2 {m}+
1 +lg melements vwith height(v)+ level(v) = ⌊lg m⌋. All other elements satisfy
height(v) + level(v) = ⌊lg m⌋ − 1. We obtain
E(m)≤2· {m} · ⌊lg m⌋+ (m−2· {m})(⌊lg m⌋ − 1) + O(lg m)
=m·(⌊lg m⌋ − 1) + 2 · {m}+O(lg m).(1)
Note that this is an estimate of the worst case, however this analysis also shows
that the best case only differs by O(lg m)-terms from the worst case.
Now, we want to estimate the number of comparisons in the worst case
performed during all heap extraction phases together. During QuickHeapsort
over nelements we create a sequence H1,...,Htof heaps of green elements
which are extracted using the SpecialLeaf procedure. Let mi=|Hi|be the size
of the i-th Heap. The sequence satisfies 2mi≤n−Pj<i mj, because heaps are
constructed and extracted on the smaller part of the array.
Here comes a subtle observation: Assume that m1+m2≤n/2. If we
replace the first two heaps with one heap H′of size |H|′=m1+m2, then
the analysis using the sequence H′, H3,...,Htcannot lead to a better bound.
Continuing this way, we may assume that we have t∈ O(lg n) and therefore
P1≤i≤tO(lg mi)⊆ O(lg2n).With Eq. (1) we obtain the bound
Text(n)≤
t
X
i=1
E(mi) = t
X
i=1
mi· ⌊lg mi⌋+ 2 {mi}!−n+O(lg2n).(2)
Later we will replace the miby other positive real numbers. Therefore we
define the following notion. Let 1 ≤ν∈R. We say a sequence x1, x2,...,xt
with xi∈R>0is valid w.r.t. ν, if for all 1 ≤i≤twe have 2xi≤ν−P
j<i
xj.
As just mentioned the initial sequence m1, m2...,mtis valid w.r.t. n. Let
us define a continuous function F:R>0→Rby F(x) = x· ⌊lg x⌋+ 2 {x}.It is
continuous since for x= 2k,k∈Zwe have F(x) = xk = limε→0(x−ε)(k−1) +
2{x−ε}. It is piecewise differentiable with right derivative ⌊lg x⌋+2. Therefore:
Lemma 3.4 Let x≥y > δ ≥0. Then we have the inequalities:
F(x) + F(y)≤F(x+δ) + F(y−δ)and F(x) + F(y)≤F(x+y).
6
Lemma 3.5 Let 1≤ν∈R. For all sequences x1, x2,...,xtwith xi∈R>0,
which are valid w.r.t. ν, we have
t
P
i=1
F(xi)≤⌊lg ν⌋
P
i=1
Fν
2i.
Proof. The result is true for ν≤2, because then F(xi)≤F(ν/2) ≤F(1) = 0
for all i. Thus, we may assume ν≥2. We perform induction on t. For t= 1
the statement is clear, since lg ν≥1 and x1≤ν/2. Now let t > 1. By Lem. 3.4,
we have F(x1) + F(x2)< F (x1+x2). Now, if x1+x2≤ν
2, then the sequence
x1+x2, x3,...,xtis valid, too; and we are done by induction. Hence, we may
assume x1+x2>ν
2. If x1≤x2, then
2x1= 2x2+ 2(x1−x2)≤ν−x1+ 2(x1−x2) = ν−x2+x1−x2≤ν−x2.
Thus, if x1≤x2, then the sequence x2, x1, x3,...,xtis valid, too. Thus, it is
enough to consider x1≥x2with x1+x2>ν
2.
We have ν
2≥1 and the sequence x′
2, x3,...xtwith x′
2=x1+x2−ν
2is valid
w.r.t. ν/2, because
x′
2=x1+x2−ν
2≤x1+ν−x1
2−ν
2=x1
2≤ν
4.
Therefore, by induction on tand Lem. 3.4 we obtain the claim:
t
X
i=1
F(xi)≤F(ν/2)+F(x′
2)+
t
X
i=3
F(xi)≤F(ν/2)+⌊lg ν⌋
X
i=2
Fν
2i≤⌊lg ν⌋
X
i=1
Fν
2i.
Lemma 3.6 ⌊lg n⌋
P
i=1
Fn
2i≤F(n)−2n+O(lg n).
Lem. 3.6
⌊lg n⌋
X
i=1
Fn
2i=n⌊lg n⌋ · ⌊lg n⌋
X
i=1
1
2i−n·⌊lg n⌋
X
i=1
i
2i+ 2 {n} · ⌊lg n⌋
X
i=1
1
2i
≤n⌊lg n⌋ · X
i≥1
1
2i−n·X
i≥1
i
2i+ 2 {n} · X
i≥1
1
2i+n
2⌊lg n⌋·X
i>0
i+⌊lg n⌋
2i
=n⌊lg n⌋ − 2n+ 2 {n}+O(lg n).
Applying these lemmata to Eq. (2) yields the proof of Thm. 3.3.
Corollary 3.7 We have Text(n)≤nlg n−2.9139n+O(lg2n).
Proof. By [18, Thm. 1] we have F(n)−2n≤nlg n−1.9139n. Hence, Cor. 3.7
follows directly from Thm. 3.3.
7
3.3 Partitioning
In the following Tpivot(n) denotes the number of comparisons required to choose
the pivot element in the worst case; and, as before, E[Tpart(n) ] denotes the
expected number of comparisons performed during partitioning. We have the
following recurrence:
E[Tpart(n) ] ≤n−1 + Tpivot(n) +
n
X
k=1
Pr [pivot = k]·E[Tpart(max {k−1, n −k})] .
(3)
If we choose the pivot at random, then we obtain by standard methods:
E[Tpart(n) ] ≤n−1 + 1
n·
n
X
k=1
E[Tpart(max {k−1, n −k}) ] ≤4n. (4)
Similarly, if we choose the pivot with the median-of-three, then we obtain:
E[Tpart(n) ] ≤3.2n+O(lg n).(5)
The proof of the first part of Thm. 3.1 follows from the above eqations,
Thm. 3.3, and Prop. 3.2. Using a growing number of elements (as ngrows) as
sample for the pivot selection, we can do better. The second part of Thm. 3.1
follows from Thm. 3.3, Prop. 3.2, and Thm. 3.8.
Theorem 3.8 Let f∈ω(1) ∩o(n)with 1≤f(n)≤n. When choosing the
pivot as median of f(n)randomly selected elements in time O(f(n)) (e.g. with
the algorithm of [1]), the expected number of comparisons used in all recursive
calls of partitioning is in 2n+o(n).
Thm. 3.8 is close to a well-known result in [14, Thm. 5] on Quickselect, see
Cor. 3.10. Formally speaking we cannot use it directly, because we deal with
QuickHeapsort, where after partitioning the recursive call is on the larger part.
Because of that, and for the sake of completeness, we give a proof. Moreover, our
proof is elementary and simpler than the one in [14]. The key step is Lem. 3.9.
Its proof is rather standard and also can be found in App. A.
Lemma 3.9 Let 0< δ < 1
2. If we choose the pivot as median of 2c+ 1 elements
such that 2c+ 1 ≤n
2, then we have Pr pivot ≤n
2−δn <(2c+ 1)αcwhere
α= 4 1
4−δ2<1.
Proof of Thm. 3.8. As an abbreviation, we let E(n) = E[Tpart(n) ] be the
expected number of comparisons performed during partitioning. We are going
to show that for all ǫ > 0 there is some D∈Rsuch that
E(n)<(2 + ǫ)n+D. (6)
So, we fix some 1 ≥ǫ > 0. We choose δ > 0 such that (2+ǫ)δ < ǫ
4.Moreover, for
this proof let µ=n+1
2. Positions of possible pivots kwith µ−δn ≤k≤µ+δn
form a small fraction of all positions, and they are located around the median.
8
Nevertheless, applying Lem. 3.9 with c=f(n)∈ω(1) ∩o(n) yields for all n,
which are large enough:
Pr [ pivot < µ −δn ]≤(2f(n) + 1) ·αf(n)≤1
48ǫ. (7)
The analogous inequality holds for Pr [ pivot > µ +δ n ]. Because Tpivot(n)∈
o(n), we have
Tpivot(n)≤1
8ǫn. (8)
for nlarge enough. Now, we choose n0such that Eq. (7) and Eq. (8) hold for
n≥n0and such that we have (2+ ǫ)δ+2
n0<ǫ
4. We set D=E(n0) + 1. Hence
for n < n0the desired result Eq. (6) holds. Now, let n≥n0. From Eq. (3) we
obtain by symmetry:
E(n)≤n−1 + Tpivot(n) + ⌊µ+δn⌋
X
k=⌈µ−δn⌉
Pr [ pivot = k]·E(k−1)
+ 2
n
X
k=⌊µ+δn⌋+1
Pr [ pivot = k]·E(k−1).
Since Eis monotone, E(k) can be bounded by the highest value in the respective
interval:
≤n+1
8ǫn + Pr [ µ−δn ≤pivot ≤µ+δn ]·E(⌊µ+δn⌋)
+ 2 Pr [ pivot > µ +δn ]·E(n−1)
≤n+1
8ǫn +1−1
24ǫ·E(⌊µ+δn⌋) + 2 1
48ǫ·E(n−1).
By induction we assume E(k)≤(2 + ǫ)k+Dfor k < n. Hence:
E(n)≤n+1
8ǫn +1−1
24ǫ·((2 + ǫ)·(µ+δn) + D) + 1
24ǫ·((2 + ǫ)n+D)
≤n+ (2 + ǫ)·n+ 1
2+δn+1
8ǫn +1
24ǫ(2 + ǫ)n+D
≤2n+ 1 + ǫ
2+ (2 + ǫ)δn +3
4ǫn +D < (2 + ǫ)n+D.
Corollary 3.10 ([14]) Let f∈ω(1) ∩o(n)with 1≤f(n)≤n. When im-
plementing Quickselect with the median of f(n)randomly selected elements as
pivot, the expected number of comparisons is 2n+o(n).
Proof. In QuickHeapsort the recursion is always on the larger part of the array.
Hence, the number of comparisons in partitioning for QuickHeapsort is an upper
bound on the number of comparisons in Quickselect.
In [14] it is also proved that choosing the pivot as median of O(√n) elements
is optimal for Quicksort as well as for Quickselect. This suggests that we choose
the same value in QuickHeapsort; what is backed by our experiments.
9
4 Modifications of QuickHeapsort Using Extra-
space
In this section we want to describe some modification of QuickHeapsort using
nbits of extra storage. We introduce two bit-arrays. In one of them (the Com-
pareArray) – which is actually two bits per element – we store the comparisons
already done (we need two bits, because there are three possible values – right,
left, unknown – we have to store). In the other one (the RedGreenArray) we
store which element is red and which is green.
Since the heaps have maximum size n/2, the RedGreenArray only requires
n/2 bits. The CompareArray is only needed for the inner nodes of the heaps,
i.e. length n/4 is sufficient. Totally this sums up to nextra bits.
For the heap construction we do not use the algorithms described in Sect. 3.1.
With the CompareArray we can do better by using the algorithm of McDiarmid
and Reed [15]. The heap construction works similarly to Bottom-Up-Heapsort,
i.e. the array is traversed backward calling for all inner positions ithe Reheap
procedure on i. The Reheap procedure takes the subheap with root iand
restores the heap condition, if it is violated at the position i. First, the Reheap
procedure determines a special leaf using the SpecialLeaf procedure as described
in Sect. 2, but without moving the elements. Then, the final position of the
former root is determined going upward from the special leaf (bottom-up-phase).
In the end, the elements above this final position are moved up towards the root
by one position. That means that all but one element which are compared during
the bottom-up-phase, stay in their places. Since in the SpecialLeaf procedure
these elements have been compared with their siblings, these comparisons can
be stored in the CompareArray and can be used later.
With another improvement concerning the construction of heaps with seven
elements as in [3] the benefits of this array can be exploited even more.
The RedGreenArray is used during the sorting phase, only. Its functionality
is straightforward: Every time a red element is inserted into the heap, the
corresponding bit is set to red. The SpecialLeaf procedure can stop as soon
as it reaches an element without green children. Whenever a red and a green
element have to be compared, the comparison can be skipped.
Theorem 4.1 Let f∈ω(1) ∩o(n)with 1≤f(n)≤n, e.g., f(n) = lg n, and
let E[T(n) ] be the expected number of comparisons by QuickHeapsort using the
CompareArray with the improvement of [3] and the RedGreenArray on a fixed
input array of size n. Choosing the pivot as median of f(n)randomly selected
elements in time O(f(n)), we have
E[T(n) ] ≤nlg n−0.997n+o(n).
Proof. We can analyze the savings by the two arrays separately, because the
CompareArray only affects comparisons between two green elements, while the
RedGreenArray only affects comparisons involving at least one red element.
First, we consider the heap construction using the CompareArray. With this
array we obtain the same worst case bound as for the standard heap construction
method. However, the CompareArray has the advantage that at the end of the
heap construction many comparisons are stored in the array and can be reused
for the extraction phase. More precisely: For every comparison except the first
10
one made when going upward from the special leaf, one comparison is stored
in the CompareArray, since for every additional comparison one element on the
path defined by SpecialLeaf stays at its place. Because every pair of siblings
has to be compared at one point during the heap construction or extraction,
all these stored comparisons can be reused. Hence, we only have to count the
comparisons in the SpecialLeaf procedure during the construction plus n
2for the
first comparison when going upward. Thus, we get an amortized bound for the
comparisons during construction of 3n
2.
In [3] the notion of Fine-Heaps is introduced. A Fine Heap is a heap with
the additional CompareArray such that for every node the larger child is stored
in the array. Such a Fine-Heap of size mcan be constructed using the above
method with 2mcomparisons. In [3] Carlsson, Chen and Mattsson showed that
a Fine-Heap of size mactually can be constructed with only 23
12 m+O(lg2m)
comparisons. That means we have to invest 23
12 m+O(lg2m) for the heap con-
struction and at the end there are m
2comparisons stored in the array. All these
comparisons stored in the array are used later. Summing up over all heaps
during an execution of QuickHeapsort, we can save another 1
12 ncomparisons
additionally to the comparisons saved by the CompareArray with the result of
[3]. Hence, for the amortized cost of the heap construction Tamort
con (i.e. the num-
ber of comparisons needed to build the heap minus the number of comparisons
stored in the CompareArray after the construction which all can be reused later)
we have obtained:
Proposition 4.2 Tamort
con (n)≤17
12 n+o(n).
This bound is slightly better than the average case for the heap construction
with the algorithm of [15] which is 1.52n.
Now, we want to count the number of comparisons we save using the Red-
GreenArray. We distinguish the two cases that two red elements are compared
and that a red and a green element are compared. Every position in the heap
has to turn red at one point. At that time, all nodes below this position are
already red. Hence, for that element we save as many comparisons as the ele-
ment is above the bottom level. Summing over all levels of a heap of size mthe
saving results in ≈m
4·1 + m
8·2 + ···=m·P
i≥1
i2−i−1=m. This estimate is
exact up to O(lg m)-terms. Since the expected number of heaps is O(lg n), we
obtain for the overall saving the value TsaveRR(n) = n+O(lg2n).
Another place where we save comparisons with the RedGreenArray is when
a red element is compared with a green element. It occurs at least one time –
when the node looses its last green child – for every inner node that we compare
a red child with a green child. Hence, we save at least as many comparisons
as there are inner nodes with two children, i.e. at least m
2−1. Since every
element – except the expected O(lg n) pivot elements – is part of a heap exactly
once, we save at least TsaveRG(n)≥n
2+O(lg n) comparisons when comparing
green with red elements. In the average case the saving might be even slightly
higher, since comparisons can also be saved when a node does not loose its last
green child.
Summing up all our savings and using the median of f(n)∈ω(1) ∩o(n) as
11
pivot we obtain the proof of Thm. 4.1:
E[T(n) ] ≤Tamort
con (n) + Text(n) + E[Tpart (n)] −TsaveRR(n)−TsaveRG(n)
≤17
12n+n·(⌊lg n⌋ − 3) + 2 {n}+ 2n−3n
2+o(n)
≤nlg n−0.997n+o(n).
5 Experimental Results and Conclusion
In Fig. 1 we present the number of comparisons of the different versions of
QuickHeapsort we considered in this paper, i.e. the basic version, the improved
variant of Sect. 2, and the version using bit-arrays (however, without the mod-
ification by [3]) for different values of n. We compare them with Quicksort,
Ultimate Heapsort, Bottom-Up-Heapsort and MDR-Heapsort. All algorithms
are implemented with median of √nelements as pivot (for Quicksort we show
additionally the data with median of 3). For the heap construction we imple-
mented the normal algorithm due to Floyd [9] as well as the algorithm using
the extra bit-array (which is the same as in MDR-Heapsort).
103104105106
−2
0
2
4
6
n
(#comparisons −nlg n)/n
Quicksort with Median of 3
Quicksort with Median of √n
Basic QuickHeapsort
Improved QuickHeapsort
QuickHeapsort with bit-arrays
MDR-Heapsort
Ultimate-Heapsort
Lower Bound
Figure 1: Average number of comparisons of QuickHeapsort implemented with
median of √ncompared with other algorithms
More results with other pivot selection strategies are in Table 2 and Table 3 in
App. B confirming that a sample size of √nis optimal for pivot selection with re-
spect to the number of comparisons and also that the o(n)-terms in Thm. 3.1 and
Thm. 3.8 are not too big. In Table 1 in App. B we present actual running times
of the different algorithms for n= 1000000. All the numbers, except the running
12
times, are average values over 100 runs with random data. As our theoretical es-
timates predict, QuickHeapsort with bit-arrays beats all other variants including
Relaxed-Weak-Heapsort (see Table 2, App. B) when implemented with median
of √nfor pivot selection. It also performs 326728 ≈0.33 ·106comparisons less
than our theoretical predictions which are 106·lg(106)−0.9139 ·106≈19017569
comparisons.
In this paper we have shown that with known techniques QuickHeapsort
can be implemented with expected number of comparisons less than nlg n−
0.03n+o(n) and extra storage O(1). On the other hand, using nextra bits we
can improve this to nlg n−0.997n+o(n), i.e. we showed that QuickHeapsort
can compete with the most advanced Heapsort variants. These theoretical es-
timates were also confirmed by our experiments. We also considered different
pivot selection schemes. For any constant size sample for pivot selection, Quick-
Heapsort beats Quicksort for large n, since Quicksort has a expected running
time of ≈Cn lg nwith C > 1. However, when choosing the pivot as median
of √nelements (i.e. with the optimal strategy) then our experiments show
that Quicksort needs less comparisons than QuickHeapsort. However, using bit-
arrays QuickHeapsort is the winner, again. In order to make the last statement
rigorous, better theoretical bounds for Quicksort with sampling √nelements
are needed. For future work it would also be of interest to prove the optimality
of √nelements for pivot selection in QuickHeapsort, to estimate the lower order
terms of the average running time of QuickHeapsort and also to find an exact
average case analysis for the saving by the bit-arrays.
Acknowledgements.
We thank Martin Dietzfelbinger, Stefan Edelkamp and Jyrki Kata jainen for
their helpful comments. We thank Simon Paridon for implementing the algo-
rithms for our experiments.
References
[1] M. Blum, R. W. Floyd, V. Pratt, R. L. Rivest, and R. E. Tarjan. Time
bounds for selection. J. Comput. Syst. Sci., 7(4):448–461, 1973.
[2] D. Cantone and G. Cincotti. QuickHeapsort, an efficient mix of classical
sorting algorithms. Theor. Comput. Sci., 285(1):25–42, 2002.
[3] S. Carlsson, J. Chen, and C. Mattsson. Heaps with Bits. In D.-Z. Du
and X.-S. Zhang, editors, ISAAC, volume 834 of LNCS, pages 288–296.
Springer, 1994.
[4] J. Chen. A Framework for Constructing Heap-like structures in-place. In
K.-W. Ng et al., editors, ISAAC, volume 762 of LNCS, pages 118–127.
Springer, 1993.
[5] J. Chen, S. Edelkamp, A. Elmasry, and J. Katajainen. In-place Heap
Construction with Optimized Comparisons, Moves, and Cache Misses. In
B. Rovan, V. Sassone, and P. Widmayer, editors, MFCS, volume 7464 of
LNCS, pages 259–270. Springer, 2012.
13
[6] T. H. Cormen, C. E. Leiserson, R. L. Rivest, and C. Stein. Introduction to
Algorithms. The MIT Press, 3 edition, 2009.
[7] R. D. Dutton. Weak-heap sort. BIT, 33(3):372–381, 1993.
[8] S. Edelkamp and P. Stiegeler. Implementing HEAPSORT with nlg n−0.9n
and QUICKSORT with nlg n+ 0.2ncomparisons. ACM J. of Exp. Alg.,
7:5, 2002.
[9] R. W. Floyd. Algorithm 245: Treesort. Commun. ACM, 7(12):701, 1964.
[10] G. H. Gonnet and J. I. Munro. Heaps on Heaps. SIAM J. Comput.,
15(4):964–971, 1986.
[11] K. Kaligosi and P. Sanders. How Branch Mispredictions Affect Quicksort.
In Y. Azar and T. Erlebach, editors, ESA, volume 4168 of LNCS, pages
780–791. Springer, 2006.
[12] J. Katajainen. The Ultimate Heapsort. In X. Lin, editor, CATS, volume 20
of Australian Computer Science Communications, pages 87–96. Springer-
Verlag 1998.
[13] D. E. Knuth. The art of computer programming. Vol. 3. Addison-Wesley,
1998.
[14] C. Mart´ınez and S. Roura. Optimal Sampling Strategies in Quicksort and
Quickselect. SIAM J. Comput., 31(3):683–705, 2001.
[15] C. McDiarmid and B. A. Reed. Building Heaps Fast. J. Alg., 10(3):352–365,
1989.
[16] K. Reinhardt. Sorting in-place with a worst case complexity of nlg n−
1.3n+O(lg n) comparisons and ǫn lg n+O(1) transports. In T. Ibaraki et
al., editors, ISAAC, volume 650 of LNCS, pages 489–498. Springer, 1992.
[17] X.-D. Wang and Y.-J. Wu. An Improved HEAPSORT Algorithm with
nlg n−0.788928nComparisons in the Worst Case. J. of Comput. Sci. and
Techn., 22:898–903, 2007.
[18] I. Wegener. The Worst Case Complexity of McDiarmid and Reed’s Variant
of Bottom-Up-Heap Sort is Less Than nlg n+ 1.1n. In C. Choffrut and
M. Jantzen, editors, STACS, volume 480 of LNCS, pages 137–147. Springer,
1991.
[19] I. Wegener. BOTTOM-UP-HEAPSORT, a new variant of HEAPSORT,
beating, on an average, QUICKSORT (if nis not very small). Theor.
Comp. Sci., 118(1):81–98, 1993.
14
APPENDIX
A Proofs
Proof of Lem. 3.4 Since the right derivative is monotonically increasing we
have:
F(x+δ)−F(x) = Zx+δ
x
F′(t) dt≥F′(x)·δ= (⌊lg x⌋+ 2)δ
and
F(y)−F(y−δ) = Zy
y−δ
F′(t) dt≤F′(y)·δ= (⌊lg y⌋+ 2)δ.
This yields:
F(y)−F(y−δ)≤(⌊lg y⌋+ 2)δ≤(⌊lg x⌋+ 2)δ≤F(x+δ)−F(x).
By adding F(x) + F(y−δ) on both sides we obtain the first claim of Lem. 3.4.
Note that limε→0F(ε) = 0. Hence the second claim follows from the first by
considering the limit δ→y.
Proof of Lem. 3.9. First note that the probability for choosing the k-th ele-
ment as pivot satisfies
n
2c+ 1·Pr [ pivot = k] = k−1
cn−k
c.
We use the notation of falling factorial xℓ=x···(x−ℓ+ 1). Thus, x
ℓ=xℓ
ℓ!.
Pr [ pivot = k] = (2c+ 1)! ·(k−1)c·(n−k)c
(c!)2·n2c+1
=2c
c(2c+ 1) 1
(n−2c)
c−1
Y
i=0
(k−1−i)(n−k−i)
(n−2i−1)(n−2i).
For k≤cwe have Pr [ pivot = k] = 0. So, let c < k ≤n
2−δn and let us
consider an index iin the product with 0 ≤i < c.
(k−1−i)(n−k−i)
(n−2i−1)(n−2i)≤(k−i)(n−k−i)
(n−2i)(n−2i)
=n
2−i−n
2−k·n
2−i+n
2−k
(n−2i)2
=n
2−i2−n
2−k2
(n−2i)2
≤1
4−n
2−n
2−δn2
n2=1
4−δ2.
15
We have 2c
c≤4c. Since 2c+ 1 ≤n
2, we obtain:
Pr [ pivot = k]≤4c(2c+ 1) 1
(n−2c)1
4−δ2c
<(2c+ 1) 2
nαc.
Now, we obtain the desired result.
Pr hpivot ≤n
2−δn i<⌊n
2−δn⌋
X
k=0
(2c+ 1) 2
nαc≤(2c+ 1)αc
B More Experimental Results
In Table 1 we present actual running times of the different algorithms for n=
1000000 with two different comparison functions (the numbers displayed here
are averages over 10 runs with random data). One of them is the normal integer
comparison, the other one first applies four times the logarithm to both operands
before comparing them. Like in [8], this simulates expensive comparisons.
In Table 2 all algorithms are implemented with median of 3 and with median
of √nelements as pivot. We compare them with Quicksort implemented with
the same pivot selection strategies, Ultimate Heapsort, Bottom-Up-Heapsort
and MDR-Heapsort. In Table 2 we also added the values for Relaxed-Weak-
Heapsort which were presented in [8].
Table 1: Running times for QuickHeapsort and other algorithms tested on 106,
average over 10 runs elements
Sorting algorithm integer data
time [s]
lg(4)-test-function
time [s]
Basic QuickHeapsort, median of 3 0.1154 4.21
Basic QuickHeapsort, median of √n0.1171 4.109
Improved QHS, median of 3 0.1073 4.049
Improved QHS, median of √n0.1118 3.911
QHS with bit-arrays, median of 3 0.1581 3.756
QHS with bit-arrays, median of √n0.164 3.7
Quicksort with median of 3 0.1181 3.946
Quicksort with median of √n0.1316 3.648
Ultimate Heapsort 0.135 5.109
Bottom-Up-Heapsort 0.1677 4.132
MDR-Heapsort 0.2596 4.129
We also compare the different pivot selection strategies on the basic Quick-
Heapsort with no modifications. We test sample of sizes of one, three, approxi-
mately lg n,4
√n,pn/ lg n,√n, and n3
4for the pivot selection.
In Table 3 the average number of comparisons and the standard deviations
are listed. We ran the algorithms on arrays of length 10000 and one million.
16
Table 2: QuickHeapsort and other algorithms tested on 106elements (the data
for Relaxed-Weak-Heapsort is taken from [8]).
Sorting algorithm Average number of com-
parisons for n= 106
Basic QuickHeapsort with median of 3 21327478
Basic QuickHeapsort with median of √n20783631
Improved QuickHeapsort, median of 3 20639046
Improved QuickHeapsort, median of √n20135688
QuickHeapsort with bit-arrays, median of 3 19207289
QuickHeapsort with bit-arrays, median of √n18690841 ∗Best result∗
Quicksort with median of 3 21491310
Quicksort with median of √n19548149
Bottom-Up-Heapsort 20294866
MDR-Heapsort 20001084
Relaxed-Weak-Heapsort 18951425
Lower Bound: lg n! 18488884 ≈lg (106!)
The displayed data is the average resp. standard deviation of 100 runs of Quick-
Heapsort with the respective pivot selection strategy.
These results are not very surprising: The larger the samples get, the smaller
is the standard deviation. The average number of comparisons reaches its min-
imum with a sample size of approximately √nelements. One notices that the
difference for the average number of comparisons is relatively small, especially
between the different pivot selection strategies with non-constant sample sizes.
This confirms experimentally that the o(n)-terms in Thm. 3.1 and Thm. 3.8 are
not too big.
Table 3: Different strategies for pivot selection for basic QuickHeapsort tested
on 104and 106elements. The standard deviation of our experiments is given in
percent of the average number of comparisons.
n104106
Sample size Average number
of comparisons
Standard
deviation
Average number
of comparisons
Standard
deviation
1 152573 4.281 21975912 3.452
3 146485 2.169 21327478 1.494
∼lg n143669 0.954 20945889 0.525
∼4
√n143620 0.857 20880430 0.352
∼pn/ lg n142634 0.413 20795986 0.315
∼√n142642 0.305 20783631 0.281
∼n3
4147134 0.195 20914822 0.168
17
C Some Words about the Worst Case Running
Time
Obviously the worst case running time depends on how the pivot element is
chosen. If just one random element is used as pivot we get the same quadratic
worst case running time as for Quicksort. However the probability that in
QuickHeapsort we run in such a “bad case” is not higher than in Quicksort, since
any choice of pivot elements leading to a worst case scenario in QuickHeapsort
also yields the worst case for Quicksort.
If we choose the pivot element as median of approximately 2 lg nelements,
we get a worst case running time of On2
lg n, i.e. for the worst case it makes
almost no difference, if the pivot is selected as median of 2 lg nor just as one
random element.
However, if we use approximately n
lg nelements as sample for the pivot se-
lection, we can get a better bound on the worst case.
Let f:N→N≥1be some monotonically growing function with f∈o(n)
(e.g. f(n) = lg n). We can apply the ideas of the Median of Medians algorithm
[1]: First we choose n
f(n)random elements, then we group them into groups
of five elements each. The median of each group can be determined with six
comparisons [13, p. 215]. Now, the median of these medians can be computed
using Quickselect. We assume that Quickselect is implemented with the same
strategy for pivot selection. That means we get the same recurrence relations
for the worst case complexity of the partitioning-phases in QuickHeapsort and
for the worst case of Quickselect:
T(n) = n+6n
5f(n)+Tn
5f(n)+Tn−3n
10f(n).
This yields T(n)≤cnf (n) for some clarge enough. Hence with this pivot
selection strategy, we reach a worst case running time for QuickHeapsort of
nlg n+O(nf(n)) and – if f(n)∈ω(1) – average running time as stated in
Sect. 3.
Driving this strategy to the end and choosing f(n) = 1 leads to Ultimate
Heapsort (or better a slight modification of it – and Quickselect turns into the
Median of Medians algorithm). Then we have T(n) = nlg n+O(n) for the
worst case of QuickHeapsort. However, our bound for the average case does not
hold anymore.
In order to obtain an nlg n+O(n)-bound for the worst case without loosing
our bound for the average case, we can apply a simple trick: Whenever after the
partitioning it turns out that the pivot does not lie in the interval {n
4,...,3n
4}
we switch to Ultimate Heapsort. This immediately yields the worst case bound
of nlg n+O(n). Moreover, the proof of Thm. 3.8 can easily be changed in
order to deal with this modification: Let C·nbe the worst case number of
comparisons for pivot selection and partitioning in Ultimate Heapsort. We can
change Eq. (7) to
Pr [ pivot < µ −δn ]≤1
8Cǫ.
Then, the rest of the proof is exactly the same. Hence, Thm. 3.8 and Thm. 3.1
are also valid when switching to Ultimate Heapsort in the case of a ‘bad’ choice
of the pivot.
18
D Pseudocode of Basic QuickHeapsort
Algorithm 4.1
procedure QuickHeapsort(A[1..n])
begin
if n > 1then
p:= ChoosePivot;
k:= PartitionReverse(A[1..n], p);
if k≤n/2then
TwoLayerMaxHeap(A[1..n], k−1); (∗heap-area: {1..k −1}∗)
swap(A[k], A[n−k+ 1]);
QuickHeapsort(A[1..n −k]); (∗recursion ∗)
else
TwoLayerMinHeap(A[1..n], n−k); (∗heap-area: {k+ 1..n}∗)
swap(A[k], A[n−k+ 1]);
QuickHeapsort(A[(n−k+ 2)..n]); (∗recursion ∗)
endif
endif
endprocedure
The ChoosePivot function returns an element pof the array chosen as pivot.
The PartitionReverse function returns an index kand rearranges the array A
so that p=A[k], A[i]≥A[k] for i < k and A[i]≤A[k] for i > k using n−1
comparisons.
Algorithm 4.2
function SpecialLeaf(A[1..m]):
begin
i:= 1;
while 2i≤mdo (∗i.e. while iis not a leaf ∗)
if 2i+ 1 ≤mand A[2i+ 1] > A[2i]then
A[i] := A[2i+ 1];
i:= 2i+ 1;
else
A[i] := A[2i];
i:= 2i;
endif
endwhile
return i;
endfunction
Algorithm 4.3
procedure TwoLayerMaxHeap(A[1..n], m)
begin
ConstructHeap(A[1..m]);
for i:= 1 to mdo
19
temp := A[n−i+ 1];
A[n−i+ 1] := A[1];
j:=SpecialLeaf(A[1..m]);
A[j] := temp;
endfor
endprocedure
The procedure TwoLayerMinHeap is symmetric to TwoLayerMaxHeap, so we
do not present its pseudocode here.
20