Conference PaperPDF Available

QuickHeapsort: Modifications and Improved Analysis

Authors:

Abstract and Figures

We present a new analysis for QuickHeapsort splitting it into the analysis of the partition-phases and the analysis of the heap-phases. This enables us to consider samples of non-constant size for the pivot selection and leads to better theoretical bounds for the algorithm. Furthermore we introduce some modifications of QuickHeapsort, both in-place and using n extra bits. We show that on every input the expected number of comparisons is n lg n - 0.03n + o(n) (in-place) respectively n lg n -0.997 n+ o (n). Both estimates improve the previously known best results. (It is conjectured in Wegener93 that the in-place algorithm Bottom-Up-Heapsort uses at most n lg n + 0.4 n on average and for Weak-Heapsort which uses n extra-bits the average number of comparisons is at most n lg n -0.42n in EdelkampS02.) Moreover, our non-in-place variant can even compete with index based Heapsort variants (e.g. Rank-Heapsort in WangW07) and Relaxed-Weak-Heapsort (n lg n -0.9 n+ o (n) comparisons in the worst case) for which no O(n)-bound on the number of extra bits is known.
Content may be subject to copyright.
arXiv:1209.4214v2 [cs.DS] 6 Mar 2013
QuickHeapsort: Modifications and Improved
Analysis
Volker Diekert Armin Weiß
Universit¨at Stuttgart, FMI
Universit¨atsstraße 38
D-70569 Stuttgart, Germany
{diekert,weiss}@fmi.uni-stuttgart.de
March 7, 2013
Abstract
We present a new analysis for QuickHeapsort splitting it into the analysis
of the partition-phases and the analysis of the heap-phases. This enables
us to consider samples of non-constant size for the pivot selection and
leads to better theoretical bounds for the algorithm.
Furthermore we introduce some modifications of QuickHeapsort, both
in-place and using nextra bits. We show that on every input the ex-
pected number of comparisons is nlg n0.03n+o(n) (in-place) respec-
tively nlg n0.997n+o(n) (always lg n= lg2n). Both estimates improve
the previously known best results. (It is conjectured [19] that the in-place
algorithm Bottom-Up-Heapsort uses at most nlg n+ 0.4non average and
for Weak-Heapsort which uses nextra bits the average number of com-
parisons is at most nlg n0.42n[8].) Moreover, our non-in-place variant
can even compete with index based Heapsort variants (e.g. Rank-Heapsort
[17]) and Relaxed-Weak-Heapsort (nlg n0.9n+o(n) comparisons in the
worst case) for which no O(n)-bound on the number of extra bits is known.
Keywords. In-place sorting - heapsort - quicksort - analysis of algo-
rithms
1 Introduction
QuickHeapsort is a combination of Quicksort and Heapsort which was first de-
scribed by Cantone and Cincotti [2]. It is based on Katajainen’s idea for Ul-
timate Heapsort [12]. In contrast to Ultimate Heapsort it does not have any
O(nlg n) bound for the worst case running time (lg n= lg2n). Its advantage is
that it is very fast in the average case and hence not only of theoretical interest.
Both algorithms have in common that first the array is partitioned into two
parts. Then in one part a heap is constructed and the elements are successively
extracted. Finally the remaining elements are treated recursively. The main
1
advantage of this method is that for the sift-down only ¡one comparison per
level is needed, whereas standard Heapsort needs two comparisons per level (for
a description of standard Heapsort see some standard textbook, e.g. [6]). This
is a severe drawback and one of the reasons why standard Heapsort cannot
compete with Quicksort in practice (of course there are also other reasons like
cache behavior). Over the time a lot of solutions to this problem appeared like
Bottom-Up-Heapsort [19] or MDR-Heapsort [15],[18], which both perform the
sift-down by first going down to some leaf and then searching upward for the
correct position. Since one can expect that the final position of some introduced
element is near to some leaf, this is a good heuristic and it leads to provably
good results. The difference between QuickHeapsort and Ultimate Heapsort lies
in the choice of the pivot element for partitioning the array. While for Ultimate
Heapsort the pivot is chosen as median of the whole array, for QuickHeapsort
the pivot is selected as median of some smaller sample (e.g. as median of 3
elements).
In [2] the basic version with fixed index as pivot is analyzed and – together
with the median of three version – implemented and compared with other Quick-
and Heapsort variants. In [8] Edelkamp and Stiegeler compare these variants
with so called Weak-Heapsort [7] and some modifications of it (e.g. Relaxed-
Weak-Heapsort). Weak-Heapsort beats basic QuickHeapsort with respect to
the number of comparisons, however it needs O(n) bits extra-space (for Relaxed-
Weak-Heapsort this bound is only conjectured), hence is not in place.
We split the analysis of QuickHeapsort into three parts: the partitioning
phases, the heap construction and the heap extraction. This allows us to get bet-
ter bounds for the running time, especially when choosing the pivot as median of
a larger sample. It also simplifies the analysis. We introduce some modifications
of QuickHeapsort, too. The first one is in-place and needs nlg n0.03n+o(n)
comparisons on average what is to the best of our knowledge better than any
other known in-place Heapsort variant. We also examine a modification using
O(n) bits extra-space, which applies the ideas of MDR-Heapsort to QuickHeap-
sort. With this method we can bound the average number of comparisons to
nlg n0.997n+o(n). Actually, a complicated, iterated in-place MergeInsertion
uses only nlg n1.3n+O(lg n) comparisons, [16]. Unfortunately, for practical
purposes this algorithm is not competitive.
Our contributions are as follows: 1. We give a simplified analysis which gives
better bounds than previously known. 2. Our approach yields the first precise
analysis of QuickHeapsort when the pivot element is taken from a larger sample.
3. We give a simple in-place modification of QuickHeapsort which saves 0.75n
comparisons. 4. We give a modification of QuickHeapsort using nextra bits
only and we can bound the expected number of comparisons. This bound is
better than the previously known for the worst case of Heapsort variants using
O(nlg n) extra bits for which best and worst case are almost the same. 5. We
have implemented QuickHeapsort, and our experiments confirm the theoretical
predictions.
The paper is organized as follows: Sect. 2 briefly describes the basic Quick-
Heapsort algorithm together with our first improvement. In Sect. 3 we analyze
the expected running time of QuickHeapsort. Then we introduce some improve-
ments in Sect. 4 allowing O(n) additional bits. Finally, in Sect. 5, we present
our experimental results comparing the different versions of QuickHeapsort with
other Quicksort and Heapsort variants.
2
2 QuickHeapsort
Atwo-layer-min-heap is an array A[1..n] of nelements together with a partition
(G, R) of {1,...,n}into green and red elements such that for all gG, r Rwe
have A[g]A[r]. Furthermore, the green elements gsatisfy the heap condition
A[g]min{A[2g], A[2g+1]}, and if gis red, then 2gand 2g+1 are red, too. (The
conditions are required to hold, only if the indices involved are in the range of 1
to n.) The green elements are called “green” because the they can be extracted
out of the heap without caution, whereas the “red” elements are blocked. Two-
layer-max-heaps are defined analogously. We can think of a two-layer-heap as
rooted binary tree such that each node is either green or red. Green nodes
satisfy the standard heap-condition, children of red nodes are red. Two-layer-
heaps were defined in [12]. In [2] for the same concept a different language is
used (they describe the algorithm in terms of External Heapsort). Now we are
ready to describe the QuickHeapsort algorithm as it has been proposed in [2].
Most of it also can be found in pseudocode in App. D.
We intend to sort an array A[1..n]. First, we choose a pivot p. This is the
randomized part of the algorithm. Then, just as in Quicksort, we rearrange the
array according to p. That means, using n1 comparisons the partitioning
function returns an index kand rearranges the array Aso that A[i]A[k] for
i < k,A[k] = p, and A[k]A[j] for k < j. After the partitioning a two-layer-
heap is built out of the elements of the smaller part of the array, either the part
left of the pivot or right of the pivot. We call this smaller part heap-area and
the larger part work-area. More precisely, if k1< n k, then {1,...,k1}
is the heap-area and {k+ 1,...,n}is the work-area. If k1nk, then
{1,...,k1}is the work-area and {k+ 1,...,n}is the heap-area. Note that
we know the final position of the pivot element without any further comparison.
Therefore, we do not count it to the heap-area nor to the work-area. If the
heap-area the part of the array left of the pivot, a two-layer-max-heap is built,
otherwise a two-layer-min-heap is built.
At the beginning the heap-area is an ordinary heap, hence it is a two-layer-
heap consisting of green elements, only. Now the heap extraction phase starts.
We assume that we are in the case of a max-heap. The other case is symmetric.
Let mdenote the size of the heap-area. The melements of the heap-area are
moved to the work-area. The extraction of one element works as follows: the
root of the heap is placed at the current position of the work-area (which at the
beginning is its last position). Then, starting from the root the resulting “hole”
is trickled down: always the larger child is moved up into the vacant position
and then this child is treated recursively. This stops as soon as a leaf is reached.
We call this the SpecialLeaf procedure (Alg. 4.2) according to [2]. Now, the
element which before was at the current position in the work-area is placed as
red element in this hole at the leaf in the heap-area. Finally the current position
in the work-area is moved by one and the next element can be extracted.
The procedure sorts correctly, because after the partitioning it is guaranteed
that all red elements are smaller than all green elements. Furthermore there is
enough space in the work-area to place all green elements of the heap, since the
heap is always the smaller part of the array. After extracting all green elements
the pivot element it placed at its final position and the remaining elements are
sorted recursively.
Actually we can improve the procedure, thereby saving 3n/4 comparisons by
3
a simple trick. Before the heap extraction phase starts in the heap-area with m
elements, we perform at most m+2
4additional comparisons in order to arrange
all pairs of leaves which share a parent such that the left child is not smaller
than its right sibling. Now, in every call of SpecialLeaf, we can save exactly one
comparison, since we do not need to compare two leaves. For a max-heap we
only need to move up the left child and put the right one at the place of the
former left one. Summing up over all heaps during an execution of standard
QuickHeapsort, we invest n+2t
4comparisons in order to save ncomparisons,
where tis the number of recursive calls. The expected number of tis in O(lg n).
Hence, we can expect to save 3n
4+O(lg n) comparisons. We call this version
the improved variant of QuickHeapsort.
3 Analysis of QuickHeapsort
This section contains the main contribution of the paper. We analyze the num-
ber of comparisons. By nwe denote the number of elements of an array to be
sorted. We use standard O-notation where O(g), o(g), and ω(g) denote classes
of functions. In our analysis we do not assume any random distribution of the
input, i.e. it is valid for every permutation of the input array. Randomization
is used however for pivot selection. With Pr [ e] we denote the probability of
some event e. The expected value of a random variable Tis denoted by E[T].
The number of assignments is bounded by some small constant times the
number of comparisons. Let T(n) denote the number of comparisons during
QuickHeapsort on a fixed array of nelements. We are going to split the analysis
of QuickHeapsort into three parts:
1. Partitioning with an expected number of comparisons E[Tpart (n) ] (aver-
age case).
2. Heap construction with at most Tcon(n) comparisons (worst case).
3. Heap extraction (sorting phase) with at most Text(n) comparisons (worst
case).
We analyze the three parts separately and put them together at the end. The
partitioning is the only randomized part of our algorithm. The expected num-
ber of comparisons depends on the selection method for the pivot. For the
expected number of comparisons by QuickHeapsort on the input array we ob-
tain E[T(n) ] Tcon (n) + Text(n) + E[Tpart(n)].
Theorem 3.1 The expected number E[T(n) ] of comparisons by basic resp. im-
proved QuickHeapsort with pivot as median of prandomly selected elements on
a fixed input array of size nis E[T(n) ] nlg n+cn +o(n)with cas follows:
pcbasic cimproved
1 +2.72 +1.97
3 +1.92 +1.17
f(n)+0.72 0.03
Here, fω(1) o(n)with 1f(n)n, e.g., f(n) = nand we assume that
we choose the median of f(n)randomly selected elements in time O(f(n)).
4
As we see, the selection method for the pivot is very important. However,
one should notice that the bound for fixed size samples for pivot selection are
not tight. The proof of these results are postponed to Sect. 3.3. Note that it
is enough to prove the results without the improvement, since the difference is
always 0.75n.
3.1 Heap Construction
The standard heap construction [9] needs at most 2mcomparisons to construct
a heap of size min the worst case and approximately 1.88min the average case.
For the mathematical analysis better theoretical bounds can be used. The best
result we are aware of is due to Chen et al. in [5]. According to this result we
have Tcon(m)1.625m+o(m). Earlier results are of similar magnitude, by
[4] it has been known that Tcon(m)1.632m+o(m) and by [10] it has been
known Tcon(m)1.625m+o(m), but Gonnet and Munro used O(m) extra bits
to get this result, whereas the new result of Chen et al. is in-place (by using
only O(lg m) extra bits).
During the execution of QuickHeapsort over nelements, every element is part
of a heap only once. Hence, the sizes of all heaps during the entire procedure
sum up to n. With the result of [5] the total number of comparisons performed
in the construction of all heaps satisfies:
Proposition 3.2 Tcon(n)1.625n+o(n).
3.2 Heap Extraction
For a real number rRwith r > 0 we define {r}by the following condition
r= 2k+{r}with kZand 0 ≤ {r}<2k.
This means that 2kis largest power of 2 which is less than or equal to rand
{r}is the difference to that power, i.e. {r}=r2lg r. In this section we first
analyze the extraction phase of one two-layer-heap of size m. After that, we
bound the number of comparisons Text (n) performed in the worst case during
all heap extraction phases of one execution of QuickHeapsort on an array of size
n. Thm. 3.3 is our central result about heap extraction.
Theorem 3.3 Text(n)n·(lg n⌋ − 3) + 2{n}+O(lg2n).
The proof of Thm. 3.3 covers almost the rest of Section 3.2. In the following,
the height height(v) of an element vin a heap His the maximal distance from
that node to a leaf below it. The height of His the height of its root. The level
level(v) of vto be its distance from the root. In this section we want to count
the comparisons during SpecialLeaf procedures, only. Recall that a SpecialLeaf
procedure is a cyclic shift on a path from the root down to some leaf, and the
number comparisons is exactly the length of this path. Hence the upper bound
is the height of the heap. But there is a better analysis.
Let us consider a heap with mgreen elements which are all extracted by
SpecialLeaf procedures. The picture is as follows: First, we color the green
root red. Next, we perform a cyclic shift defined by the SpecialLeaf procedure.
In particular, the leaf is now red. Moreover, red positions remain red, but
5
there is exactly one position vwhich has changed its color from green to red.
This position vis on the path defined by the SpecialLeaf procedure. Hence,
the number of comparisons needed to color the position vred is bounded by
height(v) + level(v).
The total number of comparisons E(m) to extract all melements of a Heap
His therefore bounded by
E(m)X
vH
(height(v) + level(v)).
We have height(H)1height(v) + level(v)height(H) = lg mfor all
vH. We now count the number of elements vwhere height(v) + level(v) =
lg mand the number of elements vwhere height(v) + level(v) = lg m⌋ − 1.
Since there are exactly {m}+ 1 nodes of level lg m, there are at most 2 {m}+
1 +lg melements vwith height(v)+ level(v) = lg m. All other elements satisfy
height(v) + level(v) = lg m⌋ − 1. We obtain
E(m)2· {m} · ⌊lg m+ (m2· {m})(lg m⌋ − 1) + O(lg m)
=m·(lg m⌋ − 1) + 2 · {m}+O(lg m).(1)
Note that this is an estimate of the worst case, however this analysis also shows
that the best case only differs by O(lg m)-terms from the worst case.
Now, we want to estimate the number of comparisons in the worst case
performed during all heap extraction phases together. During QuickHeapsort
over nelements we create a sequence H1,...,Htof heaps of green elements
which are extracted using the SpecialLeaf procedure. Let mi=|Hi|be the size
of the i-th Heap. The sequence satisfies 2minPj<i mj, because heaps are
constructed and extracted on the smaller part of the array.
Here comes a subtle observation: Assume that m1+m2n/2. If we
replace the first two heaps with one heap Hof size |H|=m1+m2, then
the analysis using the sequence H, H3,...,Htcannot lead to a better bound.
Continuing this way, we may assume that we have t∈ O(lg n) and therefore
P1itO(lg mi)⊆ O(lg2n).With Eq. (1) we obtain the bound
Text(n)
t
X
i=1
E(mi) = t
X
i=1
mi· ⌊lg mi+ 2 {mi}!n+O(lg2n).(2)
Later we will replace the miby other positive real numbers. Therefore we
define the following notion. Let 1 νR. We say a sequence x1, x2,...,xt
with xiR>0is valid w.r.t. ν, if for all 1 itwe have 2xiνP
j<i
xj.
As just mentioned the initial sequence m1, m2...,mtis valid w.r.t. n. Let
us define a continuous function F:R>0Rby F(x) = x· ⌊lg x+ 2 {x}.It is
continuous since for x= 2k,kZwe have F(x) = xk = limε0(xε)(k1) +
2{xε}. It is piecewise differentiable with right derivative lg x+2. Therefore:
Lemma 3.4 Let xy > δ 0. Then we have the inequalities:
F(x) + F(y)F(x+δ) + F(yδ)and F(x) + F(y)F(x+y).
6
Lemma 3.5 Let 1νR. For all sequences x1, x2,...,xtwith xiR>0,
which are valid w.r.t. ν, we have
t
P
i=1
F(xi)lg ν
P
i=1
Fν
2i.
Proof. The result is true for ν2, because then F(xi)F(ν/2) F(1) = 0
for all i. Thus, we may assume ν2. We perform induction on t. For t= 1
the statement is clear, since lg ν1 and x1ν/2. Now let t > 1. By Lem. 3.4,
we have F(x1) + F(x2)< F (x1+x2). Now, if x1+x2ν
2, then the sequence
x1+x2, x3,...,xtis valid, too; and we are done by induction. Hence, we may
assume x1+x2>ν
2. If x1x2, then
2x1= 2x2+ 2(x1x2)νx1+ 2(x1x2) = νx2+x1x2νx2.
Thus, if x1x2, then the sequence x2, x1, x3,...,xtis valid, too. Thus, it is
enough to consider x1x2with x1+x2>ν
2.
We have ν
21 and the sequence x
2, x3,...xtwith x
2=x1+x2ν
2is valid
w.r.t. ν/2, because
x
2=x1+x2ν
2x1+νx1
2ν
2=x1
2ν
4.
Therefore, by induction on tand Lem. 3.4 we obtain the claim:
t
X
i=1
F(xi)F(ν/2)+F(x
2)+
t
X
i=3
F(xi)F(ν/2)+lg ν
X
i=2
Fν
2ilg ν
X
i=1
Fν
2i.
Lemma 3.6 lg n
P
i=1
Fn
2iF(n)2n+O(lg n).
Lem. 3.6
lg n
X
i=1
Fn
2i=nlg n⌋ · lg n
X
i=1
1
2in·lg n
X
i=1
i
2i+ 2 {n} · lg n
X
i=1
1
2i
nlg n⌋ · X
i1
1
2in·X
i1
i
2i+ 2 {n} · X
i1
1
2i+n
2lg n·X
i>0
i+lg n
2i
=nlg n⌋ − 2n+ 2 {n}+O(lg n).
Applying these lemmata to Eq. (2) yields the proof of Thm. 3.3.
Corollary 3.7 We have Text(n)nlg n2.9139n+O(lg2n).
Proof. By [18, Thm. 1] we have F(n)2nnlg n1.9139n. Hence, Cor. 3.7
follows directly from Thm. 3.3.
7
3.3 Partitioning
In the following Tpivot(n) denotes the number of comparisons required to choose
the pivot element in the worst case; and, as before, E[Tpart(n) ] denotes the
expected number of comparisons performed during partitioning. We have the
following recurrence:
E[Tpart(n) ] n1 + Tpivot(n) +
n
X
k=1
Pr [pivot = k]·E[Tpart(max {k1, n k})] .
(3)
If we choose the pivot at random, then we obtain by standard methods:
E[Tpart(n) ] n1 + 1
n·
n
X
k=1
E[Tpart(max {k1, n k}) ] 4n. (4)
Similarly, if we choose the pivot with the median-of-three, then we obtain:
E[Tpart(n) ] 3.2n+O(lg n).(5)
The proof of the first part of Thm. 3.1 follows from the above eqations,
Thm. 3.3, and Prop. 3.2. Using a growing number of elements (as ngrows) as
sample for the pivot selection, we can do better. The second part of Thm. 3.1
follows from Thm. 3.3, Prop. 3.2, and Thm. 3.8.
Theorem 3.8 Let fω(1) o(n)with 1f(n)n. When choosing the
pivot as median of f(n)randomly selected elements in time O(f(n)) (e.g. with
the algorithm of [1]), the expected number of comparisons used in all recursive
calls of partitioning is in 2n+o(n).
Thm. 3.8 is close to a well-known result in [14, Thm. 5] on Quickselect, see
Cor. 3.10. Formally speaking we cannot use it directly, because we deal with
QuickHeapsort, where after partitioning the recursive call is on the larger part.
Because of that, and for the sake of completeness, we give a proof. Moreover, our
proof is elementary and simpler than the one in [14]. The key step is Lem. 3.9.
Its proof is rather standard and also can be found in App. A.
Lemma 3.9 Let 0< δ < 1
2. If we choose the pivot as median of 2c+ 1 elements
such that 2c+ 1 n
2, then we have Pr pivot n
2δn <(2c+ 1)αcwhere
α= 4 1
4δ2<1.
Proof of Thm. 3.8. As an abbreviation, we let E(n) = E[Tpart(n) ] be the
expected number of comparisons performed during partitioning. We are going
to show that for all ǫ > 0 there is some DRsuch that
E(n)<(2 + ǫ)n+D. (6)
So, we fix some 1 ǫ > 0. We choose δ > 0 such that (2+ǫ)δ < ǫ
4.Moreover, for
this proof let µ=n+1
2. Positions of possible pivots kwith µδn kµ+δn
form a small fraction of all positions, and they are located around the median.
8
Nevertheless, applying Lem. 3.9 with c=f(n)ω(1) o(n) yields for all n,
which are large enough:
Pr [ pivot < µ δn ](2f(n) + 1) ·αf(n)1
48ǫ. (7)
The analogous inequality holds for Pr [ pivot > µ +δ n ]. Because Tpivot(n)
o(n), we have
Tpivot(n)1
8ǫn. (8)
for nlarge enough. Now, we choose n0such that Eq. (7) and Eq. (8) hold for
nn0and such that we have (2+ ǫ)δ+2
n0<ǫ
4. We set D=E(n0) + 1. Hence
for n < n0the desired result Eq. (6) holds. Now, let nn0. From Eq. (3) we
obtain by symmetry:
E(n)n1 + Tpivot(n) + µ+δn
X
k=µδn
Pr [ pivot = k]·E(k1)
+ 2
n
X
k=µ+δn+1
Pr [ pivot = k]·E(k1).
Since Eis monotone, E(k) can be bounded by the highest value in the respective
interval:
n+1
8ǫn + Pr [ µδn pivot µ+δn ]·E(µ+δn)
+ 2 Pr [ pivot > µ +δn ]·E(n1)
n+1
8ǫn +11
24ǫ·E(µ+δn) + 2 1
48ǫ·E(n1).
By induction we assume E(k)(2 + ǫ)k+Dfor k < n. Hence:
E(n)n+1
8ǫn +11
24ǫ·((2 + ǫ)·(µ+δn) + D) + 1
24ǫ·((2 + ǫ)n+D)
n+ (2 + ǫ)·n+ 1
2+δn+1
8ǫn +1
24ǫ(2 + ǫ)n+D
2n+ 1 + ǫ
2+ (2 + ǫ)δn +3
4ǫn +D < (2 + ǫ)n+D.
Corollary 3.10 ([14]) Let fω(1) o(n)with 1f(n)n. When im-
plementing Quickselect with the median of f(n)randomly selected elements as
pivot, the expected number of comparisons is 2n+o(n).
Proof. In QuickHeapsort the recursion is always on the larger part of the array.
Hence, the number of comparisons in partitioning for QuickHeapsort is an upper
bound on the number of comparisons in Quickselect.
In [14] it is also proved that choosing the pivot as median of O(n) elements
is optimal for Quicksort as well as for Quickselect. This suggests that we choose
the same value in QuickHeapsort; what is backed by our experiments.
9
4 Modifications of QuickHeapsort Using Extra-
space
In this section we want to describe some modification of QuickHeapsort using
nbits of extra storage. We introduce two bit-arrays. In one of them (the Com-
pareArray) – which is actually two bits per element – we store the comparisons
already done (we need two bits, because there are three possible values – right,
left, unknown – we have to store). In the other one (the RedGreenArray) we
store which element is red and which is green.
Since the heaps have maximum size n/2, the RedGreenArray only requires
n/2 bits. The CompareArray is only needed for the inner nodes of the heaps,
i.e. length n/4 is sufficient. Totally this sums up to nextra bits.
For the heap construction we do not use the algorithms described in Sect. 3.1.
With the CompareArray we can do better by using the algorithm of McDiarmid
and Reed [15]. The heap construction works similarly to Bottom-Up-Heapsort,
i.e. the array is traversed backward calling for all inner positions ithe Reheap
procedure on i. The Reheap procedure takes the subheap with root iand
restores the heap condition, if it is violated at the position i. First, the Reheap
procedure determines a special leaf using the SpecialLeaf procedure as described
in Sect. 2, but without moving the elements. Then, the final position of the
former root is determined going upward from the special leaf (bottom-up-phase).
In the end, the elements above this final position are moved up towards the root
by one position. That means that all but one element which are compared during
the bottom-up-phase, stay in their places. Since in the SpecialLeaf procedure
these elements have been compared with their siblings, these comparisons can
be stored in the CompareArray and can be used later.
With another improvement concerning the construction of heaps with seven
elements as in [3] the benefits of this array can be exploited even more.
The RedGreenArray is used during the sorting phase, only. Its functionality
is straightforward: Every time a red element is inserted into the heap, the
corresponding bit is set to red. The SpecialLeaf procedure can stop as soon
as it reaches an element without green children. Whenever a red and a green
element have to be compared, the comparison can be skipped.
Theorem 4.1 Let fω(1) o(n)with 1f(n)n, e.g., f(n) = lg n, and
let E[T(n) ] be the expected number of comparisons by QuickHeapsort using the
CompareArray with the improvement of [3] and the RedGreenArray on a fixed
input array of size n. Choosing the pivot as median of f(n)randomly selected
elements in time O(f(n)), we have
E[T(n) ] nlg n0.997n+o(n).
Proof. We can analyze the savings by the two arrays separately, because the
CompareArray only affects comparisons between two green elements, while the
RedGreenArray only affects comparisons involving at least one red element.
First, we consider the heap construction using the CompareArray. With this
array we obtain the same worst case bound as for the standard heap construction
method. However, the CompareArray has the advantage that at the end of the
heap construction many comparisons are stored in the array and can be reused
for the extraction phase. More precisely: For every comparison except the first
10
one made when going upward from the special leaf, one comparison is stored
in the CompareArray, since for every additional comparison one element on the
path defined by SpecialLeaf stays at its place. Because every pair of siblings
has to be compared at one point during the heap construction or extraction,
all these stored comparisons can be reused. Hence, we only have to count the
comparisons in the SpecialLeaf procedure during the construction plus n
2for the
first comparison when going upward. Thus, we get an amortized bound for the
comparisons during construction of 3n
2.
In [3] the notion of Fine-Heaps is introduced. A Fine Heap is a heap with
the additional CompareArray such that for every node the larger child is stored
in the array. Such a Fine-Heap of size mcan be constructed using the above
method with 2mcomparisons. In [3] Carlsson, Chen and Mattsson showed that
a Fine-Heap of size mactually can be constructed with only 23
12 m+O(lg2m)
comparisons. That means we have to invest 23
12 m+O(lg2m) for the heap con-
struction and at the end there are m
2comparisons stored in the array. All these
comparisons stored in the array are used later. Summing up over all heaps
during an execution of QuickHeapsort, we can save another 1
12 ncomparisons
additionally to the comparisons saved by the CompareArray with the result of
[3]. Hence, for the amortized cost of the heap construction Tamort
con (i.e. the num-
ber of comparisons needed to build the heap minus the number of comparisons
stored in the CompareArray after the construction which all can be reused later)
we have obtained:
Proposition 4.2 Tamort
con (n)17
12 n+o(n).
This bound is slightly better than the average case for the heap construction
with the algorithm of [15] which is 1.52n.
Now, we want to count the number of comparisons we save using the Red-
GreenArray. We distinguish the two cases that two red elements are compared
and that a red and a green element are compared. Every position in the heap
has to turn red at one point. At that time, all nodes below this position are
already red. Hence, for that element we save as many comparisons as the ele-
ment is above the bottom level. Summing over all levels of a heap of size mthe
saving results in m
4·1 + m
8·2 + ···=m·P
i1
i2i1=m. This estimate is
exact up to O(lg m)-terms. Since the expected number of heaps is O(lg n), we
obtain for the overall saving the value TsaveRR(n) = n+O(lg2n).
Another place where we save comparisons with the RedGreenArray is when
a red element is compared with a green element. It occurs at least one time –
when the node looses its last green child – for every inner node that we compare
a red child with a green child. Hence, we save at least as many comparisons
as there are inner nodes with two children, i.e. at least m
21. Since every
element – except the expected O(lg n) pivot elements – is part of a heap exactly
once, we save at least TsaveRG(n)n
2+O(lg n) comparisons when comparing
green with red elements. In the average case the saving might be even slightly
higher, since comparisons can also be saved when a node does not loose its last
green child.
Summing up all our savings and using the median of f(n)ω(1) o(n) as
11
pivot we obtain the proof of Thm. 4.1:
E[T(n) ] Tamort
con (n) + Text(n) + E[Tpart (n)] TsaveRR(n)TsaveRG(n)
17
12n+n·(lg n⌋ − 3) + 2 {n}+ 2n3n
2+o(n)
nlg n0.997n+o(n).
5 Experimental Results and Conclusion
In Fig. 1 we present the number of comparisons of the different versions of
QuickHeapsort we considered in this paper, i.e. the basic version, the improved
variant of Sect. 2, and the version using bit-arrays (however, without the mod-
ification by [3]) for different values of n. We compare them with Quicksort,
Ultimate Heapsort, Bottom-Up-Heapsort and MDR-Heapsort. All algorithms
are implemented with median of nelements as pivot (for Quicksort we show
additionally the data with median of 3). For the heap construction we imple-
mented the normal algorithm due to Floyd [9] as well as the algorithm using
the extra bit-array (which is the same as in MDR-Heapsort).
103104105106
2
0
2
4
6
n
(#comparisons nlg n)/n
Quicksort with Median of 3
Quicksort with Median of n
Basic QuickHeapsort
Improved QuickHeapsort
QuickHeapsort with bit-arrays
MDR-Heapsort
Ultimate-Heapsort
Lower Bound
Figure 1: Average number of comparisons of QuickHeapsort implemented with
median of ncompared with other algorithms
More results with other pivot selection strategies are in Table 2 and Table 3 in
App. B confirming that a sample size of nis optimal for pivot selection with re-
spect to the number of comparisons and also that the o(n)-terms in Thm. 3.1 and
Thm. 3.8 are not too big. In Table 1 in App. B we present actual running times
of the different algorithms for n= 1000000. All the numbers, except the running
12
times, are average values over 100 runs with random data. As our theoretical es-
timates predict, QuickHeapsort with bit-arrays beats all other variants including
Relaxed-Weak-Heapsort (see Table 2, App. B) when implemented with median
of nfor pivot selection. It also performs 326728 0.33 ·106comparisons less
than our theoretical predictions which are 106·lg(106)0.9139 ·10619017569
comparisons.
In this paper we have shown that with known techniques QuickHeapsort
can be implemented with expected number of comparisons less than nlg n
0.03n+o(n) and extra storage O(1). On the other hand, using nextra bits we
can improve this to nlg n0.997n+o(n), i.e. we showed that QuickHeapsort
can compete with the most advanced Heapsort variants. These theoretical es-
timates were also confirmed by our experiments. We also considered different
pivot selection schemes. For any constant size sample for pivot selection, Quick-
Heapsort beats Quicksort for large n, since Quicksort has a expected running
time of Cn lg nwith C > 1. However, when choosing the pivot as median
of nelements (i.e. with the optimal strategy) then our experiments show
that Quicksort needs less comparisons than QuickHeapsort. However, using bit-
arrays QuickHeapsort is the winner, again. In order to make the last statement
rigorous, better theoretical bounds for Quicksort with sampling nelements
are needed. For future work it would also be of interest to prove the optimality
of nelements for pivot selection in QuickHeapsort, to estimate the lower order
terms of the average running time of QuickHeapsort and also to find an exact
average case analysis for the saving by the bit-arrays.
Acknowledgements.
We thank Martin Dietzfelbinger, Stefan Edelkamp and Jyrki Kata jainen for
their helpful comments. We thank Simon Paridon for implementing the algo-
rithms for our experiments.
References
[1] M. Blum, R. W. Floyd, V. Pratt, R. L. Rivest, and R. E. Tarjan. Time
bounds for selection. J. Comput. Syst. Sci., 7(4):448–461, 1973.
[2] D. Cantone and G. Cincotti. QuickHeapsort, an efficient mix of classical
sorting algorithms. Theor. Comput. Sci., 285(1):25–42, 2002.
[3] S. Carlsson, J. Chen, and C. Mattsson. Heaps with Bits. In D.-Z. Du
and X.-S. Zhang, editors, ISAAC, volume 834 of LNCS, pages 288–296.
Springer, 1994.
[4] J. Chen. A Framework for Constructing Heap-like structures in-place. In
K.-W. Ng et al., editors, ISAAC, volume 762 of LNCS, pages 118–127.
Springer, 1993.
[5] J. Chen, S. Edelkamp, A. Elmasry, and J. Katajainen. In-place Heap
Construction with Optimized Comparisons, Moves, and Cache Misses. In
B. Rovan, V. Sassone, and P. Widmayer, editors, MFCS, volume 7464 of
LNCS, pages 259–270. Springer, 2012.
13
[6] T. H. Cormen, C. E. Leiserson, R. L. Rivest, and C. Stein. Introduction to
Algorithms. The MIT Press, 3 edition, 2009.
[7] R. D. Dutton. Weak-heap sort. BIT, 33(3):372–381, 1993.
[8] S. Edelkamp and P. Stiegeler. Implementing HEAPSORT with nlg n0.9n
and QUICKSORT with nlg n+ 0.2ncomparisons. ACM J. of Exp. Alg.,
7:5, 2002.
[9] R. W. Floyd. Algorithm 245: Treesort. Commun. ACM, 7(12):701, 1964.
[10] G. H. Gonnet and J. I. Munro. Heaps on Heaps. SIAM J. Comput.,
15(4):964–971, 1986.
[11] K. Kaligosi and P. Sanders. How Branch Mispredictions Affect Quicksort.
In Y. Azar and T. Erlebach, editors, ESA, volume 4168 of LNCS, pages
780–791. Springer, 2006.
[12] J. Katajainen. The Ultimate Heapsort. In X. Lin, editor, CATS, volume 20
of Australian Computer Science Communications, pages 87–96. Springer-
Verlag 1998.
[13] D. E. Knuth. The art of computer programming. Vol. 3. Addison-Wesley,
1998.
[14] C. Mart´ınez and S. Roura. Optimal Sampling Strategies in Quicksort and
Quickselect. SIAM J. Comput., 31(3):683–705, 2001.
[15] C. McDiarmid and B. A. Reed. Building Heaps Fast. J. Alg., 10(3):352–365,
1989.
[16] K. Reinhardt. Sorting in-place with a worst case complexity of nlg n
1.3n+O(lg n) comparisons and ǫn lg n+O(1) transports. In T. Ibaraki et
al., editors, ISAAC, volume 650 of LNCS, pages 489–498. Springer, 1992.
[17] X.-D. Wang and Y.-J. Wu. An Improved HEAPSORT Algorithm with
nlg n0.788928nComparisons in the Worst Case. J. of Comput. Sci. and
Techn., 22:898–903, 2007.
[18] I. Wegener. The Worst Case Complexity of McDiarmid and Reed’s Variant
of Bottom-Up-Heap Sort is Less Than nlg n+ 1.1n. In C. Choffrut and
M. Jantzen, editors, STACS, volume 480 of LNCS, pages 137–147. Springer,
1991.
[19] I. Wegener. BOTTOM-UP-HEAPSORT, a new variant of HEAPSORT,
beating, on an average, QUICKSORT (if nis not very small). Theor.
Comp. Sci., 118(1):81–98, 1993.
14
APPENDIX
A Proofs
Proof of Lem. 3.4 Since the right derivative is monotonically increasing we
have:
F(x+δ)F(x) = Zx+δ
x
F(t) dtF(x)·δ= (lg x+ 2)δ
and
F(y)F(yδ) = Zy
yδ
F(t) dtF(y)·δ= (lg y+ 2)δ.
This yields:
F(y)F(yδ)(lg y+ 2)δ(lg x+ 2)δF(x+δ)F(x).
By adding F(x) + F(yδ) on both sides we obtain the first claim of Lem. 3.4.
Note that limε0F(ε) = 0. Hence the second claim follows from the first by
considering the limit δy.
Proof of Lem. 3.9. First note that the probability for choosing the k-th ele-
ment as pivot satisfies
n
2c+ 1·Pr [ pivot = k] = k1
cnk
c.
We use the notation of falling factorial x=x···(x+ 1). Thus, x
=x
!.
Pr [ pivot = k] = (2c+ 1)! ·(k1)c·(nk)c
(c!)2·n2c+1
=2c
c(2c+ 1) 1
(n2c)
c1
Y
i=0
(k1i)(nki)
(n2i1)(n2i).
For kcwe have Pr [ pivot = k] = 0. So, let c < k n
2δn and let us
consider an index iin the product with 0 i < c.
(k1i)(nki)
(n2i1)(n2i)(ki)(nki)
(n2i)(n2i)
=n
2in
2k·n
2i+n
2k
(n2i)2
=n
2i2n
2k2
(n2i)2
1
4n
2n
2δn2
n2=1
4δ2.
15
We have 2c
c4c. Since 2c+ 1 n
2, we obtain:
Pr [ pivot = k]4c(2c+ 1) 1
(n2c)1
4δ2c
<(2c+ 1) 2
nαc.
Now, we obtain the desired result.
Pr hpivot n
2δn i<n
2δn
X
k=0
(2c+ 1) 2
nαc(2c+ 1)αc
B More Experimental Results
In Table 1 we present actual running times of the different algorithms for n=
1000000 with two different comparison functions (the numbers displayed here
are averages over 10 runs with random data). One of them is the normal integer
comparison, the other one first applies four times the logarithm to both operands
before comparing them. Like in [8], this simulates expensive comparisons.
In Table 2 all algorithms are implemented with median of 3 and with median
of nelements as pivot. We compare them with Quicksort implemented with
the same pivot selection strategies, Ultimate Heapsort, Bottom-Up-Heapsort
and MDR-Heapsort. In Table 2 we also added the values for Relaxed-Weak-
Heapsort which were presented in [8].
Table 1: Running times for QuickHeapsort and other algorithms tested on 106,
average over 10 runs elements
Sorting algorithm integer data
time [s]
lg(4)-test-function
time [s]
Basic QuickHeapsort, median of 3 0.1154 4.21
Basic QuickHeapsort, median of n0.1171 4.109
Improved QHS, median of 3 0.1073 4.049
Improved QHS, median of n0.1118 3.911
QHS with bit-arrays, median of 3 0.1581 3.756
QHS with bit-arrays, median of n0.164 3.7
Quicksort with median of 3 0.1181 3.946
Quicksort with median of n0.1316 3.648
Ultimate Heapsort 0.135 5.109
Bottom-Up-Heapsort 0.1677 4.132
MDR-Heapsort 0.2596 4.129
We also compare the different pivot selection strategies on the basic Quick-
Heapsort with no modifications. We test sample of sizes of one, three, approxi-
mately lg n,4
n,pn/ lg n,n, and n3
4for the pivot selection.
In Table 3 the average number of comparisons and the standard deviations
are listed. We ran the algorithms on arrays of length 10000 and one million.
16
Table 2: QuickHeapsort and other algorithms tested on 106elements (the data
for Relaxed-Weak-Heapsort is taken from [8]).
Sorting algorithm Average number of com-
parisons for n= 106
Basic QuickHeapsort with median of 3 21327478
Basic QuickHeapsort with median of n20783631
Improved QuickHeapsort, median of 3 20639046
Improved QuickHeapsort, median of n20135688
QuickHeapsort with bit-arrays, median of 3 19207289
QuickHeapsort with bit-arrays, median of n18690841 Best result
Quicksort with median of 3 21491310
Quicksort with median of n19548149
Bottom-Up-Heapsort 20294866
MDR-Heapsort 20001084
Relaxed-Weak-Heapsort 18951425
Lower Bound: lg n! 18488884 lg (106!)
The displayed data is the average resp. standard deviation of 100 runs of Quick-
Heapsort with the respective pivot selection strategy.
These results are not very surprising: The larger the samples get, the smaller
is the standard deviation. The average number of comparisons reaches its min-
imum with a sample size of approximately nelements. One notices that the
difference for the average number of comparisons is relatively small, especially
between the different pivot selection strategies with non-constant sample sizes.
This confirms experimentally that the o(n)-terms in Thm. 3.1 and Thm. 3.8 are
not too big.
Table 3: Different strategies for pivot selection for basic QuickHeapsort tested
on 104and 106elements. The standard deviation of our experiments is given in
percent of the average number of comparisons.
n104106
Sample size Average number
of comparisons
Standard
deviation
Average number
of comparisons
Standard
deviation
1 152573 4.281 21975912 3.452
3 146485 2.169 21327478 1.494
lg n143669 0.954 20945889 0.525
4
n143620 0.857 20880430 0.352
pn/ lg n142634 0.413 20795986 0.315
n142642 0.305 20783631 0.281
n3
4147134 0.195 20914822 0.168
17
C Some Words about the Worst Case Running
Time
Obviously the worst case running time depends on how the pivot element is
chosen. If just one random element is used as pivot we get the same quadratic
worst case running time as for Quicksort. However the probability that in
QuickHeapsort we run in such a “bad case” is not higher than in Quicksort, since
any choice of pivot elements leading to a worst case scenario in QuickHeapsort
also yields the worst case for Quicksort.
If we choose the pivot element as median of approximately 2 lg nelements,
we get a worst case running time of On2
lg n, i.e. for the worst case it makes
almost no difference, if the pivot is selected as median of 2 lg nor just as one
random element.
However, if we use approximately n
lg nelements as sample for the pivot se-
lection, we can get a better bound on the worst case.
Let f:NN1be some monotonically growing function with fo(n)
(e.g. f(n) = lg n). We can apply the ideas of the Median of Medians algorithm
[1]: First we choose n
f(n)random elements, then we group them into groups
of five elements each. The median of each group can be determined with six
comparisons [13, p. 215]. Now, the median of these medians can be computed
using Quickselect. We assume that Quickselect is implemented with the same
strategy for pivot selection. That means we get the same recurrence relations
for the worst case complexity of the partitioning-phases in QuickHeapsort and
for the worst case of Quickselect:
T(n) = n+6n
5f(n)+Tn
5f(n)+Tn3n
10f(n).
This yields T(n)cnf (n) for some clarge enough. Hence with this pivot
selection strategy, we reach a worst case running time for QuickHeapsort of
nlg n+O(nf(n)) and – if f(n)ω(1) – average running time as stated in
Sect. 3.
Driving this strategy to the end and choosing f(n) = 1 leads to Ultimate
Heapsort (or better a slight modification of it – and Quickselect turns into the
Median of Medians algorithm). Then we have T(n) = nlg n+O(n) for the
worst case of QuickHeapsort. However, our bound for the average case does not
hold anymore.
In order to obtain an nlg n+O(n)-bound for the worst case without loosing
our bound for the average case, we can apply a simple trick: Whenever after the
partitioning it turns out that the pivot does not lie in the interval {n
4,...,3n
4}
we switch to Ultimate Heapsort. This immediately yields the worst case bound
of nlg n+O(n). Moreover, the proof of Thm. 3.8 can easily be changed in
order to deal with this modification: Let C·nbe the worst case number of
comparisons for pivot selection and partitioning in Ultimate Heapsort. We can
change Eq. (7) to
Pr [ pivot < µ δn ]1
8Cǫ.
Then, the rest of the proof is exactly the same. Hence, Thm. 3.8 and Thm. 3.1
are also valid when switching to Ultimate Heapsort in the case of a ‘bad’ choice
of the pivot.
18
D Pseudocode of Basic QuickHeapsort
Algorithm 4.1
procedure QuickHeapsort(A[1..n])
begin
if n > 1then
p:= ChoosePivot;
k:= PartitionReverse(A[1..n], p);
if kn/2then
TwoLayerMaxHeap(A[1..n], k1); (heap-area: {1..k 1})
swap(A[k], A[nk+ 1]);
QuickHeapsort(A[1..n k]); (recursion )
else
TwoLayerMinHeap(A[1..n], nk); (heap-area: {k+ 1..n})
swap(A[k], A[nk+ 1]);
QuickHeapsort(A[(nk+ 2)..n]); (recursion )
endif
endif
endprocedure
The ChoosePivot function returns an element pof the array chosen as pivot.
The PartitionReverse function returns an index kand rearranges the array A
so that p=A[k], A[i]A[k] for i < k and A[i]A[k] for i > k using n1
comparisons.
Algorithm 4.2
function SpecialLeaf(A[1..m]):
begin
i:= 1;
while 2imdo (i.e. while iis not a leaf )
if 2i+ 1 mand A[2i+ 1] > A[2i]then
A[i] := A[2i+ 1];
i:= 2i+ 1;
else
A[i] := A[2i];
i:= 2i;
endif
endwhile
return i;
endfunction
Algorithm 4.3
procedure TwoLayerMaxHeap(A[1..n], m)
begin
ConstructHeap(A[1..m]);
for i:= 1 to mdo
19
temp := A[ni+ 1];
A[ni+ 1] := A[1];
j:=SpecialLeaf(A[1..m]);
A[j] := temp;
endfor
endprocedure
The procedure TwoLayerMinHeap is symmetric to TwoLayerMaxHeap, so we
do not present its pseudocode here.
20
... UltimateHeapsort is inferior to QuickHeapsort in terms of the average case number of comparisons, although, unlike QuickHeapsort, it allows an n lg n + O(n) bound for the worst case number of comparisons. Diekert and Weiß [4] analyzed QuickHeapsort more thoroughly and described some improvements requiring less than n lg n − 0.99n + o(n) comparisons on average (choosing the pivot as median of √ n elements). However, both the original analysis of Cantone and Cincotti and the improved analysis could not give tight bounds for the average case of median-of-k QuickMergesort. ...
... We consider both the case where k is a fixed constant and where k = k(n) is an increasing function of the (sub)problem size. Previous results in [4,35] for Quicksort suggest that sample sizes k(n) = Θ( √ n) are likely to be optimal asymptotically, but most of the relative savings for the expected case are already realized for k ≤ 10. It is quite natural to expect similar behavior in QuickXsort, and it will be one goal of this article to precisely quantify these statements. ...
... We use a symmetric variant (with a min-oriented heap) if the left segment shall be sorted by X. For detailed code for the above procedure, we refer to [3] or [4]. ...
Preprint
Full-text available
QuickXsort is a highly efficient in-place sequential sorting scheme that mixes Hoare's Quicksort algorithm with X, where X can be chosen from a wider range of other known sorting algorithms, like Heapsort, Insertionsort and Mergesort. Its major advantage is that QuickXsort can be in-place even if X is not. In this work we provide general transfer theorems expressing the number of comparisons of QuickXsort in terms of the number of comparisons of X. More specifically, if pivots are chosen as medians of (not too fast) growing size samples, the average number of comparisons of QuickXsort and X differ only by $o(n)$-terms. For median-of-$k$ pivot selection for some constant $k$, the difference is a linear term whose coefficient we compute precisely. For instance, median-of-three QuickMergesort uses at most $n \lg n - 0.8358n + O(\log n)$ comparisons. Furthermore, we examine the possibility of sorting base cases with some other algorithm using even less comparisons. By doing so the average-case number of comparisons can be reduced down to $n \lg n- 1.4106n + o(n)$ for a remaining gap of only $0.0321n$ comparisons to the known lower bound (while using only $O(\log n)$ additional space and $O(n \log n)$ time overall). Implementations of these sorting strategies show that the algorithms challenge well-established library implementations like Musser's Introsort.
... Based on QuickHeapsort [5,7], Edelkamp and Weiß [9] developed the concept of QuickXsort and applied it to X = WeakHeapsort [8] and X = Mergesort. The idea -going back to UltimateHeapsort [17] -is very simple: as in Quicksort the array is partitioned into the elements greater and less than some pivot element, respectively. ...
... Then, QuickXsort with a Median-of-√ n pivot selection also needs at most n log n + cn + o(n) comparisons on average [9]. Sample sizes of approximately √ n are likely to be optimal [7,22]. ...
... state-of-the-art library implementations in C++ and Java on basic data types is surprisingly high. For example, all Heapsort variants we are aware of fail this test, we checked refined implementations of Binary Heapsort [12,28], Bottom-Up Heapsort [26], MDR Heapsort [25], QuickHeapsort [7], and Weak-Heapsort [8]. Some of these algorithm even use extra space. ...
Preprint
Full-text available
We consider the fundamental problem of internally sorting a sequence of $n$ elements. In its best theoretical setting QuickMergesort, a combination Quicksort with Mergesort with a Median-of-$\sqrt{n}$ pivot selection, requires at most $n \log n - 1.3999n + o(n)$ element comparisons on the average. The questions addressed in this paper is how to make this algorithm practical. As refined pivot selection usually adds much overhead, we show that the Median-of-3 pivot selection of QuickMergesort leads to at most $n \log n - 0{.}75n + o(n)$ element comparisons on average, while running fast on elementary data. The experiments show that QuickMergesort outperforms state-of-the-art library implementations, including C++'s Introsort and Java's Dual-Pivot Quicksort. Further trade-offs between a low running time and a low number of comparisons are studied. Moreover, we describe a practically efficient version with $n \log n + O(n)$ comparisons in the worst case.
... The idea to combine Quicksort and a secondary sorting method was suggested by Contone and Cincotti [2,1]. They study Heapsort with an output buffer (external Heapsort), 3 and combine it with Quicksort to QuickHeapsort. They analyze the average costs for external Heapsort in isolation and use a differencing trick for dealing with the QuickXSort recurrence; however, this technique is hard to generalize to median-of-k pivots. ...
... Diekert and Weiß [3] suggest optimizations for QuickHeapsort (some of which need extra space again), and they give better upper bounds for QuickHeapsort with random pivots and median-of-3. Their results are still not tight since they upper bound the total cost of all Heapsort calls together (using ad hoc arguments on the form of the costs for one Heapsort round), without taking the actual subproblem sizes into account that Heapsort is used on. ...
... In this case the behavior coincides with the simpler strategy to always sort the smaller segment by Mergesort since the segments are of almost equal size with high probability. 3 Not having to store the heap in a consecutive prefix of the array allows to save comparisons over classic in-place Heapsort: After an delete-max operation, we can fill the gap at the root of the heap by promoting the largest child and recursively moving the gap down the heap. (We then fill the gap with a −∞ sentinel value). ...
Article
Full-text available
QuickXSort is a strategy to combine Quicksort with another sorting method X, so that the result has essentially the same comparison cost as X in isolation, but sorts in place even when X requires a linear-size buffer. We solve the recurrence for QuickXSort precisely up to the linear term including the optimization to choose pivots from a sample of k elements. This allows to immediately obtain overall average costs using only the average costs of sorting method X (as if run in isolation). We thereby extend and greatly simplify the analysis of QuickHeapsort and QuickMergesort with practically efficient pivot selection, and give the first tight upper bounds including the linear term for such methods.
... UltimateHeapsort is inferior to QuickHeapsort in terms of average case number of comparisons, although, unlike QuickHeapsort, it allows an n log n + O(n) bound for the worst case number of comparisons. Diekert and Weiß [3] analyzed QuickHeapsort more thoroughly and described some improvements requiring less than n log n − 0.99n + o(n) comparisons on average. Edelkamp and Stiegeler [5] applied the idea of QuickXsort to WeakHeapsort (which was first described by Dutton [4]) introducing QuickWeakHeapsort. ...
... For the rest of the paper, we assume that the pivot is selected as the median of approximately √ n randomly chosen elements. Sample sizes of approximately √ n are likely to be optimal as the results in [3,13] suggest. ...
... First, we compare the different algorithms we use as base cases, i. e., MergeInsertion, its improved variant, and Insertionsort. The results can be seen in Fig. 4. Depending on the size of the arrays the displayed numbers are averages over 10-10000 runs 3 . The data elements we sorted were randomly chosen 32-bit integers. ...
Conference Paper
Full-text available
In this paper we generalize the idea of QuickHeapsort leading to the notion of QuickXsort. Given some external sorting algorithm X, QuickXsort yields an internal sorting algorithm if X satisfies certain natural conditions. We show that up to o(n) terms the average number of comparisons incurred by QuickXsort is equal to the average number of comparisons of X. We also describe a new variant of WeakHeapsort. With QuickWeakHeapsort and QuickMergesort we present two examples for the QuickXsort construction. Both are efficient algorithms that perform approximately n logn − 1.26n + o(n) comparisons on average. Moreover, we show that this bound also holds for a slight modification which guarantees an \(n \log n + \mathcal{O}(n)\) bound for the worst case number of comparisons. Finally, we describe an implementation of MergeInsertion and analyze its average case behavior. Taking MergeInsertion as a base case for QuickMergesort, we establish an efficient internal sorting algorithm calling for at most n logn − 1.3999n + o(n) comparisons on average. QuickMergesort with constant size base cases shows the best performance on practical inputs and is competitive to STL-Introsort.
... Ulti-mateHeapsort is inferior to QuickHeapsort in terms of average case running time, although, unlike QuickHeapsort, it allows an n log n + O(n) bound for the worst case number of comparisons. Diekert and Weiß [2] analyzed QuickHeapsort more thoroughly and showed that it needs less than n log n − 0.99n + o(n) comparisons in the average case when implemented with approximately √ n elements as sample for pivot selection and some other improvements. Edelkamp and Stiegeler [4] applied the idea of QuickXsort to WeakHeapsort (which was first [3,5] O(n/w) O(n log n) 0.09 - Abbreviations: # in this paper, MI MergeInsertion, -not analyzed, * for n = 2 k , w: computer word width in bits; we assume log n ∈ O(n/w). ...
... For the rest of the paper, we assume that the pivot is selected as the median of approximately √ n randomly chosen elements. Sample sizes of approximately √ n are likely to be optimal as the results in [2,11] suggest. ...
... For the normal QuickMergesort we used base cases of size ≤ 9. We also implemented QuickMergesort with median of three for pivot selection, which turns out to be practically efficient, although it needs slightly more comparisons than QuickMergesort with median of √ n. However, since also the larger half of the partitioned array can be sorted with Mergesort, the difference to the median of √ n version is not as big as in QuickHeapsort [2]. As suggested by the theory, we see that our improved QuickMergesort implementation with growing size base cases MergeInsertion yields a result for the constant in the linear term that is in the range of [−1.41, −1.40] -close to the lower bound. ...
Preprint
Full-text available
In this paper we generalize the idea of QuickHeapsort leading to the notion of QuickXsort. Given some external sorting algorithm X, QuickXsort yields an internal sorting algorithm if X satisfies certain natural conditions. With QuickWeakHeapsort and QuickMergesort we present two examples for the QuickXsort-construction. Both are efficient algorithms that incur approximately n log n - 1.26n +o(n) comparisons on the average. A worst case of n log n + O(n) comparisons can be achieved without significantly affecting the average case. Furthermore, we describe an implementation of MergeInsertion for small n. Taking MergeInsertion as a base case for QuickMergesort, we establish a worst-case efficient sorting algorithm calling for n log n - 1.3999n + o(n) comparisons on average. QuickMergesort with constant size base cases shows the best performance on practical inputs: when sorting integers it is slower by only 15% to STL-Introsort.
... The sorting problem has been studied thoroughly, and many research papers have focused on designing fast and optimal algorithms [9][10][11][12][13][14][15][16]. Also, some studies have focused on implementing these algorithms to obtain an efficient sorting algorithm on different platforms [15][16][17]. ...
Article
Full-text available
Sorting an array of n elements represents one of the leading problems in different fields of computer science such as databases, graphs, computational geometry, and bioinformatics. A large number of sorting algorithms have been proposed based on different strategies. Recently, a sequential algorithm, called double hashing sort (DHS) algorithm, has been shown to exceed the quick sort algorithm in performance by 10–25%. In this paper, we study this technique from the standpoints of complexity analysis and the algorithm’s practical performance. We propose a new complexity analysis for the DHS algorithm based on the relation between the size of the input and the domain of the input elements. Our results reveal that the previous complexity analysis was not accurate. We also show experimentally that the counting sort algorithm performs significantly better than the DHS algorithm. Our experimental studies are based on six benchmarks; the percentage of improvement was roughly 46% on the average for all cases studied.
... Other examples for QuickXsort are QuickHeapsort [5,9] and QuickWeakheapsort [10,11] and Ultimate Heapsort [21]. QuickXsort with median-of-√ n pivot selection uses at most n log n + cn + o(n) comparisons on average to sort n elements given that X also uses at most n log n + cn + o(n) comparisons on average [11]. ...
... Other examples for QuickXsort are QuickHeapsort [5,9] and QuickWeakheapsort [10,11] and Ultimate Heapsort [21]. QuickXsort with median-of-√ n pivot selection uses at most n log n+cn+o(n) comparisons on average to sort n elements given that X also uses at most n log n + cn + o(n) comparisons on average [11]. ...
Preprint
Full-text available
The two most prominent solutions for the sorting problem are Quicksort and Mergesort. While Quicksort is very fast on average, Mergesort additionally gives worst-case guarantees, but needs extra space for a linear number of elements. Worst-case efficient in-place sorting, however, remains a challenge: the standard solution, Heapsort, suffers from a bad cache behavior and is also not overly fast for in-cache instances. In this work we present median-of-medians QuickMergesort (MoMQuickMergesort), a new variant of QuickMergesort, which combines Quicksort with Mergesort allowing the latter to be implemented in place. Our new variant applies the median-of-medians algorithm for selecting pivots in order to circumvent the quadratic worst case. Indeed, we show that it uses at most $n \log n + 1.6n$ comparisons for $n$ large enough. We experimentally confirm the theoretical estimates and show that the new algorithm outperforms Heapsort by far and is only around 10% slower than Introsort (std::sort implementation of stdlibc++), which has a rather poor guarantee for the worst case. We also simulate the worst case, which is only around 10% slower than the average case. In particular, the new algorithm is a natural candidate to replace Heapsort as a worst-case stopper in Introsort.
... Recently, in [16], the idea of QuickHeapsort [2,5] was generalized to the notion of QuickXsort: Given some black-box sorting algorithm X, QuickXsort can be used to speed X up provided that X satisfies certain natural conditions. QuickWeakHeapsort and QuickMergesort were described as two examples of this construction. ...
Conference Paper
Full-text available
A weak heap is a variant of a binary heap where, for each node, the heap ordering is enforced only for one of its two children. In 1993, Dutton showed that this data structure yields a simple worst-case-efficient sorting algorithm. In this paper we review the refinements proposed to the basic data structure that improve the efficiency even further. Ultimately, minimum and insert operations are supported in O(1) worst-case time and extract-min operation in O(lgn) worst-case time involving at most lgn+O(1) element comparisons. In addition, we look at several applications of weak heaps. This encompasses the creation of a sorting index and the use of a weak heap as a tournament tree leading to a sorting algorithm that is close to optimal in terms of the number of element comparisons performed. By supporting insert operation in O(1) amortized time, the weak-heap data structure becomes a valuable tool in adaptive sorting leading to an algorithm that is constant-factor optimal with respect to several measures of disorder. Also, a weak heap can be used as an intermediate step in an efficient construction of binary heaps. For graph search and network optimization, a weak-heap variant, which allows some of the nodes to violate the weak-heap ordering, is known to be provably better than a Fibonacci heap.
Conference Paper
Full-text available
We show how to build a binary heap in-place in linear time by performing ˜ 1.625n element comparisons, at most ˜ 2.125n element moves, and ˜ n/B cache misses, where n is the size of the input array, B the capacity of the cache line, and ˜ f(n) approaches f(n) as n grows. The same bound for element comparisons was derived and conjectured to be optimal by Gonnet and Munro; however, their procedure requires Θ(n) pointers and does not have optimal cache behaviour. Our main idea is to mimic the Gonnet-Munro algorithm by converting a navigation pile into a binary heap. To construct a binary heap in-place, we use this algorithm to build bottom heaps of size $\Theta(\lg n)$ and adjust the heap order at the upper levels using Floyd's sift-down procedure. On another frontier, we compare different heap-construction alternatives in practice.
Conference Paper
Full-text available
First we present a new variant of Merge-sort, which needs only 1.25n space, because it uses space again, which becomes available within the current stage. It does not need more comparisons than classical Merge-sort. The main result is an easy to implement method of iterating the procedure in-place starting to sort 4/5 of the elements. Hereby we can keep the additional transport costs linear and only very few comparisons get lost, so that n log n–0.8n comparisons are needed. We show that we can improve the number of comparisons if we sort blocks of constant length with Merge-Insertion, before starting the algorithm. Another improvement is to start the iteration with a better version, which needs only (1+)n space and again additional O(n) transports. The result is, that we can improve this theoretically up to n log n –1.3289n comparisons in the worst case. This is close to the theoretical lower bound of n log n–1.443n. The total number of transports in all these versions can be reduced to n log n+O(1) for any >0.
Conference Paper
The speaker will describe his adventures during the past eight months when he has been intensively studying as many aspects of SAT solvers as possible, in the course of preparing about 100 pages of new material for Volume 4B of The Art of Computer Programming.
Article
With refinements to the WEAK-HEAPSORT algorithm we establish the general and practical relevant sequential sorting algorithm INDEX-WEAK-HEAPSORT with exactly n⌈log n⌉ - 2⌈log n⌉ + 1 ≤ n log n-0.9n comparisons and at most n log n + 0.1n transpositions on any given input. It comprises an integer array of size n and is best used to generate an index for the data set. With RELAXED-WEAK-HEAPSORT and GREEDY-WEAK-HEAPSORT we discuss modifications for a smaller set of pending element transpositions.If extra space to create an index is not available, with QUICK-WEAK-HEAPSORT we propose an efficient QUICKSORT variant with n log n + 0.2n + o(n) comparisons on the average. Furthermore, we present data showing that WEAK-HEAPSORT, INDEX-WEAK-HEAPSORT and QUICK-WEAK-HEAPSORT compete with other performant QUICKSORT and HEAPSORT variants.
Conference Paper
In this paper, we show how to improve the complexity of heap operations and heapsort using extra bits. We first study parallel complexity of implementing priority queue operations on a heap. While the insertion of a new element into a heap can be done as fast as parallel searching, we show how to delete the smallest element from a heap in constant time with a sublinear number of processors, and in sublogarithmic time with a sublogarithmic number of processors. The models of parallel computation used are the CREW PRAM and the CRCW PRAM. Our results improve those of previously known algorithms. Moreover, we study a variant, fine-heap, of the traditional heap structure. A fast algorithm for constructing this new data structure is designed, which is also used to develop an improved heapsort algorithm. Our variation of heapsort is faster than McDiarmid and Reeds variant of heapsort and requires less extra space.
Article
A new variant of HEAPSORT is presented in this paper. The algorithm is not an internal sorting algorithm in the strong sense, since extra storage for n integers is necessary. The basic idea of the new algorithm is similar to the classical sorting algorithm HEAPSORT, but the algorithm rebuilds the heap in another way. The basic idea of the new algorithm is it uses only one comparison at each node. The new algorithm shift walks down a path in the heap until a leaf is reached. The request of placing the element in the root immediately to its destination is relaxed. The new algorithm requires about n log n − 0.788928n comparisons in the worst case and n log n − n comparisons on the average which is only about 0.4n more than necessary. It beats on average even the clever variants of QUICKSORT, if n is not very small. The difference between the worst case and the best case indicates that there is still room for improvement of the new algorithm by constructing heap more carefully.