ArticlePublisher preview available

QuickHeapsort: Modifications and Improved Analysis

Authors:
To read the full-text of this research, you can request a copy directly from the authors.

Abstract and Figures

QuickHeapsort is a combination of Quicksort and Heapsort. We show that the expected number of comparisons for QuickHeapsort is always better than for Quicksort if a usual median-of-constant strategy is used for choosing pivot elements. In order to obtain the result we present a new analysis for QuickHeapsort splitting it into the analysis of the partition-phases and the analysis of the heap-phases. This enables us to consider samples of non-constant size for the pivot selection and leads to better theoretical bounds for the algorithm. Furthermore, we introduce some modifications of QuickHeapsort. We show that for every input the expected number of comparisons is at most nlog2n−0.03n+o(n)\documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$n\log _{2}n - 0.03n + o(n)$\end{document} for the in-place variant. If we allow n extra bits, then we can lower the bound to nlog2n−0.997n+o(n)\documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$n\log _{2} n -0.997 n+ o (n)$\end{document}. Thus, spending n extra bits we can save more that 0.96n comparisons if n is large enough. Both estimates improve the previously known results. Moreover, our non-in-place variant does essentially use the same number of comparisons as index based Heapsort variants and Relaxed-Weak-Heapsort which use nlog2n−0.9n+o(n)\documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$n\log _{2}n -0.9 n+ o (n)$\end{document} comparisons in the worst case. However, index based Heapsort variants and Relaxed-Weak-Heapsort require Θ(nlogn)\documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}${\Theta }(n\log n)$\end{document} extra bits whereas we need n bits only. Our theoretical results are upper bounds and valid for every input. Our computer experiments show that the gap between our bounds and the actual values on random inputs is small. Moreover, the computer experiments establish QuickHeapsort as competitive with Quicksort in terms of running time.
This content is subject to copyright. Terms and conditions apply.
Theory Comput Syst (2016) 59:209–230
DOI 10.1007/s00224-015-9656-y
QuickHeapsort: Modifications and Improved Analysis
Volker Diekert1·Armin Weiß1
Published online: 15 September 2015
Abstract QuickHeapsort is a combination of Quicksort and Heapsort. We show that
the expected number of comparisons for QuickHeapsort is always better than for
Quicksort if a usual median-of-constant strategy is used for choosing pivot elements.
In order to obtain the result we present a new analysis for QuickHeapsort splitting
it into the analysis of the partition-phases and the analysis of the heap-phases. This
enables us to consider samples of non-constant size for the pivot selection and leads
to better theoretical bounds for the algorithm. Furthermore, we introduce some mod-
ifications of QuickHeapsort. We show that for every input the expected number of
comparisons is at most nlog2n0.03n+o(n) for the in-place variant. If we allow
nextra bits, then we can lower the bound to nlog2n0.997n+o(n). Thus, spend-
ing nextra bits we can save more that 0.96ncomparisons if nis large enough. Both
estimates improve the previously known results. Moreover, our non-in-place variant
does essentially use the same number of comparisons as index based Heapsort vari-
ants and Relaxed-Weak-Heapsort which use nlog2n0.9n+o(n) comparisons in
the worst case. However, index based Heapsort variants and Relaxed-Weak-Heapsort
require (n log n) extra bits whereas we need nbits only. Our theoretical results are
upper bounds and valid for every input. Our computer experiments show that the
gap between our bounds and the actual values on random inputs is small. Moreover,
the computer experiments establish QuickHeapsort as competitive with Quicksort in
terms of running time.
Keywords In-place sorting ·Heapsort ·Quicksort ·Analysis of algorithms
Armin Weiß
armin.weiss@fmi.uni-stuttgart.de
Volker Diekert
diekert@fmi.uni-stuttgart.de
1FMI, Universit¨
at Stuttgart, Universit¨
atsstr. 38, D-70569 Stuttgart, Germany
... UltimateHeapsort is inferior to QuickHeapsort in terms of the average case number of comparisons, although, unlike QuickHeapsort, it allows an n lg n + O(n) bound for the worst case number of comparisons. Diekert and Weiß [4] analyzed QuickHeapsort more thoroughly and described some improvements requiring less than n lg n − 0.99n + o(n) comparisons on average (choosing the pivot as median of √ n elements). However, both the original analysis of Cantone and Cincotti and the improved analysis could not give tight bounds for the average case of median-of-k QuickHeapsort. ...
... 6 without proofs). In [52], the third author analyzed QuickMergesort with constant-size pivot sampling (see Sect. 4 ...
... 2. Under reasonable assumptions, sample sizes of Θ( √ n) are optimal among all polynomial size sample sizes. 3. The probability that median-of-√ n QuickXsort needs more than x wc (n) + 6n comparisons decreases exponentially in 4 √ n (Proposition 4.5). Here, x wc (n) is the worst-case cost of X. 4. We introduce median-of-medians fallback pivot selection (a trick similar to Introsort [39]) which guarantees n lg n + O(n) comparisons in the worst case while altering the average case only by o(n)-terms (Theorem 4.7). 5. Let k be a fixed constant and let X be a sorting method that needs a buffer of αn elements for some constant α ∈ [0, 1] to sort n elements and requires on average x(n) = n lg n + bn ± o(n) comparisons to do so. ...
Article
Full-text available
QuickXsort is a highly efficient in-place sequential sorting scheme that mixes Hoare’s Quicksort algorithm with X, where X can be chosen from a wider range of other known sorting algorithms, like Heapsort, Insertionsort and Mergesort. Its major advantage is that QuickXsort can be in-place even if X is not. In this work we provide general transfer theorems expressing the number of comparisons of QuickXsort in terms of the number of comparisons of X. More specifically, if pivots are chosen as medians of (not too fast) growing size samples, the average number of comparisons of QuickXsort and X differ only by o(n)-terms. For median-of-k pivot selection for some constant k, the difference is a linear term whose coefficient we compute precisely. For instance, median-of-three QuickMergesort uses at most nlgn-0.8358n+O(logn) comparisons. Furthermore, we examine the possibility of sorting base cases with some other algorithm using even less comparisons. By doing so the average-case number of comparisons can be reduced down to nlgn-1.4112n+o(n) for a remaining gap of only 0.0315n comparisons to the known lower bound (while using only O(logn) additional space and O(nlogn) time overall). Implementations of these sorting strategies show that the algorithms challenge well-established library implementations like Musser’s Introsort.
... UltimateHeapsort is inferior to QuickHeapsort in terms of the average case number of comparisons, although, unlike QuickHeapsort, it allows an n lg n + O(n) bound for the worst case number of comparisons. Diekert and Weiß [4] analyzed QuickHeapsort more thoroughly and described some improvements requiring less than n lg n − 0.99n + o(n) comparisons on average (choosing the pivot as median of √ n elements). However, both the original analysis of Cantone and Cincotti and the improved analysis could not give tight bounds for the average case of median-of-k QuickMergesort. ...
... We consider both the case where k is a fixed constant and where k = k(n) is an increasing function of the (sub)problem size. Previous results in [4,35] for Quicksort suggest that sample sizes k(n) = Θ( √ n) are likely to be optimal asymptotically, but most of the relative savings for the expected case are already realized for k ≤ 10. It is quite natural to expect similar behavior in QuickXsort, and it will be one goal of this article to precisely quantify these statements. ...
... We use a symmetric variant (with a min-oriented heap) if the left segment shall be sorted by X. For detailed code for the above procedure, we refer to [3] or [4]. ...
Preprint
Full-text available
QuickXsort is a highly efficient in-place sequential sorting scheme that mixes Hoare's Quicksort algorithm with X, where X can be chosen from a wider range of other known sorting algorithms, like Heapsort, Insertionsort and Mergesort. Its major advantage is that QuickXsort can be in-place even if X is not. In this work we provide general transfer theorems expressing the number of comparisons of QuickXsort in terms of the number of comparisons of X. More specifically, if pivots are chosen as medians of (not too fast) growing size samples, the average number of comparisons of QuickXsort and X differ only by $o(n)$-terms. For median-of-$k$ pivot selection for some constant $k$, the difference is a linear term whose coefficient we compute precisely. For instance, median-of-three QuickMergesort uses at most $n \lg n - 0.8358n + O(\log n)$ comparisons. Furthermore, we examine the possibility of sorting base cases with some other algorithm using even less comparisons. By doing so the average-case number of comparisons can be reduced down to $n \lg n- 1.4106n + o(n)$ for a remaining gap of only $0.0321n$ comparisons to the known lower bound (while using only $O(\log n)$ additional space and $O(n \log n)$ time overall). Implementations of these sorting strategies show that the algorithms challenge well-established library implementations like Musser's Introsort.
Article
Among the comparison-based algorithms, insertionsort is recognized as one of the fastest methods to sort relatively small data sets, or when the elements are relatively ordered. However, due to not offering good asymptotic complexity in its runtime, it performs very poorly both in the worst case and in the average case for most large data collections. In this article we offer a new sorting algorithm based on orderer block insertions with worst case optimal time. At the cost of an additional memory space of nk words, for any constant k, our algorithm is able to easily transform it into an in-place algorithm with a time of o(n2) in the worst case. Empirically, our method outperforms insertionsort in all the different cases tested, even for small input collections. Furthermore, our experiments show that, for small or large datasets from either of the two main probability distributions —Uniform and Normal; our algorithms also outperform any traditional method like quicksort, mergesort or heapsort, and even better than the efficient hybrid algorithm introsort —std::sort() method provided by the GNU C++ Standard Library.
Conference Paper
Full-text available
У рачунарским наукама као мерило перформанси и ефикасности система понајвише се користе алгоритми сортирања. Сваки алгоритам сортирања као и сваки програмски језик има одређене предности и недостатке. Са циљем осветљавања те чињенице, овај рад се бави перформансама извршавања одабраних алгоритама сортирања који се користе у различитим модерним рачунарским језицима/компајлерима (C#, Java и Python). Алгоритми сортирања тестирани су за насумично генерисане низове података различите величине и структуре. Резултати показују интересантне разлике у ефикасности сортирања између појединих имплементација у односу на оптерећење CPU-а и време извршења. Поред емпиријског дискутован је и аналитички начин одређивања времена извршавања изабраних алгоритама.
Article
Full-text available
Sorting an array of n elements represents one of the leading problems in different fields of computer science such as databases, graphs, computational geometry, and bioinformatics. A large number of sorting algorithms have been proposed based on different strategies. Recently, a sequential algorithm, called double hashing sort (DHS) algorithm, has been shown to exceed the quick sort algorithm in performance by 10–25%. In this paper, we study this technique from the standpoints of complexity analysis and the algorithm’s practical performance. We propose a new complexity analysis for the DHS algorithm based on the relation between the size of the input and the domain of the input elements. Our results reveal that the previous complexity analysis was not accurate. We also show experimentally that the counting sort algorithm performs significantly better than the DHS algorithm. Our experimental studies are based on six benchmarks; the percentage of improvement was roughly 46% on the average for all cases studied.
Conference Paper
Full-text available
We show how to build a binary heap in-place in linear time by performing ˜ 1.625n element comparisons, at most ˜ 2.125n element moves, and ˜ n/B cache misses, where n is the size of the input array, B the capacity of the cache line, and ˜ f(n) approaches f(n) as n grows. The same bound for element comparisons was derived and conjectured to be optimal by Gonnet and Munro; however, their procedure requires Θ(n) pointers and does not have optimal cache behaviour. Our main idea is to mimic the Gonnet-Munro algorithm by converting a navigation pile into a binary heap. To construct a binary heap in-place, we use this algorithm to build bottom heaps of size $\Theta(\lg n)$ and adjust the heap order at the upper levels using Floyd's sift-down procedure. On another frontier, we compare different heap-construction alternatives in practice.
Conference Paper
Full-text available
First we present a new variant of Merge-sort, which needs only 1.25n space, because it uses space again, which becomes available within the current stage. It does not need more comparisons than classical Merge-sort. The main result is an easy to implement method of iterating the procedure in-place starting to sort 4/5 of the elements. Hereby we can keep the additional transport costs linear and only very few comparisons get lost, so that n log n–0.8n comparisons are needed. We show that we can improve the number of comparisons if we sort blocks of constant length with Merge-Insertion, before starting the algorithm. Another improvement is to start the iteration with a better version, which needs only (1+)n space and again additional O(n) transports. The result is, that we can improve this theoretically up to n log n –1.3289n comparisons in the worst case. This is close to the theoretical lower bound of n log n–1.443n. The total number of transports in all these versions can be reduced to n log n+O(1) for any >0.
Article
With refinements to the WEAK-HEAPSORT algorithm we establish the general and practical relevant sequential sorting algorithm INDEX-WEAK-HEAPSORT with exactly n⌈log n⌉ - 2⌈log n⌉ + 1 ≤ n log n-0.9n comparisons and at most n log n + 0.1n transpositions on any given input. It comprises an integer array of size n and is best used to generate an index for the data set. With RELAXED-WEAK-HEAPSORT and GREEDY-WEAK-HEAPSORT we discuss modifications for a smaller set of pending element transpositions.If extra space to create an index is not available, with QUICK-WEAK-HEAPSORT we propose an efficient QUICKSORT variant with n log n + 0.2n + o(n) comparisons on the average. Furthermore, we present data showing that WEAK-HEAPSORT, INDEX-WEAK-HEAPSORT and QUICK-WEAK-HEAPSORT compete with other performant QUICKSORT and HEAPSORT variants.
Conference Paper
In this paper, we show how to improve the complexity of heap operations and heapsort using extra bits. We first study parallel complexity of implementing priority queue operations on a heap. While the insertion of a new element into a heap can be done as fast as parallel searching, we show how to delete the smallest element from a heap in constant time with a sublinear number of processors, and in sublogarithmic time with a sublogarithmic number of processors. The models of parallel computation used are the CREW PRAM and the CRCW PRAM. Our results improve those of previously known algorithms. Moreover, we study a variant, fine-heap, of the traditional heap structure. A fast algorithm for constructing this new data structure is designed, which is also used to develop an improved heapsort algorithm. Our variation of heapsort is faster than McDiarmid and Reeds variant of heapsort and requires less extra space.
Article
A new variant of HEAPSORT is presented in this paper. The algorithm is not an internal sorting algorithm in the strong sense, since extra storage for n integers is necessary. The basic idea of the new algorithm is similar to the classical sorting algorithm HEAPSORT, but the algorithm rebuilds the heap in another way. The basic idea of the new algorithm is it uses only one comparison at each node. The new algorithm shift walks down a path in the heap until a leaf is reached. The request of placing the element in the root immediately to its destination is relaxed. The new algorithm requires about n log n − 0.788928n comparisons in the worst case and n log n − n comparisons on the average which is only about 0.4n more than necessary. It beats on average even the clever variants of QUICKSORT, if n is not very small. The difference between the worst case and the best case indicates that there is still room for improvement of the new algorithm by constructing heap more carefully.
Article
We present an algorithm to construct a heap which uses on average (α + o(1))n comparisons to build a heap on n elements, where α ≈ 1.52. Indeed on the overwhelming proportion of inputs our algorithm uses this many comparisons. This average complexity is better than that known for any other algorithm. We conjecture that it is optimal. Our method is a natural variant of the standard heap construction method due to Floyd.
Article
A variant of HEAPSORT, called BOTTOM-UP-HEAPSORT, is presented. It is based on a new reheap procedure. This sequential sorting algorithm is easy to implement and beats, on an average, QUICKSORT if n⩾400 and a clever version of QUICKSORT (where the split object is the median of 3 randomly chosen objects) if n⩾16000. The worst-case number of comparisons is bounded by 1.5n log n+O(n). Moreover, the new reheap procedure improves the delete procedure for the heap data structure for all n.
Article
We present an efficient and practical algorithm for the internal sorting problem. Our algorithm works in-place and, on the average, has a running-time of in the size n of the input. More specifically, the algorithm performs comparisons and element moves on the average. An experimental comparison of our proposed algorithm with the most efficient variants of Quicksort and Heapsort is carried out and its results are discussed.