Conference PaperPDF Available

# Sorting in-place with a worst case complexity of n log n−1.3n+O(log n) comparisons and ε n log n+O(1) transports

Authors:

## Abstract

First we present a new variant of Merge-sort, which needs only 1.25n space, because it uses space again, which becomes available within the current stage. It does not need more comparisons than classical Merge-sort. The main result is an easy to implement method of iterating the procedure in-place starting to sort 4/5 of the elements. Hereby we can keep the additional transport costs linear and only very few comparisons get lost, so that n log n–0.8n comparisons are needed. We show that we can improve the number of comparisons if we sort blocks of constant length with Merge-Insertion, before starting the algorithm. Another improvement is to start the iteration with a better version, which needs only (1+)n space and again additional O(n) transports. The result is, that we can improve this theoretically up to n log n –1.3289n comparisons in the worst case. This is close to the theoretical lower bound of n log n–1.443n. The total number of transports in all these versions can be reduced to n log n+O(1) for any >0.
Sorting In-Place with a Worst Case Complexity of
nlog n1.3n+O(log n) Comparisons and
ε n log n+O(1) Transports
Klaus Reinhardt
Institut f¨ur Informatik, Universit¨at Stuttgart
Breitwiesenstr.22, D-7000 Stuttgart-80, Germany
e-mail: reinhard@informatik.uni-stuttgart.de
Abstract
First we present a new variant of Merge-sort, which needs only 1.25nspace, because
it uses space again, which becomes available within the current stage. It does not need
more comparisons than classical Merge-sort.
The main result is an easy to implement method of iterating the procedure in-place
starting to sort 4/5 of the elements. Hereby we can keep the additional transport costs
linear and only very few comparisons get lost, so that nlog n0.8ncomparisons are
needed.
We show that we can improve the number of comparisons if we sort blocks of con-
stant length with Merge-Insertion, before starting the algorithm. Another improvement
is to start the iteration with a better version, which needs only (1+ε)nspace and again
additional O(n) transports. The result is, that we can improve this theoretically up to
nlog n1.3289ncomparisons in the worst case. This is close to the theoretical lower
bound of nlog n1.443n.
The total number of transports in all these versions can be reduced to ε n log n+O(1)
for any ε > 0.
1 Introduction
In regard to well known sorting algorithms, there appears to be a kind of trade-oﬀ be-
tween space and time complexity: Methods for sorting like Merge-sort, Merge-Insertion or
Insertion-sort, which have a small number of comparisons, either need O(n2) transports or
work with a data structure, which needs 2nplaces (see [Kn72] [Me84]). Although the price
for storage is decreasing, it is a desirable property for an algorithm to be in-place, that
means to use only the storage needed for the input (except for a constant amount).
In [HL88] Huang and Langston gave an upper bound of 3nlog nfor the number of
comparisons of their in-place variant of merge-sort. Heap-sort needs 2nlog ncomparisons
and the upper bound for the comparisons in Bottom-up-Heapsort of 1.5nlog n[We90] is
tight [Fl91]. Carlsson’s variant of heap-sort [Ca87] needs nlog n+ Θ(nlog log n) compar-
isons. The ﬁrst algorithm, which is nearly in-place and has nlog n+O(n) as the number of
comparisons, is Mc Diarmid and Reed’s variant of Bottom-up-Heap-sort. Wegener showed
this research has been partially supported by the EBRA working group No. 3166 ASMICS.
1
in [We91], that it needs nlog n+ 1.1ncomparisons, but the algorithm is not in-place, since
For in-place sorting algorithms, we can ﬁnd a trade-oﬀ between the number of compar-
isons and the number of transports: In [MR91] Munro and Raman describe an algorithm,
which sorts in-place with only O(n) transports but needs O(n1+ε) comparisons.
The way the time complexity is composed of transports and essential1comparisons
depends on the data structure of the elements. If an element is a block of ﬁxed size and the
key ﬁeld is a small part of it, then a transport can be more expensive than a comparison. On
the other hand, if an element is a small set of pointers to an (eventually external) database,
then a transport is much cheaper than a comparison. But in any way an O(nlog n) time
algorithm is asymptotically faster than the algorithm in [MR91].
In this paper we describe the ﬁrst sorting algorithm, which fulﬁlls the following prop-
erties:
It is general. (Any kind of elements of an ordered set can be sorted.)
It is in-place. (The used storage except the input is constant.)
It has the worst-case time complexity O(nlog n).
It has nlog n+O(n) as the number of (essential) comparisons in the worst case (the
constant factor 1 for the term nlog n).
The negative linear constant in the number of comparisons and the possibility of reducing
the number of transports to εn log n+O(1) for every ﬁxed ε > 0 are further advantages of
our algorithm.
In our description some parameters of the algorithm are left open. One of them de-
pends on the given ε. Other parameters inﬂuence the linear constant in the amount of
comparisons. Regarding these parameters, we can again ﬁnd a trade-oﬀ between the linear
component in the number of comparisons and a linear amount of transports. Although a
linear amount of transports does not inﬂuence the asymptotic transport cost, it is surely
important for a practical application. We will see that by choosing appropriate parameters,
we obtain a negative linear component in the number of comparisons as close as we want
to 1.328966, but for a practical application, we should be satisﬁed with at most 1. The
main issue of the paper is theoretical, but we expect that a good choice of the parameters
leads to an eﬃcient algorithm for practical application.
2 A Variant of Merge-sort in 1.25nPlaces
The usual way of merging is to move the elements from one array of length nto another
one. We use only one array, but we add a gap of length n
4and move the elements from one
side of the gap to the other. In each step pairs of lists are merged together by repeatedly
moving the bigger head element to the tail of the new list. Hereby the space of former
pairs of lists in the same stage can be used for the resulting lists. As long as the lists are
short, there is enough space to do this (see Figure 1). The horizontal direction in a ﬁgure
shows the indices of the array and the vertical direction expresses the size of the contained
elements.
1Comparisons of pointers are not counted
2
& %
6
" !
6
.
.
.. .............
.
.
.
.
.
.
.
.
.
.
.......
.
.
.
.
.
.
.
.
Figure 1: Merging two short lists on the left side to one list on the right side
In the last step it can happen, that in the worst case (for the place) there are two lists
of length n
2. At some time during the last step the tail of the new list will hit the head of
the second list. In this case half of the ﬁrst list (of length n
4) is already merged and the
other half is not yet moved. This means, that the gap begins at the point, where the left
end of the resulting list will be. We can then start to merge from the other side (see Figure
2). This run ends exactly when all elements of the ﬁrst list are moved into the new list.
6
 
6
 
6
..............
((((
(
.
.. ... .
.
.
..
.
(
(
(
(
(
(
(
(
(
(
(
(
(
(
(
(
(
(
(
(
(
(
(
(
(
((
((((((((((((((
............
.
.
.
.
.
.
.
............
.
.
.
.
.
.. ...................
((((((((((((((
.
.
.
.
.
.
.
.
.
..
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
Figure 2: The last step, merging two lists of length n
2using a gap of length n
4
Remark: In general, it suﬃces to have a gap of half of the size of the shorter list (If the
number of elements is not a power of 2, it may happen that the other list has up to twice
this length.) or 1
2+r, if we don’t mind the additional transports for shifting the second list
rtimes. The longer list has to be between the shorter list and the gap.
Remark: This can easily be performed in a stable way by regarding the element in
the left list as smaller, if two elements are equal.
3
3 In-Place sorting by Iteration
Using the procedure described in Section 2 we can also sort 0.8nelements of the input
in-place by treating the 0.2nelements like the gap. Whenever an element is moved by the
procedure, it is swapped with an unsorted element from the gap similar to the method in
[HL88].
(
(
(
(
(
(
(
(
(
(
(
(
(
(
(
(
(
(
(
(
(
(
(
(
(
(
(
(
(
(
(
(
(
(
(
.
Figure 3: Reducing the number of unsorted element to 1
3
In one iteration we sort 2
3of the unsorted elements with the procedure and merge them
into the sorted list, where we again use 1
3of the unsorted part of the array, which still
remains unsorted, as gap (see Figure 3). This means that we merge 2
15 nelements to 4
5n
elements, 2
45 nto 14
15 n,2
135 nto 44
45 nand so on.
Remark: In one iteration step we must either shift the long list once, which costs
additional transports, or address the storage in a cyclic way in the entire algorithm.
Remark: This method mixes the unsorted elements completely, which makes it im-
possible to perform the algorithm in a stable way.
3.1 Asymmetric Merging
If we want to merge a long of length land a short list of length htogether, we can compare
the ﬁrst element of the short list with the 2k’th element of the long list and then we can
either move 2kelements of the long list or we need another kcomparisons to insert and
move the element of the short list. Hence, in order to merge the two lists one needs in the
worst case bl
2kc+ (k+ 1)h1 comparisons. This was ﬁrst described by [HL72].
4
3.2 The Number of Additional Comparisons
Let us assume, that we can sort nelements with nlog ndn +O(1) comparisons, then we
get the following: Sorting 4
5nelements needs 4
5n(log 4
5nd) + O(1) comparisons. Sorting
2
5·3knelements for all kneeds
log n
X
k=1 2
5·3kn(log( 2
5·3kn)d) + O(1)
=2
5n
log n
X
k=1
1
3k(log 2
5nd)log 3
log n
X
k=1
k
3k
+O(log n)
=2
5n1
2(log 2
5nd)log 3 3
4+O(log n)
=1
5nlog nd+ log 2
5log 3 3
2+O(log n)
comparisons. Together this yields n(log nd+4
5log 4
5+1
5log 2
53
10 log 3) + O(log n) =
n(log nd0.9974) + O(log n) comparisons. Merging 2
15 nelements to 4
5nelements, 2
5·32mn
to (1 1
5·32m1)nwith k= 3m+ 1 and 2
5·32m+1 nto (1 1
5·32m)nwith k= 3m+ 3 needs
n1
5+n3·2
15 +
n
log n
X
m=1 11
5·32m1
23m+1 +(3m+ 2)2
5·32m+11
5·32m
23m+3 +(3m+ 4)2
5·32m+1 !+O(log n)
=n
3
5+5
8
log n
X
m=1
1
8m13
40
log n
X
m=1
1
72m+8
5
log n
X
m=1
m
9m+4
3
log n
X
m=1
1
9m
+O(log n)
=n3
5+5
8·1
713
40 ·1
71 +8
5·9
64 +4
3·1
8+O(log n)
= 1.076n+O(log n)
comparisons. Thus we need only nlog ndipn+O(log n) = nlog n(d0.0785)n+O(log n)
comparisons for the in-place algorithm. For the algorithm described so far dip = 0.8349;
the following section will explain this and show how this can be improved.
4 Methods for Reducing the Number of Comparisons
The idea is, that we can sort blocks of chosen length bwith ccomparisons by Merge-
Insertion. If the time costs for one block are O(b2), then the time we need is O(bn) = O(n)
since bis ﬁxed.
4.1 Good Applications of Merge-Insertion
Merge-Insertion needs b
P
k=1
dlog 3
4kecomparisons to sort belements [Kn72]. So it works well
for bi:= 4i1
3, where it needs ci:= 2i4i+i
34i+ 1 comparisons. We prove this in the
following by induction: Clearly b1= 1 element is sorted with c1= 0 comparisons. To sort
5
bi+1 = 4bi+ 1 we need
ci+1 =ci+bidlog 3
2bie+ (2bi+ 1)dlog 3
4bi+1e
=ci+bidlog 4i1
2e+ (2bi+ 1)dlog 4i+1 1
4e
=2i4i+i
34i+ 1 + 4i1
3(2i1) + 2·4i+ 1
32i
=8i4i+i4·4i+ 4
3
=2(i+ 1)4i+1 +i+ 1
34i+1 + 1.
Table 1 shows some instances for biand ci.
ibicidbest,i didip,i
1 1 0 1 0.9139 0.8349
2 5 7 1.1219 1.0358 0.9768
3 21 66 1.2971 1.2110 1.1320
4 85 429 1.3741 1.2880 1.2090
5 341 2392 1.4019 1.3158 1.2368
6 1365 12290 1.4118 1.3257 1.2467
.
.
..
.
..
.
..
.
..
.
..
.
.
1.415037 1.328966 1.250008
Table 1: The negative linear constant depending on the block size.
4.2 The Complete Number of Comparisons
In order to merge two lists of length land hone needs l+h1 comparisons. In the best
case where n=b2kwe would need 2kccomparisons to sort the blocks ﬁrst and then the
comparisons to merge 2kipairs of lists of the length 2i1bin the i’th of the ksteps. These
are
2kc+
k
X
i=1
2ki(2i1b+ 2i1b1) = 2kc+kn 2k+ 1
=nlog nn(log b+1c
b)
|{z }
dbest:=
+1
=nlog nndbest + 1
comparisons. In the general case the negative linear factor dis not so good as dbest. Let
the number of elements be n= (2k+m)bjfor j < b and m < 2k, then we need (2k+m)c
comparisons to sort the blocks, m(2b1) jcomparisons to merge mpairs of blocks
together and k
P
i=1
(n2ki) = kn 2k+ 1 comparisons to merge everything together in k
6
steps. The total number of comparisons is
cb(n) := (2k+m)c+m(2b1) j+kn 2k+ 1
=nlog nnlog n
2k+ (2k+m)(c+ 2b1) 2b2kj+ 1
=nlog nn log n
2k+ 2b2k
nc+ 2b1
b!+c+ 2b1
bjj+ 1
nlog nnlog(2bln 2) + 1
ln 2 c+ 2b1
b
| {z }
d:=
+c+b1
bj+ 1
=nlog nnd +c+b1
bj+ 1.
The inequality follows from the fact that the expression log x+2b
xhas its minimum for
x= 2bln 2, since (log x+2b
x)0=1
xln 2 2b
x2= 0. We have used x:= n
2k. Hence we loose at
most (log(2 ln 2) 2 + 1
ln 2 )n=0.086071n= (ddbest)ncomparisons in contrast to the
case of an ideal value of n.
Table 1 shows the inﬂuence of the block size biand the number of comparisons cito
sort it by Merge-Insertion on the negative linear constant difor the comparisons of the
algorithm in Section 2 and dip,i for the algorithm in Section 3. It shows that dip can be
improved as close as we want to 1.250008. As one can see looking at Table 1, most of the
possible improvement is already reached with relatively small blocks.
Another idea is that we can reduce the additional comparisons in Section 3 to a very
small amount, if we start using the algorithm in Section 4.3. This allows us to improve
the linear constant for the in-place algorithm as close as we want to diand in combination
with the block size we can improve it as close as we want to 1.328966.
4.3 A variant of Merge-sort in (1+ε)nplaces
We can change the algorithm of Section 2 in the following way: For a given ε > 0 we have
to choose appropriate r’s for the last dlog 1
εe − 2 steps according to the ﬁrst remark in that
section. Because the number of the last steps and the r’s are constants determined by ε,
the additional transports are O(n) (of course the constant becomes large for small ε’s).
7
5 The Reduction of the Number of Transports
The algorithms in Section 2 and Section 4.3 perform nlog n+O(n) transports. We can
improve this to ε n log n+O(1) for all constants ε > 0, if we combine d1
ε+ 1esteps to
1 by merging 2d1
ε+1e(constantly many) lists in each step, as long as the lists are short
enough. Hereby we keep the number of comparisons exactly the same using for example
the following technique: We use a binary tree of that (constant) size, which contains on
every leaf node the pointer to the head of a list (’nil’ if it is empty, and additionally a
pointer to the tail) and on every other node that pointer of a son node, which points to
the bigger element. After moving one element we need d1
ε+ 1ecomparisons to repair the
tree as shown in this example:
... 7... 3... 5 8 ... 6 9 ...
AA
AU
EEEEEEEE?
........
.
.
.
.
.
.
.
.
........
.
.
.
.
.
.
.
.
................
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
... 7... 3... 5... 6 8 9 . . .
AA
AU
/
?
/
........
.
.
.
.
.
.
.
.
........
.
.
.
.
.
.
.
.
................
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
Note that if we want to reduce the number of transports to o(nlog n), then the size of
the tree must increase depending on n, but then the algorithm would not be in-place.
Remark: This method can also be used to reduce the total number of I/Os (for
transports and comparisons) to a slow storage to ε n log n+O(1), if we can keep the 2d1
ε+1e
elements in a faster storage.
5.1 Transport Costs in the In-Place Procedure
Choosing εof the half size eliminates the factor 2 for swapping instead of moving. But
there are still the additional transports in the iteration during the asymmetric merging in
Section 3:
Since log3niterations have to be performed, we would get O(nlog n) additional
transports, if we perform only iterations of the described kind, which need O(n) transports
each time. But we can reduce these additional transports to O(n), if we perform a second
kind of iteration after each iiterations for any chosen i. Each time it reduces the number
of elements to the half, which have to be moved in later iterations. This second kind of
iteration works as follows (see Figure 4):
One Quick-Sort iteration step is performed on the unsorted elements, where the middle
element of the sorted list is chosen as reference. Then one of the lists is sorted in the usual
way (the other unsorted list is used as a gap) and merged with the corresponding half of
the sorted list. These elements will never have to be moved again.
Remark: It is easy to see that we can reduce the number of additional comparisons to
εn, if we choose ibig enough. Although a formal estimation is quite hard, we conjecture,
that we need only a few comparisons more, if we mainly apply iterations of the second kind,
even in the worst case, where all unsorted elements are on one side. Clearly this iteration
is good in the average.
8
6
 
6
6
& %
6

(
(
(
(
(
(
(
(
(
(
(
(
(
(
(
(
(
Figure 4: Moving half of the elements to their ﬁnal position.
6 Possible Further Improvements
The algorithm in Section 2 works optimal for the case that n=b2mand has its worst
behavior (loosing -0.086n) in the case that n= 2 ln 2b2m= 1.386b2m. But we could avoid
this for the start of the iteration, if we are not ﬁxed to sort 0.8 of the elements in the
ﬁrst step but exactly b2melements for 2
5< b2m4
5. A simpler method would be to
sort always the biggest possible b2min each iteration step. In both cases the k’s for the
asymmetric merging in the further iterations have to be found dynamically, which makes
a proper formal estimation diﬃcult.
For practical applications it is worth to note, that the average number of comparisons
for asymmetric merging is better than in the worst case, because for each element of the
smaller list, which is inserted in 2k1 elements, on the average some elements of the
longer list also can be moved to the new list, so that the average number of comparisons
is less than bl
2kc+ (k+ 1)h1. So we conjecture, that an optimized in-place algorithm
can have nlog n1.4n+O(log n) comparisons in the average and nlog n1.3n+O(log n)
comparisons in the worst case avoiding large constants for the linear amount of transports.
If there exists a constant c > 0 and an in-place algorithm, which sorts cn of the input
with O(nlog n) comparisons and O(n) transports, then the iteration method could be used
to solve the open problem in [MR91].
9
6.0.1 Open Problems:
Does there exists an in-place sorting algorithm with O(nlog n) comparisons and O(n)
transports [MR91]?
Does there exists an in-place sorting algorithm, which needs only nlog n+O(n)
comparisons and only o(nlog n) transports?
Does there exists a stable in-place sorting algorithm, which needs only nlog n+O(n)
comparisons and only O(nlog n) transports?
References
[Ca87] S. Carlsson: A variant of HEAPSORT with almost optimal number of comparisons.
Information Processing Letters 24:247-250,1987.
[Fl91] R. Fleischer: A tight lower bound for the worst case of bottom-up-heapsort. Techni-
cal Report MPI-I-91-104, Max-Planck-Institut f¨ur Informatik, D-W-6600 Saarbr¨ucken,
Germany April 1991.
[HL88] B-C. Huang and M.A. Langston: Practical in-place merging. CACM, 31:248-
352,1988.
[HL72] F. K. Hwang and S. Lin: A simple algorithm for merging two disjoint linearly
ordered sets. SIAM J. Computing 1:31-39, 1972.
[Kn72] D. E. Knuth: The Art of Computer Programming Volume 3 / Sorting and Search-
[Me84] K. Mehlhorn: Data Structures and Algorithms, Vol 1: Sorting and Searching.
Springer-Verlag, Berlin/Heidelberg, 1984.
[MR91] J. I. Munro and V. Raman: Fast Stable In-Place Sorting with O(N) Data Moves.
Proceedings of the FST&TCS, LNCS 560:266-277, 1991.
[We90] I. Wegener: BOTTOM-UP-HEAPSORT, a new variant of HEAPSORT beating on
average QUICKSORT (if nis not very small). Proceedings of the MFCS90, LNCS 452,
516-522, 1990.
[We91] I. Wegener: The worst case complexity of Mc Diarmid and Reed’s variant of
BOTTOM-UP-HEAP SORT is less than nlog n+ 1.1n. Proceedings of the STACS91,
LNCS 480:137-147, 1991.
10
... Reinhardt [42] used this trick (among others) to design an internal Mergesort variant that needs n lg n−1.329n±O(log n) comparisons in the worst case. Unfortunately, implementations of this InPlaceMergesort algorithm have not been documented. ...
... Indeed, the copying in Algorithm C.1 can be avoided by sorting the first subproblem recursively with "ping-pong" sort into the desired 4 Merging can be done in place using more advanced tricks (see, e.g., [19,34]), but those tend not to be competitive in terms of running time with other sorting methods. By changing the global structure, a "pure" internal Mergesort variant [29,42] can be achieved using part of the input as a buffer (as in QuickMergesort) at the expense of occasionally having to merge runs of very different lengths. ...
... Reinhardt's merge A third, less obvious alternative was proposed by Reinhardt [42], which allows to use an even smaller α for merges where input and buffer area form a contiguous region; see Fig. 4. Assume we are given an array A with positions A [1, . . . , t] being empty or containing dummy elements (to simplify the description, we assume the first case), A[t + 1, . . . ...
Article
Full-text available
QuickXsort is a highly efficient in-place sequential sorting scheme that mixes Hoare’s Quicksort algorithm with X, where X can be chosen from a wider range of other known sorting algorithms, like Heapsort, Insertionsort and Mergesort. Its major advantage is that QuickXsort can be in-place even if X is not. In this work we provide general transfer theorems expressing the number of comparisons of QuickXsort in terms of the number of comparisons of X. More specifically, if pivots are chosen as medians of (not too fast) growing size samples, the average number of comparisons of QuickXsort and X differ only by o(n)-terms. For median-of-k pivot selection for some constant k, the difference is a linear term whose coefficient we compute precisely. For instance, median-of-three QuickMergesort uses at most nlgn-0.8358n+O(logn) comparisons. Furthermore, we examine the possibility of sorting base cases with some other algorithm using even less comparisons. By doing so the average-case number of comparisons can be reduced down to nlgn-1.4112n+o(n) for a remaining gap of only 0.0315n comparisons to the known lower bound (while using only O(logn) additional space and O(nlogn) time overall). Implementations of these sorting strategies show that the algorithms challenge well-established library implementations like Musser’s Introsort.
... This observation allows to apply the median-of-medians algorithm to smaller samples leading to both a better average-and worst-case performance. Our algorithm is based on a merging procedure introduced by Reinhardt [32], which requires less temporary space than the usual merging. A further improvement, which we call undersampling (taking less elements for pivot selection into account), allows to reduce the worst-case number of comparisons down to n log n + 1.59n + O(n 0.8 ). ...
... In [32], Reinhardt describes how to merge two subsequent sequences in an array using ...
... As already mentioned in Section 3.2, Reinhardt's merging procedure [32] works also with less than one fifth of the whole array as temporary space if we do not require to merge sequences of equal length. Thus, we can allow the pivot to be even further off the median -with the cost of making the Mergesort part more expensive due to imbalanced merging. ...
... This observation allows to apply the median-of-medians algorithm to smaller samples leading to both a better average-and worst-case performance. Our algorithm is based on a merging procedure introduced by Reinhardt [32], which requires less temporary space than the usual merging. A further improvement, which we call undersampling (taking less elements for pivot selection into account), allows to reduce the worst-case number of comparisons down to n log n + 1.59n + O(n 0.8 ). ...
... In [32], Reinhardt describes how to merge two subsequent sequences in an array using additional space for only half the number of elements in one of the two sequences. The additional space should be located in front or after the two sequences. ...
... As already mentioned in Section 3.2, Reinhardt's merging procedure [32] works also with less than one fifth of the whole array as temporary space if we do not require to merge sequences of equal length. Thus, we can allow the pivot to be even further off the median -with the cost of making the Mergesort part more expensive due to imbalanced merging. ...
Preprint
Full-text available
The two most prominent solutions for the sorting problem are Quicksort and Mergesort. While Quicksort is very fast on average, Mergesort additionally gives worst-case guarantees, but needs extra space for a linear number of elements. Worst-case efficient in-place sorting, however, remains a challenge: the standard solution, Heapsort, suffers from a bad cache behavior and is also not overly fast for in-cache instances. In this work we present median-of-medians QuickMergesort (MoMQuickMergesort), a new variant of QuickMergesort, which combines Quicksort with Mergesort allowing the latter to be implemented in place. Our new variant applies the median-of-medians algorithm for selecting pivots in order to circumvent the quadratic worst case. Indeed, we show that it uses at most $n \log n + 1.6n$ comparisons for $n$ large enough. We experimentally confirm the theoretical estimates and show that the new algorithm outperforms Heapsort by far and is only around 10% slower than Introsort (std::sort implementation of stdlibc++), which has a rather poor guarantee for the worst case. We also simulate the worst case, which is only around 10% slower than the average case. In particular, the new algorithm is a natural candidate to replace Heapsort as a worst-case stopper in Introsort.
... However, MergeInsertion and Insertionsort can be used to sort small subarrays such that the quadratic running time for these subarrays is small in comparison to the overall running time. Reinhardt [15] used this technique to design an internal Mergesort variant that needs in the worst case n log n − 1.329n + O(log n) comparisons. Unfortunately, implementations of this InPlaceMergesort algorithm have not been documented. ...
... We emphasize that the average of our best implementation has a proven gap of at most 0.05n + o(n) comparisons to the lower bound. The value n log n − 1.4n for n = 2 k matches one side of Reinhardt's conjecture that an optimized in-place algorithm can have n log n − 1.4n + O(log n) comparisons in the average [15]. Moreover, our experimental results validate the theoretical considerations and indicate that the factor −1.43 can be beaten. ...
Conference Paper
Full-text available
In this paper we generalize the idea of QuickHeapsort leading to the notion of QuickXsort. Given some external sorting algorithm X, QuickXsort yields an internal sorting algorithm if X satisfies certain natural conditions. We show that up to o(n) terms the average number of comparisons incurred by QuickXsort is equal to the average number of comparisons of X. We also describe a new variant of WeakHeapsort. With QuickWeakHeapsort and QuickMergesort we present two examples for the QuickXsort construction. Both are efficient algorithms that perform approximately n logn − 1.26n + o(n) comparisons on average. Moreover, we show that this bound also holds for a slight modification which guarantees an $$n \log n + \mathcal{O}(n)$$ bound for the worst case number of comparisons. Finally, we describe an implementation of MergeInsertion and analyze its average case behavior. Taking MergeInsertion as a base case for QuickMergesort, we establish an efficient internal sorting algorithm calling for at most n logn − 1.3999n + o(n) comparisons on average. QuickMergesort with constant size base cases shows the best performance on practical inputs and is competitive to STL-Introsort.
... Reinhardt [12] used this technique to design an internal Mergesort variant that needs in the worst case n log n − 1.329n + O(log n) comparisons. Unfortunately, implementations of this InPlaceMergesort algorithm have not been documented. ...
... As another example for QuickXsort we consider QuickMergesort. For the Mergesort part we use standard (top-down) Mergesort which can be implemented using m extra spaces to merge two arrays of length m (there are other methods like in [12] which require less space -but for our purposes this is good enough). The procedure is depicted in Fig. 2. We sort the larger half of the partitioned array with Mergesort as long as we have one third of the whole array as temporary memory left, otherwise we sort the smaller part with Mergesort. ...
Preprint
Full-text available
In this paper we generalize the idea of QuickHeapsort leading to the notion of QuickXsort. Given some external sorting algorithm X, QuickXsort yields an internal sorting algorithm if X satisfies certain natural conditions. With QuickWeakHeapsort and QuickMergesort we present two examples for the QuickXsort-construction. Both are efficient algorithms that incur approximately n log n - 1.26n +o(n) comparisons on the average. A worst case of n log n + O(n) comparisons can be achieved without significantly affecting the average case. Furthermore, we describe an implementation of MergeInsertion for small n. Taking MergeInsertion as a base case for QuickMergesort, we establish a worst-case efficient sorting algorithm calling for n log n - 1.3999n + o(n) comparisons on average. QuickMergesort with constant size base cases shows the best performance on practical inputs: when sorting integers it is slower by only 15% to STL-Introsort.
... While other comparison-efficient in-place sorting methods are known (e.g. [15,9,8]), the ones based on QuickXSort and elementary methods X are particularly easy to implement 1 since one can adapt existing implementations for X. In such an implementation, the tried and tested optimization to choose the pivot as the median of a small sample suggests itself to improve QuickXSort. ...
Article
Full-text available
QuickXSort is a strategy to combine Quicksort with another sorting method X, so that the result has essentially the same comparison cost as X in isolation, but sorts in place even when X requires a linear-size buffer. We solve the recurrence for QuickXSort precisely up to the linear term including the optimization to choose pivots from a sample of k elements. This allows to immediately obtain overall average costs using only the average costs of sorting method X (as if run in isolation). We thereby extend and greatly simplify the analysis of QuickHeapsort and QuickMergesort with practically efficient pivot selection, and give the first tight upper bounds including the linear term for such methods.
... In Algorithm 2, filterCandidate; getPutativeTargetScore ; Rank; rocAnalysis; auprAnalysis; append- ToWeightList and getWeightList procedures require O(|V bi,L[i] | 2 + |E bi,L[i] |); O(|V L[i] |); O(|V L[i] |log(|V L[i] |)); O(|V L[i] | 2 ) [14]; O(|V L[i] | 2 ); O(1) and O(1) time, respectively , where |V bi,L[i] | and |E bi,L[i] | are the number of nodes and edges of the bipartite graph of Li (the i th network in L) and |V L[i] | is the number of nodes in the Li. In getTopologicalRank procedure, feature extraction takes G(Xall); generation of the tfb scores takes O(|ν|) [5] where ν is the number of support vectors (in the worst case, |ν| = |V L[i] |); and ranking using heapsort takes O(|V L[i] |log(|V L[i] |)) time [35]. Hence, the time complexity of getTopologicalRank is G(Xall) + O(|V L[i] |log(|V L[i] |)). ...
Conference Paper
Full-text available
Target prioritization ranks molecules in biological networks according to a score that seeks to identify molecules that fulfill particular roles (e.g., drug targets). We study this problem in the context of partial information (e.g., unknown targets) and present TAPESTRY, a network-based approach that prioritizes candidate targets in a given signaling network with unknown targets by utilizing knowledge (target characteristics) gained from curated targets in another set of signaling networks. We consider both topological and dynamic features and use a weighted sum approach to examine the relative influence of these two classes of features on the prioritization results. TAPESTRY exploits a knowledge base of characterization models and predictive topological features of a set of signaling networks (candidate networks) with curated targets. Then, given a signaling network G with unknown targets, TAPESTRY identifies a candidate network most similar to G and selects its characterization model as prioritization model for computing a topological feature-based rank of each candidate node in G. Next, a dynamic feature-based rank is computed for these nodes by leveraging the time-series curves of ODEs associated with the edges in G. Finally, these two ranks are integrated and used for prioritizing candidate targets. We experimentally study the performance of TAPESTRY using signaling networks from BioModels with real-world curated outcomes. Our results demonstrate its effectiveness and superiority in comparison to state-of-the-art approaches.
Article
Full-text available
QuickHeapsort is a combination of Quicksort and Heapsort. We show that the expected number of comparisons for QuickHeapsort is always better than for Quicksort if a usual median-of-constant strategy is used for choosing pivot elements. In order to obtain the result we present a new analysis for QuickHeapsort splitting it into the analysis of the partition-phases and the analysis of the heap-phases. This enables us to consider samples of non-constant size for the pivot selection and leads to better theoretical bounds for the algorithm. Furthermore, we introduce some modifications of QuickHeapsort. We show that for every input the expected number of comparisons is at most nlog2n−0.03n+o(n)\documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$n\log _{2}n - 0.03n + o(n)$\end{document} for the in-place variant. If we allow n extra bits, then we can lower the bound to nlog2n−0.997n+o(n)\documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$n\log _{2} n -0.997 n+ o (n)$\end{document}. Thus, spending n extra bits we can save more that 0.96n comparisons if n is large enough. Both estimates improve the previously known results. Moreover, our non-in-place variant does essentially use the same number of comparisons as index based Heapsort variants and Relaxed-Weak-Heapsort which use nlog2n−0.9n+o(n)\documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$n\log _{2}n -0.9 n+ o (n)$\end{document} comparisons in the worst case. However, index based Heapsort variants and Relaxed-Weak-Heapsort require Θ(nlogn)\documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}${\Theta }(n\log n)$\end{document} extra bits whereas we need n bits only. Our theoretical results are upper bounds and valid for every input. Our computer experiments show that the gap between our bounds and the actual values on random inputs is small. Moreover, the computer experiments establish QuickHeapsort as competitive with Quicksort in terms of running time.
Article
Full-text available
MergeInsertion, also known as the Ford-Johnson algorithm, is a sorting algorithm which, up to today, for many input sizes achieves the best known upper bound on the number of comparisons. Indeed, it gets extremely close to the information-theoretic lower bound. While the worst-case behavior is well understood, only little is known about the average case. This work takes a closer look at the average case behavior. In particular, we establish an upper bound of $$n \log n - 1.4005n + o(n)$$ comparisons. We also give an exact description of the probability distribution of the length of the chain a given element is inserted into and use it to approximate the average number of comparisons numerically. Moreover, we compute the exact average number of comparisons for n up to 148. Furthermore, we experimentally explore the impact of different decision trees for binary insertion. To conclude, we conduct experiments showing that a slightly different insertion order leads to a better average case and we compare the algorithm to Manacher’s combination of merging and MergeInsertion as well as to the recent combined algorithm with (1,2)-Insertionsort by Iwama and Teruyama.
Conference Paper
MergeInsertion, also known as the Ford-Johnson algorithm, is a sorting algorithm which, up to today, for many input sizes achieves the best known upper bound on the number of comparisons. Indeed, it gets extremely close to the information-theoretic lower bound. While the worst-case behavior is well understood, only little is known about the average case. This work takes a closer look at the average case behavior. In particular, we establish an upper bound of $$n \log n - 1.4005n + o(n)$$ comparisons. We also give an exact description of the probability distribution of the length of the chain a given element is inserted into and use it to approximate the average number of comparisons numerically. Moreover, we compute the exact average number of comparisons for n up to 148. Furthermore, we experimentally explore the impact of different decision trees for binary insertion. To conclude, we conduct experiments showing that a slightly different insertion order leads to a better average case and we compare the algorithm to the recent combination with (1,2)-Insertionsort by Iwama and Teruyama.
Article
Full-text available
We present a novel, yet straightforward linear-time algorithm for merging two sorted lists in a fixed amount of additional space. Constant of proportionality estimates and empirical testing reveal that this procedure is reasonably competitive with merge routines free to squander unbounded additional memory, making it particularly attractive whenever space is a critical resource.
Conference Paper
Full-text available
Bottom-Up-Heapsort is a variant of Heapsort. Its worst case complexity for the number of comparisons is known to be bounded from above by 3/2n log n+O(n), where n is the number of elements to be sorted. There is also an example of a heap which needs 5/2n log n–O(n log log n) comparisons. We show in this paper that the upper bound is asymptotical tight, i.e. we prove for large n the existence of heaps which need at least c n (2nlog n–O(n log log n)) comparisons where c n=1–log 2 n/1 converges to 1. This result also proves the old conjecture that the best case for classical Heapsort needs only asymptotical nlogn+O(n log log n) comparisons.
Conference Paper
Full-text available
Bottom-Up-Heapsort is a variant of Heapsort. Its worst-case complexity for the number of comparisons is known to be bounded from above by 3/2n logn+0(n), wheren is the number of elements to be sorted. There is also an example of a heap which needs 5/4n logn-0(n log logn) comparisons. We show in this paper that the upper bound is asymptotically tight, i.e., we prove for largen the existence of heaps which need at least 3/2n logn-O(n log logn) comparisons. This result also proves the old conjecture that the best case for classical Heapsort needs only asymptotic n logn + O(n log logn) comparisons.
Conference Paper
Until recently, it was not known whether it was possible to stably sort (Le. keeping equal elements in their initial order) an array of n elements using only O(n) data movements and O(1) extra space. In [10], an algorithm was given to perform this task in O(n 2) comparisons in the worst case. Here, we develop a new algorithm for the problem that performs only O(n 1+) comparisons (0<<1 is any fixed constant) in the worst case. This bound on the number of comparisons matches (asymptotically) the best known bound for the same problem with the stability constraint dropped.
Article
A variant of HEAPSORT, called BOTTOM-UP-HEAPSORT, is presented. It is based on a new reheap procedure. This sequential sorting algorithm is easy to implement and beats, on an average, QUICKSORT if n⩾400 and a clever version of QUICKSORT (where the split object is the median of 3 randomly chosen objects) if n⩾16000. The worst-case number of comparisons is bounded by 1.5n log n+O(n). Moreover, the new reheap procedure improves the delete procedure for the heap data structure for all n.
Article
An algorithm, which asymptotically halves the number of comparisons made by the common Heapsort, is presented and analysed in the worst case. The number of comparisons is shown to be (n+1)(log(n+1)+log log(n+1)+1.82)+O(log n) in the worst case to sort n elements, without using any extra space. Quicksort, which usually is referred to as the fastest in-place sorting method, uses 1.38n log n − O(n) in the average case (see Gonnet (1984)).
Conference Paper
BOTTOM-UP-HEAP SORT is a variant of HEAP SORT which beats on average even the clever variants of QUICK SORT, if n is not very small. Up to now, the worst case complexity of BOTTOM-UP-HEAP SORT can be estimated only by 1.5n log n. McDiarmid and Reed (1989) have presented a variant of BOTTOM-UP-HEAP SORT which needs extra storage for n bits. The worst case number of comparisons of this (almost internal) sorting algorithm is estimated by n log n+1.1n. It is discussed how many comparisons can be saved on average.
Article
In this paper we present a new algorithm for merging two linearly ordered sets which requires substantially fewer comparisons than the commonly used tape merge or binary insertion algorithms. Bounds on the difference between the number of comparisons required by this algorithm and the information theory lower bounds are derived. Results from a computer implementation of the new algorithm are given and compared with a similar implementation of the tape merge algorithm.