Figure 3 - uploaded by Sebastian Wild
Content may be subject to copyright.
Predicted growth function for the number of executed Bytecodes on almost sorted arrays for JRE7 (gray dashed line and squares) and JRE7 (1,3) (blue solid line and circles), normalized by n ln n. The logarithmic horizontal axis depicts the input size. The model was trained on sizes up to 10 6 and fits the larger data, as well.
Source publication
Recent results on Java 7's dual pivot Quicksort have revealed its highly asymmetric nature. These insights suggest that asymmetric pivot choices are preferable to symmetric ones for this Quicksort variant. From a theoretical point of view, this should allow us to improve on the current implementation in Oracle's Java 7 runtime library. In this pape...
Context in source publication
Similar publications
Lidar, photogrammetry, and various other survey technologies enable the collection of massive point clouds. Faced with hundreds of billions or trillions of points the traditional solutions for handling point clouds usually under-perform even for classical loading and retrieving operations. To obtain insight in the features affecting performance the...
In this work the problem of guided improvisation is approached and elaborated; then a new method, Variable Markov Oracle, for guided music synthesis is proposed as the first step to tackle the guided improvisation problem. Variable Markov Oracle is based on previous results from Audio Oracle, which is a fast indexing and recombination method of rep...
Recent results on Java 7's dual pivot Quicksort have revealed its highly asymmetric nature. These insights suggest that asymmetric pivot choices are preferable to symmetric ones for this Quicksort variant. From a theoretical point of view, this should allow us to improve on the current implementation in Oracle's Java 7 runtime library. In this pape...
Citations
... And for the other sums in Equation (3): [15] and [16]). This is the same value of the expected number of comparisons, when one pivot chosen in the classical Quicksort [17]. ...
Sorting an array of objects such as integers, bytes, floats, etc is considered as one of the most important problems in Computer Science. Quicksort is an effective and wide studied sorting algorithm to sort an array of n distinct elements using a single pivot. Recently, a modified version of the classical Quicksort was chosen as standard sorting algorithm for Oracles Java 7 routine library due to Vladimir Yaroslavskiy. The purpose of this paper is to present the different behavior of the classical Quicksort and the Dual-pivot Quicksort in complexity. In Particular, we discuss the convergence of the Dual-pivot Quicksort process by using the contraction method. Moreover we show the distribution of the number of comparison done by the duality process converges to a unique fixed point.
... This however leads to very unbalanced distributions of sizes for the recursive calls, such that a trade-off between partitioning costs and balance of subproblem sizes has to be found. We have demonstrated experimentally that there is potential to tune dual-pivot Quicksort using skewed pivots (Wild et al., 2013c), but only considered a small part of the parameter space. It will be the purpose of this paper to identify the optimal way to sample pivots by means of a precise analysis of the resulting overall costs, and to validate (and extend) the empirical findings that way. ...
... the code to count key comparisons, swaps and scanned elements. For counting the number of executed Java Bytecode instructions, we used our tool MaLiJAn, which can automatically generate code to count the number of Bytecodes (Wild et al., 2013c). All reported counts are averages of runs on 1000 random permutations of the same size. ...
... is the optimal sampling parameter for most k (and all values for k in with t = (0, 1, 2), despite using the same sample size k = 5. Whether this also results in a performance gain in practice, however, depends on details of the runtime environment (Wild et al., 2013c). (One should also note that the savings are only 2% respectively 4%.) Since these two cost measures (Bytecodes and scanned elements) are arguably the ones with highest impact in the running time, it is very good news from the practitioner's point of view that the optimal choice for one of them is also reasonably good for the other; such choice should yield a close-to-optimal running time (as far as sampling is involved). ...
The new dual-pivot Quicksort by Vladimir Yaroslavskiy - used in Oracle's Java
runtime library since version 7 - features intriguing asymmetries in its
behavior. They were shown to cause a basic variant of this algorithm to use
less comparisons than classic single-pivot Quicksort implementations. In this
paper, we extend the analysis to the case where the two pivots are chosen as
fixed order statistics of a random sample. Surprisingly, dual-pivot Quicksort
then needs more comparisons than a corresponding version of classic Quicksort,
so it is clear that counting comparisons is not sufficient to explain the
running time advantages observed for Yaroslavskiy's algorithm in practice.
Consequently, we take a more holistic approach in this paper and also give the
precise leading term of the average number of swaps, the number of executed
Java Bytecode instructions and the number of I/Os in the external-memory model
and determine the optimal order statistics for each of these cost measures. It
turns out that - unlike for classic Quicksort, where it is optimal to choose
the pivot as median of the sample - the asymmetries in Yaroslavskiy's algorithm
render pivots with a systematic skew more efficient than the symmetric choice.
Moreover, we finally have a convincing explanation for the success of
Yaroslavskiy's algorithm in practice: Compared with corresponding versions of
classic single-pivot Quicksort, dual-pivot Quicksort needs significantly less
I/Os, both with and without pivot sampling.
... However, the classic partitioning methods treat elements smaller and larger than the pivot in symmetric ways -unlike Yaroslavskiy's partitioning algorithm: Depending on how elements relate to the two pivots, one of five different execution paths is taken in the partitioning loop, and these have highly different costs (Wild et al., 2013c)! How often each of these five paths is taken thus depends on the ranks of the two pivots, which we can push in a certain direction by selecting other order statistics of a sample than the tertiles. ...
... It is interesting to note in this context that the implementation in Oracle's Java 7 runtime librarywhich uses t = (1, 1, 1) -executes asymptotically more Bytecodes (on random permutations) than Y w t with t = (0, 1, 2), despite using the same sample size k = 5. Whether this also results in a performance gain in practice, however, depends on details of the runtime environment (Wild et al., 2013c). ...
... In this paper, we gave the precise leading term asymptotic of the average costs of Quicksort with Yaroslavskiy's dual-pivot partitioning method and selection of pivots as arbitrary order statistics of a constant size sample. Our results confirm earlier empirical findings (Yaroslavskiy, 2010;Wild et al., 2013c) that the inherent asymmetries of the partitioning algorithm call for a systematic skew in selecting the pivots -the tuning of which requires a quantitative understanding of the delicate trade-off between partitioning costs and the distribution of subproblem sizes for recursive calls. Moreover, we have demonstrated that this tuning process is very sensitive to the choice of suitable cost measures, which firmly suggests a detailed analyses in the style of Knuth, instead of focusing on the number of comparisons and swaps only. ...
The new dual-pivot Quicksort by Vladimir Yaroslavskiy - used in Oracle's Java
runtime library since version 7 - features intriguing asymmetries in its
behavior. They were shown to cause a basic variant of this algorithm to use
less comparisons than classic single-pivot Quicksort implementations. In this
paper, we extend the analysis to the case where the two pivots are chosen as
fixed order statistics of a random sample and give the precise leading term of
the average number of comparisons, swaps and executed Java Bytecode
instructions. It turns out that - unlike for classic Quicksort, where it is
optimal to choose the pivot as median of the sample - the asymmetries in
Yaroslavskiy's algorithm render pivots with a systematic skew more efficient
than the symmetric choice. Moreover, the optimal skew heavily depends on the
employed cost measure; most strikingly, abstract costs like the number of swaps
and comparisons yield a very different result than counting Java Bytecode
instructions, which can be assumed most closely related to actual running time.
... Future research may focus on this scenario, trying to identify an optimal choice for the pivots. Related results are known for classic Quickselect [26, 28] and Yaroslavskiy's algorithm in Quicksorting [48]. Furthermore, it would be interesting to extend our analysis to the number of bit comparisons instead of atomic key comparisons. ...
There is excitement within the algorithms community about a new partitioning
method introduced by Yaroslavskiy. This algorithm renders Quicksort slightly
faster than the case when it runs under classic partitioning methods. We show
that this improved performance in Quicksort is not sustained in Quickselect; a
variant of Quicksort for finding order statistics. We investigate the number of
comparisons made by Quickselect to find a key with a randomly selected rank
under Yaroslavskiy's algorithm. This grand averaging is a smoothing operator
over all individual distributions for specific fixed order statistics. We give
the exact grand average. The grand distribution of the number of comparison
(when suitably scaled) is given as the fixed-point solution of a distributional
equation of a contraction in the Zolotarev metric space. Our investigation
shows that Quickselect under older partitioning methods slightly outperforms
Quickselect under Yaroslavskiy's algorithm, for an order statistic of a random
rank. Similar results are obtained for extremal order statistics, where again
we find the exact average, and the distribution for the number of comparisons
(when suitably scaled). Both limiting distributions are of perpetuities (a sum
of products of independent mixed continuous random variables).
... The number of executed Bytecode instructions has been shown to resemble actual running time [Camesi et al., 2006], even though just in time compilation can have a tremendous influence [Wild et al., 2013] and some aspects of modern processor architectures are neglected. ...
... The actual Java 7 runtime library implementation uses M = 46, which seems far from optimal at first sight. Note however that the implementation uses the more elaborate pivot selection scheme tertiles of five [Wild et al., 2013], which implies additional constant overhead per partitioning step. ...
... Then, for each block the number of Bytecode instructions was counted, the result is given in Table 5. We have automated this process as part of our tool MaLiJAn (Maximum Likelihood Java Analyzer), which provides a means of automating empirical studies of algorithms based on their control flow graphs [Laube and Nebel, 2010;Wild et al., 2013]. ...
In 2009, Oracle replaced the long-serving sorting algorithm in its Java 7 runtime library by a new dual-pivot Quicksort variant due to Vladimir Yaroslavskiy. The decision was based on the strikingly good performance of Yaroslavskiy's implementation in running time experiments. At that time, no precise investigations of the algorithm were available to explain its superior performance—on the contrary: previous theoretical studies of other dual-pivot Quicksort variants even discouraged the use of two pivots. In 2012, two of the authors gave an average case analysis of a simplified version of Yaroslavskiy's algorithm, proving that savings in the number of comparisons are possible. However, Yaroslavskiy's algorithm needs more swaps, which renders the analysis inconclusive.
To force the issue, we herein extend our analysis to the fully detailed style of Knuth: we determine the exact number of executed Java Bytecode instructions. Surprisingly, Yaroslavskiy's algorithm needs sightly more Bytecode instructions than a simple implementation of classic Quicksort—contradicting observed running times. As in Oracle's library implementation, we incorporate the use of Insertionsort on small subproblems and show that it indeed speeds up Yaroslavskiy's Quicksort in terms of Bytecodes; but even with optimal Insertionsort thresholds, the new Quicksort variant needs slightly more Bytecode instructions on average.
Finally, we show that the (suitably normalized) costs of Yaroslavskiy's algorithm converge to a random variable whose distribution is characterized by a fixed-point equation. From that, we compute variances of costs and show that for large n, costs are concentrated around their mean.
... As noted by Wild et al. [18], considering only key comparisons and swap operations does not suffice for evaluating the practicability of sorting algorithms. In Section 9, we will present preliminary experimental results that indicate the following: When sorting integers, the "optimal" method of Section 5 is slower than Yaroslavskiy's algorithm. ...
... We choose the second-largest and fourthlargest as pivots. (This is the pivot choice that is used in Yaroslavskiy's algorithm in the JRE7 implementation, see [18] for further discussion.) The probability that p and q, p < q, are chosen as pivots is exactly (s · m · )/ n 5 . ...
... Applying (12), we get E(C Y n ) = 1.704n ln n+o(n ln n) key comparisons. (Note that Wild et al. [18] calculated this leading coefficient as well.) This is slightly better than "clever quicksort", which uses the median of a sample of three elements as a single pivot element and achieves 1.714n ln n + O(n) key comparisons on average [7]. ...
Dual pivot quicksort refers to variants of classical quicksort where in the
partitioning step two pivots are used to split the input into three segments.
This can be done in different ways, giving rise to different algorithms.
Recently, a dual pivot algorithm due to Yaroslavskiy received much attention,
because it replaced the well-engineered quicksort algorithm in Oracle's Java 7
runtime library. Nebel and Wild (ESA 2012) analyzed this algorithm and showed
that on average it uses 1.9n ln n + O(n) comparisons to sort an input of size
n, beating standard quicksort, which uses 2n ln n + O(n) comparisons. We
introduce a model that captures all dual pivot algorithms, give a unified
analysis, and identify new dual pivot algorithms that minimize the average
number of key comparisons among all possible algorithms up to lower order or
linear terms. This minimum is 1.8n ln n + O(n). For the case that the pivots
are chosen from a small sample, we include a comparison of dual pivot quicksort
and classical quicksort. We also present results about minimizing the average
number of swaps.
We present original average-case results on the performance of the Ford-Fulkerson maxflow algorithm on grid graphs (sparse) and random geometric graphs (dense). The analysis technique combines experiments with probability generating functions, stochastic context free grammars and an application of the maximum likelihood principle enabling us to make statements about the performance, where a purely theoretical approach has little chance of success. The methods lends itself to automation allowing us to study more variations of the Ford-Fulkerson maxflow algorithm with different graph search strategies and several elementary operations. A simple depth-first search enhanced with random iterators provides the best performance on grid graphs. For random geometric graphs a simple priority-first search with a maximum-capacity heuristic provides the best performance. Notable is the observation that randomization improves the performance even when the inputs are created from a random process.