Content uploaded by Ching-Kuang Shene

Author content

All content in this area was uploaded by Ching-Kuang Shene on Mar 06, 2015

Content may be subject to copyright.

This paper presents a complexity analysis of two STL in-place rotation algorithms. If an array of n elements is rotated to the right Δ positions, the first STL version, which uses forward iterators, uses n - gcd(n, Δ) swaps, while the second version, which uses random access iterators, uses only n+gcd(n, Δ) array element movements. This paper also proves the optimality of the second version. A performance comparison is included.

Content uploaded by Ching-Kuang Shene

Author content

All content in this area was uploaded by Ching-Kuang Shene on Mar 06, 2015

Content may be subject to copyright.

... To achieve these, different block rotation algorithms [9] are available for use. Prominent two among these are juggling rotation, (most choicely and efficiently implemented with greatest common divisor (gcd) algorithm [10]), and threeway-reversal rotation [11,12,13]. ...

... Already, [13] has formulated a precise and concise way of determining the complexity of three-wayreversal rotation, by introducing a function that returns or from modular arithmetic. Reference [14,15] rather favoured the gcd based rotation or "vector exchange" by showing that the problem solution can be reduced [12] to only ( ) moves [10], instead of the usual three-way-reversal that used as much as (⌊( ) ⌋ ⌊ ⌋ ⌊ ⌋), assignments [14], where is the sequence size and as the cycle gap. Reference [16] opined that if the size of each element of the sequence is a "large record", the numerous indexes and variable moves delay (which makes gcd based rotation complicated) cannot "offset any speed up gained" in the elements moves. ...

... In sequence rotation based on gcd calculation, the outer loop assigns elements twice for each cycle. The number of cycles is the gcd of the sequence size and the cycle gap (or offset [10]). Let the size of sequence be , where is the offset, then there are ( ) cycles. ...

Systems analysts, designers, and programmers normally choose among candidate algorithms to develop applications. This choice is even more crucial when the application is life critical, memory and/or time bound, such as embedded real-time systems. One of such choice making is in two leading candidate, high performance, sequence rotation algorithms-juggling versus three-way-reversal-often required for in-place sorting. The problem is identifying the best performing one among them and whether changes of elements versus indexes weights, in the form of altered data types, would shifts the performance index. This paper is therefore focused on investigating the performances differences of these algorithms across data types in adequately long stretched sequence, using both theoretical and statistical analysis. The objective is to measure the efficiencies of the algorithms around the variability of elements versus indexes weights. Using Java to implement the algorithms, pool of 35,500 records of execution timing for each of the data types were generated. Out of this number, 355 randomly sampled records from every next 100 were collected. Two-way analysis of variance (ANOVA) was carried out using R statistical package. The results of the analysis indicated that theoretically, juggling rotation might perform better in terms of time utilization. But statistically, the hypothesis test strongly suggested the contrary. The implication is that, with this study, users are better informed of which algorithm to adopt in practice. Given the conflicting theoretical and statistical results, it becomes necessary to investigate further the experimental low performance of juggling rotation.

... Sequence rotation or circular shifting is also increasingly being used in internal buffer management in text editors [3][4][5], co-processor design [6], image encryption [7], permutation in Data Encryption Standard and Advanced Encryption Standard [8], and task scheduling in real-time systems [9]. Hence, sequence rotation algorithms have become so ubiquitous that they form part of programming languages standard libraries such as Java and C++ "Standard Template Libraries" (STL) [10,11]. In short, any in-place operation to rearrangement of sequence items would likely employ sequence rotation. ...

... Each element move (or assignment) is accomplished in a circular fashion at specified length or number of skipped positions on the sequence, known as cycle gap/ cycle length (or offset). A move is "either assigning a value into an array[/sequence] element or copying an array [/sequence] element to elsewhere" [10]. The next element to move requires jumping/skipping one or more specified preceding/succeeding elements (depending on left/right rotation, respectively [12]) like a juggler exercise, hence the term 'Juggling Rotation.' ...

... The complexity is the execution cost measured in terms of storage, time, and/or "whatever units are relevant" [16]. Sometimes, the performance is measured in terms of optimality, focusing on "best configuration" (such as number of element assignments and comparisons [10]) to achieve some goals [17,18]. On this, juggling rotation algorithm is optimal [5,19]. ...

In previous experimental study with three-way-reversal and juggling sequence rotation algorithms, using 20,000,000 elements for type LONG in Java, the average execution times have been shown to be 49.66761ms and 246.4394ms, respectively. These results have revealed appreciable low performance in the juggling algorithm despite its proven optimality. However, the juggling algorithm has also exhibited efficiency with some offset ranges. Due to this pattern of the juggling algorithm, the current study is focused on investigating source of the inefficiency on the average performance. Samples were extracted from the previous experimental data, presented differently and analyzed both graphically and in tabular form. Greatest common divisor values from the data that equal offsets were used. As emanating from the previous study, the Java language used for the rotation was to simulate ordering of tasks for safety and efficiency in the context of real-time task scheduling. Outcome of the investigation shows that juggling rotation performance competes favorably with three-way-reversal rotation (and even better in few cases) for certain offsets, but poorly with the rests. This study identifies the poorest performances around offsets in the neighborhood of square root of the sequence size. From the outcome, the study therefore strongly advises application developers (especially for real-time systems) to be mindful of where and how to in using juggling rotation.

... Thus, Line 11 of the algorithm can be realised by first cyclically shifting (a i ) t ≤i<max( ,c) left by t − t positions, then passing its last t entries to the recursive call, and finally cyclically shifting the vector right by t − t positions. Using in-place algorithms (see Gries and Mills, 1981;Shene, 1997;Furia, 2014) the cyclic shifts can be performed with only O(1) additional field elements stored in auxiliary space, and, since the vector has length max( , c) − t = 2 k , O(2 k ) movements of its entries, where a movement involves either assigning a value into an entry of the vector or copying one of its entries elsewhere. It follows that O(max( , c)) movements are performed overall, since the difference t − t for the initial call of the algorithm and each of its subsequent recursive calls is always zero if the initial parameters satisfy max( , c) = 2 log 2 max( ,c) , nonzero at most once if the parameters satisfy c ≤ , and nonzero at most twice otherwise. ...

We describe new fast algorithms for evaluation and interpolation on the “novel” polynomial basis over finite fields of characteristic two introduced by Lin et al. (2014). Fast algorithms are also described for converting between their basis and the monomial basis, as well as for converting to and from the Newton basis associated with the evaluation points of the evaluation and interpolation algorithms. Combining algorithms yields a new truncated additive fast Fourier transform (FFT) and inverse truncated additive FFT which improve upon some previous algorithms when the field possesses an appropriate tower of subfields.

The problem of interchanging tw segments of an array is considered.Using the known methods as a starting-point, two new adaptations
are developed that achieve higher memory locality. It is confirmed, both analytically and experimentally, that on a computer
with a hierarchical memory the adaptations are superior to the earlier methods.

ResearchGate has not been able to resolve any references for this publication.