Conference PaperPDF Available

Merge sort enhanced in place sorting algorithm

Authors:

Abstract and Figures

This paper aims at introducing a new sorting algorithm which sorts the elements of an array In Place. This algorithm has O(n) best case Time Complexity and O(n log n) average and worst case Time Complexity. We achieve our goal using Recursive Partitioning combined with In Place merging to sort a given array. A comparison is made between this particular idea and other popular implementations. We finally draw out a conclusion and observe the cases where this outperforms other sorting algorithms. We also look at its shortcomings and list the scope for future improvements that could be made.
Content may be subject to copyright.
2016 International Conference on Advanced Communication Control and Computing Technologies (ICACCCT)
ISBN No.978-1-4673-9545-8 698
Merge Sort Enhanced In Place Sorting
Algorithm
1Vignesh R, 2Tribikram Pradhan,
1,2Department of Information and Communication Technology (ICT) Manipal
Institute of Technology,
1,2Manipal University Manipal 576 014, Karnataka, India
1knowvigkri@gmail.com, 1,2 tribikram.pradhan@manipal.edu
AbstractThis paper aims at introducing a new sorting
al-
gorithm which sorts the elements of an array In Place.
This
algorithm has O(n) best case Time Complexity
and O(n log n)
average and worst case Time Complexity.
We achieve our goal
using Recursive Partitioning
combined with In Place merging to
sort a given array. A
comparison is made between this particular
idea and
other popular implementations. We finally draw out
a
conclusion and observe the cases where this outperforms
other
sorting algorithms. We also look at its shortcomings
and list the
scope for future improvements that could be
made.
KeywordsTime Complexity, In Place, Recursive
Partitioning
I. INTRODUCTION
In mathematics and computer science, the process of
ar-
ranging similar elements in a definite order is
known as
sorting. Sorting is not a new term in
computing. It finds its
significance in various day to
day applications and forms the
backbone of
computational problem solving. From complex
search
engine algorithms to stock markets, sorting has an
impeccable presence in this modern day era of
information
technology. Efficient sorting also leads in
optimization of
many other complex problems.
Algorithms related to sorting
have always attracted a
great deal of Computer Scientists and
Mathematicians.
Due to the simplicity of the problem and the
need for
solving it more systematically, more and more sorting
algorithms are being devised to suit the purpose.
There are many factors on which the performance of
a
sorting algorithm depends, varying from code
complexity to
effective memory usage. No single
algorithm covers all aspects
of efficiency at once.
Hence, we use different algorithms under
different
constraints.
When we look on developing a new algorithm, it is
impor-
tant for us to understand how long might the
algorithm take
to run. It is known that the time for
any algorithm to execute
depends on the size of the
input data. In order to analyze the
efficiency of an
algorithm, we try to find a relationship on it’s
time
dependence with the amount of data given.
Another factor to take into consideration is the
space used
up by the code with respect to the input.
Algorithms that need
constant minimum extra space
are called In Place. They are
generally preferred over
algorithms that take extra memory
space for their
execution.
In this paper, we introduce a new algorithm which
uses the
concept of divide and conquer to sort the
array recursively
using bottom up approach. Instead of
using an external array
to merge the two sorted sub
arrays, we use multiple pivots to
keep track of the
minimum element of both the sub arrays and
sort it In
Place.
Rest of the paper is organized as follows. Section II
discusses the various references used in making this
paper.
Section III describes the basic working idea
behind this algo-
rithm. Section IV contains the
pseudo code required for the
implementation of this
algorithm. In Section V, we do a Case
Study on the
merging process over an array. In Section VI, we
derive
the time and space complexities of our code. In Section
VII, we do an experimental analysis of this algorithm
on arrays
of varying sizes. In Section VIII, we draw out
asymptotic
conclusion based on Section VI and VII.
We finally list out
the scope for future improvements
and conclude the paper in
Section X.
II. LITERATURE SURVEY
You Ying, Ping You and Yan Gan[2], in the year 2011
made
a comparison between the 5 major types of
sorting algorithms.
They came to a conclusion that
Insertion or Selection sort
performs well for small
range of elements. It was also noted
that Bubble or
Insertion sort should be preferred for ordered
set of
elements. Finally, for large random input parameters,
Quick or Merge sort outperforms other sorting
algorithms.
Jyrki Katajainen, Tomi Pasanen and Jukka
Teuhola[4], in
the year 1996 explained the uses and
performance analysis of
an In Place Merge Sort
algorithm. Initially, a straightforward
variant was
applied with O(n log 2n)+ O(n) comparisons and
3(n log
2n) + O(n) moves. Later, a more advanced variant
was
introduced which required at most (n log 2n) + O(n)
comparisons and (n log 2n) moves, for any fixed array
of size
n’.
Antonio S Symbonis[6], in 1994 showed the stable
merging of two arrays of sizes m and n’, where m
< n, with
O(m + n) assignments, O(m log (n/m + 1))
comparisons and
a constant amount of additional
space. He also mentioned
the possibility of an In
Place merging without the use of an
internal buffer.
Wang Xiang[7], in the year 2011 presented a brief
2016 International Conference on Advanced Communication Control and Computing Technologies (ICACCCT)
699
analysis
of the performance measure of Quick Sort
algorithm. This
paper discusses about the Time
Complexity of Quick Sort
algorithm and makes a
comparison between the improved
Bubble Sort and
Quick Sort through analysing the first order
derivative
of the function that is found to co-relate Quick Sort
with other sorting algorithms.
Shrinu Kushagra, Alejandro Lopez-Ortiz and J. Ian
Munro [8], in 2013, presented a new approach which
consisted of multiple pivots in order to sort elements.
They performed
an experimental study and also
provided analysis on cache
behavior of these
algorithms. Here, they proposed a 3 pivot
mechanism
for sorting and improved the performance by 7-8%.
Hossain, Nadir, Alma, Amiruzzaman and
M.Quadir[9],
in the year 2004 came up with an
algorithm which was
more efficient than the
traditional Merge Sort algorithm. This
technique used
divide and conquer method to divide the entire
data
until two elements are present in each group instead
of
a single element like the standard Merge Sort. This
technique
reduces the number of recursive calls and the
subdivision
of problem, hence increasing the overall
efficiency of the
algorithm.
Guigang Zheng, Shaohua Teng, Wei Zhang and
Xiufen
Fu[13], in 2009 presented an enhanced
method on indexing
and its corresponding parallel
algorithm. The experiment
demonstrated that
execution time for indexing based sorting
algorithm
was less than other sorting algorithms. On the basis
of
index table and parallel computing, it was shown that
every
two sub-merging sequence of the Merge Sort
algorithm were
sorted in single processor computer.
This saved the waiting
and disposal time and hence
had better efficiency than the
original Merge Sort
algorithm.
Bing-Chao Huang and Michael A. Langston[14],
in 1987
proposed a practical linear-time approach for
merging two
sorted arrays using a fixed additional
space.
Rohit Yadav, Kratika Varshney and Nitin Verma[20],
in the
year 2013 discussed the run time complexities of
the recursive
and non recursive approach of the merge
sort algorithm using
a simple unit cost model. New
implementations the for two
way and four way
bottom-up merge sort were given, the worst
case
complexities of which were shown to be bounded by
5.5n log 2n + O(n) and 3.25n log 2n + O(n),
respectively.
III. METHODOLOGY
In this particular section, we lay emphasis on the idea
behind the working of this algorithm. The proposed
algorithm
solves our problem in two steps, the
strategies behind which
are stated below.
3.1
DIVIDE AND CONQUER
We use the Divide and Conquer strategy to split the
given array into individual elements. Starting from
individual
elements, we sort the array using Bottom Up
Approach,
keeping track of the minimum and
maximum value of the
sub arrays at all times. The
technique used for splitting the
array is similar to
that of a standard Merge Sort, where we
recursively
partition the array from start to mid, and from
mid to
last after which, we call the sort function to sort that
particular sub array.
3.2
PIVOT BASED MERGING
Fig. 1. A recursive algorithm to split and sort array
elements
This is the part from which this algorithm differs
from a
standard Merge Sort. Instead of merging the
two sorted sub
arrays in a different array, we use
multiple pivots to sort them
In Place and save the extra
space consumed. Our function
prototype to sort the
array looks somewhat like this:
Procedure sort (int *ar, int i, int j)
*ar = pointer to the array
i = starting point of the first sub array
j
= ending point of the second sub array
We use 4 pivotsa’, ’x’, ’y’ and ’b’ in the code to
accomplish our task. ‘a’ and ‘b’ initially mark starting
points
of the two sorted sub arrays respectively. As a
result, a is
initialized to i and b is obtained by
dividing the sum of
i and j by two and
incrementing it. x is the point below
which our final
array is sorted and is initialized to ’i’. ‘y is an
intermediate pivot, which marks the bound for pivot ‘a
and is
initialized to ’b’. All in all, our function is
targeted at sorting
the main array ‘ar’ from position ‘i
to ‘j’ given that the
elements from i to b-1 and from
b to j are already sorted.
The variable ‘a’ is used for keeping track of the
minimum
value in the first sub array that has not yet
been accessed
(for most of the time, barring a few
passes). Similarly, b is
used for keeping track of the
minimum value in the second
sub array that has not
2016 International Conference on Advanced Communication Control and Computing Technologies (ICACCCT)
700
yet been accessed (again, for most of
the time,
barring a few passes). As mentioned earlier, x is
the
point before which our final array is sorted. So at
any
point, the array from i to x-1’ is sorted.
Finally, we have
another variable called ctr’, which
is initialized and always
kept equal to b till the
second sub array (from b to j’)
is sorted. If not,
we keep on incrementing ctr and swap it
with its
next value until the element at ctr gets placed in
its correct position and the second sub array becomes
sorted
once again. We then makectr’ equal to ’b’.
Our logic revolves around comparing the current
minimum values in the two sorted sub arrays (values at
a’ and ’b’), and swapping the smaller number with the
value at ’x’. We then increment ’x’ and reposition (’a
or ’b’) accordingly.
IV. PSEUDO CODE
Given below is the working pseudo code for the idea
proposed. We have two main functions to achieve our
purpose, one to split the array and the other to sort that
particular sub array In Place.
Algorithm 1 SPLITTING ALGORITHM
1: Procedure split (int * ar, int i, int j):
2: if
j
=
i
+ 1 or
j
=
i
then
3:
if ar[ i ] > ar[ j ]
then
4:
swap ( ar[ j ] , ar[ i ] )
5: return
6: end if
7: else
8:
mid =
(i
+ j)/2
9: split (ar, i, mid)
10: split (ar, mid+1, j)
11:
if ar[mid + 1] < ar[mid]
then
12: sort (ar, i, j)
13: end if
14: end if
1: Procedure sort (int * ar, int i, int j) :
2: x i, a x,
b
(i
+ j)/2 + 1, y and ctr
b
3: while x < b do
4:
if ctr < j and ar[ctr] > ar[ctr + 1] then
5:
swap ( ar[ctr] , ar[ctr + 1] )
6:
ctr ctr + 1
7:
end if
8:
if ctr
j or ar[ctr] <= ar[ctr + 1] then
9:
ctr b
10: end if
11: if b >
j
and a > x and b = a + 1 and a > y and ctr = b then
12: b a, ctr b, a y
13: else if b >
j
and a > x and ctr = b then
14:
b y, ctr b, a x
15:
else if b >
j
and ctr = b then
16:
break
17: end if
18: if a = x and x = y and ctr = b then
19: y b
20: else if x = y then
21:
y a
22: end if
23: if a > y and b > a + 1 and ar[b] < ar[a] and ctr = b then
24:
swap ( ar[a] , ar[b] )
25:
swap ( ar[a] , ar[x] )
26:
x x + 1 , a a + 1
27:
if ar[ctr] > ar[ctr + 1] then
28:
swap ( ar[ctr] , ar[ctr + 1] )
29:
ctr ctr + 1
30: end if
31:
else if a = x and b = y and ar[b] < ar[a] then
32:
swap ( ar[x] , ar[b] )
33:
a b, b b + 1 , x x + 1
34:
if ctr = b
1 then
35:
ctr ctr + 1
36:
end if
37:
else if a = x and b = y and ar[b] >= ar[a] then
38:
x x + 1 and a a + 1
39:
else if b = a + 1 and ar[b] < ar[a] then
40:
swap ( ar[b] , ar[x] )
41:
swap ( ar[a] , ar[b] )
42:
b b + 1 , x x + 1 , a a + 1
43:
if ctr = b
1 then
44:
ctr ctr + 1
45:
end if
46:
else if b = a + 1 and ar[b] >= ar[a] then
47:
swap ( ar[x] , ar[a] )
48:
a y , x x + 1
49:
else
if
a
=
y
and
x
<
y
and
ctr
!
=
b
+
1
and
ar
[
b
]
<
ar
[
a
]
then
50:
swap ( ar[x] , ar[b] )
51:
b b + 1 , x x + 1
52:
if ctr = b
1 then
53:
ctr ctr + 1
54:
end if
55:
else if b > a + 1 and ar[b] >= ar[a] then
56:
swap ( ar[x] , ar[a] )
57:
x x + 1 , a a + 1
58:
end if
59: end while
V. CASE STUDY
In this Case Study, we take a look into the merging
process of the two sorted sub arrays. Let us consider an
array of size 18 elements for the sake of this example.
The two sorted sub arrays are from ’i’ to ’b-1’ and from
b’ to ’j’.
INI TIAL ARRAY:
PASS 1:
This is the first pass inside the ’while’ loop of our
sort’ procedure. As mentioned, we compare the current
minimum values of the two sub arrays (value at ’a’ (-4)
and ’b’ (-3)). The value at ’a’ is less than that at ’b’.
Since ’a’ is equal to ’x’, we don’t need to swap the
values at ’a’ and ’x’. Instead, we increment ’a’ and ’x’.
The first element (-4) is now in its correct position.
a’ holds the minimum value of first sub array that has
not yet been accessed (-1) and the array before ’x’ is
sorted.
if a = x and b = y and ar[b] >= ar[a] then
x x + 1 and a a + 1
end if
PASS 2:
In this pass, the value at ’b’ is less than that at ’a’. So
we swap the value at ’b’ with the value at ’x’. ’x’ is
incremented accordingly. The second element (-3) is
now in its correct sorted position. We reassign ’a’ as ’b
and increment ’b’. ’b’ now contains the current
minimum value of the second sub array (-2) and ’a
keeps track of its previously pointed value (-1).
if a = x and b = y and ar[b] < ar[a] then
swap ( ar[x] , ar[b] )
a b, b b + 1 , x x + 1
if ctr = b
1 then
2016 International Conference on Advanced Communication Control and Computing Technologies (ICACCCT)
701
ctr ctr + 1
end if
end if
PASS 3:
Our motto behind each pass is to assign ’a’ and b
such that they contain the current minimum values of
the two sub arrays (This condition is true for all but few
passes (discussed in Pass 7)).
if b = a + 1 and ar[b] < ar[a] then
swap ( ar[b] , ar[x] )
swap ( ar[a] , ar[b] )
b b + 1 , x x + 1 , a a + 1
if ctr = b
1 then
ctr ctr + 1
end if
end if
PASS 4:
if b = a + 1 and ar[b] < ar[a] then
swap ( ar[b] , ar[x] )
swap ( ar[a] , ar[b] )
b b + 1 , x x + 1 , a a + 1
if ctr = b
1 then
ctr ctr + 1
end if
end if
PASS 5:
if b = a + 1 and ar[b] >= ar[a] then
swap ( ar[x] , ar[a] )
a y , x x + 1
end if
PASS 6:
if b > a + 1 and ar[b] >= ar[a] then
swap ( ar[x] , ar[a] )
x x + 1 , a a + 1
end if
FEW NOTES:
1.
It is noticeable up till now that our aim has been to
keep elements from ‘a’ to ‘b
1’ and elements from
y’ to ‘a
1’ sorted (for a
>=
y).
2.
Another thing worth observing is that elements
from ‘a’ to ‘b-1’ are less than the elements from ‘yto
a-1’, (provided a >= y). This means that the first sub
array can be accessed in sorted order from ‘a’ to ‘b-1’
and then from y’ to ‘a-1’.
PASS 7:
Till now, we had assumed that the value of the
variable ‘ctr’ to be equal to ‘b’. This was only because
ar[ctr] was less than or equal to ar[ctr+1] i.e. the array
starting from ‘b’ was sorted. However, to preserve the
two conditions stated in Pass 6, we make a swap that
costs us the order of the two sub arrays. We solve this
dilemma by swapping ar[ctr] with ar[ctr+1] and
increment ‘ctr’. We keep on doing this until ar[ctr]
becomes less than or equal to ar[ctr+1]. After this,
ctr’ once again is made equal to ‘b’ (PASS 8).
if a > y and b > a + 1 and ar[b] < ar[a] and ctr = b then
swap ( ar[a] , ar[b] )
swap ( ar[a] , ar[x] )
x x + 1 , a a + 1
if ar[ctr] > ar[ctr + 1] then swap
(
ar[ctr] , ar[ctr + 1]
)
ctr ctr + 1
end if
end if
PASS 8:
if ctr
j or ar[ctr] <= ar[ctr + 1] then
ctr b
end if
PASS 9:
if b = a + 1 and ar[b] < ar[a] then
swap ( ar[b] , ar[x] )
swap ( ar[a] , ar[b] )
b b + 1 , x x + 1 , a a + 1
if ctr = b
1 then
ctr ctr + 1
end if
end if
PASS 10:
if b = a + 1 and ar[b] < ar[a] then
swap ( ar[b] , ar[x] )
swap ( ar[a] , ar[b] )
b b + 1 , x x + 1 , a a + 1
if ctr = b
1 then
ctr ctr + 1
end if
end if
PASS 11:
if x = y then
y a
end if
PASS 12:
if b = a + 1 and ar[b] >= ar[a] then
swap ( ar[x] , ar[a] )
a y , x x + 1
end if
PASS 13:
if b = a + 1 and ar[b] >= ar[a] then
swap ( ar[x] , ar[a] )
a y , x x + 1
end if
PASS 14:
if b = a + 1 and ar[b] < ar[a] then
swap ( ar[b] , ar[x] )
swap ( ar[a] , ar[b] )
b b + 1 , x x + 1 , a a + 1
if ctr = b
1 then
ctr ctr + 1
end if
end if
2016 International Conference on Advanced Communication Control and Computing Technologies (ICACCCT)
702
PASS 15:
if b = a + 1 and ar[b] < ar[a] then
swap ( ar[b] , ar[x] )
swap ( ar[a] , ar[b] )
b b + 1 , x x + 1 , a a + 1
if ctr = b
1 then
ctr ctr + 1
end if
end if
PASS 16:
if x = y then
y a
end if
PASS 17:
if b = a + 1 and ar[b] < ar[a] then
swap ( ar[b] , ar[x] )
swap ( ar[a] , ar[b] )
b b + 1 , x x + 1 , a a + 1
if ctr = b
1 then
ctr ctr + 1
end if
end if
PASS 18:
if b = a + 1 and ar[b] < ar[a] then
swap ( ar[b] , ar[x] )
swap ( ar[a] , ar[b] )
b b + 1 , x x + 1 , a a + 1
if ctr = b
1 then
ctr ctr + 1
end if
end if
PASS 19:
Since the value of b is greater than j’, it has
gone out
of bounds. Hence, we re- initialize the
values of our pivots
accordingly.
if b >
j
and a > x and b = a + 1 and a > y and ctr = b then
b a, ctr b, a y
end if
PASS 20:
if a = x and x = y and ctr = b then
y b
end if
PASS 21:
if a = x and b = y and ar[b] < ar[a] then
swap ( ar[x] , ar[b] )
a b, b b + 1 , x x + 1
if ctr = b
1 then
ctr ctr + 1
end if
end if
PASS 22:
In the previous pass, b and ctr have again gone
out of
bounds. This condition is similar to that of
Pass 19 but with
different side condition.
if b >
j
and a > x and ctr = b and a = y then
b
y, ctr b, a x
end if
PASS 23:
if a = x and b = y and ar[b] < ar[a] then
swap ( ar[x] , ar[b] )
a b, b b + 1 , x x + 1
if ctr = b
1 then
ctr ctr + 1
end if
end if
It took us about 23 passes to do an In Place merge on
18 elements. Although in code, it would have taken less
iterations since multiple conditions can be evaluated at
the same time. This more or less covers our sorting
logic.
VI. COMPLEXITY ANALYSIS
In this section, we analyze the time and space
complexity for this algorithm’s best and worst case
scenarios.
6.1
TIME COMPLEXITY
6.1.1
WORST CASE
We saw in previous sections that our code structure was
similar to the following:
1.
Procedure
split
(int
*
ar
, int
i
, int
j
) :
2.
if
j
=
i
+ 1
or
j
=
i
then if
ar
[
i
]
> ar
[
j
]
then
3.
swap
(
ar
[
j
]
, ar
[
i
] )
4. end if
5. return
6. else
7.
mid
=
(
i
+
j
)
/
2
split
(
ar, i, mid
)
split
(
ar, mid+1, j
)
8.
if
ar
[
mid
+ 1
]
<
ar
[
mid
]
then
9. sort (ar, i, j)
10. end if end if
1.
:
Procedure
sort
(
int *
ar
, int
i
, int
j
)
:
2. while x < b do
3. // Sorting logic (multiple if-else statements)
4. end while
Our algorithm starts its execution in the ’split
procedure. Let C1 be the constant time taken to
execute the ’if’ condition in this procedure and C2be
the constant time taken to execute the ’else’ condition.
Inside the ’else’ condition, we recursively call the same
2016 International Conference on Advanced Communication Control and Computing Technologies (ICACCCT)
703
function twice for size
n/2’. We then call the ’sort procedure.
Our ’sort’ procedure comprises of a ’while’ loop that
has the logic for merging the two sorted arrays into a
single sorted array. Let C3 be the constant time taken
to execute this procedure. Our overall equation for time
complexity becomes:
We use Recurrence Relation to find out the time
complexity for the code. For a large input size n, the
above equation for calculating the Time Complexity T
(n) can be simplified as:
T(n) = 2T(n/2) + n + C2 + C3
T (n) = 2[2T (n/4) + n/2 + C2 + C3] + n + C2 + C3 T (n) =
4T (n/4) + 2n + 3C2 + 3C3
T (n) = 4[2T (n/8) + n/4 + C2 + C3] + 2n + 3C2 + 3C3
T (n) = 8T (n/8) + 3n + 7C2 + 7C3
For k iterations the equation for T(n) becomes:
T (n) = 2k T (n/2k ) + kn + (2k 1)C2 + (2k 1)C3
The base condition for recursion in our algorithm
occurs when n is equal to 2 i.e. T(2). This implies:
n/2k = 2
k= log2 n -1
This also means that the recursion runs up to log2 n -1
times before reaching its base condition. Substituting
this value of k in the above equation, we get:
T (n) = 2log2 n1T (n/(2log2 n1)) + (log2 n -1)n
+ (2log2 n1 1)C2 + (2log2 n1 1)C3
We know, n/(2log2 n1) = 2
Substituting this and the value of T (2), we get:
T (n) = (n/2)C1 +n(log2 n)n +(n/2 1)C2 +(n/2 1)C3
6.1.2 BEST CASE
The efficiency of this sorting algorithm is directly
proportional to the orderliness in the given array. As a
result, the best case of this algorithm occurs when the
array is already sorted (or even almost sorted). Let us
consider the sorted array given as an example to find
out the time taken by this algorithm in its best case:
If the array is already sorted, this would imply that the
starting element of the second sub array would be
greater than the ending element of first sub array i.e.
ar[mid + 1] ar[mid]. Hence the program would not
even enter the sort procedure. This means that our
time complexity for merging the two sorted sub arrays
is constant.
Hence, the time taken to split the array is the only factor
affecting the total Time Complexity. As a result, our
overall equation becomes:
T(n) = 2T(n/2) + C2
T (n) = 2[2T (n/4) + C2] + C2 T (n) = 4T (n/4) + 3C2
T (n) = 4[2T (n/8) + C2] + 3C2
T (n) = 8T (n/8) + 7C2
For k iterations the equation for T(n) becomes:
T (n) = 2k T (n/2k ) + (2k 1)C2
Similar to the previous condition, the base condition for
recursion occurs when n is equal to 2 i.e. T (2).
This implies:
n/2k = 2
k = log2 n -1
Again, this means that the recursion runs up to log2 n -1
times before reaching its base condition. Substituting
this value of k in the above equation, we get:
T (n) = 2log2 n1T (n/(2log2 n1)) + (2log2 n1 1)C2
T (n) = (n/2)C1 + (n/2 1)C2
6.2 SPACE COMPLEXITY
This is an In Place sorting algorithm and takes constant
amount of memory for sorting a particular array. This
property is quite important for any algorithm since it
results in almost nonexistent computational space in the
memory. In some cases, this is even considered more
important than an algorithms Time Complexity.
6.3 STABILITY
Instability is a major drawback in this sorting
algorithm. Due to this, similar elements are not
evaluated as distinct and lose their order as a result.
This issue can be sorted out by increasing the number
of pivots and treating similar elements as distinct, but
the implementation becomes way too complicated and
is beyond the scope of this paper. For a sorted array
however, the stability is maintained since no swaps are
being made.
IV EXPERIMENTAL ANALYSIS
We evaluated the performance of the code for array
inputs
up to 32,000 elements by denoting the time
taken to sort the
elements.
7.1
WORST CASE
The worst case scenario in this algorithm occurs when
each
element in the input array is distinct and there is
no order in
the array whatsoever. We noticed that for
large values of input
elements, this algorithm performs
slower than the Standard
Merge Sort and Quick Sort.
However, for array elements up
to 1000, this
algorithm is faster than both Merge and Quick
Sort
even in its worst case.
2016 International Conference on Advanced Communication Control and Computing Technologies (ICACCCT)
704
7.1
AVERAGE CASE
We consider the average case to be an array with
partial
order in it.
7.1
BEST CASE
The Best Case scenario happens when the array is
completely sorted or has similar elements. Since we
know
that the Time Complexity for this condition is
O(n), we
only compare this algorithm with those
having the similar
Time Complexities (Bubble and
Insertion sort). Also, since
the time difference between
them was very small and non
-
comparable, we
compared the 3 algorithms with respect to
the number
of iterations taken to sort the array. We noticed
that
this algorithm performs better than Bubble Sort but is
slightly slower than Insertion Sort.
NUMBER OF ITERATIONS
ELEMENTS
INSERTION
HYBRID
1000
998
1023
2000
1998
3996
2047
4000
3998
4095
8000
7998
15996
8191
16000
15998
31996
16383
32000
31998
63996
32767
V. ASYMPTOTIC ANALYSIS
In this section, based on experimental analysis and
previously stated proof, we draw out an asymptotic
analysis
of our algorithm.
ANALYSIS
FACTORS
BEST CASE
AVERAGE CAS E
WORST CASE
TIME
SPACE
STABILITY
O
(
n
)
O
(1)
YES
O
(
n
log
n
)
O
(1)
NO
O
(
n
log
n
)
O
(1)
NO
VI. CONCLUSION
This idea, like most standard algorithms has room for
improvement. During our implementation phase, we
noticed
that the code slows down for very large
values of ‘n’ The
instability of this algorithm is also
a cause of concern.
Future improvements can be made to enhance the
performance over
larger number of input array. Since
we have the minimum
and maximum value of the sub
array at any time, instead of
starting from the
beginning, we can combine the current logic
with an
end first search to reduce the number of iterations.
Regarding its stability, as mentioned earlier, this
algorithm can
be made stable by increasing the
number of pivots but this
would lead to other
complications. Any improvement though,
however
trivial, would be highly appreciated.
REFERENCES
[1] Dr. D. E. Knuth. ”Sorting and Searching”, The Art of Computer
Programming, 3rd volume, second edi tion
[2] You Yang, Ping Yu and Yan Gan. ”Experimental Study on the
Five Sort Algorithms”, International Conference on Mechanic
Automation
and Control Engineering (MACE), 2011
[3] W. A. Martin. ”Sorting”, ACM Comp Survey., 3(4):147-174,
1971
[4] Jyrki Katajainen, Tomi Pasanen and Jukka Teuhola. ”Practical
in-place mergesort”, Nordic Journal of Computing Archive
Volume 3 Issue 1, 1996
[5] R. Cole. “Parallel Merge Sort,” Proc. 27th IEEE Symp. FOCS,
pp. 511516, 1988
[6] Symvonis, Antonios.” Optimal stable merging.” The Computer
Journal 38. 8 (1995): 681-690.
[7] Wang Xiang. ” Analysis of the Time Complexity of Quick Sort
Algo- rithm”, Information Management, Innovation
Management and Indus- trial Engineering (ICIII), International
Conference, 2011
[8] Shrinu Kushagra, Alejandro Lopez, J. Ian Munro and Aurick
Qiao ”Multi-Pivot Quicksort: Theory and Experiments”,
Proceedings of the 16th Meeting on Algorithm Engineering and
Experiments (ALENEX), pp. 47-60, 2014.
[9] Nadir Hossain, Md. Golam Rabiul Alma, Md. Amiruzzaman
and S.M. Moinul Quadir. ”An Efficient Merge Sort Technique
that Reduces both Times and Comparisons”, Information and
Communication Technolo- gies: From Theory to Applications,
International Conference, 2004
[10] L. T. Pardo. ”Stable sorting and merging with optimal space
and time bounds”, SIAM Journal on Computing, 6(2):351–372,
1977.
[11] Jeffrey Ullman, John Hopcroft and Alfred Aho. ” The Design
and Analysis of Computer Algorithms”, 1974.
[12] E.Horowitz and S.Sahni. ”Fundamentals of Data Structures”,
Computer Science Press, Rockville, 1976
[13] Guigang Zheng, Guangzhou, Shaohua Teng, Wei Zhang and
Xiufen Fu. ”A cooperative sort algorithm based on indexing ”,
Computer Supported Cooperative Work in Design, 13th
International Conference, 2009
[14] Bing-Chao Huang and Michael A. Langston. ”Practical in-place
merg- ing”, Communications of the ACM CACM Homepage
table of contents archive Volume 31 Issue 3, 1988
[15] A.Symvonis. ”Optimal stable merging”, Computer Journal,
38:681690, 1995
[16] F. K. Hwang and S. Lin. ”A Simple algorithm for merging two
disjoint linear ordered sets”, SIAM Journal on Computing, 1972
[17] Hovarth, E. C. ”Stable sorting in asymptotically optimal time
and extra space”, Journal of the ACM 177-199, 1978
[18] S.Dvorak and B.Durian. ”Stable linear time sub linear space
merging”, The Computer Journal 30 372-375, 19 8 7
[19] J.Chen. ”Optimizing stable in-place merging”, Theoretical
Computer Science, 302(1/3):191210, 2003.
[20] Rohit Yadav, Kratika Varshney and Nitin Verma. ”Analysis of
Recursive and Non-Recursive Merge Sort Algorithm”,
Interna tional Journal of Advanced Research in Compute r Science and
Software Engineering Volume 3, Issue 11, November 2013
... The wrong values at any positions are corrected through swapping or rearranging steps. Vignesh and Pradhan [19] improved merge sort by using multiple pivots to sort data in the array. The time complexities of the best and worst cases are O(n) and O(n log n), respectively. ...
... Type-3 is a sub-sequence of data wherein the first difference is two and all differences of two consecutive data alternate between one and two. Finally, type-4 is a sub-sequence (14,15,16,17,18) is compacted to a type-1 compact group (14,18) (1) in which (15,16,17) are removed from memory and called compressed data; (2) a type-2 sub-sequence (14,15,17,18,20) is compacted to a type-2 compact group (14,20) (2) and (15,17,18) are removed from memory; (3) a type-3 subsequence (14,16,17,19,20) is compacted to a type-3 compact group (14,20) (3) and (16,17,19) are removed from memory; (4) a type-4 sub-sequence (14,16,18,20,22) is compacted to a type-4 compact group (14,22) (4) and (16,18,20) are removed from memory. A compact group of any type is invertible to the sub-sequence of type p where p is 1, 2, 3 and 4. For example, (14,20) (3) = (14,16,17,19,20). ...
... Type-3 is a sub-sequence of data wherein the first difference is two and all differences of two consecutive data alternate between one and two. Finally, type-4 is a sub-sequence (14,15,16,17,18) is compacted to a type-1 compact group (14,18) (1) in which (15,16,17) are removed from memory and called compressed data; (2) a type-2 sub-sequence (14,15,17,18,20) is compacted to a type-2 compact group (14,20) (2) and (15,17,18) are removed from memory; (3) a type-3 subsequence (14,16,17,19,20) is compacted to a type-3 compact group (14,20) (3) and (16,17,19) are removed from memory; (4) a type-4 sub-sequence (14,16,18,20,22) is compacted to a type-4 compact group (14,22) (4) and (16,18,20) are removed from memory. A compact group of any type is invertible to the sub-sequence of type p where p is 1, 2, 3 and 4. For example, (14,20) (3) = (14,16,17,19,20). ...
Article
Full-text available
Big streaming data environment concerns a complicated scenario where data to be processed continuously flow into a processing unit and certainly cause a memory overflow problem. This obstructs the adaptation of deploying all existing classic sorting algorithms because the data to be sorted must be entirely stored inside the fixed-size storage including the space in internal and external storage devices. Generally, it is always assumed that the size of each data chunk is not larger than the size of storage ( M ) but in fact the size of the entire stream ( n ) is usually much larger than M . In this paper, a new fast continuous streaming sorting is proposed to cope with the constraint of storage overflow. The algorithm was tested with various real data sets consisting of 10,000 to 17,000,000 numbers and different storage sizes ranging from 0.01 n to 0.50 n . It was found that the feasible lower bound of storage size is 0.35 n with 100% sorting accuracy. The sorting time outperforms bubble sort, quick sort, insertion sort, and merge sort when data size is greater than 1,000,000 numbers. Remarkably, the sorting time of the proposed algorithm is 1,452 times less than the sorting time of external merge sort and 28.1767 times less than the sorting time of streaming data sort. The time complexity of proposed algorithm is O ( n ) while the space complexity is O ( M ).
... The modified merge sort decreases the execution time for smaller datasets as well. In [13], an in-place merge sorting algorithm is developed which is based on traditional merge sort algorithm. It utilizes recursive partitioning together with in-place sorting to sort the input array. ...
... This algorithm achieved the worst case time complexity of O(n) + O(n log n). Vignesh & Pradhan (2016) proposed a sorting algorithm by improving merge sort. It uses multiple pivots to sort data. ...
Article
Full-text available
Tremendous quantities of numeric data have been generated as streams in various cyber ecosystems. Sorting is one of the most fundamental operations to gain knowledge from data. However, due to size restrictions of data storage which includes storage inside and outside CPU with respect to the massive streaming data sources, data can obviously overflow the storage. Consequently, all classic sorting algorithms of the past are incapable of obtaining a correct sorted sequence because data to be sorted cannot be totally stored in the data storage. This paper proposes a new sorting algorithm called streaming data sort for streaming data on a uniprocessor constrained by a limited storage size and the correctness of the sorted order. Data continuously flow into the storage as consecutive chunks with chunk sizes less than the storage size. A theoretical analysis of the space bound and the time complexity is provided. The sorting time complexity is O ( n ), where n is the number of incoming data. The space complexity is O ( M ), where M is the storage size. The experimental results show that streaming data sort can handle a million permuted data by using a storage whose size is set as low as 35% of the data size. This proposed concept can be practically applied to various applications in different fields where the data always overflow the working storage and sorting process is needed.
... Buffering and reading from input strings were presented by Zhang and Larson [32]. Merge sort was also evaluated in various tests and benchmarks by Vignesh and Pradhan [33], Cheema et al. [34], and Paira et al. [35]. These research results show that merge sort has a very high potential for new improvements. ...
Article
Full-text available
The development in multicore architectures gives a new line of processors that can flexibly distribute tasks between their logical cores. These need flexible models of efficient algorithms, both fast and stable. A new line of efficient sorting algorithms can support these systems to efficiently use all available resources. Processes and calculations shall be flexibly distributed between cores to make the performance as high as possible. In this article we present a fully flexible sorting method designed for parallel processing. The idea we describe in this article is based on modified merge sort, which in parallel form is designed for multicore architectures. The novelty of this idea is in particular way of processing. We have developed a fully flexible method that can be implemented for a number of processors. The tasks are flexibly distributed between logical cores to increase the efficiency of sorting. The method preserves separation of concerns; therefore, each of the processors works separately without any cross actions and interruptions. The proposed method was described in theoretical way, examined in tests, and compared to other methods. The results confirm high efficiency and show that with each newly added processor sorting becomes faster and more efficient.
... Research on possible improvements in buffering and reading were presented in [24]. Possible enhancements to the merging procedure were discussed in [25]. The research on possible improvements were also based on extensive comparison to other methods. ...
Article
Full-text available
Modern architectures make possible development in new algorithms for large data sets and distributed computing. The newly proposed versions can benefit both from faster computing on the multi core architectures, and intelligent programming techniques that use efficient procedures available in the latest programming studios. Frequently used algorithms to sort arrays of data in NoSQL databases is merge sort, where as NoSQL we understand any database without typical SQL programming interpreter. The author describes how to use the parallelization of the sorting processes for the modified method of sorting by merging for large data sets. The subject of this research is the claim that the parallelization of the sorting method is faster and beneficial for multi-core systems. Presented results show how the number of processors influences the sorting performance. The results are presented in theoretical assumptions and confirmed in practical benchmark tests. The method is compared to other sorting methods like quick sort, heap sort, and merge sort to show potential efficiency.
Article
Full-text available
We present a novel, yet straightforward linear-time algorithm for merging two sorted lists in a fixed amount of additional space. Constant of proportionality estimates and empirical testing reveal that this procedure is reasonably competitive with merge routines free to squander unbounded additional memory, making it particularly attractive whenever space is a critical resource.
Article
Sorting algorithm is one of the most basic research fields in computer science. It's goal is to make record easier to search, insert and delete. Through the description of five sort algorithms: bubble, select, insert, merger and quick, the time and space complexity was summarized. Furthermore, two categories of () 2 On and (l og ) On n could be found out. From the aspects of input sequence scale and input sequence random degree, some results were obtained based on the experiments. When the size of records is small, insertion sort or selection sort performs well. When the sequence is ordered, insertion sort or bubble sort performs well. When the size of records is large, quick sort or merge sort performs well. Different application could select appropriate sort algorithm according to these rules.
Article
Quick sort algorithm has been widely used in data processing systems, because of its high efficiency, fast speed, scientific structure. Therefore, thorough study based on time complexity of quick sort algorithm is of great significance. Especially on time complexity aspect, the comparison of quick sort algorithm and other algorithm is particularly important. This paper talks about time complexity of quick sort algorithm and makes a comparison between the improved bubble sort and quick sort through analyzing the first order derivative of the function that is founded to correlate quick sort with other sorting algorithm. The comparison can promote programmers make the right decision when they face the choice of sort algorithms in a variety of circumstances so as to reduce the code size and improve efficiency of application program.
Conference Paper
We proposed a new but simple and efficient merge sort technique. This method has several advantages on performance over previous merge sort technique. This technique divides the whole data as like divided and conquer method but until two elements in a group instead of one in each group like traditional merge sort by [D. E. Knuth (1973)]. Then solves the divided parts and conquers them. The resulting efficiency of this method is that the division of problem is much less then the previous method, so the recursive call is also less. As a result we have less comparison and less computational time.
Article
In 2000, Geffert et al. (Theoret. Comput. Sci. 237 (2000) 159) presented an asymptotically efficient algorithm for stable merging in constant extra space. The algorithm requires at most m1(t+1)+m2/2t+o(m1) comparisons (t=⌊log2(m2/m1)⌋) and 5m2+12m1+o(m1) moves, where m1 and m2 are the sizes of two ordered sublists to be merged, and m1⩽m2. This paper optimizes the algorithm. The optimized algorithm is simpler than their algorithm, and makes at most m1(t+1)+m2/2t+o(m1+m2) comparisons and 6m2+7m1+o(m1+m2) moves.
Conference Paper
Based on insertion, Quick-Sort and Merge-Sort algorithms, this paper proposes an improved method about indexing and presents its corresponding parallel algorithm. Introduction of index table increases memory consumption but decreases consumption of record movement in sorting. The experiment demonstrates that executing CPU time of indexing based sort algorithm is evidently less than that of other sort algorithms. Based on index table and parallel computing, the Merge-Sort algorithm saved the waiting and disposal time in which every two sub-merging sequences are sorted in single processor computer. This obtained better efficiency than the original Merge-Sort algorithm.
Article
In this paper we present a new algorithm for merging two linearly ordered sets which requires substantially fewer comparisons than the commonly used tape merge or binary insertion algorithms. Bounds on the difference between the number of comparisons required by this algorithm and the information theory lower bounds are derived. Results from a computer implementation of the new algorithm are given and compared with a similar implementation of the tape merge algorithm.