Content uploaded by Ahmet Arif Aydin
Author content
All content in this area was uploaded by Ahmet Arif Aydin on Sep 29, 2014
Content may be subject to copyright.
This study is about the importance of sorting and parallelism in numerous scientific
fields. Sorting is one of the most studied problems in the Computer Science field. The
goal of the studies is decreasing sorting time, which relates to saving power and
money. Meanwhile, parallelism is also another vital field which appeals to Scientists
in order to run their operations fast. The main goal of the project is implementing
alternative sequential versions of Radix Sort Algorithm. We also have parallel versions
of these sequential versions which are suitable to parallelize.
In the first phase of the project, Sequential Least Significant Digit (LSD) Radix Sort
versions are implemented. After that, Parallel LSD radix sort versions are
implemented with different synchronization constructs, Critical Section and Lock
Routine, and different scheduling methods, static, dynamic, and guided, to analyze
their impact on the running time.
In the second phase of the project, sequential, parallel, and hybrid versions of the
Most Significant Digit (MSD) Radix Sort versions are implemented. First, Sequential
MSD1 Radix Sort version is implemented which is an alternative implementation of
Traditional Recursive MSD Radix Sort Algorithm. After that, we come up with parallel
version of MSD1 Radix Sort. Finally, Hybrid versions of MSD Radix Sort are
implemented which utilize Quicksort Algorithm.
ABSTRACT
Radix Sort Algorithm sorts integers and strings based on their individual digits or bits.
The idea is behind that when the digits have been sorted based on their digits; all
numbers will be sorted. Sorting process continues until comparison of each digit on the
same place completed. Radix Sort has been chosen because it is one of the fastest
sorting algorithms. Theoretically, Radix sort’s runtime is O(nlgn) [1], however, in
practice the runtime of the algorithm is O(nd) where n is the length of the data
structure and d is the digit count of the maximum number in the unsorted sequence..
PERFORMANCE BENCHMARKING OF SEQUENTIAL, PARALLEL AND HYBRID RADIX SORT ALGORITHMS
AND ANALYZING IMPACT OF SUB VECTORS, CREATED ON EACH LEVEL, ON HYBRID MSD RADIX SORT’S RUNTIME
Ahmet Arif Aydin, Dr.Gita Alaghband
RADIX SORT ALGORITHM
SEQUENTIAL LSD RADIX SORT
Traditional LSD Radix Sort
PARALLEL LSD RADIX SORT
SCHEDULING STRATEGIES
The aim of scheduling strategies is distributing distinct sub proportions of the given
input among threads. Scheduling strategies, which have been applied to the project
are static, dynamic, guided. When scheduling strategy is dynamic, there is not a fixed
iteration numbers for each thread. One thread can process more than others.
Whenever each thread completes its process, requests one more chunk. Therefore,
distribution of the given data can be in any order of the threads. Also, any thread
can work more than others in any order
SEQ. & PAR. LSD RADIX SORT RESULTS
4D-9999
O.5M
1M
2M
4M
Trad. LSD
0.068
0.14
0.28
0.58
LSD1
0.067
0.14
0.28
0.58
LSD2
0.05
0.12
0.25
0.51
P1- C-D (1T)
0.15
0.32
0.66
1.33
P2- L - D (1T)
0.15
0.32
0.66
1.33
P1- C-D (2T)
0.60
1.19
2.4
4.83
P2- L - D (2T)
0.69
1.46
2.9
5.97
P1- C-D (4T)
0.98
1.97
3.9
7.86
P2- L - D (4T)
1.26
2.67
6.61
10.69
P1- C-D (6T)
1.87
3.66
7.68
15.47
P2- L - D (6T)
2.58
4.5
11.08
18.67
P1- C-D (8T)
2.55
5.1
10.06
21.14
P2- L - D (8T)
3.13
6.68
17.8
23.49
P1- C-D (10T)
2.97
6.17
11.97
26.76
P2- L - D (10T)
4.18
8.73
24.18
26.82
P1- C-D (12T)
3.53
7.19
14.24
30
P2- L - D (12T)
4.96
10.4
28.14
41.3
4D-9999
O.5M
1M
2M
4M
Trad. LSD
0.06
0.14
0.28
0.58
LSD1
0.06
0.14
0.28
0.58
LSD2
0.05
0.12
0.25
0.51
P1- C -S (1T)
0.106
0.22
0.45
0.92
P2- L -S (1T)
0.108
0.22
0.46
0.94
P1- C -S (2T)
0.51
1.04
2.15
4.19
P2- L -S (2T)
0.54
1.07
2.02
4.4
P1- C -S (4T)
1.06
2.16
4.26
8.5
P2- L -S (4T)
1.29
2.92
6.39
11.6
P1- C -S (6T)
1.63
3.29
6.62
13.38
P2- L -S (6T)
2.17
4.65
11.34
19.68
P1- C -S (8T)
2.08
4.22
8.41
17.22
P2- L -S (8T)
2.88
6.48
15
26.1
P1- C -S (10T)
2.55
5.17
10.3
20.86
P2- L -S (10T)
3.61
8.25
17.63
33.5
P1- C -S (12T)
3.01
6.11
12.29
25
P2- L -S (12T)
4.17
9.32
20.21
37
0
5
10
15
20
25
30
35
40
45
Runtime(seconds)
O.5M
1M
2M
4M
4D-9999
O.5M
1M
2M
4M
Trad. LSD
0.06
0.14
0.28
0.58
LSD1
0.06
0.14
0.28
0.58
LSD2
0.05
0.12
0.25
0.51
P1- C -G (1T)
0.10
0.22
0.46
0.93
P2- L - G (1T)
0.10
0.22
0.46
0.93
P1- C -G (2T)
0.53
1.05
2.2
4.24
P2- L - G (2T)
0.52
1.1
2.33
5.08
P1- C -G (4T)
1.12
2.27
4.59
9.04
P2- L - G (4T)
1.45
3.21
6.58
12.91
P1- C -G (6T)
1.76
3.62
7.27
14.5
P2- L - G (6T)
2.66
5,53
12.6
22.6
P1- C -G (8T)
2.28
4.6
9.43
18.57
P2- L - G (8T)
3.47
8
17.46
32.43
P1- C -G (10T)
2.85
5.68
11.43
23
P2- L - G (10T)
4.27
9.89
20.53
40
P1- C -G (12T)
3.56
6.68
13.6
27
P2- L - G (12T)
4.99
11.5
23.82
46
10D-9999999999
O.5M
1M
2M
4M
P1- C-D (1T)
0.4
0.87
1.67
3.37
P1- C -S (1T)
0.27
0.57
1.17
2.35
P1- C -G (1T)
0.27
0.58
1.18
2.39
P1- C-D (2T)
1.52
3.07
6.17
12.47
P1- C -S (2T)
1.35
2.73
5.55
11.36
P1- C -G (2T)
1.37
2.76
5.66
12.22
P1- C-D (4T)
2.47
5.05
9.87
19.98
P1- C -S (4T)
2.69
5.3
10.85
21.81
P1- C -G (4T)
2.83
5.7
11.5
22.96
P1- C-D (6T)
4.85
9
12.78
41.4
P1- C -S (6T)
4.18
8.53
17.2
33.6
P1- C -G (6T)
4.67
9.3
19.09
36
P1- C-D (8T)
6.26
12.6
26
51.8
P1- C -S (8T)
5.33
10.7
21
43
P1- C -G (8T)
5.95
12
23
47
P1- C-D (10T)
7.86
14.5
31
66
P1- C -S (10T)
6.57
13
26
53
P1- C -G (10T)
7.33
14
28
58
P1- C-D (12T)
9.42
17
36
74
P1- C -S (12T)
7.77
15
31
65
P1- C -G (12T)
8.58
17
33
71
0
10
20
30
40
50
60
70
80
Runtime(seconds)
O.5M
1M
2M
4M
0
5
10
15
20
25
30
35
40
45
50
Runtime(seconds)
O.5M
1M
2M
4M
SEQUENTIAL& PARALLEL MSD RADIX SORT
MSD radix sort starts from the most significant digit and the next step works on next
most significant digit until examining the least significant digit of the number. This
feature provides to create individual buckets each level.
Recursive MSD Radix Sort
LSD2 Radix Sort
Non- Recursive MSD1 Radix Sort
SEQ & PARALLEL MSD RADIX SORT RESULTS
Two parallel LSD radix sort implemented.P1 utilizes critical section and P2 starts the
sorting process from the least significant digit and move towards most significant digit.
LSD radix sort starts the sorting process from the least significant digit and move
towards most significant digit.
Parallel MSD1 Radix Sort
0
1
2
3
4
5
6
Runtime(seconds)
O.5M
1M
2M
4M
0
0.5
1
1.5
2
2.5
3
1D
2D
3D
4D
5D
6D
7D
8D
9D
10D
Speedup
Numbers Range (1digits--10 Digits)
O.5M
1M
2M
4M
6D-99999
O.5M
1M
2M
4M
Seq.MSD1
0.19
0.36
0.71
1.34
P (1T)
0.37
0.71
1.65
3.6
P (2T)
0.25
0.53
1.07
2.13
P (4T)
0.19
0.38
0.75
1.45
P (6T)
0.15
0.3
0.56
1.1
P (8T)
0.14
0.28
0.55
1.09
P (10T)
0.15
0.29
0.72
1.45
P (12T)
0.12
0.24
0.68
1.46
0 1 2 3 4
Seq.MSD1
P (1T)
P (2T)
P (4T)
P (6T)
P (8T)
P (10T)
P (12T)
Runtime(seconds)
4M
2M
1M
O.5M
7D-999999
O.5M
1M
2M
4M
Seq.MSD1
0.28
0.52
1.03
1.94
P (1T)
0.47
0.9
2.02
4.16
P (2T)
0.31
0.58
1.3
2.5
P (4T)
0.22
0.44
0.9
1.74
P (6T)
0.17
0.34
0.69
1.34
P (8T)
0.16
0.33
0.66
1.3
P (10T)
0.15
0.32
0.8
1.5
P (12T)
0.14
0.28
0.8
1.6
8D-9999999
O.5M
1M
2M
4M
Seq.MSD1
0.35
0.68
1.36
2.59
P (1T)
0.54
1.05
2.32
5.19
P (2T)
0.33
0.69
1.5
2.9
P (4T)
0.24
0.48
1.03
2.07
P (6T)
0.2
0.4
0.78
1.63
P (8T)
0.18
0.39
0.76
1.49
P (10T)
0.18
0.32
0.88
1.87
P (12T)
0.14
0.29
0.84
1.9
9D-99999999
O.5M
1M
2M
4M
Seq.MSD1
0.39
0.76
1.53
3.03
P (1T)
0.59
1.18
2.3
4.62
P (2T)
0.34
0.72
1.63
3.27
P (4T)
0.27
0.53
1.13
2.31
P (6T)
0.21
0.44
0.83
1.76
P (8T)
0.19
0.37
0.76
1.6
P (10T)
0.18
0.38
0.81
1.75
P (12T)
0.16
0.33
0.71
1.33
0 2 4 6
Seq.MSD1
P (1T)
P (2T)
P (4T)
P (6T)
P (8T)
P (10T)
P (12T)
Runtime(seconds)
4M
2M
1M
O.5M
0 2 4 6
Seq.MSD1
P (1T)
P (2T)
P (4T)
P (6T)
P (8T)
P (10T)
P (12T)
Runtime(seconds)
4M
2M
1M
O.5M
0 2 4 6
Seq.MSD1
P (1T)
P (2T)
P (4T)
P (6T)
P (8T)
P (10T)
P (12T)
Runtime(seconds)
4M
2M
1M
O.5M
SEQUENTIAL & PARALLEL HYBRID MSD RADIX SORT
Hybrid1 MSD Radix Sort Hybrid2 MSD Radix Sort
Data structure of Hybrid MSD Radix Sort versions is identical with MSD1 Radix Sort
version’s data structure. Maximum number’s digit length (d) defines the number of
steps. MSD1 Radix Sort version sorts numbers in d steps.
Hybrid versions work at most d-1 steps starting from 1 step, 2 steps until d-1 steps.
Each step works individually. At the end, Hybrid versions determine which bucket
contains numbers, then applies Quicksort to each sub vectors.
Each Parallel version utilizes same data structure its sequential version, but at the
end, applying Quicksort to sub vectors will be parallel.
CONCLUSION
SEQ & PARALLEL HYBRID MSD RADIX SORT RESULTS
4D
STEP
0.5M
1M
2M
4M
hybrid 1
1
0.066
0.13
0.27
0.55
hybrid2
1
0.061
0.12
0.25
0.52
P1(1T)
1
0.065
0.13
0.27
0.55
P2(1T)
1
0.061
0.12
0.25
0.51
P1(2T)
1
0.044
0.09
0.18
0.38
P2(2T)
1
0.041
0.08
0.16
0.34
P1(4T)
1
0.035
0.07
0.15
0.29
P2(4T)
1
0.031
0.06
0.13
0.26
P1(6T)
1
0.032
0.06
0.13
0.26
P2(6T)
1
0.028
0.05
0.11
0.23
P1(8T)
1
0.032
0.064
0.13
0.27
P2(8T)
1
0.028
0.05
0.11
0.22
P1(10T)
1
0.028
0.05
0.11
0.23
P2(10T)
1
0.024
0.04
0.09
0.19
P1(12T)
1
0.028
0.05
0.11
0.23
P2(12T)
1
0.024
0.04
0.098
0.19
4D
STEP
0.5M
1M
2M
4M
hybrid 1
2
0.077
0.15
0.32
0.64
hybrid2
2
0.07
0.14
0.28
0.59
P1(1T)
2
0.074
0.15
0.31
0.64
P2(1T)
2
0.073
0.14
0.3
0.61
P1(2T)
2
0.059
0.12
0.25
0.5
P2(2T)
2
0.057
0.11
0.23
0.48
P1(4T)
2
0.049
0.1
0.21
0.43
P2(4T)
2
0.05
0.1
0.2
0.41
P1(6T)
2
0.048
0.1
0.2
0.41
P2(6T)
2
0.048
0.09
0.19
0.38
P1(8T)
2
0.047
0.096
0.2
0.4
P2(8T)
2
0.047
0.09
0.19
0.39
P1(10T)
2
0.047
0.09
0.19
0.39
P2(10T)
2
0.045
0.08
0.18
0.35
P1(12T)
2
0.047
0.09
0.19
0.39
P2(12T)
2
0.045
0.08
0.17
0.36
4D
STEP
0.5M
1M
2M
4M
hybrid 1
3
0.087
0.18
0.36
0.74
hybrid2
3
0.078
0.16
0.32
0.66
P1(1T)
3
0.085
0.17
0.36
0.73
P2(1T)
3
0.081
0.16
0.33
0.58
P1(2T)
3
0.076
0.15
0.31
0.63
P2(2T)
3
0.071
0.14
0.29
0.59
P1(4T)
3
0.068
0.14
0.28
0.56
P2(4T)
3
0.064
0.12
0.26
0.53
P1(6T)
3
0.068
0.14
0.28
0.55
P2(6T)
3
0.063
0.12
0.25
0.52
P1(8T)
3
0.068
0.13
0.27
0.55
P2(8T)
3
0.062
0.12
0.25
0.51
P1(10T)
3
0.067
0.13
0.27
0.54
P2(10T)
3
0.062
0.12
0.25
0.51
P1(12T)
3
0.069
0.13
0.27
0.54
P2(12T)
3
0.062
0.12
0.24
0.51
0
0.1
0.2
0.3
0.4
0.5
0.6
0.7
0.8
hybrid 1
hybrid2
P1(1T)
P2(1T)
P1(2T)
P2(2T)
P1(4T)
P2(4T)
P1(6T)
P2(6T)
P1(8T)
P2(8T)
P1(10T)
P2(10T)
P1(12T)
P2(12T)
Runtime(seconds)
0.5M
1M
2M
4M
0
0.1
0.2
0.3
0.4
0.5
0.6
0.7
hybrid 1
hybrid2
P1(1T)
P2(1T)
P1(2T)
P2(2T)
P1(4T)
P2(4T)
P1(6T)
P2(6T)
P1(8T)
P2(8T)
P1(10T)
P2(10T)
P1(12T)
P2(12T)
Runtime(seconds)
0.5M
1M
2M
4M
0
0.1
0.2
0.3
0.4
0.5
0.6
hybrid 1
hybrid2
P1(1T)
P2(1T)
P1(2T)
P2(2T)
P1(4T)
P2(4T)
P1(6T)
P2(6T)
P1(8T)
P2(8T)
P1(10T)
P2(10T)
P1(12T)
P2(12T)
Runtime(seconds)
0.5M
1M
2M
4M
First, Sequential LSD Radix sort versions implemented, then Parallel LSD Radix Sort versions implemented with Critical Section and Lock, and their impacts on the results have been measured. Moreover, Recursive MSD Radix Sort implemented, and then we come up with
an alternative non-recursive sequential MSD1 Radix sort which always performs better than the traditional recursive version. Also, Parallel MSD1 Radix Sort implemented and results measured. Finally, two sequential and two parallel Hybrid MSD Radix Sort versions
implemented, and Impact of creating sub vectors on the runtime have been measured in both sequential and parallel Hybrid MSD Radix Sort versions.
0
5
10
15
20
25
30
35
40
Runtime(seconds)
O.5M
1M
2M
4M