Toward the parallelization of GSL
In this paper, we present our joint efforts to design and develop parallel implementations of the GNU Scientific Library for
a wide variety of parallel platforms. The multilevel software architecture proposed provides several interfaces: asequential
interface that hides the parallel nature of the library to sequential users, a parallel interface for parallel programmers,
and a web services based interface to provide remote access to the routines of the library. The physical level of the architecture
includes platforms ranging from distributed and shared-memory multiprocessors to hybrid systems and heterogeneous clusters.
Several well-known operations arising in discrete mathematics and sparse linear algebra are used to illustrate the challenges,
benefits, and performance of different parallelization approaches.
Available from: Jorge González-Domínguez
- "Among them, PBLAS  , a subset of BLAS, and ScaLAPACK , a subset of LAPACK , are the most popular ones. Based on them, Aliaga et al.  made an effort to parallelize the open source numerical library GSL . Moreover, many vendors provide their own parallel numerical libraries, such as Intel MKL , IBM PESSL  and HP MLIB . "
[Show abstract] [Hide abstract]
ABSTRACT: The popularity of Partitioned Global Address Space (PGAS) languages has increased during the last years thanks to their high programmability and performance through an efficient exploitation of data locality, especially on hierarchical architectures such as multicore clusters. This paper describes UPCBLAS, a parallel numerical library for dense matrix computations using the PGAS Unified Parallel C language. The routines developed in UPCBLAS are built on top of sequential basic linear algebra subprograms functions and exploit the particularities of the PGAS paradigm, taking into account data locality in order to achieve a good performance. Furthermore, the routines implement other optimization techniques, several of them by automatically taking into account the hardware characteristics of the underlying systems on which they are executed. The library has been experimentally evaluated on a multicore supercomputer and compared with a message-passing-based parallel numerical library, demonstrating good scalability and efficiency. Copyright © 2012 John Wiley & Sons, Ltd.
Concurrency and Computation Practice and Experience 09/2012; 24(14):1645-1667. DOI:10.1002/cpe.1914 · 1.00 Impact Factor
Available from: Patricio Bulić
[Show abstract] [Hide abstract]
ABSTRACT: In this paper we present an approximate algorithm for detecting and filtering data dependencies with a sufficiently large
distance between memory references. A sequence of the same operations (typically enclosed in a ‘for’ loop) can be replaced
with a single SIMD operation if the distance between memory references is greater than or equal to the number of data processed
in the SIMD register. Some loops that could not be vectorized on traditional vector processors, can still be parallelized
for short SIMD execution. There are a number of approximate data-dependence tests that have been proposed in the literature
but in all of them data dependency will be assumed when actually there is no such a dependence that could restrict parallelization
related to the short SIMD execution model. By examining the properties of linear subscript expressions of possibly conflicting
data references, our algorithm gives the green light to the parallelization process if some sufficient conditions regarding
the dependence distance are met. Our method is based on the Banerjee test and checks the minimum and maximum distances between
memory references within the iteration space rather than searching for the existence of an integer solution to the dependence
equation. The proposed method extends the accuracy and applicability of the classical Banerjee test.
Data dependency–Multimedia extensions–SIMD instructions–Vectorizing compilers
The Journal of Supercomputing 05/2011; 56(2):226-244. DOI:10.1007/s11227-009-0364-8 · 0.86 Impact Factor
Data provided are for informational purposes only. Although carefully collected, accuracy cannot be guaranteed. The impact factor represents a rough estimation of the journal's impact factor and does not reflect the actual current impact factor. Publisher conditions are provided by RoMEO. Differing provisions from the publisher's actual policy or licence agreement may be applicable.