Conference Paper

Parallaxis-III: a structured data-parallel programming language

Authors:
To read the full-text of this research, you can request a copy directly from the author.

Abstract

Parallaxis is a machine-independent language for data-parallel programming, based on sequential Modula-2. Programming in Parallaxis is done on a level of abstraction with virtual processors and virtual connections, which may be defined by the application programmer. This paper describes Parallaxis-III, the current version of the language definition, together with a number of parallel sample algorithms

No full-text available

Request Full-text Paper PDF

To read the full-text of this research,
you can request a copy directly from the author.

... One goal of the course was for students to experience two different models of parallelism: SIMD and MIMD computing. For SIMD computing, the author installed Parallaxis-III [14] on each lab workstation. Parallaxis is a machine-independent framework for defining virtual SIMD architectures, specifying parallel algorithms (in Modula-2), and running a specified algorithm on a defined architecture. ...
Article
Much has changed about parallel and distributed computing (PDC) since the author began teaching the topic in the late 1990s. This paper reviews some of the key changes to the field and describes their impacts on his work as a PDC educator. Such changes include: the availability of free implementations of the message passing interface (MPI) for distributed-memory multiprocessors; the development of the Beowulf cluster; the advent of multicore architectures; the development of free multithreading languages and libraries such as OpenMP; the availability of (relatively) inexpensive manycore accelerator devices (e.g., GPUs); the availability of free software platforms like CUDA, OpenACC, OpenCL, and OpenMP for using accelerators; the development of inexpensive single board computers (SBCs) like the Raspberry Pi, and other changes. The paper details the evolution of PDC education at the author's institution in response to these changes, including curriculum changes, seven different Beowulf cluster designs, and the development of pedagogical tools and techniques specifically for PDC education. The paper also surveys many of the hardware and software infrastructure options available to PDC educators, provides a strategy for choosing among them, and provides practical advice for PDC pedagogy. Through these discussions, the reader may see how much PDC education has changed over the past two decades, identify some areas of PDC that have remained stable during this same time period, and so gain new insight into how to efficiently invest one's time as a PDC educator.
... Several parallel languages have supported mechanisms for storing and manipulating index sets. Parallaxis-III and C are two such examples, both designed to express a SIMD style of computation [2,15]. Both languages support dense multidimensional index spaces that are used to declare parallel arrays. ...
Conference Paper
Most array languages, including Fortran 90, Matlab, and APL, provide support for referencing arrays by extending the traditional array subscripting construct found in scalar languages. We present an alternative to subscripting that exploits the concept of regions---an index set representation that can be named, manipulated with high-level operators, and syntactically separated from array references. This paper develops the concept of region-based programming and describes its benefits in the context of an idealized array language called RL. We show that regions simplify programming, reduce the likelihood of errors, and enable code reuse. Furthermore, we describe how regions accentuate the locality of array expressions and how this locality is important when targeting parallel computers. We also show how the concepts of region-based programming have been used in ZPL, a fully-implemented practical parallel programming language in use by scientists and engineers. In addition, we contrast region-based programming with the array reference constructs of other array languages.
... Each peripheral router is connected to 4 input/output ports and the central router is connected to m+1 input/output ports. This elementary network can be described in Parallaxis[19] as following: Configuration Poly [2...n], [2...m] Periphery connection: Vertical connection: North: Poly[i] → Poly[i − 1] South: Poly[i] → Poly[i + 1] Horizontal connection: East: Poly[j] → Poly[j + 1] West: Poly[j] → Poly[j − 1] Central Router: Connection many to one Poly[i, j] → Poly[0] [0] The GEXspidergon graph is constructed by iterating this algorithm in two dimensions as illustrated in Figure 2. Routers are categorized into four groups according to their degree: ...
Article
Full-text available
The study of Networks on Chips (NoCs) is a research field that primarily addresses the global communication in Systems-on-Chip (SoCs). The selected topology and the routing algorithm play a prime role in the performance of NoC architectures. In order to handle the design complexity and meet the tight time-to-market constraints, it is important to automate most of these NoC design phases. The extension of the UML language called UML profile for MARTE (Modeling and Analysis of Real-Time and Embedded systems) specifies some concepts for model-based design and analysis of real time and embedded systems. This paper presents a MARTE based methodology for modeling concepts of NoC based architectures. It aims at improving the effectiveness of the MARTE standard by clarifying some notations and extending some definitions in the standard, in order to be able to model complex architectures like NoCs.
... The described parallel algorithm was implemented using the Parallaxis parallel programming language [9]. A suite of tests were performed, which were meant to verify the correctness of the parallel implementation. ...
Article
This paper presents an efficient method for implementing the Gaussian elimination technique for an nxm (m>=n) matrix, using a 2D SIMD array of nxm processors. The described algorithm consists of 2xn-1=O(n) iterations, which provides an optimal speed-up over the serial version. A particularity of the algorithm is that it only requires broadcasts on the rows of the processor matrix and not on its columns. The paper also presents several extensions and applications of the Gaussian elimination algorithm.
Article
Since 1990, the Computer Science Department at Rochester Institute of Technology has offered a concentration in parallel computing. This concentration is available both to undergraduates and to students studying for the masters degree.This paper documents our experiences with the selection of hardware and software to support our parallel computing program. We describe our concentration, and we report on the networking established between Rochester Institute of Technology and other colleges and universities, designed to provide support for educators who are attempting to introduce parallel computing into their curricula. Finally, we look at what we might do differently if we were starting today.
Conference Paper
Full-text available
This paper presents an overview of low level parallel image processing algorithms and their implementation for active vision systems. Authors have demonstrated novel low level image processing algorithms for point operators, local operators, dithering, smoothing, edge detection, morphological operators, image segmentation and image compression. The algorithms have been prepared & described as pseudo codes. These algorithms have been simulated using Parallel Computing Toolboxtrade (PCT) of MATLAB. The PCT provides parallel constructs in the MATLAB language, such as parallel for loops, distributed arrays and message passing & enables rapid prototyping of parallel code through an interactive parallel MATLAB session.
Conference Paper
Network on Chip (NoC) is a research field path that primarily addresses the global communication in System on Chip (SoC).The selected topology of the components interconnects plays a prime role in the performance of NoC architecture, for NoC conception, high-level synthesis approaches are utilized thus the behaviorally description of the system is refined into an accurate register-transfer-level (RTL) design for SoC implementation. In the recent MARTE (Modeling and Analysis of Real-time and Embedded Systems) Profile, a notion of multidimensional multiplicity has been proposed to model repetitive structures and topology. This paper presents a contribution for a new methodology for modeling NoC based Model Driven Architecture and the Modeling and Analysis of Real-Time and embedded System (MARTE), it aims to prove the effectiveness of standard MARTE in modeling irregular or globally irregular locally regular architectures. We will start this work by high level abstraction to reach low level through generated VHDL code.
Conference Paper
Full-text available
Data-parallel languages support a single instruction flow; the parallelism is expressed at the instruction level. Actually, data-parallel languages have chosen arrays to support the parallelism. This regular data structure allows a natural development of regular parallel algorithms. The implementation of irregular algorithms necessitates a programming effort to project the irregular data structures onto regular structures. In this article we present the different techniques used to manage the irregularity in data-parallel languages. Each of them will be illustrated with standard or experimental data-parallel language constructions. 1 Irregularity and data-parallelism First observe that data-parallelism and task parallelism programming models are derived directly from SIMD and MIMD execution models. The first trace of data parallelism is seen in the first supercomputers such as the Cray 1 or the Cyber 205 that provided a pipelined parallel execution model. The access to contiguous or r...
Conference Paper
Full-text available
We describe the experience of three undergraduate computer science programs offering courses on parallel computing. In particular, we offer three different solutions to the problem of equipping a lab and discuss how those solutions may impact the content of the course.
Conference Paper
Full-text available
. Most data-parallel languages use arrays to support parallelism. This regular data structure allows a natural development of regular parallel algorithms. The implementation of irregular algorithms requires a programming effort to project the irregular data structures onto regular structures. We first propose in this paper a classification of existing data-parallel languages. We briefly describe their irregular and dynamic aspects, and derive different levels where irregularity and dynamicity may be introduced. We propose then a new irregular and dynamic data-parallel programming model, called Idole. Finally we discuss its integration in the C++ language, and present an overview of the Idole extension of C++. 1 Irregularity and Data-Parallelism The evolution of data-parallel languages mimics closely the evolution of sequential languages. Keeping in mind efficiency and simplicity, compilers have supported, in a first step, only regular data structures: arrays in sequential l...
Conference Paper
We give linear systolic array architectures for self-organizing linear lists using two hybrids schemes of move-to-front and transpose heuristics, attempting to incorporate the best of both methodologies. The arrays provide input every clock cycle and have a number of processors equal to the length of the list n. This design is then implemented to build high-speed lossless data compression hardware for data communication and storage that have high compression ratio for both small and large files.
Conference Paper
In this paper we discuss the formal specification of parallel SIMD execution. We outline a vector model to describe SIMD execution which forms the basis of a semantic definition for a simple SIMD language definition. The model is based upon the notion of atomic parallel SIMD instructions operating on vectors of size Π where Π is the number of PEs on the machine. The vector model for parallel SIMD execution is independent of any specific computing architecture and can define parallel SIMD execution on a real SIMD machine (with a limited number of PEs) or a SIMD simulation. The model enables the formal specification of SIMD languages by providing an underlying mathematical framework for the SIMD paradigm
Conference Paper
Parallaxis is a machine-independent language for data-parallel programming. Sequential Modula-2 is used as the base language. Programming in Parallaxis is done on a level of abstraction with virtual processors and virtual connections, which may be defined by the application programmer. The paper describes a compiler and debugger for Parallaxis-III, which allow the execution of massively parallel programs on sequential workstations, especially for teaching purposes
Article
Full-text available
Parallaxis-III is an architecture-independent data parallel programming language based on Modula-2. It has been designed for teaching data parallel concepts and is in use at a large number of institutions. Compilers exist for data parallel systems, as well as for a sequential simulation system. A data parallel graphics debugger allows efficient source level analysis for parallel programs
Chapter
This paper presents the result of a study in which we examined about 50 massively parallel programming languages in order to detect typical approaches towards supporting parallelism. Based on classification into nine classes, semantic properties affeccting the development of parallel programs are compared. From a consideration of the general function of programming languages in software engineering, we derive basic requirements on parallel languages.
Conference Paper
Parallaxis is a programming language for massively parallel single instruction-multiple data (SIMD) systems, based on Modula-2. There are only a small number of additional constructs to handle parallel data (vectors) and data exchange among processors or between the front-end and back-end. Parallaxis helps to solve parallel problems in a natural way and does not require special skills. The major language constructs are described and a number of sample programs are given together with their simulated processor element PE-load and efficiency values. Parallaxis is available as a simulation system which is chiefly used in universities for instructional purposes. However, a compiler for the massively parallel MasPar computer system has been finished, and a compiler for the Connection Machine is being developed