To read the full-text of this research, you can request a copy directly from the author.
Abstract
Parallaxis is a machine-independent language for data-parallel
programming, based on sequential Modula-2. Programming in Parallaxis is
done on a level of abstraction with virtual processors and virtual
connections, which may be defined by the application programmer. This
paper describes Parallaxis-III, the current version of the language
definition, together with a number of parallel sample algorithms
To read the full-text of this research, you can request a copy directly from the author.
... One goal of the course was for students to experience two different models of parallelism: SIMD and MIMD computing. For SIMD computing, the author installed Parallaxis-III [14] on each lab workstation. Parallaxis is a machine-independent framework for defining virtual SIMD architectures, specifying parallel algorithms (in Modula-2), and running a specified algorithm on a defined architecture. ...
Much has changed about parallel and distributed computing (PDC) since the author began teaching the topic in the late 1990s. This paper reviews some of the key changes to the field and describes their impacts on his work as a PDC educator. Such changes include: the availability of free implementations of the message passing interface (MPI) for distributed-memory multiprocessors; the development of the Beowulf cluster; the advent of multicore architectures; the development of free multithreading languages and libraries such as OpenMP; the availability of (relatively) inexpensive manycore accelerator devices (e.g., GPUs); the availability of free software platforms like CUDA, OpenACC, OpenCL, and OpenMP for using accelerators; the development of inexpensive single board computers (SBCs) like the Raspberry Pi, and other changes. The paper details the evolution of PDC education at the author's institution in response to these changes, including curriculum changes, seven different Beowulf cluster designs, and the development of pedagogical tools and techniques specifically for PDC education. The paper also surveys many of the hardware and software infrastructure options available to PDC educators, provides a strategy for choosing among them, and provides practical advice for PDC pedagogy. Through these discussions, the reader may see how much PDC education has changed over the past two decades, identify some areas of PDC that have remained stable during this same time period, and so gain new insight into how to efficiently invest one's time as a PDC educator.
... Several parallel languages have supported mechanisms for storing and manipulating index sets. Parallaxis-III and C are two such examples, both designed to express a SIMD style of computation [2,15]. Both languages support dense multidimensional index spaces that are used to declare parallel arrays. ...
Most array languages, including Fortran 90, Matlab, and APL, provide support for referencing arrays by extending the traditional array subscripting construct found in scalar languages. We present an alternative to subscripting that exploits the concept of regions---an index set representation that can be named, manipulated with high-level operators, and syntactically separated from array references. This paper develops the concept of region-based programming and describes its benefits in the context of an idealized array language called RL. We show that regions simplify programming, reduce the likelihood of errors, and enable code reuse. Furthermore, we describe how regions accentuate the locality of array expressions and how this locality is important when targeting parallel computers. We also show how the concepts of region-based programming have been used in ZPL, a fully-implemented practical parallel programming language in use by scientists and engineers. In addition, we contrast region-based programming with the array reference constructs of other array languages.
... Each peripheral router is connected to 4 input/output ports and the central router is connected to m+1 input/output ports. This elementary network can be described in Parallaxis[19] as following: Configuration Poly [2...n], [2...m] Periphery connection: Vertical connection: North: Poly[i] → Poly[i − 1] South: Poly[i] → Poly[i + 1] Horizontal connection: East: Poly[j] → Poly[j + 1] West: Poly[j] → Poly[j − 1] Central Router: Connection many to one Poly[i, j] → Poly[0] [0] The GEXspidergon graph is constructed by iterating this algorithm in two dimensions as illustrated in Figure 2. Routers are categorized into four groups according to their degree: ...
The study of Networks on Chips (NoCs) is a research field that primarily addresses the global communication in Systems-on-Chip (SoCs). The selected topology and the routing algorithm play a prime role in the performance of NoC architectures. In order to handle the design complexity and meet the tight time-to-market constraints, it is important to automate most of these NoC design phases. The extension of the UML language called UML profile for MARTE (Modeling and Analysis of Real-Time and Embedded systems) specifies some concepts for model-based design and analysis of real time and embedded systems. This paper presents a MARTE based methodology for modeling concepts of NoC based architectures. It aims at improving the effectiveness of the MARTE standard by clarifying some notations and extending some definitions in the standard, in order to be able to model complex architectures like NoCs.
... The described parallel algorithm was implemented using the Parallaxis parallel programming language [9]. A suite of tests were performed, which were meant to verify the correctness of the parallel implementation. ...
This paper presents an efficient method for implementing the Gaussian elimination technique for an nxm (m>=n) matrix, using a 2D SIMD array of nxm processors. The described algorithm consists of 2xn-1=O(n) iterations, which provides an optimal speed-up over the serial version. A particularity of the algorithm is that it only requires broadcasts on the rows of the processor matrix and not on its columns. The paper also presents several extensions and applications of the Gaussian elimination algorithm.
Since 1990, the Computer Science Department at Rochester Institute of Technology has offered a concentration in parallel computing. This concentration is available both to undergraduates and to students studying for the masters degree.This paper documents our experiences with the selection of hardware and software to support our parallel computing program. We describe our concentration, and we report on the networking established between Rochester Institute of Technology and other colleges and universities, designed to provide support for educators who are attempting to introduce parallel computing into their curricula. Finally, we look at what we might do differently if we were starting today.
This paper presents an overview of low level parallel image processing algorithms and their implementation for active vision systems. Authors have demonstrated novel low level image processing algorithms for point operators, local operators, dithering, smoothing, edge detection, morphological operators, image segmentation and image compression. The algorithms have been prepared & described as pseudo codes. These algorithms have been simulated using Parallel Computing Toolboxtrade (PCT) of MATLAB. The PCT provides parallel constructs in the MATLAB language, such as parallel for loops, distributed arrays and message passing & enables rapid prototyping of parallel code through an interactive parallel MATLAB session.
Network on Chip (NoC) is a research field path that primarily addresses the global communication in System on Chip (SoC).The selected topology of the components interconnects plays a prime role in the performance of NoC architecture, for NoC conception, high-level synthesis approaches are utilized thus the behaviorally description of the system is refined into an accurate register-transfer-level (RTL) design for SoC implementation. In the recent MARTE (Modeling and Analysis of Real-time and Embedded Systems) Profile, a notion of multidimensional multiplicity has been proposed to model repetitive structures and topology. This paper presents a contribution for a new methodology for modeling NoC based Model Driven Architecture and the Modeling and Analysis of Real-Time and embedded System (MARTE), it aims to prove the effectiveness of standard MARTE in modeling irregular or globally irregular locally regular architectures. We will start this work by high level abstraction to reach low level through generated VHDL code.
Data-parallel languages support a single instruction flow; the parallelism is expressed at the instruction level. Actually, data-parallel languages have chosen arrays to support the parallelism. This regular data structure allows a natural development of regular parallel algorithms. The implementation of irregular algorithms necessitates a programming effort to project the irregular data structures onto regular structures. In this article we present the different techniques used to manage the irregularity in data-parallel languages. Each of them will be illustrated with standard or experimental data-parallel language constructions. 1 Irregularity and data-parallelism First observe that data-parallelism and task parallelism programming models are derived directly from SIMD and MIMD execution models. The first trace of data parallelism is seen in the first supercomputers such as the Cray 1 or the Cyber 205 that provided a pipelined parallel execution model. The access to contiguous or r...
We describe the experience of three undergraduate computer science programs offering courses on parallel computing. In particular, we offer three different solutions to the problem of equipping a lab and discuss how those solutions may impact the content of the course.
. Most data-parallel languages use arrays to support parallelism. This regular data structure allows a natural development of regular parallel algorithms. The implementation of irregular algorithms requires a programming effort to project the irregular data structures onto regular structures. We first propose in this paper a classification of existing data-parallel languages. We briefly describe their irregular and dynamic aspects, and derive different levels where irregularity and dynamicity may be introduced. We propose then a new irregular and dynamic data-parallel programming model, called Idole. Finally we discuss its integration in the C++ language, and present an overview of the Idole extension of C++. 1 Irregularity and Data-Parallelism The evolution of data-parallel languages mimics closely the evolution of sequential languages. Keeping in mind efficiency and simplicity, compilers have supported, in a first step, only regular data structures: arrays in sequential l...
We give linear systolic array architectures for self-organizing linear lists using two hybrids schemes of move-to-front and transpose heuristics, attempting to incorporate the best of both methodologies. The arrays provide input every clock cycle and have a number of processors equal to the length of the list n. This design is then implemented to build high-speed lossless data compression hardware for data communication and storage that have high compression ratio for both small and large files.
In this paper we discuss the formal specification of parallel SIMD
execution. We outline a vector model to describe SIMD execution which
forms the basis of a semantic definition for a simple SIMD language
definition. The model is based upon the notion of atomic parallel SIMD
instructions operating on vectors of size Π where Π is the number
of PEs on the machine. The vector model for parallel SIMD execution is
independent of any specific computing architecture and can define
parallel SIMD execution on a real SIMD machine (with a limited number of
PEs) or a SIMD simulation. The model enables the formal specification of
SIMD languages by providing an underlying mathematical framework for the
SIMD paradigm
Parallaxis is a machine-independent language for data-parallel
programming. Sequential Modula-2 is used as the base language.
Programming in Parallaxis is done on a level of abstraction with virtual
processors and virtual connections, which may be defined by the
application programmer. The paper describes a compiler and debugger for
Parallaxis-III, which allow the execution of massively parallel programs
on sequential workstations, especially for teaching purposes
Parallaxis-III is an architecture-independent data parallel
programming language based on Modula-2. It has been designed for
teaching data parallel concepts and is in use at a large number of
institutions. Compilers exist for data parallel systems, as well as for
a sequential simulation system. A data parallel graphics debugger allows
efficient source level analysis for parallel programs
This paper presents the result of a study in which we examined about 50 massively parallel programming languages in order to detect typical approaches towards supporting parallelism. Based on classification into nine classes, semantic properties affeccting the development of parallel programs are compared. From a consideration of the general function of programming languages in software engineering, we derive basic requirements on parallel languages.
Parallaxis is a programming language for massively parallel single
instruction-multiple data (SIMD) systems, based on Modula-2. There are
only a small number of additional constructs to handle parallel data
(vectors) and data exchange among processors or between the front-end
and back-end. Parallaxis helps to solve parallel problems in a natural
way and does not require special skills. The major language constructs
are described and a number of sample programs are given together with
their simulated processor element PE-load and efficiency values.
Parallaxis is available as a simulation system which is chiefly used in
universities for instructional purposes. However, a compiler for the
massively parallel MasPar computer system has been finished, and a
compiler for the Connection Machine is being developed