Conference PaperPDF Available

Tool Description: Array programming in Pascal

Authors:

Abstract and Figures

A review of previous array Pascals leads on to a description the Glasgow Pascal compiler. The compiler is an ISO-Pascal superset with semantic extensions to translate data parallel statements to run on multiple SIMD cores. An appendix is given which includes demonstrations of the tool. Pascal[17, 20] was one of the first imperative programming languages to be provided with array extensions. The first Array Pascal compiler Actus[24, 25] was roughly contemporary with the comparable Distributed Array Processor Fortran[15, 25]. Turner's Vector Pascal[28], another array extension of the language , was strongly influenced by APL[18]. It was similar in its array features to ZPL[2, 23, 27], Fortan90[9] or Single Assignment C[12, 26]. These all developed to address the challenge of the super-computers that were coming into use at the time. Later Vector Pascal implementations were developed at Saarland University[10, 21] and the University of Glasgow[4, 5]. Pascal-XSC[13] an extension for scientific data processing provided extensions for vectors, matrices and interval arithmetic but was not a general array language. In Actus was the syntax of array declarations indicated which dimensions of the array were to be evaluated in parallel. var a:array[1:100,1..50} of integer; Here the : rather than the .. is used to indicate that the dimension is to be evaluated in parallel. Actus provided both parallel assignments using index sets a[range]:=40:56;
Content may be subject to copyright.
Tool Description: Array programming in Pascal
Paul Cockshott, Ciaran Mcreesh, Susanne
Oehler
University of Glasgow, School of Computing Science
Youssef Gdura
University of Tripoli
Abstract
A review of previous array Pascals leads on to a description the
Glasgow Pascal compiler. The compiler is an ISO-Pascal superset
with semantic extensions to translate data parallel statements to
run on multiple SIMD cores. An appendix is given which includes
demonstrations of the tool.
Keywords Pascal, SIMD, Vector Processor, GPU
1. Previous array Pascals
Pascal[
17
,
20
] was one of the first imperative programming lan-
guages to be provided with array extensions. The first Array Pascal
compiler Actus[
24
,
25
] was roughly contemporary with the compa-
rable Distributed Array Processor Fortran[15, 25].
Turner’s Vector Pascal[
28
], another array extension of the lan-
guage, was strongly influenced by APL[
18
]. It was similar in its
array features to ZPL[
2
,
23
,
27
], Fortan90[
9
] or Single Assignment
C[
12
,
26
]. These all developed to address the challenge of the super-
computers that were coming into use at the time. Later Vector Pascal
implementations were developed at Saarland University[
10
,
21
] and
the University of Glasgow[
4
,
5
]. Pascal-XSC[
13
] an extension for
scientific data processing provided extensions for vectors, matrices
and interval arithmetic but was not a general array language.
In Actus was the syntax of array declarations indicated which
dimensions of the array were to be evaluated in parallel.
Here the : rather than the .. is used to indicate that the dimension is
to be evaluated in parallel. Actus provided both parallel assignments
using index sets
and parallel compound statements using the con-
struct.
The implicit assumption behind this design decision appears to
have been that there would be distributed processors each with their
own memory banks, so that the compiler would spread the array
over the banks using the
i:j
index form as a clue. This idea has
not been used in subsequent Vector Pascal dialects which have been
designed for machines with a unified memory.
Permission to make digital or hard copies of all or part of this work for personal or
classroom use is granted without fee provided that copies are not made or distributed
for profit or commercial advantage and that copies bear this notice and the full citation
on the first page. Copyrights for components of this work owned by others than ACM
must be honored. Abstracting with credit is permitted. To copy otherwise, or republish,
to post on servers or to redistribute to lists, requires prior specific permission and/or a
fee. Request permissions from permissions@acm.org.
PLDI 2015, .
Copyright c
2015 ACM 978-1-4503-3584-3/15/06. . . $15.00.
http://dx.doi.org/10.1145/
2. Glasgow Vector Pascal
In what follows ’Vector Pascal’ will refer to the Glasgow Vector
Pascal compiler. The implementation initially targeted modern
SIMD chips[
7
,
8
,
11
,
19
] for which it used vectorisation techniques
similar to those in the contemporary Intel C compiler[
1
]. With the
advent of multi-core machines and GPU’s subsequent Vector Pascal
releases have supported automatic multi-core parallelism as well as
SIMD parallelism.
2.1 Parallelism
Vector Pascal uses implicit parallelism obviating the need for
a statement. Conventional loops will, in fact, be
vectorised if there are no data dependencies, but the spirit of the
language is to use APL style array expressions. Thus one can write:
to operate on all corresponding elements of the three arrays. This is
semantically equivalent to:
The index vector is implicitly declared with sufficient elements
to index the array on the left of the assignment scope covering the
right of the assignment statement. Index vectors are usually elided
provided that corresponding positions in arrays are intended.
can be explicitly used to perform things like circular shifts :
Let us assume that we want to compile in program to
execute on a 6 core Xeon using the AVX instruction-set and 32 bit
addressing, we use the command
The compiler then transforms the code into:
The statement has been broken down into two forms of parallelism:
an outer loop that runs on different cores doing every 6th row and an
inner loop that operates 8 words at a time. The loops are then placed
in a nested procedure. The threads on the different cores have
access to the variables by virtue of Pascal being a block
structured language, but have local copies of . Access to the
enclosing scope by the other cores is ensured by sending a static
link from register when posting the job.
Vector Pascal does not support parallel statements, but does
allow parallel expressions:
2.2 Map and reduce
Any dyadic operator
can be used as a reduction operator using the
function form
\◦
, so \* computes the product of a vector, \+ its sum
etc.
Function applications map over arrays. The following example
uses map and reduce.
returns the scalar added to the product of the elements of . It
is mapped over as follows
If you have a matrix, transposing the matrix amounts to swapping
the order of the row and column indices. Thus for matrix
will swap the indices of the right hand side. If is a matrix this is
equivalent to :
i, j a
. But if were a vector
it is equivalent to :i, j a.
Generalisation of transpose is provided by the operator
which permutes the indices
is equivalent to :i, j, k p.
The . operator between arrays performs the scalar product thus:
is equivalent to the sum of products
Pιuι×vι
. To the extent
that + and are overloaded so is scalar product. Thus when are
vectors of sets this evaluates as pairwise set intersection reduced by
set union.
Table 1. Compliance with ISO standard tests.
Compiler Failed % Success
Free Pascal 2.6.2 34 80
Turbo Pascal 7 26 87
Vector Pascal Pentium 0 100
Vector Pascal Xeon Phi 4 97.6
2.3 Types
The Pascal standard[
17
] supports sets over cardinal types. Vector
Pascal extends this to any ordered type.
For cardinal element types sets are implemented as bitmaps and set
expressions vectorised using SIMD. Many Pascal implementations
have a maximum set size of
28
elements. Vector Pascal supports
sets of up to
231
elements. For non-cardinal element types the sets
are implemented as balanced trees and no vectorisation is used.
Dynamic sized arrays are supported as described by the Extended
Standard[14].
is now a pointer to a vector of 10 reals.
Sub-array expressions of the form return a dynamic
array with bounds
Literate programming[
22
] is supported by the compiler. The
flag on the compiler command line causes a L
A
T
E
X documentation
file to be created. Comments delimited thus are treated as
inline L
A
T
E
X, those delimited as are rendered as marginal notes.
Pascal code is reformatted to typographically distinguish reserved
words and variable names. Formulae are rendered with appropriate
maths notation.
Source code can be in UTF8 Unicode, and variable names can
be in Roman, Greek, Cyrillic or CJK characters. Chinese equivalent
reserved words are supported.
3. Implementation
The compiler is in Java and is released from SourceForge under GPL.
It uses the toolchain for linking and targets a range of contempo-
rary and recent instruction-sets: Pentium, Opteron[
19
], SSE, SSE2,
AVX, Playstation2(MIPS), Playstation3(Cell)[
11
], Nvida and the
Intel Knights Ferry[
6
,
16
]. The relevant assemblers must, of course,
be installed. In addition non supported architectures can be targeted
by the option which translates Vector Pascal to C and uses
to generate binary. For the Cell and Nvidia implementations,
the compiler generates code for an abstract SIMD machine that is
implemented either in C on the vector processors or in CUDA on
the GPU.
Performance achieved on Intel AVX and SSE architectures is
comparable to the use of C with Vector Intrinsics and threaded
building blocks[
3
]. However when compared to GPUs performance
it is not as performant as Cuda. Though vector pascal source code
tends to be more compact than C or Cuda for the same task.
Compliance with the ISO language standard is above that of
some other leading Pascal compilers, see Table 1. The ISO-Pascal
conformance test suite comprises 218 programmes designed to test
each feature of the language standard. From the ISO test set a
subset
1
was excluded that tests obsolete file i/o features as all three
compilers follow the Turbo Pascal syntax for file operations. We
ran the test suite using the host Vector Pascal compiler and in cross
compiler mode for the XeonPhi. A programme was counted as a
pass if it compiled and printed the correct result. A fail was recorded
if compilation did not succeed or the programme, on execution,
failed to deliver the correct result.
4. Future work
We have a number ongoing student projects both to extend the Pascal
system, and to add new front ends to it.
1.
We are extending parallel reduction operations in Pascal to allow
arbitrary dyadic functions, as opposed to operators to be used
for reduction.
2.
We are building a front end for the Haggis language used for
teaching in Scottish schools, that uses the code generator sub-
systems used in the Pascal compiler.
3.
We have a prototype Vector C front end for the compiler. This
supports similar parallelisation mechanisms to Vector Pascal
using a Matlab style array syntax. For example:
when compiled and executed produces as output:
here stands for Glasgow C Compiler. This prototype is not
yet fully conformant with the C standard.
References
[1]
Aart J. C. Bik, Milind Girkar, Paul M. Grey, and Xinmin Tian. Au-
tomatic intra-register vectorization for the Intel architecture. Int. J.
Parallel Program., 30(2):65–98, 2002.
[2]
Bradford L Chamberlain, Sung-Eun Choi, C Lewis, Calvin Lin,
Lawrence Snyder, and W Derrick Weathersby. Zpl: A machine in-
dependent programming language for parallel computers. Software
Engineering, IEEE Transactions on, 26(3):197–211, 2000.
[3]
P Cockshott, Y Gdura, and Paul Keir. Array languages and the n-body
problem. Concurrency and Computation: Practice and Experience,
26(4):935–951, 2014.
[4]
Paul Cockshott. Vector pascal reference manual. SIGPLAN Not.,
37(6):59–81, 2002.
[5]
Paul Cockshott and Greg Michaelson. Orthogonal parallel processing in
vector pascal. Computer Languages, Systems & Structures, 32(1):2–41,
2006.
1
Tests 1,3,5, 19, 54, 67..76,78,90..92, 111..115, 118, 121, 131, 141, 160,
197, 198, 202, 203, 212, 213.
[6]
William Paul Cockshott, Susanne Oehler, and Tian Xu. Developing
a compiler for the XeonPhi (TR-2014-341). University of Glasgow,
2014.
[7]
W.P. Cockshott and A. Koliousis. The SCC and the SICSA multi-core
challenge. In 4th MARC Symposium, December 2011.
[8]
Peter Cooper. Porting the Vector Pascal Compiler to the Playstation 2.
Master’s thesis, University of Glasgow Dept of Computing Science,
http://www.dcs.gla.ac.uk/ wpc/reports/compilers/compilerindex/PS2.pdf,
2005.
[9]
AK Ewing, H Richardson, AD Simpson, and R Kulkarni. Writing
Data Parallel Programs with High Performance Fortran. Edinburgh
ParallelComputing Centre, 1998.
[10]
A. Formella, A. Obe, WJ Paul, T. Rauber, and D. Schmidt. The SPARK
2.0 system-a special purpose vector processor with a VectorPASCAL
compiler. In System Sciences, 1992. Proceedings of the Twenty-Fifth
Hawaii International Conference on, volume 1, pages 547–558. IEEE,
1992.
[11]
Youssef Omran Gdura. A new parallelisation technique for heteroge-
neous CPUs. PhD thesis, University of Glasgow, 2012.
[12]
C. Grelck and S.-B. Scholz. SAC — From High-level Programming
with Arrays to Efficient Parallel Execution. Parallel Processing Letters,
13(3):401–412, 2003.
[13]
R Hammer, M Neaga, and D Ratz. Pascal xsc. New Concepts for
Scientific Computation and Numerical Data Processing, pages 15–44,
1992.
[14]
Tony Hetherington. An introduction to the extended pascal language.
ACM SIGPLAN Notices, 28(11):42–51, 1993.
[15]
DAP ICL. Fortran language reference manual. ICL Technical Publica-
tion TP6918, 1979.
[16]
Intel Corporation. Intel Xeon Phi Product Family: Product Brief, April
2014.
[17] ISO. Pascal ISO 7185, 1990.
[18] K. Iverson. A programming language. Wiley, New York, 1966.
[19]
Iain Jackson. Opteron Support for Vector Pascal. Final year thesis,
Dept Computing Science, University of Glasgow, 2004.
[20]
Kathleen Jensen, Niklaus Wirth, Andrew B Mickel, and James F Miner.
Pascal: user manual and report, volume 3. springer-Verlag New York,
1975.
[21]
Christoph W Kessler, Wolfgang J Paul, and Thomas Rauber. Scheduling
vector straight line code on vector processors. In Code Generation
Concepts, Tools, Techniques, page 73..91. Springer, 1992.
[22]
Donald Ervin Knuth. Literate programming. The Computer Journal,
27(2):97–111, 1984.
[23]
Calvin Lin and Lawrence Snyder. Zpl: An array sublanguage. In
Languages and Compilers for Parallel Computing, pages 96–114.
Springer, 1994.
[24]
R. H. Perrott. A Language for Array and Vector Processors. ACM
Trans. Program. Lang. Syst., 1(2):177–195, October 1979.
[25]
R. H. Perrott and A. Zarea-Aliabadi. Supercomputer languages. ACM
Comput. Surv., 18(1):5–22, 1986.
[26]
S.-B. Scholz. —Efficient Support for High-Level Array Operations in
a Functional Setting. Journal of Functional Programming, 13(6):1005–
1059, 2003.
[27]
L Snyder. A Programmer’s Guide to ZPL. MIT Press, Cambridge,
1999.
[28]
T Turner. Vector Pascal a Computer Programming Language for the
Array Processor. PhD thesis, PhD thesis, Iowa State University, USA,
1987.
Appendix Demo
Description
Here is a scaled up version of the programme described earlier
It performs 2*800*1024*100= 163 million arithmetic operations,
we can compile it for the default Pentium code model and produce
a L
A
T
E
X listing file thus:
Running it on an AMD A6 we get
We can now compile it for the AVX instruction-set
This vectorises the code so it runs much faster
It can be further accelerated by multicore compilation. Note it is
not worth using more than 2 cores on this model of CPU as there
are only 2 vector floating point units shared between the 4 cores.
Although on programmes as small as this gains from parallelism are
not guaranteed. We get the following code for the inner loop:
Now let us look at the listings,
Or we can run on the file and get a pretty print
version looking like this
4.1 bar
(see Section 4.2 )
;
4.2 foo
← × ;
Next let us compare the performance of Vector Pascal with C
when blurring a 1024x1024 pixel colour image. The same separable
convolution algorithm is used in both cases:. The addition of the C
file on the compiler command line instructs it to link the Pascal and
C in a single binary.
Pascal outperforms C in this example because it uses saturated
SIMD arithmetic on pixels.
Finally as a bit of fun, matrix product of numbers and strings to
print a Roman number:
Acknowledgments
Thanks to the many Glasgow University students whose term
projects contributed to the compiler and to CloPeMa, Collaborative
project funded by the EU FP7-ICT , 288553
ResearchGate has not been able to resolve any citations for this publication.
Article
Full-text available
Two phases of the SICSA Multi-core Challenge have gone past. The first challenge was to produce concordances of books for sequences of words up to length N; and the second to simulate the motion of N celestial bodies under gravity. We took both challenges on the SCC, using C and the Linux Shell. This paper is an account of the experiences gained. It also gives a shorter account of the performance of other systems on the same set of problems, as they provide benchmarks against which the SCC performance can be compared with.
Book
This manual is directed to those who have previously acquired some programming skill. The intention is to provide a means of learning Pascal without outside guidance. It is based on The Programming Language Pascal (Revised Report) [1]--the basic definition of Pascal and concise reference manual for the experienced Pascal programmer. The linear structure of a book is by no means ideal for introducing a language, whether it be a formal or natural one. Nevertheless, it is recommended to follow the given organization, paying particular attention to the example programs, and then to reread those sections causing difficulties. One may wish, however, to reference chapter 12 if troubles arise concerning the input and output conventions of the programs. The manual was prepared as a file on a computer, that is, as a sequence of characters of a single type font. This is very convenient for the purposes of updating; unfortunately, it is sometimes a bit awkward to read. The reader is asked to be indulgent with the absence of sub- and superscripts (e.g. m raised to the power n is denoted by m**n). Chapters 0--12 define the language Pascal and serve as a standard for both the implementor and the programmer. The implementor must regard the task of recognizing Standard Pascal as the minimum requirement of his system, while the programmer who intends his programs to be transferable from one installation to another should use only features described as Standard Pascal. On the other hand, any implementation may (and usually does) go beyond the minimum. Chapters 13 and 14 document the implementation of Pascal on the CDC 6000 machine. Chapter 13 describes the additional features of the language PASCAL 6000, whereas chapter 14 is devoted to the use of the compiler and the system under the operating system SCOPE.
Article
This paper is a description of the contributions to the Scottish Informatics and Computer Science Alliance Multi-core Challenge on many body planetary simulation made by a compiler group at the University of Glasgow. Our group is part of the Computer Vision and Graphics research group, and we have for some years been developing array compilers because we think these are a good tool both for expressing graphics algorithms and for exploiting the parallelism that computer vision applications require. We shall describe experiments using two languages on two different platforms, and we shall compare the performance of these with reference C implementations running on the same platforms. Finally, we shall draw conclusions both about the viability of the array language approach as compared with other approaches used in the challenge and also about the strengths and weaknesses of the two, very different, processor architectures we used. Copyright © 2013 John Wiley & Sons, Ltd.
Thesis
Support for vector operations in computer programming languages is analyzed to determine if programs employing such operations run faster. The programming language Vector Pascal is defined and compared to Fortran 8X and Actus. Vector Pascal contains definitions for matrix and vector operations and the Vector Pascal compiler translates vector expressions. The Vector Pascal compiler executes on an IBM Personal Computer AT and produces code for a Floating Point Systems FPS-164 Scientific Computer. The standard benchmark LINPACK, which solves systems of linear equations, is transcribed from Fortran to Standard Pascal and Vector Pascal. The Vector Pascal version of LINPACK exploits vector operations defined in the language. The speedup of the Vector Pascal version of LINPACK over the Standard Pascal version is presented.
Article
The paper describes a succinct problem-oriented programming language. The language is broad in scope, having been developed for, and applied effectively in, such diverse areas as microprogramming, switching theory, operations research, information retrieval, sorting theory, structure of compilers, search procedures, and language translation. The language permits a high degree of useful formalism. It relies heavily on a systematic extension of a small set of basic operations to vectors, matrices, and trees, and on a family of flexible selection operations controlled by logical vectors. Illustrations are drawn from a variety of applications.
Article
The high-level languages proposed for supercomputers, such as vector and array processors, have been designed using one of the following two approaches: (1) an existing sequential language is adapted, (2) a new language based on the hardware is developed. Recently, there has emerged a third approach, which does not require the programmer to be aware of the sequential nature of the language or the hardware characteristics. Examples of these language groups are examined to illustrate their main features and what is required of a programmer when using such languages. The study therefore enables a comparison of the different language approaches to be made.
Article
The scientific community has consistently demanded from computing machines an increase in the number of instructions executed per second. The latest increase has been achieved by duplication of arithmetic units for an array processor and the pipelining of functional units for vector processors. The high level programming languages for such machines have not benefited from the advances which have been made in programming language design and implementation techniques. A high level language is described in this paper which is appropriate for both array and vector processors and is defined without reference to the hardware of either type of machine. The syntax enables the parallel nature of a problem to be expressed in a form which can be readily exploited by these machines. This is achieved by using the data declarations to indicate the maximum extent of parallel processing and then to manipulate this, or a lesser extent, in the course of program execution. It was found to be possible to modify many of the structured programming and data structuring concepts for this type of parallel environment and to maintain the benefits of compile time and run time checking. Several special constructs and operators are also defined. The language offers to the large scale scientific computing community many of the advances which have been made in software engineering techniques while it exploits the architectural advances which have been made.
Article
Despite the widespread adoption of parallel operations in contemporary CPU designs, their use has been restricted by a lack of appropriate programming language abstractions and development environments. To fully exploit the SIMD model of computation such operations offer, programmers depend on CPU specific machine code or implementation-dependent libraries.Here we present vector Pascal (VP), a language designed to enable the elegant and efficient expression of SIMD algorithms. VP imports into Pascal abstraction mechanisms derived from functional languages, in turn having their origins in APL. In particular, it extends all operators to work on vectors of data. The type system is also extended to handle pixels and dimensional analysis. Code generation is via the ILCG system that allows retargeting to multiple different SIMD instruction sets based on formalised descriptions of the instruction set semantics.