Conference Paper

Compact native code generation for dynamic languages on micro-core architectures

Authors:
To read the full-text of this research, you can request a copy directly from the authors.

No full-text available

Request Full-text Paper PDF

To read the full-text of this research,
you can request a copy directly from the authors.

... vPython can either be run standalone on the device or as a Domain-Specific Language (DSL) within Python running on the host, offloading kernels for execution to the device. More information on the parallel programming, offloading and dynamic code loading capabilities of the language can be found in [22] and [21]. ...
... Although this is an issue for PicoRV32 binaries, it impacts both Olympus and native C kernels. Therefore, a more detailed discussion of possible mitigations for this issue will not be provided, except to highlight the benefits of the Olympus dynamic code loading mechanism discussed in [21]. ...
... Further work includes exploring automatic memory management for data and code, optimisation of the Olympus abstract machine, automatic dynamic function selection for the dynamic loading support discussed in [21], additional data types (byte arrays) to minimise the memory footprint of data and additional device support (GPUs and FPGAs) using OpenCL C and Xilinx HLS C. ...
Preprint
Vipera provides a compiler and runtime framework for implementing dynamic Domain-Specific Languages on micro-core architectures. The performance and code size of the generated code is critical on these architectures. In this paper we present the results of our investigations into the efficiency of Vipera in terms of code performance and size.
Chapter
Vipera provides a compiler and runtime framework for implementing dynamic Domain-Specific Languages on micro-core architectures. The performance and code size of the generated code is critical on these architectures. In this paper we present the results of our investigations into the efficiency of Vipera in terms of code performance and size.KeywordsDomain-specific languagesPythonnative code generationRISC-Vmicro-core architectures
Conference Paper
Full-text available
In the presence of ever-changing computer architectures, high-quality optimising compiler backends are moving targets that require specialist knowledge and sophisticated algorithms. In this paper, we explore a new backend for the Glasgow Haskell Compiler (GHC) that leverages the Low Level Virtual Machine (LLVM), a new breed of compiler written explicitly for use by other compiler writers, not high-level programmers, that promises to enable outsourcing of low-level and architecture-dependent aspects of code generation. We discuss the conceptual challenges and our backend design. We also provide an extensive quantitative evaluation of the performance of the backend and of the code it produces.
Article
Full-text available
The Spineless Tagless G-machine is an abstract machine designed to support non-strict higher-order functional languages. This presentation of the machine falls into three parts. Firstly, we give a general discussion of the design issues involved in implementing non-strict functional languages. Next, we present the STG language, an austere but recognizably-functional language, which as well as a denotational meaning has a well-defined operational semantics. The STG language is the ‘abstract machine code’ for the Spineless Tagless G-machine. Lastly, we discuss the mapping of the STG language onto stock hardware. The success of an abstract machine model depends largely on how efficient this mapping can be made, though this topic is often relegated to a short section. Instead, we give a detailed discussion of the design issues and the choices we have made. Our principal target is the C language, treating the C compiler as a portable assembler.
Article
Full-text available
We provide an overview of the key architectural features of recent microprocessor designs and describe the programming model and abstractions provided by OpenCL, a new parallel programming standard targeting these architectures.
Article
Micro-core architectures combine many low memory, low power computing cores together in a single package. These are attractive for use as accelerators but due to limited on-chip memory and multiple levels of memory hierarchy, the way in which programmers offload kernels needs to be carefully considered. In this paper we use Python as a vehicle for exploring the semantics and abstractions of higher level programming languages to support the offloading of computational kernels to these devices. By moving to a pass by reference model, along with leveraging memory kinds, we demonstrate the ability to easily and efficiently take advantage of multiple levels in the memory hierarchy, even ones that are not directly accessible to the micro-cores. Using a machine learning benchmark, we perform experiments on both Epiphany-III and MicroBlaze based micro-cores, demonstrating the ability to compute with data sets of arbitrarily large size. To provide context of our results, we explore the performance and power efficiency of these technologies, demonstrating that whilst these two micro-core technologies are competitive within their own embedded class of hardware, there is still a way to go to reach HPC class GPUs.
Conference Paper
Static compilation, a.k.a., ahead-of-time (AOT) compilation, is an alternative approach to JIT compilation that can combine good speed and lightweight memory footprint, and that can accommodate read-only memory constraints that are imposed by some devices and some operating systems. Unfortunately the highly dynamic nature of JavaScript makes it hard to compile statically and all existing AOT compilers have either gave up on good performance or full language support. We have designed and implemented an AOT compiler that aims at satisfying both. It supports full unrestricted ECMAScript 5.1 plus many ECMAScript 2017 features and the majority of benchmarks are within 50
Book
With the SPARC (Scalable Processor ARChitecture) architecture and system software as the underlying foundation, Sun Microsys­ terns is delivering a new model of computing-easy workgroup computing-to enhance the way people work, automating processes across groups, departments, and teams locally and globally. Sun and a large and growing number of companies in the computer industry have embarked on a new approach to meet the needs of computer users and system developers in the 1990s. Originated by Sun, the approach targets users who need a range of compatible computer systems with a variety of application soft­ ware and want the option to buy those systems from a choice of vendors. The approach also meets the needs of system developers to be part of a broad, growing market of compatible systems and software-developers who need to design products quickly and cost-effecti vel y. The SPARe approach ensures that computer systems can be easy to use for all classes of users and members of the workgroup, end users, system administrators, and software developers. For the end user, the SPARC technologies facilitate system set-up and the daily use of various applications. For the system administrator supporting the computer installation, setting up and monitoring the network are easier. For the software developer, there are ad­ vanced development tools and support. Furthermore, the features of the SPARC hardware and software technologies ensure that SPARC systems and applications play an important role in the years to come.
Conference Paper
Dynamic, interpreted languages, like Python, are attractive for domain-experts and scientists experimenting with new ideas. However, the performance of the interpreter is often a barrier when scaling to larger data sets. This paper presents a just-in-time compiler for Python that focuses in scientific and array-oriented computing. Starting with the simple syntax of Python, Numba compiles a subset of the language into efficient machine code that is comparable in performance to a traditional compiled language. In addition, we share our experience in building a JIT compiler using LLVM[1].
Conference Paper
An intermediate language for the machine independent compilation of ALGOL 68 is described. It makes very few assumptions on the target machine but provides a strong descriptive mechanism for abstract machine objects by which they can easily be mapped on target machine objects.
Book
This work is a textbook for an undergraduate course in compiler construction.
Article
We present an extensive, annotated bibliography of the abstract machines designed for each of the main programming paradigms (imperative, object oriented, functional, logic and concurrent). We conclude that whilst a large number of efficient abstract machines have been designed for particular language implementations, relatively little work has been done to design abstract machines in a systematic fashion.
Poster 99: Eithne: A Framework for Benchmarking Micro-Core Accelerators. htps://sc19.supercomputing.org/proceedings/tech_poster/tech_poster_pages/rpost186.html Maurice Jamieson and Nick Brown
  • Maurice Jamieson
  • Nick Brown
Here's how you can get some free speed on your Python code with Numba. htps://towardsdatascience.com /hereshow-you-can-get-some-free-speed-on-your-python-code-withnumba-89fdc8249ef3 George Seif
  • George Seif
PicoRV32: A Size-Optimized RISC-V CPU. Contribute to clifordwolf/picorv32 development by creating an account on GitHub. htps://github.com/clifordwolf/picorv32 Cliford Wolf
  • Cliford Wolf
ELF binary compilation of a python script-part 1 : Cython. htps://obrunet.github.io/pythonic%20ideas/compilation_cython/ Olivier Brunet. 2020. ELF binary compilation of a python script-part 1 : Cython
  • Olivier Brunet
Use Cython to get more than 30X speedup on your Python code. htps://towardsdatascience.com/ use-cython-to-getmore-than-30x-speedup-on-your-python-code-f6cb337919b6 George Seif
  • George Seif
The Essence of Compilers
  • Robin Hunter
  • Hunter Robin