Conference Paper

Pallene: a statically typed companion language for lua

Authors:
To read the full-text of this research, you can request a copy directly from the authors.

Abstract

The simplicity and flexibility of dynamic languages make them popular for prototyping and scripting, but the lack of compile-time type information makes it very challenging to generate efficient executable code. Inspired by ideas from scripting, just-in-time compilers, and optional type systems, we are developing Pallene, a statically typed companion language to the Lua scripting language. Pallene is designed to be amenable to standard ahead-of-time compilation techniques, to interoperate seamlessly with Lua (even sharing its runtime), and to be familiar to Lua programmers. In this paper, we compare the performance of the Pallene compiler against LuaJIT, a just in time compiler for Lua, and with C extension modules. The results suggest that Pallene can achieve similar levels of performance.

No full-text available

Request Full-text Paper PDF

To read the full-text of this research,
you can request a copy directly from the authors.

... MicroPython performs the compilation of bytecode to native code on the device [8] similar to JIT except that the bytecode is not profiled as is common for JIT compilers, rather the bytecode is just lowered to native code. An alternative approach was taken for Vipera, similar to that employed by the Pallene / Titan compiler [17] for Lua [18]. Here, the source language compiler, running on the host, emits C source code that is then compiled to generate native binary executables. ...
Preprint
Vipera provides a compiler and runtime framework for implementing dynamic Domain-Specific Languages on micro-core architectures. The performance and code size of the generated code is critical on these architectures. In this paper we present the results of our investigations into the efficiency of Vipera in terms of code performance and size.
... Pallene. Consider now the grammar for Pallene [11], a staticallytyped programming language derived from Lua. It is interesting to see the granularity we can achieve with the cut mechanism proposed here. ...
Preprint
Parsing Expression Grammars (PEGs) are a recognition-based formalism which allows to describe the syntactical and the lexical elements of a language. The main difference between Context-Free Grammars (CFGs) and PEGs relies on the interpretation of the choice operator: while the CFGs' unordered choice e | e' is interpreted as the union of the languages recognized by e and e, the PEGs' prioritized choice e/e' discards e' if e succeeds. Such subtle, but important difference, changes the language recognized and yields more efficient parsing algorithms. This paper proposes a rewriting logic semantics for PEGs. We start with a rewrite theory giving meaning to the usual constructs in PEGs. Later, we show that cuts, a mechanism for controlling backtracks in PEGs, finds also a natural representation in our framework. We generalize such mechanism, allowing for both local and global cuts with a precise, unified and formal semantics. Hence, our work strives at better understanding and controlling backtracks in parsers for PEGs. The semantics we propose is executable and, besides being a parser with modest efficiency, it can be used as a playground to test different optimization ideas. More importantly, it is a mathematical tool that can be used for different analyses.
... Gualandi and Ierusalimschy [GI18] identified a benchmark, based on Conway's Game of Life, as being GC-heavy. Its simplicity makes for a good introductory setting to analyze and dissect the application of subheaps in full detail. ...
Article
Automated memory management avoids the tedium and danger of manual techniques. However, as no programmer input is required, no widely available interface exists to permit principled control over sometimes unacceptable performance costs. This dissertation explores the idea that performance-oriented languages should give programmers greater control over where and when the garbage collector (GC) expends effort. We describe an interface and implementation to expose heap partitioning and collection decisions without compromising type safety. We show that our interface allows the programmer to encode a form of reference counting using Hayes' notion of key objects. Preliminary experimental data suggests that our proposed mechanism can avoid high overheads suffered by tracing collectors in some scenarios, especially with tight heaps. However, for other applications, the costs of applying subheaps---in human effort and runtime overheads---remain daunting.
Chapter
Vipera provides a compiler and runtime framework for implementing dynamic Domain-Specific Languages on micro-core architectures. The performance and code size of the generated code is critical on these architectures. In this paper we present the results of our investigations into the efficiency of Vipera in terms of code performance and size.KeywordsDomain-specific languagesPythonnative code generationRISC-Vmicro-core architectures
Article
Full-text available
The literature presents many strategies for enforcing the integrity of types when typed code interacts with untyped code. This paper presents a uniform evaluation framework that characterizes the differences among some major existing semantics for typed–untyped interaction. Type system designers can use this framework to analyze the guarantees of their own dynamic semantics.
Article
The simplicity and flexibility of dynamic languages make them popular for prototyping and scripting, but the lack of compile-time type information makes it challenging to generate efficient executable code. Inspired by ideas from scripting, just-in-time compilers, and optional type systems, we have developed Pallene, a typed companion language to the Lua scripting language, intended for writing lower-level libraries and extension modules. It plays the role of a system language in the scripting paradigm, but one that has been explicitly designed to interoperate with Lua. Pallene is designed to be efficient, to interoperate seamlessly with Lua (even sharing its runtime), and to be familiar to Lua programmers. It should also be simple, easy to understand, and easy to implement, so it can be as portable as maintainable as Lua itself. In this paper, we discuss the rationale for Pallene's design and present a description of its syntax, type system, and semantics. We also compare the performance of Pallene extension modules against pure Lua (both interpreted and JIT-compiled), against C extension modules (operating with Lua data structures), and also against programs fully written in C, which provide a familiar baseline. The results corroborate our hypothesis that Pallene has the potential to be a low-level counterpart to Lua. The performance of Lua extension modules written in Pallene can be better than that of equivalent modules written in C and it is competitive with the performance from a JIT compiler, despite the vastly simpler implementation. This is a revised and extended version of a SBLP 2018 paper with a similar title [1].
Conference Paper
Full-text available
Programmers have come to embrace dynamically-typed languages for prototyping and delivering large and complex systems. When it comes to maintaining and evolving these systems, the lack of explicit static typing becomes a bottleneck. In response, researchers have explored the idea of gradually-typed programming languages which allow the incremental addition of type annotations to software written in one of these untyped languages. Some of these new, hybrid languages insert run-time checks at the boundary between typed and untyped code to establish type soundness for the overall system. With sound gradual typing, programmers can rely on the language implementation to provide meaningful error messages when type invariants are violated. While most research on sound gradual typing remains theoretical, the few emerging implementations suffer from performance overheads due to these checks. None of the publications on this topic comes with a comprehensive performance evaluation. Worse, a few report disastrous numbers. In response, this paper proposes a method for evaluating the performance of gradually-typed programming languages. The method hinges on exploring the space of partial conversions from untyped to typed. For each benchmark, the performance of the different versions is reported in a synthetic metric that associates runtime overhead to conversion effort. The paper reports on the results of applying the method to Typed Racket, a mature implementation of sound gradual typing, using a suite of real-world programs of various sizes and complexities. Based on these results the paper concludes that, given the current state of implementation technologies, sound gradual typing faces significant challenges. Conversely, it raises the question of how implementations could reduce the overheads associated with soundness and how tools could be used to steer programmers clear from pathological cases.
Article
Full-text available
Virtual Machines (VMs) with Just-In-Time (JIT) compilers are traditionally thought to execute programs in two phases: first the warmup phase determines which parts of a program would most benefit from dynamic compilation; after compilation has occurred the program is said to be at peak performance. When measuring the performance of JIT compiling VMs, data collected during the warmup phase is generally discarded, placing the focus on peak performance. In this paper we run a number of small, deterministic benchmarks on a variety of well known VMs. In our experiment, less than one quarter of the benchmark/VM pairs conform to the traditional notion of warmup, and none of the VMs we tested consistently warms up in the traditional notion. This raises a number of questions about VM benchmarking, which are of interest to both VM authors and end users.
Article
Full-text available
Programmers have come to embrace dynamically-typed languages for prototyping and delivering large and complex systems. When it comes to maintaining and evolving these systems, the lack of explicit static typing becomes a bottleneck. In response, researchers have explored the idea of gradually-typed programming languages which allow the incremental addition of type annotations to software written in one of these untyped languages. Some of these new, hybrid languages insert run-time checks at the boundary between typed and untyped code to establish type soundness for the overall system. With sound gradual typing, programmers can rely on the language implementation to provide meaningful error messages when type invariants are violated. While most research on sound gradual typing remains theoretical, the few emerging implementations suffer from performance overheads due to these checks. None of the publications on this topic comes with a comprehensive performance evaluation. Worse, a few report disastrous numbers. In response, this paper proposes a method for evaluating the performance of gradually-typed programming languages. The method hinges on exploring the space of partial conversions from untyped to typed. For each benchmark, the performance of the different versions is reported in a synthetic metric that associates runtime overhead to conversion effort. The paper reports on the results of applying the method to Typed Racket, a mature implementation of sound gradual typing, using a suite of real-world programs of various sizes and complexities. Based on these results the paper concludes that, given the current state of implementation technologies, sound gradual typing faces significant challenges. Conversely, it raises the question of how implementations could reduce the overheads associated with soundness and how tools could be used to steer programmers clear from pathological cases.
Article
Full-text available
Combining static and dynamic typing within the same language offers clear benefits to programmers. It provides dynamic typing in situations that require rapid prototyping, heterogeneous data structures, and reflection, while supporting static typing when safety, modularity, and efficiency are primary concerns. Siek and Taha (2006) introduced an approach to combining static and dynamic typing in a fine-grained manner through the notion of type consistency in the static semantics and run-time casts in the dynamic semantics. However, many open questions remain regarding the semantics of gradually typed languages. In this paper we present Reticulated Python, a system for experimenting with gradual-typed dialects of Python. The dialects are syntactically identical to Python 3 but give static and dynamic semantics to the type annotations already present in Python 3. Reticulated Python consists of a typechecker and a source-to-source translator from Reticulated Python to Python 3. Using Reticulated Python, we evaluate a gradual type system and three approaches to the dynamic semantics of mutable objects: the traditional semantics based on Siek and Taha (2007) and Herman et al. (2007) and two new designs. We evaluate these designs in the context of several third-party Python programs.
Article
Full-text available
The HipHop Virtual Machine (HHVM) is a JIT compiler and runtime for PHP. While PHP values are dynamically typed, real programs often have latent types that are useful for optimization once discovered. Some types can be proven through static analysis, but limitations in the ahead-of-time approach leave some types to be discovered at run time. And even though many values have latent types, PHP programs can also contain polymorphic variables and expressions, which must be handled without catastrophic slowdown. HHVM discovers latent types by structuring its JIT around the concept of a tracelet. A tracelet is approximately a basic block specialized for a particular set of run-time types for its input values. Tracelets allow HHVM to exactly and efficiently learn the types observed by the program, while using a simple compiler. This paper shows that this approach enables HHVM to achieve high levels of performance, without sacrificing compatibility or interactivity.
Conference Paper
Full-text available
Building high-performance virtual machines is a complex and expensive undertaking; many popular languages still have low-performance implementations. We describe a new approach to virtual machine (VM) construction that amortizes much of the effort in initial construction by allowing new languages to be implemented with modest additional effort. The approach relies on abstract syntax tree (AST) interpretation where a node can rewrite itself to a more specialized or more general node, together with an optimizing compiler that exploits the structure of the interpreter. The compiler uses speculative assumptions and deoptimization in order to produce efficient machine code. Our initial experience suggests that high performance is attainable while preserving a modular and layered architecture, and that new high-performance language implementations can be obtained by writing little more than a stylized interpreter.
Article
Full-text available
Cython is a Python language extension that allows explicit type declarations and is compiled directly to C. As such, it addresses Python's large overhead for numerical loops and the difficulty of efficiently using existing C and Fortran code, which Cython can interact with natively.
Conference Paper
Full-text available
We report on the birth and evolution of Lua and discuss how it moved from a simple configuration language to a versatile, widely used language that supports extensible semantics, anonymous functions, full lexical scoping, proper tail calls, and coroutines.
Article
Full-text available
SCRIPTING LANGUAGES ARE an important element in the current landscape of programming languages. A key feature of a scripting language is its ability to integrate with a system language. 7 This integration takes two main forms: extending and embedding. In the first form, you extend the scripting language with libraries and functions written in the system language and write your main program in the scripting language. In the second form, you embed the scripting language in a host program (written in the system language) so that the host can run scripts and call functions defined in the scripts; the main program is the host program. In this setting, the system language is usually called the host language.
Conference Paper
Programmers often migrate from a dynamically typed to a statically typed language when their simple scripts evolve into complex programs. Optional type systems are one way of having both static and dynamic typing in the same language, while keeping its dynamically typed semantics. This makes evolving a program from dynamic to static typing a matter of describing the implied types that it is using and adding annotations to make those types explicit. Designing an optional type system for an existing dynamically typed language is challenging, as its types should feel natural to programmers that are already familiar with this language. In this work, we give a formal description of Typed Lua, an optional type system for Lua, with a focus on two of its novel type system features: incremental evolution of imperative record and object types that is both lightweight and type-safe, and projection types, a combination of flow typing, functions that return multiple values, and multiple assignment. While our type system is tailored to the features and idioms of Lua, its features can be adapted to other imperative scripting languages.
Conference Paper
We attempt to apply the technique of Tracing JIT Compilers in the context of the PyPy project, i.e., to programs that are interpreters for some dynamic languages, including Python. Tracing JIT compilers can greatly speed up programs that spend most of their time in loops in which they take similar code paths. However, applying an unmodified tracing JIT to a program that is itself a bytecode interpreter results in very limited or no speedup. In this paper we show how to guide tracing JIT compilers to greatly improve the speed of bytecode interpreters. One crucial point is to unroll the bytecode dispatch loop, based on two kinds of hints provided by the implementer of the bytecode interpreter. We evaluate our technique by applying it to two PyPy interpreters: one is a small example, and the other one is the full Python interpreter.
Conference Paper
The Smalltalk-80* programming language includes dynamic storage allocation, full upward funargs, and universally polymorphic procedures; the Smalltalk-80 programming system features interactive execution with incremental compilation, and implementation portability. These features of modern programming systems are among the most difficult to implement efficiently, even individually. A new implementation of the Smalltalk-80 system, hosted on a small microprocessor-based computer, achieves high performance while retaining complete (object code) compatibility with existing implementations. This paper discusses the most significant optimization techniques developed over the course of the project, many of which are applicable to other languages. The key idea is to represent certain runtime state (both code and data) in more than one form, and to convert between forms when needed.
Article
SUMMARY Implementing a concurrent programming language such as Java by the means of a translator to an existing language is attractive as it provides portability over all platforms supported by the host language and reduces development time - as many low-level tasks can be delegated to the host compiler. The C and C++ programming languages are popular choices for many language implementations due to the availability of ecient compilers on a wide range of platforms. For garbage-collected languages, however, they are not a perfect match as no support is provided for accurately discovering pointers to heap-allocated data on thread stacks. We evaluate several previously published techniques, and propose a new mechanism, lazy pointer stacks, for performing accurate garbage collection in such uncooperative environments. We implemented the new technique in the Ovm Java virtual machine with our own Java-to-C/C++ compiler using GCC as a back-end compiler. Our extensive experimental results confirm that lazy pointer stacks outperform existing approaches: we provide a speed-up of 4.5% over Henderson's accurate collector with a 17% increase in code size. Accurate collection is essential in the context of real-time systems, we thus validate our approach with the implementation of a real-time concurrent garbage collection algorithm.
Conference Paper
As scripts grow into full-fledged applications, programmers should want to port portions of their programs from script- ing languages to languages with sound and rich type sys- tems. This form of interlanguage migration ensures type- safety and provides minimal guarantees for reuse in other applications, too. In this paper, we present a framework for expressing this form of interlanguage migration. Given a program that con- sists of modules in the untyped lambda calculus, we prove that rewriting one of them in a simply typed lambda calcu- lus produces an equivalent program and adds the expected amount of type safety, i.e., code in typed modules can't go wrong. To ensure these guarantees, the migration process infers constraints from the statically typed module and im- poses them on the dynamically typed modules in the form of behavioral contracts.
Conference Paper
We present a just-in-time compiler for a Java VM that is small enough to fit on resource-constrained devices, yet is surprisingly effective. Our system dynamically identifies traces of frequently executed bytecode instructions (which may span several basic blocks across several methods) and compiles them via Static Single Assignment (SSA) construction. Our novel use of SSA form in this context allows to hoist instructions across trace side-exits without necessitating expensive compensation code in off-trace paths. The overall memory consumption (code and data) of our system is only 150 kBytes, yet benchmarks show a speedup that in some cases rivals heavy-weight just-in-time compilers.
Article
A fundamental change is occurring in the way people write computer programs, away from system programming languages such as C or C++ to scripting languages such as Perl or Tcl. Although many people are participating in the change, few realize that the change is occurring and even fewer know why it is happening. This article explains why scripting languages will handle many of the programming tasks in the next century better than system programming languages. System programming languages were designed for building data structures and algorithms from scratch, starting from the most primitive computer elements. Scripting languages are designed for gluing. They assume the existence of a set of powerful components and are intended primarily for connecting components
LuaJIT, a Just-In-Time Compiler for Lua
  • Mike Pall
Refined Criteria for Gradual Typing
  • Jeremy G Siek
  • Michael M Vitousek
  • Matteo Cimini
  • John Tang Boyland
V8 Optimization Killers
  • Petka Antonov
The Rocky Road to MCode. Talk at Lua Moscow conference 2017
  • Javier Guerra
Lua in Grim Fandango. Grim Fandango Network
  • Bret Mogilefsky
LuaJIT Hacking: Getting next() out of the NYI list
  • Javier Guerra
LOOM - A LuaJIT performance visualizer
  • Javier Guerra
Source code repository for the Titan programming language
  • Fábio André Murbach Maidl
  • Gabriel Mascarenhas
  • Hisham Ligneul
  • Hugo Musso Muhammad
  • Gualandi
Typing Dynamic Languages - a Review
  • Hugo Musso
  • Gualandi
Google ScholarDigital Library
  • Keith Adams
  • Jason Evans
  • Bertrand Maher
  • Guilherme Ottoni
  • Andrew Paroski
  • Brett Simmers
  • Edwin Smith
  • Owen Yamauchi
  • Paul Graham
Typing Dynamic Languages - a Review
  • Gualandi Hugo Musso
Compiling Dynamic Language Implementations
  • Michael Hudson
  • Samuele Pedroni