About
42
Publications
10,889
Reads
How we measure 'reads'
A 'read' is counted each time someone views a publication summary (such as the title, abstract, and list of authors), clicks on a figure, or views or downloads the full-text. Learn more
1,210
Citations
Publications
Publications (42)
Many useful tasks in data science and machine learning applications can be written as simple variations of matrix multiplication. However, users have difficulty performing such tasks as existing matrix/vector libraries support only a limited class of computations hand-tuned for each unique hardware platform. Users can alternatively write the task a...
Columnar databases are an established way to speed up online analytical processing (OLAP) queries. Nowadays, data processing (e.g., storage, visualization, and analytics) is often performed at the programming language level, hence it is desirable to also adopt columnar data structures for common language runtimes. While there are frameworks, librar...
Compiler optimization decisions are often based on hand-crafted heuristics centered around a few established benchmark suites. Alternatively, they can be learned from feature and performance data produced during compilation. However, data-driven compiler optimizations based on machine learning models require large sets of quality data for training...
Compilers provide many architecture-agnostic, high-level optimizations trading off peak performance for code size. High-level optimizations typically cannot precisely reason about their impact, as they are applied before the final shape of the generated machine code can be determined. However, they still need a way to estimate their transformation’...
To cope with today's large scale of data, parallel dataflow engines such as Hadoop, and more recently Spark and Flink, have been proposed. They offer scalability and performance, but require data scientists to develop analysis pipelines in unfamiliar programming languages and abstractions. To overcome this hurdle, dataflow engines have introduced s...
Java programs can contain non-counted loops, that is, loops for which the iteration count can neither be determined at compile time nor at run time. State-of-the-art compilers do not aggressively optimize them, since unrolling non-counted loops often involves duplicating also a loop's exit condition, which thus only improves run-time performance if...
Compilers perform a variety of advanced optimizations to improve the quality of the generated machine code. However, optimizations that depend on the data flow of a program are often limited by control-flow merges. Code duplication can solve this problem by hoisting, i.e. duplicating, instructions from merge blocks to their predecessors. However, f...
Compilers perform a variety of advanced optimizations to improve the quality of the generated machine code. However, optimizations that depend on the data flow of a program are often limited by control-flow merges. Code duplication can solve this problem by hoisting, i.e. duplicating, instructions from merge blocks to their predecessors. However, f...
Most high-performance dynamic language virtual machines duplicate language semantics in the interpreter, compiler, and runtime system. This violates the principle to not repeat yourself. In contrast, we define languages solely by writing an interpreter. The interpreter performs specializations, e.g., augments the interpreted program with type infor...
Most high-performance dynamic language virtual machines duplicate language semantics in the interpreter, compiler, and runtime system. This violates the principle to not repeat yourself. In contrast, we define languages solely by writing an interpreter. The interpreter performs specializations, e.g., augments the interpreted program with type infor...
The R language, from the point of view of language design and implementation, is a unique combination of various programming language concepts. It has functional characteristics like lazy evaluation of arguments, but also allows expressions to have arbitrary side effects. Many runtime data structures, for example variable scopes and functions, are...
The R language, from the point of view of language design and implementation, is a unique combination of various programming language concepts. It has functional characteristics like lazy evaluation of arguments, but also allows expressions to have arbitrary side effects. Many runtime data structures, for example variable scopes and functions, are...
We present an approach to cross-compile Java bytecodes to Java-Script, building on existing Java optimizing compiler technology. Static analysis determines which Java classes and methods are reachable. These are then translated to JavaScript using a re-configured Java just-in-time compiler with a new back end that generates JavaScript instead of ma...
We present an approach to cross-compile Java bytecodes to Java-Script, building on existing Java optimizing compiler technology. Static analysis determines which Java classes and methods are reachable. These are then translated to JavaScript using a re-configured Java just-in-time compiler with a new back end that generates JavaScript instead of ma...
When building a compiler for a high-level language, certain intrinsic features of the language must be expressed in terms of the resulting low-level operations. Complex features are often expressed by explicitly weaving together bits of low-level IR, a process that is tedious, error prone, difficult to read, difficult to reason about, and machine d...
Originally developed with a single language in mind, the JVM is now targeted by numerous programming languages—its automatic memory management, just-in-time compilation, and adaptive optimizations—making it an attractive execution platform. However, the garbage collector, the just-in-time compiler, and other optimizations and heuristics were design...
A method for a compiler includes receiving, by the compiler and from an interpreter, a representation of a code section having a control path that changes the representation. The representation has profiling data, and the profiling data has a threshold. The method further includes performing, by the compiler and based on the threshold, a partial ev...
This paper presents TruffleC, a C interpreter that allows the dynamic execution of C code on top of a Java Virtual Machine (JVM). Rather than producing a static build of a C application, TruffleC is a self-optimizing abstract syntax tree (AST) interpreter combined with a just-in-time (JIT) compiler. Our self-optimizing interpreter specializes the A...
Escape Analysis allows a compiler to determine whether an object is accessible outside the allocating method or thread. This information is used to perform optimizations such as Scalar Replacement, Stack Allocation and Lock Elision, allowing modern dynamic compilers to remove some of the abstractions introduced by advanced programming models.
The a...
Building high-performance virtual machines is a complex and expensive undertaking; many popular languages still have low-performance implementations. We describe a new approach to virtual machine (VM) construction that amortizes much of the effort in initial construction by allowing new languages to be implemented with modest additional effort. The...
We present a compiler intermediate representation (IR) that allows dynamic speculative optimizations for high-level languages. The IR is graph-based and contains nodes fixed to control flow as well as floating nodes. Side-effecting nodes include a framestate that maps values back to the original program. Guard nodes dynamically check assumptions an...
We present an efficient and dynamic approach for calling native functions from within Java. Traditionally, programmers use the Java Native Interface (JNI) to call such functions. This paper introduces a new mechanism which we tailored specifically towards calling native functions from Java. We call it the Graal Native Function Interface (GNFI). It...
Java Virtual Machines are optimized for performing well on traditional Java benchmarks, which consist almost exclusively of code generated by the Java source compiler (javac). Code generated by compilers for other languages has not received nearly as much attention, which results in performance problems for those languages.
One important specimen o...
Dynamic code evolution is a technique to update a program while it is running. In an object-oriented language such as Java, this can be seen as replacing a set of classes by new versions. We modified an existing high-performance virtual machine to allow arbitrary changes to the definition of loaded classes. Besides adding and deleting fields and me...
We present an intermediate representation (IR) for a Java just in time (JIT) compiler written in Java. It is a graph-based IR that models both control-flow and data-flow dependencies between nodes. We show the framework in which we developed our IR. Much care has been taken to allow the programmer to focus on compiler optimization rather than IR bo...
syntax tree (AST) interpreter is a simple and natural way to implement a programming language. However, it is also considered the slowest approach because of the high overhead of virtual method dispatch. Language implementers therefore define bytecodes to speed up interpretation, at the cost of introducing inflexible and hard to maintain bytecode f...
An abstract syntax tree (AST) interpreter is a simple and natural way to implement a programming language. However, it is also considered the slowest approach because of the high overhead of virtual method dispatch. Language implementers therefore define bytecodes to speed up interpretation, at the cost of introducing inflexible and hard to maintai...
Modern virtual machines for Java use a dynamic compiler to optimize the program at run time. The compilation time therefore impacts the performance of the application in two ways: First, the compilation and the program's execution compete for CPU resources. Second, the sooner the compilation of a method finishes, the sooner the method will execute...
Coroutines are non-preemptive lightweight processes. Their advantage over threads is that they do not have to be synchronized because they pass control to each other explicitly and deterministically. Coroutines are therefore an elegant and efficient implementation construct for numerous algorithmic problems. Many mainstream languages and runtime en...
Dynamic code evolution is a technique to update a program while it is running. In an object-oriented language such as Java, this can be seen as replacing a set of classes by new versions. We modified an existing high-performance virtual machine to allow arbitrary changes to the definition of loaded classes. Besides adding and deleting fields and me...
Continuations, or 'the rest of the computation', are a con- cept that is most often used in the context of functional and dynamic programming languages. Implementations of such languages that work on top of the Java virtual ma- chine (JVM) have traditionally been complicated by the lack of continuations because they must be simulated. We propose an...