About
88
Publications
11,740
Reads
How we measure 'reads'
A 'read' is counted each time someone views a publication summary (such as the title, abstract, and list of authors), clicks on a figure, or views or downloads the full-text. Learn more
1,443
Citations
Additional affiliations
January 2004 - present
Publications
Publications (88)
The notion of polyglot software development refers to the fact that most software projects nowadays rely on multiple languages to deal with widely different concerns, from core business concerns to user interface, security, and deployment concerns among many others. Many different wordings around this notion have been proposed in the literature, wi...
One of the strengths of the Jetbrains MPS projectional language workbench is that it supports mixing different kinds of notations (graphical, tabular, textual, etc.). Many existing languages, however, are fully textual and are defined using grammar technology. To allow such languages to be used from within MPS, language engineers have to manually r...
Clear consistency guarantees on data are paramount for the design and implementation of distributed systems. When implementing distributed applications, developers require approaches to verify the data consistency guarantees of an implementation choice. Crooks et al. define a state-based and client-centric model of database isolation. This paper fo...
Context: Computational notebooks are a contemporary style of literate programming, in which users can communicate and transfer knowledge by interleaving executable code, output, and prose in a single rich document. A Domain-Specific Language (DSL) is an artificial software language tailored for a particular application domain. Usually, DSL users ar...
Many metaprogramming tasks, such as refactorings, automated bug fixing, or large-scale software renovation, require high-fidelity source code transformations -- transformations which preserve comments and layout as much as possible. Abstract syntax trees (ASTs) typically abstract from such details, and hence would require pretty printing, destroyin...
Relational model finding is a successful technique which has been used in a wide range of problems during the last decade. This success is partly due to the fact that many problems contain relational structures which can be explored using relational model finders. Although these model finders allow for the exploration of such structures they often...
In high-throughput, distributed systems, such as large-scale banking infrastructure, synchronization between actors becomes a bottle-neck in high-contention scenarios. This results in delays for users, and reduces opportunities for scaling such systems. This paper proposes Static Local Coordination Avoidance, which analyzes application invariants a...
Concurrent objects with asynchronous messaging are an increasingly popular way to structure highly available, high performance, large-scale software systems. To ensure data-consistency and support synchronization between objects such systems often use an atomic commitment protocol such as Two-Phase commit (2PC). In highly available, high-throughput...
The need for levels of availability and scalability beyond those supported by relational databases has led to the emergence of a new generation of purpose-specific databases grouped under the term NoSQL. In general, NoSQL databases are designed with horizontal scalability as a primary concern and deliver increased availability and fault tolerance a...
Context: Meta programming consists for a large part of matching, analyzing, and transforming syntax trees. Many meta programming systems process abstract syntax trees, but this requires intimate knowledge of the structure of the data type describing the abstract syntax. As a result, meta programming is error-prone, and meta programs are not resilie...
Live modeling enables modelers to incrementally update models as they are running and get immediate feedback about the impact of their changes. Changes introduced in a model may trigger inconsistencies between the model and its run-time state (e.g., deleting the current state in a statemachine); effectively requiring to migrate the run-time state t...
Interactive notebooks allow people to communicate and collaborate through a single rich document that might include live code, multimedia, computed results, and documentation, which is persisted as a whole for reproducibility. Notebooks are currently being used extensively in domains such as data science, data journalism, and machine learning. Howe...
Domain-Specific Languages (DSLs) manifest themselves in remarkably diverse shapes, ranging from internal DSLs embedded as a mere fluent API within a programming language, to external DSLs with dedicated syntax and tool support. Although different shapes have different pros and cons, combining them for a single language is problematic: language desi...
Effect handling is a way to structure and scope side-effects which is gaining popularity as an alternative to monads in purely functional programming languages. Languages with support for effect handling allow the programmer to define idioms for state, exception handling, asynchrony, backtracking, etc. from within the language. Functional programmi...
Interactive notebooks, such as provided by the Jupyter platform [2], are gaining traction in scientific computing, data science, and machine learning. Developing a Jupyter kernel machinery for a new language, however, requires considerable effort. In this extended abstract, we present Bacatá, a language-parametric bridge between Jupyter and the Ras...
Live programming is a style of development characterized by incremental change and immediate feedback. Instead of long edit-compile cycles, developers modify a running program by changing its source code, receiving immediate feedback as it instantly adapts in response. In this paper, we propose an approach to bridge the gap between running programs...
Large organizations like banks suffer from the ever growing complexity of their systems. Evolving the software becomes harder and harder since a single change can affect a much larger part of the system than predicted upfront. A large contributing factor to this problem is that the actual domain knowledge is often implicit, incomplete, or out of da...
Mainstream programming languages like Java have limited support for language extensibility. Without mechanisms for syntactic abstraction, new programming styles can only be embedded in the form of libraries, limiting expressiveness. In this paper, we present Recaf, a lightweight tool for creating Java dialects; effectively extending Java with new l...
Many model-driven development (MDD) tools employ specialized frameworks and modeling languages, and assume that the semantics of models is provided by some form of code generation. As a result, programming against models is cumbersome and does not integrate well with ordinary programming languages and IDEs. In this paper we present MD4J, a modeling...
Modular interpreters are a crucial first step towards component-based language development: instead of writing language interpreters from scratch, they can be assembled from reusable, semantic building blocks. Unfortunately, traditional language interpreters can be hard to extend because different language constructs may require different interpret...
Mainstream model transformation tools operate on graph structured models which are described by class-based meta-models. In the traditional grammarware space, transformation tools consume and produce tree structured terms, which are described by some kind of algebraic datatype or grammar. In this paper we explore a functional style of model transfo...
Spreadsheet systems are live programming environments. Both the data and the code are right in front you, and if you edit either of them, the effects are immediately visible. Unfortunately, spreadsheets lack mechanisms for abstraction, such as classes, function definitions etc. Programming languages excel at abstraction, but most mainstream languag...
All software evolves, and programming languages and programming language tools are no exception. And just like in ordinary software construction, modular implementations can help ease the process of changing a language implementation and its dependent tools. However, the syntactic and semantic dependencies between language features make this a chal...
Modular interpreters have the potential to achieve component-based language development: instead of writing language interpreters from scratch, they can be assembled from reusable, semantic building blocks. Unfortunately, traditional language interpreters are hard to extend because different language constructs may require different interpreter sig...
Modular interpreters have the potential to achieve component-based language development: instead of writing language interpreters from scratch, they can be assembled from reusable, semantic building blocks. Unfortunately, traditional language interpreters are hard to extend because different language constructs may require different interpreter sig...
Traversing complex Abstract Syntax Trees (ASTs) typically requires large amounts of tedious boilerplate code. For many operations most of the code simply walks the structure, and only a small portion of the code implements the functionality that motivated the traversal in the first place. This paper presents a type-safe Java framework called Shy th...
The goal of the DSLDI workshop is to bring together researchers and
practitioners interested in sharing ideas on how DSLs should be designed,
implemented, supported by tools, and applied in realistic application contexts.
We are both interested in discovering how already known domains such as graph
processing or machine learning can be best support...
Language workbenches are environments for simplifying the creation and use of computer languages. The annual Language Workbench Challenge (LWC) was launched in 2011 to allow the many academic and industrial researchers in this area an opportunity to quantitatively and qualitatively compare their approaches. We first describe all four LWCs to date,...
In textual modeling, models are created through an intermediate parsing step which maps textual representations to abstract model structures. Therefore, the identify of elements is not stable across different versions of the same model. Existing model differencing algorithms, therefore, cannot be applied directly because they need to identify model...
Object Algebras are a recently introduced design pattern to make the implementation of recursive data types more extensible. In this short paper we report our experience in using Object Algebras in building a realistic domain specific language (DSL) for questionnaires, called QL. This experience has led to a simple, yet powerful set of tools for th...
Program transformations play an important role in domain-specific languages and model-driven development. Tracing the execution of such transformations has well-known benefits for debugging, visualization and error reporting. In this paper, we introduce string origins, a lightweight, generic and portable technique to establish a tracing relation be...
Program transformations in terms of abstract syntax trees compromise referential integrity by introducing variable capture. Variable capture occurs when in the generated program a variable declaration accidentally shadows the intended target of a variable reference. Existing transformation systems either do not guarantee the avoidance of variable c...
Program transformations in terms of abstract syntax trees compromise referential integrity by introducing variable capture. Variable capture occurs when in the generated program a variable declaration accidentally shadows the intended target of a variable reference. Existing transformation systems either do not guarantee the avoidance of variable c...
An Object Grammar is a variation on traditional BNF grammars, where the notation is extended to support declarative bidirectional mappings between text and object graphs. The two directions for interpreting Object Grammars are parsing and formatting. Parsing transforms text into an object graph by recognizing syntactic features and creating the cor...
Rascal is a meta-programming language for processing source code in the broad sense (models, documents, formats, languages, etc.). In this short note we discuss the implementation of the "TTC'14 Movie Database Case" in Rascal. In particular we will highlight the challenges and benefits of using a functional programming language for transforming gra...
Software projects consist of different kinds of artifacts: build files, configuration files, markup files, source code in different software languages, and so on. At the same time, however, most integrated development environments (IDEs) are focused on a single (programming) language. Even if a programming environment supports multiple languages (e...
Questionnaires are an important medium for collecting information in diverse areas of society (scientific surveys, tax filing, auditing guidance etc.). We are interested in a domain-specific language (DSL) to automatically generate questionnaire software from declarative specifications. This note describes an important aspect of the semantics of su...
Language workbenches are tools that provide high-level mechanisms for the implementation of (domain-specific) languages. Language workbenches are an active area of research that also receives many contributions from industry. To compare and discuss existing language workbenches, the annual Language Workbench Challenge was launched in 2011. Each yea...
Digital forensics software often has to be changed to cope with new variants and versions of file formats. Developers reverse engineer the actual files, and then change the source code of the analysis tools. This process is error-prone and time consuming because the relation between the newly encountered data and how the source code must be changed...
Domain-specific languages (dsls) can significantly increase productivity and quality in software construction. However, even dsl programs need to evolve to accomodate changing requirements and circumstances. How can we know if the design of a dsl supports the relevant evolution scenarios on its programs? We present an experimental approach to evalu...
Object algebras are a new programming technique that enables a simple solution to basic extensibility and modularity issues in programming languages. While object algebras excel at defining modular features, the composition mechanisms for object algebras (and features) are still cumbersome and limited in expressiveness. In this paper we leverage tw...
Domain-specific languages (DSLs) require IDE support, just like ordinary programming languages. This paper introduces semantic deltas as a foundation for building live DSL environments to bridge the “gulf of evaluation” between DSL code and the running application. Semantic deltas are distinguished from textual or structural deltas in two ways. Fir...
Object Grammars define mappings between text and object graphs. Parsing recognizes syntactic features and creates the corresponding object structure. In the reverse direction, formatting recognizes object graph features and generates an appropriate textual presentation. The key to Object Grammars is the expressive power of the mapping, which decoup...
Managed Data is a two-level approach to data abstraction in which programmers first define data description and manipulation mechanisms, and then use these mechanisms to define specific kinds of data. Managed Data allows programmers to take control of many important aspects of data, including persistence, access/change control, reactivity, logging,...
Object Grammars define mappings between text and object graphs. Parsing recognizes syntactic features and creates the corresponding object struc-ture. In the reverse direction, formatting recognizes object graph features and generates an appropriate textual presentation. The key to Object Grammars is the expressive power of the mapping, which decou...
File carvers are forensic software tools used to recover data from storage devices in order to find evidence. Every legal case requires different trade-offs between precision and runtime performance. The resulting required changes to the software tools are performed manually and under the strictest deadlines.
In this paper we present a model-driven...
Algebraic specification has a long tradition in bridging the gap between specification and programming by making specifications executable. Building on extensive experience in designing, implementing and using specification formalisms that are based on algebraic specification and term rewriting (namely Asf and Asf+Sdf), we are now focusing on using...
We compare the Visitor pattern with the Interpreter pattern, investigating a single case in point for the Java language. We have produced and compared two versions of an interpreter for a programming language. The first version makes use of the Visitor pattern. The second version was obtained by using an automated refactoring to transform uses of t...
Digital forensics investigations often consist of analyzing large quantities of data. The software tools used for analyzing such data are constantly evolving to cope with a multiplicity of versions and variants of data formats. This process of customization is time consuming and error prone. To improve this situation we present DERRIC, a domain-spe...
Ambiguity detection tools try to statically track down ambiguities in context-free grammars. Current ambiguity detection tools, however, either are too slow for large realistic cases, or produce incomprehensible ambiguity reports. AmbiDexter is the ambiguity tool to have your cake and eat it too.
API migration refers to adapting an application such that its dependence on a given API (the source API) is eliminated in favor of depending on an alternative API (the target API) with the source and target APIs serving the same domain. One may attempt to automate API migration by code transformation or wrapping of some sort. API migration is relat...
Model-driven software development (MDSD) has been on the rise over the past few years and is becoming more and more mature.
However, evaluation in real-life industrial context is still scarce.
In this paper, we present a case-study evaluating the applicability of a state-of-the-art MDSD tool, modJ, a suite of domain specific languages (DSLs) for d...
Ambiguity detection tools try to statically track down ambiguities in context-free grammars. Current ambiguity detection tools, however, either are too slow for large realistic cases, or produce incomprehensible ambiguity reports. AMBIDEXTER is the ambiguity tool to have your cake and eat it too.
Does the use of DSL tools improve the maintainability of language implementations compared to implementations from scratch? We present empirical results on aspects of maintainability of six implementations of the same DSL using different languages (Java, JavaScript, C#) and DSL tools (ANTLR, OMeta, Microsoft "M"). Our evaluation indicates that the...
Many automated software engineering tools require tight integration of techniques for source code analysis and manipulation. State-of-the-art tools exist for both, but the domains have remained notoriously separate because different computational paradigms fit each domain best. This impedance mismatch hampers the development of new solutions becaus...
Rascal is a new language for meta-programming and is intended to solve problems in the domain of source code analysis and
transformation. In this article we give a high-level overview of the language and illustrate its use by many examples. Rascal
is a work in progress both regarding implementation and documentation. More information is available a...
How would a language look like that is specially designed for solving meta-programming problems in the software composition
domain? We present requirements for and design of Rascal, a new language for solving meta-programming problems that fit the
Extract-Analyze-SYnthesize (EASY) paradigm.
Within the programming domain of XML processing, we set up a benchmark for API migration. The benchmark is a suite of XML processing scenarios that are implemented in terms of different XML APIs. We suggest that a relatively general technique for API migration should be capable of pro- viding source-to-source translations between the different impl...
Failing integration builds are show stoppers. Development activity is stalled because developers have to wait with integrating new changes until the problem is fixed and a successful build has been run. We show how backtracking can be used to mitigate the impact of build failures in the context of component-based software development. This way, eve...
The development of a domain specific language (DSL) can be a difficult and costly undertaking. Language workbenches aim to provide integrated development support to ease this process. The Meta-Environment is a language workbench providing parsing, analysis, transformation,
syntax highlighting and formatting support for the development of programmin...
Bridging problem domain and solution in product line engi- neering is a time-consuming and error-prone process. Since both domains are structured dierently (features vs. artifacts), there is no natural way to map one to the other. Using an explicit and formal mapping creates opportunities for consistency checking and automation. This way both the c...
The meta-environment is a flexible framework for language development, source code analysis and source code transformation. We highlight new features and demonstrate how the system supports key functionalities for software evolution: fact extraction, software analysis, visualization, and software transformation
Integration hell is a prime example of software evolution gone out of control. The Sisyphus continuous integration system is designed to prevent this situation in the context of component-based software configuration management. We show how incremental and backtracking techniques are applied to strike a balance between maximal feedback and being up...
We show how under certain assumptions, the release and de- livery of software updates can be automated in the context of component- based systems. These updates allow features or fixes to be delivered to users more quickly. Furthermore, user feedback is more accurate, thus enabling quicker response to defects encountered in the field. Based on a fo...
We explore the connection between term rewriting systems (TRS) and aspect-oriented programming (AOP). Term rewriting is a paradigm that is used in fields such as program transformation and theorem proving. AOP is a method for decomposing software, complementary to the usual separation into programs, classes, functions, etc. An aspect represents cod...
This paper presents techniques to reason about the composition of configurable components and to automatically derive consistent compositions. The reasoning is achieved by describing components in a formal component description language, that allows the description of component variability, dependencies and configu-ration actions. It also enables t...
In component-based product populations, variability has to be described at the component level to be able to benet from a product family approach. As a consequence, composition of components becomes very complex. We describe how this complexity can be managed au- tomatically. The concepts and techniques presented are the rst step toward automated m...
Voor leveranciers van softwareproducten wordt het steeds moeilijker om de softwareconfiguraties bij hun klanten te beheren en te controleren. De software draait op uiteenlopende hardware en software platforms en is vaak voor een specif ieke klantinstallatie geparametriseerd of geoptimaliseerd. Op dit moment worden deze configuraties als gedetaillee...
Syntax Tree as input. The algorithm traverses the tree in a Modular SOS fashion: the traversal is directed by combinators, until a subtree can be collapsed. The tree thus decreases in depth in a bottom up fashion, while traversing it top down for each step. The main di#erence between acr and evalan is that acr employs term rewriting only to reduce...
Binary component-based software updates that are lightweight, efficient, safe and generic still remain a challenge. Most existing
deployment systems that achieve this goal have to control the complete software environment of the user which is a barrier
to adoption for both software consumers and producers. Binary change set composition is a techniq...