Sven Apel's research while affiliated with Link Campus University and other places

Publications (347)

Article
One of the key differences between traditional software engineering and machine learning (ML) is the lack of specifications for ML models. Traditionally, specifications provide a cornerstone for compositional reasoning and for the divide-and-conquer strategy of how we build large and complex systems from components, but these are hard to come by fo...
Article
Full-text available
Mailing lists are a major communication channel for supporting developer coordination in open-source software projects. In a recent study, researchers explored temporal relationships (e.g., synchronization) between developer activities on source code and on the mailing list, relying on simple heuristics of developer collaboration (e.g., co-editing...
Article
Many open-source software projects depend on a few core developers, who take over both the bulk of coordination and programming tasks. They are supported by peripheral developers, who contribute either via discussions or programming tasks, often for a limited time. It is unclear what role these peripheral developers play in the programming and comm...
Preprint
Full-text available
Model transformations play a fundamental role in model-driven software development. They can be used to solve or support central tasks, such as creating models, handling model co-evolution, and model merging. In the past, various (semi-)automatic approaches have been proposed to derive model transformations from meta-models or from examples. These...
Preprint
Full-text available
Determining whether a configurable software system has a performance bug or it was misconfigured is often challenging. While there are numerous debugging techniques that can support developers in this task, there is limited empirical evidence of how useful the techniques are to address the actual needs that developers have when debugging the perfor...
Article
Reducing energy consumption of IT systems is fundamentally important for saving cost and reducing CO2 emissions. We explain the limits of pure artificial intelligence/machine learning (ML) methods when focusing on the source code and outline a conceptual framework for combining software engineering methods and ML to build white-box energy models.
Article
Software development is at the intersection of the social realm , involving people who develop the software, and the technical realm , involving artifacts (code, docs, etc.) that are being produced. It has been shown that a socio-technical perspective provides rich information about the state of a software project. In particular, we are interested...
Preprint
Full-text available
Detecting and understanding reasons for defects and inadvertent behavior in software is challenging due to their increasing complexity. In configurable software systems, the combinatorics that arises from the multitude of features a user might select from adds a further layer of complexity. We introduce the notion of feature causality, which is bas...
Article
In collaborative software development, merge conflicts arise when developers integrate concurrent code changes. Practitioners seek to minimize the number of merge conflicts because resolving them is difficult, time consuming, and often an error-prone task. Despite a substantial number of studies investigating merge conflicts, the challenges in merg...
Article
Full-text available
The C preprocessor is widely used in practice. Conditional compilation with #ifdef annotations allows developers to flexibly introduce variability in their programs. Developers can use disciplined annotations, entirely enclosing full statements with preprocessor directives, or undisciplined ones, enclosing only parts of the statements. Despite some...
Article
Lifted (family-based) static analysis based on abstract interpretation is capable of analyzing all variants of a program family (or any other configurable software system), simultaneously, in a single run without generating any of the variants explicitly. The elements of the underlying lifted domain are tuples, which maintain one property per syste...
Preprint
Model transformations play a fundamental role in model-driven software development. They can be used to solve or support central tasks, such as creating models, handling model co-evolution, and model merging. In the past, various (semi-)automatic approaches have been proposed to derive model transformations from meta-models or from examples. These...
Preprint
Full-text available
Software comes with many configuration options, satisfying varying needs from users. Exploring those options for non-functional requirements can be tedious, time-consuming, and even error-prone (if done manually). Worse, many software systems can be tuned to multiple objectives (e.g., faster response time, fewer memory requirements, decreased netwo...
Chapter
This work presents a novel approach for synthesizing numerical program sketches using lifted (family-based) static program analysis. In particular, our approach leverages a lifted static analysis based on abstract interpretation, which is used for analyzing program families with numerical features. It takes as input the common code base, which enco...
Article
Full-text available
This paper describes a large-scale empirical study investigating the relevance of socio-technical congruence over key basic software quality metrics, namely, bugs and churn. That is, we explore whether alignment or misalignment of social communication structures and technical dependencies in large software projects influences software quality. To t...
Preprint
Full-text available
We report on a large-scale empirical study investigating the relevance of socio-technical congruence over key basic software quality metrics, namely, bugs and churn. In particular, we explore whether alignment or misalignment of social communication structures and technical dependencies in large software projects influences software quality. To thi...
Preprint
The lack of specifications is a key difference between traditional software engineering and machine learning. We discuss how it drastically impacts how we think about divide-and-conquer approaches to system design, and how it impacts reuse, testing and debugging activities. Traditionally, specifications provide a cornerstone for compositional reaso...
Article
A number of product-line analysis approaches lift analyses such as type checking, model checking, and theorem proving from the level of single programs to the level of product lines. These approaches share concepts and mechanisms that suggest an unexplored potential for reuse of key analysis steps and properties, implementation, and verification ef...
Article
Full-text available
In scientific computing, researchers often use feature-rich software frameworks to simulate physical, chemical, and biological processes. Commonly, researchers follow a clone-and-own approach: Copying the code of an existing, similar simulation and adapting it to the new simulation scenario. In this process, a user has to select suitable artifacts...
Chapter
Full-text available
Lifted ( family-based ) static analysis by abstract interpretation is capable of analyzing all variants of a program family simultaneously, in a single run without generating any of the variants explicitly. The elements of the underlying lifted analysis domain are tuples, which maintain one property per variant. Still, explicit property enumeration...
Preprint
Many modern software systems are highly configurable, allowing the user to tune them for performance and more. Current performance modeling approaches aim at finding performance-optimal configurations by building performance models in a black-box manner. While these models provide accurate estimates, they cannot pinpoint causes of observed performa...
Preprint
Full-text available
Performance-influence models can help stakeholders understand how and where configuration options and their interactions influence the performance of a system. With this understanding, stakeholders can debug performance behavior and make deliberate configuration decisions. Current black-box techniques to build such models combine various sampling a...
Article
Large-scale software development today relies heavily on version control systems facilitating distributed development of software projects. For the purpose of merging diverging versions of the code base, version control systems employ line-based merge algorithms, which are applicable to all text files. Structured merge algorithms have been proposed...
Article
The human factor is prevalent in empirical software engineering research. However, human studies often do not use the full potential of analysis methods by combining analysis of individual tasks and participants with an analysis that aggregates results over tasks and/or participants. This may hide interesting insights of tasks and participants and...
Preprint
Full-text available
Lifted (family-based) static analysis by abstract interpretation is capable of analyzing all variants of a program family simultaneously, in a single run without generating any of the variants explicitly. The elements of the underlying lifted analysis domain are tuples, which maintain one property per variant. Still, explicit property enumeration i...
Article
Full-text available
Stakeholders of configurable systems are often interested in knowing how configuration options influence the performance of a system to facilitate, for example, the debugging and optimization processes of these systems. Several black-box approaches can be used to obtain this information, but they either sample a large number of configurations to ma...
Article
Full-text available
In large-scale open-source software projects, where developers are often distributed across the entire planet, coordination among developers is crucial. To estimate whether a state of socio-technical congruence is achieved, which is associated with software quality and project success, we assess the alignment of collaboration and communication in s...
Article
Artificial intelligence has gained considerable momentum in software engineering. One success story is the use of machine learning for performance prediction and optimization of configurable software systems, for which it is often intractable to determine which configuration is optimal. Despite notable success stories, there are major challenges th...
Article
Full-text available
Conducting technology-oriented experiments (i.e., experiments in which treatments are applied to objects by a computer-based tool) without proper tool support is often a time-consuming and highly error-prone task. Although many techniques have been proposed to help conducting controlled experiments, none of them simultaneously addresses (1) the exe...
Article
Full-text available
Version control systems assist developers in managing concurrent changes to a common code base by tracking all code contributions over time. A notorious problem is that, when integrating code contributions, merge conflicts may occur and resolving them is a time-consuming and error-prone task. There is a popular belief that communication and collabo...
Article
Grown software systems often contain code that is not necessary anymore. Such unnecessary code wastes resources during development and maintenance, for example, when preparing code for migration or certification. Running a profiler may reveal code that is not used in production, but it is often time-consuming to obtain representative data in this w...
Article
Full-text available
While polyhedral optimization appeared in mainstream compilers during the past decade, its profitability in scenarios outside its classic domain of linear-algebra programs has remained in question. Recent implementations, such as the LLVM plugin Polly, produce promising speedups, but the restriction to affine loop programs with control flow known a...
Preprint
Many software systems offer configuration options to tailor their functionality and non-functional properties (e.g., performance). Often, users are interested in the (performance-)optimal configuration, but struggle to find it, due to missing information on influences of individual configuration options and their interactions. In the past, various...
Conference Paper
Context: Verification techniques such as model checking are being applied to ensure that software systems achieve desired quality levels and fulfill their functional and non-functional specification. However, applying these techniques to software product lines is a twofold challenge, given the exponential blowup of the number of products and the st...
Article
Full-text available
Detecting feature interactions is imperative for accurately predicting performance of highly-configurable systems. State-of-the-art performance prediction techniques rely on supervised machine learning for detecting feature interactions, which, in turn, relies on time-consuming performance measurements to obtain training data. By providing informat...
Article
Full-text available
Maintenance consumes 40% to 80% of software development costs. So, it is essential to write source code that is easy to understand to reduce the costs with maintenance. Improving code understanding is important because developers often mistake the meaning of code, and misjudge the program behavior, which can lead to errors. There are patterns in so...
Chapter
This chapter is devoted to the performance analysis of configurable and evolving software. Both configurability and evolution imply a high degree of software variation, that is a large space of software variants and versions, that challenges state-of-the-art analysis techniques for software. We give an overview on strategies to cope with software v...
Article
Full-text available
Modeling the performance of a highly configurable software system requires capturing the influences of its configuration options and their interactions on the system’s performance. Performance-influence models quantify these influences, explaining this way the performance behavior of a configurable system as a whole. To be useful in practice, a per...
Article
Full-text available
Several variability representations have been proposed over the years. Software maintenance in the presence of variability is known to be hard. One of the reasons is that maintenance tasks require a large amount of cognitive effort for program comprehension. In fact, the different ways of representing variability in source code might influence the...
Preprint
Full-text available
In many domains, software systems cannot be deployed until authorities judge them fit for use in an intended operating environment. Certification standards and processes have been devised and deployed to regulate operations of software systems and prevent their failures. However, practitioners are often unsatisfied with the efficiency and value pro...
Preprint
Full-text available
In configurable software systems, stakeholders are often interested in knowing how configuration options influence the performance of a system to facilitate, for example, the debugging and optimization processes of these systems. There are several black-box approaches to obtain this information, but they usually require a large number of samples to...
Article
Iterative program optimization is known to be able to adapt more easily to particular programs and target hardware than model-based approaches. An approach is to generate random program transformations and evaluate their profitability by applying them and benchmarking the transformed program on the target hardware. This procedure’s large computatio...