
Norbert SiegmundOtto-von-Guericke University Magdeburg | OvGU · Faculty of Computer Science
Norbert Siegmund
Dr.-Ing.
About
119
Publications
19,057
Reads
How we measure 'reads'
A 'read' is counted each time someone views a publication summary (such as the title, abstract, and list of authors), clicks on a figure, or views or downloads the full-text. Learn more
4,068
Citations
Introduction
Additional affiliations
August 2011 - March 2012
January 2008 - December 2013
Publications
Publications (119)
Context
Chatbots based on large language models are becoming an important tool in modern software development, yet little is known about how programming beginners interact with this new technology to write code and acquire new knowledge. Thus, we are missing key ingredients to develop guidelines on how to adopt chatbots for becoming productive at p...
\textit{Background}: Large language models (LLMs) have become a paramount interest of researchers and practitioners alike, yet a comprehensive overview of key considerations for those developing LLM-based systems is lacking. This study addresses this gap by collecting and mapping the topics practitioners discuss online, offering practical insights...
Context
Programmer’s block, akin to writer’s block, is a phenomenon where capable programmers struggle to create code. Despite anecdotal evidence, no scientific studies have explored the relationship between programmer’s block and writer’s block.
Objective
The primary objective of this study is to study the presence of blocks during programming an...
Retrieval-augmented generation (RAG) is an umbrella of different components, design decisions, and domain-specific adaptations to enhance the capabilities of large language models and counter their limitations regarding hallucination and outdated and missing knowledge. Since it is unclear which design decisions lead to a satisfactory performance, d...
As a software system evolves, its performance can improve or degrade over time. Performance evolution is especially delicate in configurable software systems, where performance degradation may manifest only for specific configurations, making it especially hard to spot and fix. Problem. Prior work concentrated mainly on performance-bug detection an...
Context
Many software systems can be tuned for multiple objectives (e.g., faster runtime, less required memory, less network traffic or energy consumption, etc.). Such systems can suffer from “disagreement” where different models have different (or even opposite) insights and tactics on how to optimize a system. For configuration problems, we show...
Software projects are complex technical and organizational systems involving large numbers of artifacts and developers. To understand and tame software complexity, a wide variety of program analysis techniques have been developed for bug detection, program comprehension, verification, and more. At the same time, repository mining techniques aim at...
Understanding the influence of configuration options on the performance of a software system is key for finding optimal system configurations, system understanding, and performance debugging. In the literature, a number of performance-influence modeling approaches have been proposed, which model a configuration option’s influence and a configuratio...
Determining whether a configurable software system has a performance bug or it was misconfigured is often challenging. While there are numerous debugging techniques that can support developers in this task, there is limited empirical evidence of how useful the techniques are to address the actual needs that developers have when debugging the perfor...
Software comes with many configuration options, satisfying varying needs from users. Exploring those options for non-functional requirements can be tedious, time-consuming, and even error-prone (if done manually). Worse, many software systems can be tuned to multiple objectives (e.g., faster response time, fewer memory requirements, decreased netwo...
In scientific computing, researchers often use feature-rich software frameworks to simulate physical, chemical, and biological processes. Commonly, researchers follow a clone-and-own approach: Copying the code of an existing, similar simulation and adapting it to the new simulation scenario. In this process, a user has to select suitable artifacts...
Many modern software systems are highly configurable, allowing the user to tune them for performance and more. Current performance modeling approaches aim at finding performance-optimal configurations by building performance models in a black-box manner. While these models provide accurate estimates, they cannot pinpoint causes of observed performa...
Performance-influence models can help stakeholders understand how and where configuration options and their interactions influence the performance of a system. With this understanding, stakeholders can debug performance behavior and make deliberate configuration decisions. Current black-box techniques to build such models combine various sampling a...
The human factor is prevalent in empirical software engineering research. However, human studies often do not use the full potential of analysis methods by combining analysis of individual tasks and participants with an analysis that aggregates results over tasks and/or participants. This may hide interesting insights of tasks and participants and...
Stakeholders of configurable systems are often interested in knowing how configuration options influence the performance of a system to facilitate, for example, the debugging and optimization processes of these systems. Several black-box approaches can be used to obtain this information, but they either sample a large number of configurations to ma...
Artificial intelligence has gained considerable momentum in software engineering. One success story is the use of machine learning for performance prediction and optimization of configurable software systems, for which it is often intractable to determine which configuration is optimal. Despite notable success stories, there are major challenges th...
Many software systems offer configuration options to tailor their functionality and non-functional properties (e.g., performance). Often, users are interested in the (performance-)optimal configuration, but struggle to find it, due to missing information on influences of individual configuration options and their interactions. In the past, various...
Detecting feature interactions is imperative for accurately predicting performance of highly-configurable systems. State-of-the-art performance prediction techniques rely on supervised machine learning for detecting feature interactions, which, in turn, relies on time-consuming performance measurements to obtain training data. By providing informat...
Modeling the performance of a highly configurable software system requires capturing the influences of its configuration options and their interactions on the system’s performance. Performance-influence models quantify these influences, explaining this way the performance behavior of a configurable system as a whole. To be useful in practice, a per...
In configurable software systems, stakeholders are often interested in knowing how configuration options influence the performance of a system to facilitate, for example, the debugging and optimization processes of these systems. There are several black-box approaches to obtain this information, but they usually require a large number of samples to...
Configuring a software system to optimize non-functional properties is a hard task. There are dozens to thousands of configuration options that can affect performance, energy consumption, and other attributes of the resulting program. Even worse, options may interact, such that their combined presence (or absence) has an influence on a non-function...
Iterative program optimization is known to be able to adapt more easily to particular programs and target hardware than model-based approaches. An approach is to generate random program transformations and evaluate their profitability by applying them and benchmarking the transformed program on the target hardware. This procedure’s large computatio...
Most software systems provide options that allow users to tailor the system in terms of functionality and qualities. The increased flexibility raises challenges for understanding the configuration space and the effects of options and their interactions on performance and other non-functional properties. To identify how options and interactions affe...
Despite the huge spread and economical importance of configurable software systems, there is unsatisfactory support in utilizing the full potential of these systems with respect to finding performance-optimal configurations. Prior work on predicting the performance of software configurations suffered from either (a) requiring far too many sample co...
Many software systems today are configurable, offering customization of functionality by feature selection. Understanding how performance varies in terms of feature selection is key for selecting appropriate configurations that meet a set of given requirements. Due to a huge configuration space and the possibly high cost of performance measurement,...
Finding good configurations for a software system is often challenging since the number of configuration options can be large. Software engineers often make poor choices about configuration or, even worse, they usually use a sub-optimal configuration in production, which leads to inadequate performance. To assist engineers in finding the (near) opt...
Finding good configurations for a software system is often challenging since the number of configuration options can be large. Software engineers often make poor choices about configuration or, even worse, they usually use a sub-optimal configuration in production, which leads to inadequate performance. To assist engineers in finding the (near) opt...
Detecting feature interactions is imperative for accurately predicting performance of highly-configurable systems. State-of-the-art performance prediction techniques rely on supervised machine learning for detecting feature interactions, which, in turn, relies on time consuming performance measurements to obtain training data. By providing informat...
Multigrid methods are among the most efficient algorithms for solving discretized partial differential equations. Typically, a multigrid system offers various configuration options to tune performance for different applications and hardware platforms. However, knowing the best performing configuration in advance is difficult, because measuring all...
Modern software systems provide many configuration options which significantly influence their non-functional properties. To understand and predict the effect of configuration options, several sampling and learning strategies have been proposed, albeit often with significant cost to cover the highly dimensional configuration space. Recently, transf...
The polyhedron model is a powerful model to identify and apply systematically loop transformations that improve data locality (e.g., via tiling) and enable parallelization. In the polyhedron model, a loop transformation is, essentially, represented as an affine function. Well-established algorithms for the discovery of promising transformations are...
Finding the optimally performing configuration of a software system for a given setting is often challenging. Recent approaches address this challenge by learning performance models based on a sample set of configurations. However, building an accurate performance model can be very expensive (and is often infeasible in practice). The central insigh...
Variability models are often enriched with attributes, such as performance, that encode the influence of features on the respective attribute. In spite of their importance, there are only few attributed variability models available that have attribute values obtained from empirical, real-world observations and that cover interactions between featur...
Software Product Lines (SPLs) are highly configurable systems. This raises the challenge to find optimal performing configurations for an anticipated workload. As SPL configuration spaces are huge, it is infeasible to benchmark all configurations to find an optimal one. Prior work focused on building performance models to predict and optimize SPL c...
Modern software systems are built to be used in dynamic environments using configuration capabilities to adapt to changes and external uncertainties. In a self-adaptation context, we are often interested in reasoning about the performance of the systems under different configurations. Usually, we learn a black-box model based on real measurements t...
Modern software systems are built to be used in dynamic environments using configuration capabilities to adapt to changes and external uncertainties. In a self-adaptation context, we are often interested in reasoning about the performance of the systems under different configurations. Usually, we learn a black-box model based on real measurements t...
Finding the optimally performing configuration of a software system for a given setting is often challenging. Recent approaches address this challenge by learning performance models based on a sample set of configurations. However, building an accurate performance model can be very expensive (and is often infeasible in practice). The central insigh...
Despite the huge spread and economical importance of configurable software systems, there is unsatisfactory support in utilizing the full potential of these systems with respect to finding performance-optimal configurations. Prior work on predicting the performance of software configurations suffered from either (a) requiring far too many sample co...
Geometric multigrid solvers are among the most efficient methods for solving partial differential equations. To optimize performance, developers have to select an appropriate combination of algorithms for the hardware and problem at hand. Since a manual configuration of a multigrid solver is tedious and does not scale for a large number of differen...
Almost every complex software system today is configurable. While configurability has many benefits, it challenges performance prediction, optimization, and debugging. Often, the influences of individual configuration options on performance are unknown. Worse, configuration options may interact, giving rise to a configuration space of possibly expo...
The vast amount of processing power and memory bandwidth provided by modern graphics cards make them an interesting platform for data-intensive applications. Unsurprisingly, the database research community identified GPUs as effective co-processors for data processing several years ago. In the past years, there were many approaches to make use of G...
A key idea of feature orientation is to decompose a software product line along the features it provides. Feature decomposition is orthogonal to object-oriented decomposition—it crosscuts the underlying package and class structure. It has been argued often that feature decomposition improves system structure by reducing coupling and by increasing c...
A standard technique for numerically solving elliptic partial differential equations on structured grids is to discretize them, and, then, to apply an efficient geometric multi-grid solver. Unfortunately, finding the optimal choice of multi-grid components and parameter settings is challenging and existing auto-tuning techniques fail to explain per...
Despite the wide use of software product lines, their implementation and evolution is a challenging task. When implementing a feature, a developer has to know which code fragments of other (already implemented) features are accessible in each program variant in which the feature is included. Especially for composition-based implementation technique...
For a decade, the database community has been exploring graphics processing units and other co-processors to accelerate query processing. While the developed algorithms often outperform their CPU counterparts, it is not beneficial to keep processing devices idle while over utilizing others. Therefore, an approach is needed that efficiently distribu...
Most contemporary programs are customizable. They provide many features that give rise to millions of program variants. Determining which feature selection yields an optimal performance is challenging, because of the exponential number of variants. Predicting the performance of a variant based on previous measurements proved successful, but induces...
The vast amount of processing power and memory bandwidth provided by modern graphics cards make them an interesting platform for data-intensive applications. Unsurprisingly, the database research community has identified GPUs as effective co-processors for data processing several years ago. In the past years, there were many approaches to make use...
Configurable software systems allow stakeholders to derive program variants by selecting features. Understanding the correlation between feature selections and performance is important for stakeholders to be able to derive a program variant that meets their requirements. A major challenge in practice is to accurately predict performance based on a...
Most contemporary programs are customizable. They provide many features that give rise to millions of program variants. Determining which feature selection yields an optimal performance is challenging, because of the exponential number of variants. Predicting the performance of a variant based on previous measurements proved successful, but induces...
The feature-interaction problem has been keeping researchers and practitioners in suspense for years. Although there has been substantial progress in developing approaches for modeling, detecting, managing, and resolving feature interactions, we lack sufficient knowledge on the kind of feature interactions that occur in real-world systems. In this...
Since a decade, the database community researches opportunities to exploit graphics processing units to accelerate query processing. While the developed GPU algorithms often outperform their CPU counterparts, it is not beneficial to keep processing devices idle while over utilizing others. Therefore, an approach is needed that effectively distribut...
Software product-line engineering enables efficient development of tailor-made software by means of reusable artifacts. As practitioners increasingly develop software systems as product lines, there is a growing potential to reuse product lines in other product lines, which we refer to as multi product line. We identify challenges when developing m...
Software product-line engineering aims at developing families of related products, which share common assets, to provide customers with tailor-made products. Customers are
often interested not only in particular functionalities (i.e., features), but also in non-functional quality attributes such as performance, reliability, and footprint. Measuring...
Feature models specify valid combinations of features in software product lines. With dependent feature models (DFMs), we apply separation of concerns to feature models for two main benefits. First, we can modularize feature models into parts relevant to groups of stakeholders. Second, we are able to model dependencies between different software pr...
By adopting to more domains, database management systems (DBMSs) increase their functionality continously. This leads to DBMSs that often include unnecessary functionality, which decreases performance. A result of this trend is that new specialized systems arise that focus only on a certain application scenario but often reimplement already existin...
To adapt to heterogeneous hardware, software of embedded systems provides customization capacities. Typically, this customization is achieved using conditional compilation with preprocessors. However, preprocessor usage can lead to obfuscated source code that can be difficult to comprehend, which in turn cause increased maintenance costs, bugs, and...
Heterogeneity of embedded systems leads to the development of variable software, such as software product lines. From such a family of programs, stakeholders select the specific variant that satisfies their functional requirements. However, different functionality exposes different non-functional properties of these variants. Especially in the embe...
Today, we are surrounded by computers. However, only a minor part is working stations we might think of. The major part, about 98 %, are embedded systems [15], for example PDAs, mobile phones, sensors, or credit cards. For embedded systems, resource constraints regarding memory capacity and performance are characteristic. Furthermore, the hardware...
Customizable programs and program families provide user-selectable features to allow users to tailor a program to an application scenario. Knowing in advance which feature selection yields the best performance is difficult because a direct measurement of all possible feature combinations is infeasible. Our work aims at predicting program performance...
Software product lines (SPLs) and adaptive systems aim at variability to cope with changing requirements. Variability can be described in terms of features , which are central for development and configuration of SPLs. In traditional SPLs, features are bound statically before runtime. By contrast, adaptive systems support feature binding at runtime...
Software product lines (SPLs) and adaptive systems aim at variability to cope with changing requirements. Variability can be described in terms of features, which are central for development and configuration of SPLs. In traditional SPLs, features are bound statically before runtime. By contrast, adaptive systems support feature binding at runtime...