
Stephan Diehl- Prof. Dr.
- Trier University
Stephan Diehl
- Prof. Dr.
- Trier University
About
221
Publications
36,744
Reads
How we measure 'reads'
A 'read' is counted each time someone views a publication summary (such as the title, abstract, and list of authors), clicks on a figure, or views or downloads the full-text. Learn more
6,313
Citations
Introduction
Current institution
Publications
Publications (221)
In this paper, the term formula code refers to fragments of source code that implement a mathematical formula. We present empirical studies that analyze the diversity and frequency of formula code in open-source-software projects. In an exploratory study, we investigated what kinds of formulas are implemented in real-world Java projects and derived...
In this paper, the term formula code refers to fragments of source code that implement a mathematical formula. We present empirical studies that analyze the diversity and frequency of formula code in open-source-software projects. In an exploratory study, we investigated what kinds of formulas are implemented in real-world Java projects and derived...
Stack Overflow (SO) is the largest Q&A website for software developers, providing a huge amount of copyable code snippets. Using those snippets raises maintenance and legal issues. SO's license (CC BY-SA 3.0) requires attribution, i.e., referencing the original question or answer, and requires derived work to adopt a compatible license. While there...
A core goal of Continuous Integration (CI) is to make small incremental changes to software projects, which are integrated frequently into a mainline repository or branch. This paper presents an empirical study that investigates if developers adjust their commit activity towards the above-mentioned goal after projects start using CI. We analyzed th...
Stack Overflow (SO) is the most popular question-and-answer website for software developers, providing a large amount of code snippets and free-form text on a wide variety of topics. Like other software artifacts, questions and answers on SO evolve over time, for example when bugs in code snippets are fixed, code is updated to work with a more rece...
Stack Overflow (SO) is the most popular question-and-answer website for software developers, providing a large amount of copyable code snippets. Like other software artifacts, code on SO evolves over time, for example when bugs are fixed or APIs are updated to the most recent version. To be able to analyze how code and the surrounding text on SO ev...
Software development includes diverse tasks such as implementing new features, analyzing requirements, and fixing bugs. Being an expert in those tasks requires a certain set of skills, knowledge, and experience. Several studies investigated individual aspects of software development expertise, but what is missing is a comprehensive theory. We prese...
Stack Overflow (SO) is the most popular question-and-answer website for software developers, providing a large amount of code snippets and free-form text on a wide variety of topics. Like other software artifacts, questions and answers on SO evolve over time, for example when bugs in code snippets are fixed, code is updated to work with a more rece...
Stack Overflow (SO) is the most popular question-and-answer website for software developers, providing a large amount of code snippets and free-form text on a wide variety of topics. Like other software artifacts, questions and answers on SO evolve over time, for example when bugs in code snippets are fixed, code is updated to work with a more rece...
A core goal of Continuous Integration (CI) is to make small incremental changes to software projects. Those changes should then be integrated frequently into a mainline repository or branch. This paper presents an empirical study investigating if developers adjust their commit activity towards this goal after projects start using CI. To this end, w...
Stack Overflow (SO) is the most popular question-and-answer website for software developers, providing a large amount of copyable code snippets. Using those snippets raises maintenance and legal issues. SO's license (CC BY-SA 3.0) requires attribution, i.e., referencing the original question or answer, and requires derived work to adopt a compatibl...
Crowdsourcing offers great potential to overcome the limitations of controlled lab studies. To guide future designs of crowdsourcing-based studies for visualization, we review visualization research that has attempted to leverage crowdsourcing for empirical evaluations of visualizations. We discuss six core aspects for successful employment of crow...
Sketching is an important activity for understanding, designing, and communicating different aspects of software systems such as their requirements or architecture. Often, sketches start on paper or whiteboards, are revised, and may evolve into a digital version. Users may then print a revised sketch, change it on paper, and digitize it again. Exis...
Background: Reaching out to professional software developers is a crucial part of empirical software engineering research. One important method to investigate the state of practice is survey research. As drawing a random sample of professional software developers for a survey is rarely possible, researchers rely on various sampling strategies. Obje...
Stack Overflow (SO) is the largest Q&A website for developers, providing a huge amount of copyable code snippets. Using these snippets raises various maintenance and legal issues. The SO license requires attribution, i.e., referencing the original question or answer, and requires derived work to adopt a compatible license. While there is a heated d...
Background: Performance bugs can lead to severe issues regarding computation efficiency, power consumption, and user experience. Locating these bugs is a difficult task because developers have to judge for every costly operation whether runtime is consumed necessarily or unnecessarily. Objective: We wanted to investigate how developers, when locati...
Recent studies have shown that sketches and diagrams play an important role in the daily work of software developers. If these visual artifacts are archived, they are often detached from the source code they document, because there is no adequate tool support to assist developers in capturing, archiving, and retrieving sketches related to certain s...
Sketches and diagrams play an important role in the daily work of software developers. In this paper, we investigate the use of sketches and diagrams in software engineering practice. To this end, we used both quantitative and qualitative methods. We present the results of an exploratory study in three companies and an online survey with 394 partic...
Stack Overflow (SO) is the largest Q&A website for developers, providing a huge amount of copyable code snippets. Using these snippets raises various maintenance and legal issues. The SO license requires attribution, i.e., referencing the original question or answer, and requires derived work to adopt a compatible license. While there is a heated d...
Background: Reaching out to professional software developers is a crucial part of empirical software engineering research. One important method to investigate the state of practice is survey research. As drawing a random sample of professional software developers for a survey is rarely possible, researchers rely on various sampling strategies. Obje...
Dynamic graph visualization focuses on the challenge of representing the evolution of relationships between entities in readable, scalable and effective diagrams. This work surveys the growing number of approaches in this discipline. We derive a hierarchical taxonomy of techniques by systematically categorizing and tagging publications. While stati...
Requirements engineering produces specifications of the needs or conditions to meet for a software product. These specifications may be vague and ungrounded, i.e. the relation of the requirements to the observations they are derived from may be unclear or not documented. Furthermore, stakeholders may be influenced by solutions of existing software...
Humans are very efficient in processing and remembering visual information. That is why metaphors and visual representations are important in education. Because of their high visual expressiveness, presentation tools like Microsoft PowerPoint are very popular for teaching in classrooms. However, representing source code with such tools is tedious a...
Background: Performance bugs can lead to severe issues regarding computation efficiency, power consumption, and user experience. Locating these bugs is a difficult task because developers have to judge for every costly operation whether runtime is consumed necessarily or unnecessarily. Objective: We wanted to investigate how developers, when locati...
One of software developers' most important activities is exploring the broader context of a certain programming task, which strongly requires navigating source code and working out a mental model of the collected information. Without tool support, creating and maintaining this mental model leads to significant cognitive load because developers have...
In a selective retrospective of the history of software visualization we discuss examples of applying visualization techniques to analyze the past and present state of software. Based on this retrospective, we make various suggestions for future research. In particular, we argue that the prediction of future aspects of a software system is an impor...
Keywords or tags summarize documents on an abstract level and can also be used for describing code fragments. They might be leveraged for retrieving features of a software system, understanding program functionality, or providing additional context. While automatic approaches at best are only able to retrieve information that is already contained i...
Sketches and diagrams play an important role in the daily work of software developers. In this paper, we investigate the use of sketches and diagrams in software engineering practice. To this end, we used both quantitative and qualitative methods. We present the results of an exploratory study in three companies and an online survey with 394 partic...
Recent studies have shown that sketches and diagrams play an important role in the daily work of software developers. If these visual artifacts are archived, they are often detached from the source code they document, because there is no ad- equate tool support to assist developers in capturing, archiving, and retrieving sketches related to certain...
In software engineering the comparison of graph-based models is a well-known problem. Although different comparison metrics have been proposed, there are situations in which automatic or pre-configured approaches do not provide reasonable results. Especially when models contain semantic similarities or differences, additional human knowledge is oft...
The comparison of hierarchies is a data analysis task for that a number of visualization approaches already exist. Generally, this can be regarded as a special form of graph comparison. These techniques typically handle two or more compared hierarchies all in the same way. In many practical applications, however, there are reasons why one of the hi...
Visual comparison of hierarchies such as directory structures is often considered a passive analysis task. Thus, insights gained from the visualization need to be recorded and applied afterwards. In contrast in this paper, we propose and explore an active visual analytics approach focusing on the manipulation of directory structures in the context...
Dynamic graph visualization focuses on the challenge of representing the evolution of relationships between entities in readable, scalable, and effective diagrams. This work surveys the growing number of approaches in this discipline. We derive a hierarchical taxonomy of techniques by systematically categorizing and tagging publications. While stat...
Nowadays file browsers represent a common means to organize hierarchically structured data provided by a file system. However, when it comes to comparing and merging different directory structures, file browsers often do not explicitly support users to accomplish such a task. In this paper, we present a special-purpose approach focusing on the mani...
Touch gestures are not only often very intuitive, but their direct manipulation characteristics also help to reduce the cognitive load. Since software development poses complex cognitive demands, our goal is to exploit the advantages of direct manipulation to support professional software engineering processes. In this paper, we demonstrate how tou...
During early phases of a software development process co-located group work is an important technique that involves all stakeholders to derive requirements and a design of a future software system. However, such group work is usually applied without any computer-assistance and often faces the problem that information is not well preserved for subse...
Multivariate networks, or graphs, are an essential element of various activities in the software engineering domain, such as program comprehension for software maintenance and evolution. In this chapter, we present the specific context in which multivariate graphs occur in software engineering, highlight their importance in domain-specific tasks, a...
In recent years, research on model merging has often focused on algorithmic problems. Although a lot of tasks may be done automatically by a tool, there still exist situations when users want to or sometimes have to intervene. Thus, it is also crucial to provide well-designed user interfaces that allow users to adapt or modify results of an automat...
The evolution of a software project is a rich data source for analyzing and improving the software development process. Recently, several research groups have tried to cluster source code artifacts based on information about how the code of a software system evolves. The results of these evolutionary approaches seem promising, but a direct comparis...
Software systems are often modeled and visualized as graphs in order to understand their higher-level structure: code entities are connected by dependencies or couplings. However, when only considering one type of code coupling such as method calls, the understanding gained stays limited to this specific aspect. Encoding multiple types of code coup...
Numeric variables are one of the most frequently used data types. During the execution of a program, their values might change often. Tracing these changes can be necessary for understanding specific behavior of the program or for locating bugs. However, using a breakpoint debugger requires tedious stepping, and logging changes implies analyzing la...
Mapping a dynamic graph dataset to an inappropriate visualization leads to a degradation of visualization performance at some task. To tap the full potential of existing dynamic graph visualization techniques, we propose a methodology for matching application requirements with dynamic graph visualization profiles. We target at supporting experts ch...
Presenting source code to others is not only a typical task of computer science teachers, but also when practitioners as well as researchers in software engineering are faced with this task. Usually, classical presentation tools or source code editors are used for such presentations. However, while the former are too inflexible for the presenter to...
Finding and fixing performance bottlenecks requires sound knowledge of the program that is to be optimized. In this paper, we propose an approach for presenting performance-related information to software engineers by visually augmenting source code shown in an editor. Small diagrams at each method declaration and method call visualize the propagat...
Object-orientation is one of the essential parts of every software engineering course. However, according to literature, it often lacks the following: First, modeling on a conceptual level independent from a particular programming language is often neglected. Moreover, the actual process of designing or implementing a piece of software seems to be...
In requirements engineering, CRC modeling and use case analysis are established techniques and are often performed as a group work activity. In particular, role play is used to involve different stakeholders into the use case analysis. To support this kind of co-located collaboration we developed CREW-Space, which allows several users to simultaneo...
Rapid Serial Visual Presentation is an effective approach for browsing and searching large amounts of data. By presenting subsequent images at high frequency, we utilize the perceptual abilities of the human visual system to rapidly process certain visual features. While this concept is successfully used in video and image browsing, we demonstrate...
Im Rahmen der Wiederverwendung von Software, des Software-Redesigns oder des Outsourcings kann es sinnvoll sein, eine Komponente eines Software-Systems herauszutrennen, damit dieses Teilsystem unabhängig und effizient weiterentwickelt werden kann. Schwerpunkt dieser Arbeit ist das Design und die Realisierung einer Benutzerschnittstelle, die ein sem...
In source code files, fields and methods are arranged in linear order. Modern programming languages such as Java do not constrain this order-developers are free to choose any sequence. In this paper we examine the largely unexplored strategies developers apply for ordering fields and methods: First, we use visualization to explore different orderin...
We present a novel dynamic graph visualization technique based on node-link diagrams. The graphs are drawn side-byside from left to right as a sequence of narrow stripes that are placed perpendicular to the horizontal time line. The hierarchically organized vertices of the graphs are arranged on vertical, parallel lines that bound the stripes; dire...
Software systems are modularized to make their inherent complexity manageable. While there exists a set of well-known principles that may guide software engineers to design the modules of a software system, we do not know which principles are followed in practice. In a study based on 16 open source projects, we look at different kinds of coupling c...
Dependencies and coupling relationships between code entities can be manifold. They form a graph structure with several different types of edges. Visualizing these graphs presents two challenges: the often large size of the graphs and the readable representation of the different edge types. In this paper we present a new node-link graph visualizati...
So far, research on model merging has mostly focused on algorithmic problems. But, there are various situations when software engineers have to compare and merge different models manually or at least make important decisions. In this paper, we provide insights into the process of how users compare and merge visual models. To this end, we observed p...
In this paper, we present CREWW, a tool for co-located, collaborative CRC modeling and use case analysis. In CRC sessions role play is used to involve all stakeholders when determining whether the current software model completely and consistently captures the modeled use case. In this activity it quickly becomes difficult to keep track of which cl...
Identifying refactorings in software archives has been an active research topic in the last decade, mainly because it is a prerequisite for various software evolution analyses (e.g., error detection, capturing intent of change, capturing and replaying changes, and relating refactorings and software metrics). Many of these techniques rely on similar...
Why computer scientists should come out from "behind the scenes" more often and work with the media to draw public attention to their fundamental innovations.
Radial visualizations play an important role in the information visualization community. But the decision to choose a radial coordinate system is rather based on intuition than on scientific foundations. The empirical approach presented in this paper aims at uncovering strengths and weaknesses of radial visualizations by comparing them to equivalen...
Reverse engineering methods produce different descriptions of software architectures. In this article we address the task of exploring and comparing these descriptions. We present a novel visualization technique to compare architectures consisting of a decomposition of the software system and the dependencies among the code entities. This technique...
Code clone detection is an enabling technology for plenty of applications, each having different requirements to a clone detector. In this paper we present a generic pipeline model of the code clone detection process. Based on this model we developed the JCCD code clone detection API for implementing custom clone detectors. By combining and paramet...
The evolution of a software project is a rich data source for analyzing and improving the software development process. Recently, several research groups have tried to cluster source code artifacts based on information about how the code of a software system evolves. The results of these evolutionary approaches seem promising, but a direct comparis...
In a software project, outsourcing the development of a particular functionality, reusing a part in another software, or handing-over a part of the code to a new team member requires the extraction of an independent subset of the software-a component. This paper describes and analyzes the process of extracting such a component. We introduce an auto...
Code clone detection is an enabling technology for plenty of applications, each having different requirements for a code clone detector. In the tool demonstration we present JCCD, a code clone detection API, which is based on a pipeline model. By combining and parameterizing predefined API components as well as by adding new components, the pipelin...
Graphs are a mathematical method to model relations between objects. The most common metaphor to visualize graphs is the node-link technique, which typically suffers from visual clutter caused by many edge crossings. Much research has been done on the development of sophisticated algorithms aimed at enhancing the layout with respect to edge crossin...
In business as well as science a clear and professional presentation of quantitative information is often required and helps to efficiently communicate new insights. The predominant approach is to integrate charts into slide shows created with standard presentation programs. In this paper, we introduce the chart flight metaphor for visualizing spat...
Dynamic recompilation tries to produce more efficient code by exploiting runtime information. Virtual machines like the Jikes RVM use recompilation heuristics to decide how to recompile the program, i.e. what parts are recompiled at what level of optimization. In this paper we present our post-mortem amortization analysis based on improved call sta...
The experiment investigated whether controlling presentation speed as well as labels, which display the names of the currently presented nodes in interactive dynamic graphs, affects comprehension performance. Dynamic graphs are animated graphical representations of nodes and edges representing mathematical structures used to model relations between...
Most research on the readability of graph visualization focuses on node-link diagrams of static graphs. But in many applications graphs are not static, but change over time, or graphs are too dense to be drawn as node-link diagrams. In this paper we look at dynamic graph visualizations: We translate the general goal of graph visualization--to conve...
Many applications feature large hierarchic dynamic graphs that change over time. Often, these changes are more important than the graphs themselves.In our approach, areas of interests in dynamic graphs are detected based on user preferences. The user is guided from one area of interest to another in such a way that reduced contextual information is...
Software evolution studies have traditionally focused on individual products. In this study we scale up the idea of software evolution by considering software compilations composed of a large quantity of independently developed products, engineered to ...
Compound digraphs are a widely used model in computer science. In many application domains these models evolve over time. Only few approaches to visualize such dynamic compound digraphs exist and mostly use animation to show the dynamics. In this paper we present a new visualization tool called TimeArcTrees that visualizes weighted, dynamic compoun...
In this paper we present the results of a user study comparing the readability of force-directed, orthogonal, and hierarchical graph layouts. To this end we identified prototypical tasks which are solved using visual representations of graphs. Based on the correctness of answers and the related response time we evaluated for each task which layout...
Many recently developed information visualization techniques are radial variants of originally Cartesian visualizations. Almost none of these radial variants have been evaluated with respect to their benefits over their original visualizations. In this work we compare a radial and a Cartesian variant of a visualization tool for sequences of transac...
In many applications transactions between the elements of an information hierarchy occur over time. For example, the product oers of a department store can be organized into product groups and subgroups to form an information hierarchy. A market basket consisting of the products bought by a customer forms a transaction. Market baskets of one or mor...
Software development is heavily dependent on the participants of the process and their roles within the process. Each developer has his specific skills and interests and hence contributes to the project in a different way. While some programmers work on separate modules, others developers integrate these modules towards the final product. To identi...
While there is a considerable amount of research on analyzing the change information stored in software repositories, only few researcher have looked at software changes contained in email archives in form of patches. In this paper we look at the email archives of two open source projects and answer questions like the following: How many emails con...
The evolution of dependencies in information hierarchies can be modeled by sequences of compound digraphs with edge weights. In this paper we present a novel approach to visualize such sequences of graphs. It uses radial tree layout to draw the hierarchy, and circle sectors to represent the temporal change of edges in the digraphs. We have develope...
In this paper we present a generic algorithm for drawing sequences of graphs. This algorithm works for different layout algorithms
and related metrics and adjustment strategies. It differs from previous work on dynamic graph drawing in that it considers
all graphs in the sequence (offline) instead of just the previous ones (online) when computing t...
Software visualization encompasses the development and evaluation of methods for graphically representing different aspects of software, including its structure, its execution, and its evolution. Software visualization combines techniques from areas like software engineering, programming languages, data mining, computer graphics, information visual...
Software has been and is still mostly refactored without tool support. Moreover, as we found in our case studies, programmers tend not to document these changes as refactorings, or even worse label changes as refactorings, although they are not. In this paper we present a technique to detect changes that are likely to be refactorings and rank them...
In the past 40 years, software engineering has emerged as an important sub-field of computer science. The quality and productivity of software have been improved and the cost and risk of software development been decreased due to the contributions made ...
Software repositories such as source control systems,defect tracking systems,or archived communications between project personnel are used to help manage the progress of software projects.Software practitioners and researchers are beginning to recognize the potential bene .t of mining this information to support the maintenance of software systems,...
In this paper we combine the results of our refactoring reconstruction technique with bug, mail and release information to perform process and bug analyses of the ARGOUML CVS archive.
Refactorings are program transformations which should preserve the program behavior. Consequently, we expect that during phases when there are mostly refactorings in the change history of a system, only few new bugs are introduced. For our case study we analyzed the version histories of several open source systems and reconstructed the refactorings...
This paper deals with the visual representation of a particular kind of structured data: trees where each node is associated with an object (leaf node) of a taxonomy. We introduce a new visualization technique that we call Trees In A Treemap. In this visualization edges can either be drawn as straight or orthogonal edges. We compare our technique w...
Softwaretechnik basiert wie jede andere Wissenschaft auf historischen Er- fahrungen: Was hat in der Vergangenheit funktioniert und was nicht? Aus der Entste- hungsgeschichte eines Programms,wie sie in Software-Archiven aufgezeichnet wur- de, kann man solche Erfahrungen bilden und nutzbar machen - etwa um verwandte Programmstellen vorzuschlagen (wei...
We perform knowledge discovery in software archives in order to detect refactorings on the level of classes and methods. Our REFVIS prototype finds these refactorings in CVS repositories and relates them to transactions. Additionally, REFVIS relates movements of methods to the class inheritance hierarchy of the analyzed project. REFVIS creates visu...