About
325
Publications
64,781
Reads
How we measure 'reads'
A 'read' is counted each time someone views a publication summary (such as the title, abstract, and list of authors), clicks on a figure, or views or downloads the full-text. Learn more
12,777
Citations
Introduction
All publications are available on my website
Current institution
Additional affiliations
September 2004 - present
Education
October 1999 - May 2003
Publications
Publications (325)
Over the past decade, microservices have gained significant popularity, impacting how applications are designed and deployed. Maintaining a comprehensive high-level view of microservices applications is essential, especially for software evolution tasks, enabling developers to understand, maintain, and optimize the complex interactions across vario...
A crucial activity in software maintenance and evolution is the comprehension of the changes performed by developers, when they submit a pull request and/or perform a commit on the repository. Typically, code changes are represented in the form of code diffs, textual representations highlighting the differences between two file versions, depicting...
Managing data-intensive software ecosystems has long been considered an expensive and error-prone process. This is mainly due to the often implicit consistency relationships between applications and their database(s). In addition, as new technologies emerged for specialized purposes (e.g., key-value stores, document stores, graph databases), the co...
Fuzz testing, also known as fuzzing, is a software testing technique aimed at identifying software vulnerabilities. In recent decades, fuzzing has gained increasing popularity in the research community. However, existing studies led by fuzzing experts mainly focus on improving the coverage and performance of fuzzing techniques. That is, there is st...
Classical software documentation, as it was conceived and intended decades ago, is not the only reality anymore. Official documentation from authoritative and official sources is being replaced by real-time collaborative platforms and ecosystems that have seen a surge, influenced by changes in society, technology, and best practices. These modern t...
Modern instant messaging applications (e.g., Gitter, Slack, Discord) provide users with real-time communication means. Developers use them for collaborative development, to ask for code reviews, and to have software-related discussions. In short, a (potential) treasure trove for program comprehension. However, as with any high-throughput "chat appl...
In object-oriented programming, classes are the primary abstraction mechanism used by and exposed to developers. Understanding classes is key for the development and evolution of object-oriented applications. The fundamental problem faced by developers is that while classes are intrinsically structured entities, in IDEs they are represented as a bl...
Context
Over the past decades, researchers proposed numerous approaches to visualize source code. A popular one is CodeCity, an interactive 3D software visualization representing software system as cities: buildings represent classes (or files) and districts represent packages (or folders). Building dimensions represent values of software metrics,...
Opinion mining, sometimes referred to as sentiment analysis, has gained increasing attention in software engineering (SE) studies. SE researchers have applied opinion mining techniques in various contexts, such as identifying developers’ emotions expressed in code comments and extracting users’ critics toward mobile apps. Given the large amount of...
Refactoring aims at improving the maintainability of source code without modifying its external behavior. Previous works proposed approaches to recommend refactoring solutions to software developers. The generation of the recommended solutions is guided by metrics acting as proxy for maintainability (e.g., number of code smells removed by the recom...
Refactoring operations are behavior-preserving changes aimed at improving source code quality. While refactoring is largely considered a good practice, refactoring proposals in pull requests are often rejected after the code review. Understanding the reasons behind the rejection of refactoring contributions can shed light on how such contributions...
The city metaphor for visualizing software systems in 3D has been widely explored and has led to many diverse implementations and approaches. Common among all approaches is a focus on the software artifacts, while the aspects pertaining to the data and information (stored both in databases and files) used by a system are seldom taken into account....
Most changes during software maintenance and evolution are not atomic changes, but rather the result of several related changes affecting different parts of the code. It may happen that developers omit needed changes, thus leaving a task partially unfinished, introducing technical debt or injecting bugs. We present a study investigating “ quick rem...
Code completion is one of the killer features of Integrated Development Environments (IDEs), and researchers have proposed different methods to improve its accuracy. While these techniques are valuable to speed up code writing, they are limited to recommendations related to the next few tokens a developer is likely to type given the current context...
The SZZ algorithm for identifying bug-inducing changes has been widely used to evaluate defect prediction techniques and to empirically investigate when, how, and by whom bugs are introduced. Over the years, researchers have proposed several heuristics to improve the SZZ accuracy, providing various implementations of SZZ. However, fairly evaluating...
Android fragmentation is a well-known issue referring to the adoption of different versions in the multitude of devices supporting such an operating system. Each Android version features a set of APIs provided to developers. These APIs are subject to changes and may cause compatibility issues. To support app developers, approaches have been propose...
Modern software is developed under considerable time pressure, which implies that developers more often than not have to resort to compromises when it comes to code that is well written and code that just does the job. This has led over the past decades to the concept of "technical debt", a short-term hack that potentially generates long-term maint...
Software systems are continuously modified to implement new features, to fix bugs, and to improve quality attributes. Most of these activities are not atomic changes, but rather the result of several related changes affecting different parts of the code. For this reason, it may happen that developers omit some of the needed changes and, as a conseq...
Knowledge transfer is one of the main goals of modern code review, as shown by several studies that surveyed and interviewed developers. While knowledge transfer is a clear expectation of the code review process, there are no analytical studies using data mined from software repositories to assess the effectiveness of code review in "training" deve...
Software visualization is a program comprehension technique used in the context of software maintenance, reverse engineering, and software evolution analysis. In the last decade, researchers have been exploring 3D representations for visualizing programs. Among these representations, one of the most popular is the city metaphor, which represents a...
Background: Researchers have been exploring 3D representations for visualizing software. Among these representations, one of the most popular is the city metaphor, which represents a target object-oriented system as a virtual city. Recently, this metaphor has been also implemented in interactive software visualization tools that use virtual reality...
Developers do not always have the knowledge needed to understand source code and must refer to different resources (e.g., teammates, documentation, the web). This non-trivial process, called program comprehension, is very time-consuming. While many approaches support the comprehension of a given code at hand, they are mostly focused on defining ext...
Sentiment analysis has been applied to various software engineering (SE) tasks, such as evaluating app reviews or analyzing developers' emotions in commit messages. Studies indicate that sentiment analysis tools provide unreliable results when used out-of-the-box, since they are not designed to process SE datasets. The silver bullet for a successfu...
Software development video tutorials have seen a steep increase in popularity in recent years. Their main advantage is that they thoroughly illustrate how certain technologies, programming languages, etc. are to be used. However, they come with a caveat: there is currently little support for searching and browsing their content. This makes it diffi...
Code annotations are the core of the main APIs and frameworks for enterprise development, and are widely used on several applications. However, despite these APIs and frameworks made advanced uses of annotations, the language API for annotation reading is far from their needs. In particular, annotation reading is still a relatively complex task, th...
Software repositories typically store data composed of structured and unstructured parts. Researchers mine this data to empirically validate research ideas and to support practitioners' activities. Structured data (e.g., source code) has a formal syntax and is straightforward to analyze; unstructured data (e.g., documentation) is a mix of natural l...
Software development, like any prolonged and intellectually demanding activity, can negatively affect the motivation of developers. This is especially true in specific areas of software engineering, such as requirements engineering, test-driven development, bug reporting and fixing, where the creative aspects of programming fall short. The develope...
If measured by the number of published papers, defect prediction has become an important research field over the past decade, with many researchers continuously proposing novel approaches to predict defects in software systems. However, most of these approaches have had a noticeable lack of impact on industrial practice. This lack of impact is beca...
To ensure quality of software systems, developers use bug reports to track defects. It is in the interest of users and developers that bug reports provide the necessary information to ease the fixing process. Past research found that users do not provide the information that developers deem ideally useful to fix a bug. This raises an interesting qu...
When knowledgeable colleagues are not available, developers resort to offline and online resources, e.g., tutorials, mailing lists, and Q&A websites. These, however, need to be found, read, and understood, which takes its toll in terms of time and mental energy. A more immediate and accessible resource are video tutorials found on the web, which in...
Nowadays developers heavily rely on sources of informal documentation, including Q&A forums, slides, or video tutorials, the latter being particularly useful to provide introductory notions for a piece of technology. The current practice is that developers have to browse sources individually, which in the case of video tutorials is cumbersome, as t...
While coding, developers construct and maintain mental models of software systems to support the task at hand. Although source code is the main product of software development, the process involves navigating and inspecting entities beyond the ones that are edited by the end of a task. Developers use various user interfaces (UI) offered by the Inte...
Java is a safe language. Its runtime environment provides strong safety guarantees that any Java application can rely on. Or so we think. We show that the runtime actually does not provide these guarantees-for a large fraction of today's Java code. Unbeknownst to many application developers, the Java runtime includes a "backdoor" that allows expert...
Developers often require knowledge beyond the one they possess, which boils down to asking co-workers for help or consulting additional sources of information, such as Application Programming Interfaces (API) documentation, forums, and Q&A websites. However, it requires time and energy to formulate one’s problem, peruse and process the results. We...
We present ViDI (Visual Design Inspector), a novel code review tool which focuses on quality concerns and design inspection as its cornerstones. It leverages visualization techniques to represent the reviewed software and augments the visualization with the results of quality analysis tools. To effectively understand the contribution of a reviewer...
Summarization is hailed as a promising approach to reduce the amount of information that must be taken in by the person who wants to understand development artifacts, such as pieces of code, bug reports, emails, etc. However, existing approaches treat artifacts as pure textual entities, disregarding the heterogeneous and partially structured nature...