About
53
Publications
9,970
Reads
How we measure 'reads'
A 'read' is counted each time someone views a publication summary (such as the title, abstract, and list of authors), clicks on a figure, or views or downloads the full-text. Learn more
1,600
Citations
Introduction
Skills and Expertise
Publications
Publications (53)
Code completion aims at speeding up code writing by recommending to developers the next tokens they are likely to type. Deep Learning (DL) models pushed the boundaries of code completion by redefining what these coding assistants can do: We moved from predicting few code tokens to automatically generating entire functions. One important factor impa...
Developers who interrupt their involvement in a project can gradually forget critical information about the code, such as its purpose, structure, the impact of external dependencies, and the approach to implementation. Forgetting the implementation details can have detrimental effects on software maintenance, comprehension, knowledge sharing, and d...
[Context] Coupling is a widely discussed metric by software engineers while developing complex software systems, often referred to as a crucial factor and symptom of a poor or good design. Nevertheless, measuring the logical coupling among microservices and analyzing the interactions between services is non-trivial because it demands runtime inform...
Identifiers, such as method and variable names, form a large portion of source code. Therefore, low-quality identifiers can substantially hinder code comprehension. To support developers in using meaningful identifiers, several (semi-)automatic techniques have been proposed, mostly being data-driven (e.g., statistical language models, deep learning...
Transformers have gained popularity in the software engineering (SE) literature. These deep learning models are usually pre-trained through a self-supervised objective, meant to provide the model with basic knowledge about a language of interest (e.g., Java). A classic pre-training objective is the masked language model (MLM), in which a percentage...
The automatic generation of source code is one of the long-lasting dreams in software engineering research. Several techniques have been proposed to speed up the writing of new code. For example, code completion techniques can recommend to developers the next few tokens they are likely to type, while retrieval-based approaches can suggest code snip...
Software engineering research has always being concerned with the improvement of code completion approaches, which suggest the next tokens a developer will likely type while coding. The release of GitHub Copilot constitutes a big step forward, also because of its unprecedented ability to automatically generate even entire functions from their natur...
Coupling is one of the most frequently mentioned metric in software systems. However, to measure logical coupling between microservices, runtime information is needed or the availability of service-log files to analyze the calls between services is required. This work presents our emerging results, in which we propose a metric to statically calcula...
Identifiers, such as method and variable names, form a large portion of source code. Therefore, low-quality identifiers can substantially hinder code comprehension. To support developers in using meaningful identifiers, several (semi-)automatic techniques have been proposed, mostly being data-driven (e.g. statistical language models, deep learning...
Fine-grained just-in-time defect prediction aims at identifying likely defective files within new commits pushed by developers onto a shared repository. Most of the techniques proposed in literature are based on supervised learning, where machine learning algorithms are fed with historical data. One of the limitations of these techniques is concern...
Deep Learning (DL) models have been widely used to support code completion. These models, once properly trained, can take as input an incomplete code component (e.g., an incomplete function) and predict the missing tokens to finalize it. GitHub Copilot is an example of code recommender built by training a DL model on millions of open source reposit...
Different from what happens for most types of software systems, testing video games has largely remained a manual activity performed by human testers. This is mostly due to the continuous and intelligent user interaction video games require. Recently, reinforcement learning (RL) has been exploited to partially automate functional testing. RL enable...
Code review is a practice widely adopted in open source and industrial projects. Given the non-negligible cost of such a process, researchers started investigating the possibility of automating specific code review tasks. We recently proposed Deep Learning (DL) models targeting the automation of two tasks: the first model takes as input a code subm...
Logging is a practice widely adopted in several phases of the software lifecycle. For example, during software development log statements allow engineers to verify and debug the system by exposing fine-grained information of the running software. While the benefits of logging are undisputed, taking proper decisions about where to inject log stateme...
Code completion aims at speeding up code writing by predicting the next code token(s) the developer is likely to write. Works in this field focused on improving the accuracy of the generated predictions, with substantial leaps forward made possible by deep learning (DL) models. However, code completion techniques are mostly evaluated in the scenari...
Code completion aims at speeding up code writing by predicting the next code token(s) the developer is likely to write. Works in this field focused on improving the accuracy of the generated predictions, with substantial leaps forward made possible by deep learning (DL) models. However, code completion techniques are mostly evaluated in the scenari...
Code comments play a prominent role in program comprehension activities. However, source code is not always documented and code and comments not always co-evolve. To deal with these issues, researchers have proposed techniques to automatically generate comments documenting a given code at hand. The most recent works in the area applied deep learnin...
Software logs are of great value in both industrial and open-source projects. Mobile analytics logging enables developers to collect logs remotely from their apps running on end user devices at the cost of recording and transmitting logs across the Internet to a centralised infrastructure. This paper makes a first step in characterising logging pra...
Code completion is one of the main features of modern Integrated Development Environments (IDEs). Its objective is to speed up code writing by predicting the next code token(s) the developer is likely to write. Research in this area has substantially bolstered the predictive performance of these techniques. However, the support to developers is sti...
The SZZ algorithm for identifying bug-inducing changes has been widely used to evaluate defect prediction techniques and to empirically investigate when, how, and by whom bugs are introduced. Over the years, researchers have proposed several heuristics to improve the SZZ accuracy, providing various implementations of SZZ. However, fairly evaluating...
Code reviews are popular in both industrial and open source projects. The benefits of code reviews are widely recognized and include better code quality and lower likelihood of introducing bugs. However, since code review is a manual activity it comes at the cost of spending developers' time on reviewing their teammates' code. Our goal is to make t...
Code smells are symptoms of poor design quality. Since code review is a process that also aims at improving code quality, we investigate whether and how code review influences the severity of code smells. In this study, we analyze more than 21,000 code reviews belonging to seven Java open-source projects; we find that active and participated code r...
Bug prediction is aimed at identifying software artifacts that are more likely to be defective in the future. Most approaches defined so far target the prediction of bugs at class/file level. Nevertheless, past research has provided evidence that this granularity is too coarse-grained for its use in practice. As a consequence, researchers have star...
Healthcare mobile apps are becoming a reality for users interested in keeping their daily activities under control. In the last years, several researchers have investigated the effect of healthcare mobile apps on the life of their users as well as the positive/negative impact they have on the quality of life. Nonetheless, it remains still unclear h...
Code comments are a key software component containing information about the underlying implementation. Several studies have shown that code comments enhance the readability of the code. Nevertheless, not all the comments have the same goal and target audience. In this paper, we investigate how 14 diverse Java open and closed source software project...
Defect prediction models focus on identifying defect-prone code elements, for example to allow practitioners to allocate testing resources on specific subsystems and to provide assistance during code reviews. While the research community has been highly active in proposing metrics and methods to predict defects on long-term periods (i.e.,at release...
Contemporary code review is a widespread practice used by software engineers to maintain high software quality and share project knowledge. However, conducting proper code review takes time and developers often have limited time for review. In this paper, we aim at investigating the information that reviewers need to conduct a proper code review, t...
Recent research has provided evidence that, in the industrial context , developing video games diverges from developing software systems in other domains, such as office suites and system utilities. In this paper, we consider video game development in the open source system (OSS) context. Specifically, we investigate how developers contribute to vi...
Obtaining a good dataset to conduct empirical studies on the engineering of Android apps is an open challenge. To start tackling this challenge, we present AndroidTimeMachine, the first, self-contained, publicly available dataset weaving spread-out data sources about real-world, open-source Android apps. Encoded as a graph-based database, AndroidTi...
Recent research has provided evidence that, in the industrial context, developing video games diverges from developing software systems in other domains, such as office suites and system utilities.
In this paper, we consider video game development in the open source system (OSS) context. Specifically, we investigate how developers contribute to vid...
To gain a deeper empirical understanding of how developers work on Android apps, we investigate self-reported activities of Android developers and to what extent these activities can be classified with machine learning techniques. To this aim, we firstly create a taxonomy of self-reported activities coming from the manual analysis of 5,000 commit m...
Developers adopt code comments for different reasons such as document source codes or change program flows. Due to a variety of use scenarios, code comments may impact on readability and maintainability. In this study, we investigate how developers of 5 open-source mobile applications use code comments to document their projects. Additionally, we e...
Bug prediction is aimed at supporting developers in the identification of code artifacts more likely to be defective. Researchers have proposed prediction models to identify bug prone methods and provided promising evidence that it is possible to operate at this level of granularity. Particularly, models based on a mixture of product and process me...
Past research provided evidence that developers making code changes sometimes omit to update the related documentation, thus creating inconsistencies that may contribute to faults and crashes. In dynamically typed languages, such as Python, an inconsistency in the documentation may lead to a mismatch in type declarations only visible at runtime. Wi...
Code comments are a key software component containing information about the underlying implementation. Several studies have shown that code comments enhance the readability of the code. Nevertheless, not all the comments have the same goal and target audience. In this paper, we investigate how six diverse Java OSS projects use code comments, with t...