Diego Elias Costa

Diego Elias Costa
Université du Québec à Montréal | UQAM

PhD

About

45
Publications
40,383
Reads
How we measure 'reads'
A 'read' is counted each time someone views a publication summary (such as the title, abstract, and list of authors), clicks on a figure, or views or downloads the full-text. Learn more
229
Citations
Introduction
I am a postdoctoral researcher at the DAS Lab, lead by Prof. Emad Shihab, which is part of the Department of Computer Science and Software Engineering, at Concordia University. My research interests cover a wide range of software engineering and performance engineering related topics, including mining software repositories, software ecosystems, dependency management, bots in software engineering, and performance testing.
Additional affiliations
September 2019 - July 2021
Concordia University
Position
  • PostDoc Position
January 2015 - August 2019
Universität Heidelberg
Position
  • PhD Student
November 2012 - December 2014
Universidade Federal de Uberlândia (UFU)
Position
  • Master's Student

Publications

Publications (45)
Preprint
Full-text available
Modern software systems are often built by leveraging code written by others in the form of libraries and packages to accelerate their development. While there are many benefits to using third-party packages, software projects often become dependent on a large number of software packages. Consequently, developers are faced with the difficult challe...
Preprint
Full-text available
The Open Source Software movement has been growing exponentially for a number of years with no signs of slowing. Driving this growth is the widespread availability of libraries and frameworks that provide many functionalities. Developers are saving time and money incorporating this functionality into their applications resulting in faster more feat...
Article
Full-text available
Pull-based development has enabled numerous volunteers to contribute to open-source projects with fewer barriers. Nevertheless, a considerable amount of pull requests (PRs) with valid contributions are abandoned by their contributors , wasting the effort and time put in by both the contributors and maintainers. To better understand the underlying d...
Article
Full-text available
Context While in serverless computing, application resource management and operational concerns are generally delegated to the cloud provider, ensuring that serverless applications meet their performance requirements is still a responsibility of the developers. Performance testing is a commonly used performance assessment practice; however, it trad...
Article
Full-text available
Nowadays, wearables-based Human Activity Recognition (HAR) systems represent a modern, robust, and lightweight solution to monitor athlete performance. However, user data variability is a problem that may hinder the performance of HAR systems, especially the cross-subject HAR models. Such a problem may have a lesser effect on the subject-specific m...
Article
Full-text available
Due to their increasing complexity, today's software systems are frequently built by leveraging reusable code in the form of libraries and packages. Software ecosystems (e.g., npm) are the primary enablers of this code reuse, providing developers with a platform to share their own and use others' code. These ecosystems evolve rapidly: developers ad...
Preprint
Pull-based development has enabled numerous volunteers to contribute to open-source projects with fewer barriers. Nevertheless, a considerable amount of pull requests (PRs) with valid contributions are abandoned by their contributors, wasting the effort and time put in by both their contributors and maintainers. To gain a more comprehensive underst...
Article
Full-text available
Inertial sensors are widely used in the field of human activity recognition (HAR), since this source of information is the most informative time series among non-visual datasets. HAR researchers are actively exploring other approaches and different sources of signals to improve the performance of HAR systems. In this study, we investigate the impac...
Chapter
Java 8 marked a shift in the Java development landscape by introducing functional-like concepts in its stream library. Java developers can now rely on stream pipelines to simplify data processing, reduce verbosity, easily enable parallel processing and increase the expressiveness of their code. While streams have seemingly positive effects in Java...
Article
Full-text available
Dependency management in modern software development poses many challenges for developers who wish to stay up to date with the latest features and fixes whilst ensuring backwards compatibility. Project maintainers have opted for varied, and sometimes conflicting, approaches for maintaining their dependencies. Opting for unsuitable approaches can in...
Conference Paper
Full-text available
Java 8 marked a shift in the Java development landscape by introducing functional-like concepts in its stream library. Java developers can now rely on stream pipelines to simplify data processing, reduce verbosity, easily enable parallel processing and increase the expressiveness of their code. While streams have seemingly positive effects on Java...
Preprint
Full-text available
Context. While in serverless computing, application resource management and operational concerns are generally delegated to the cloud provider, ensuring that serverless applications meet their performance requirements is still a responsibility of the developers. Performance testing is a commonly used performance assessment practice; however, it tra...
Preprint
Due to its increasing complexity, today's software systems are frequently built by leveraging reusable code in the form of libraries and packages. Software ecosystems (e.g., npm) are the primary enablers of this code reuse, providing developers with a platform to share their own and use others' code. These ecosystems evolve rapidly: developers add...
Article
Full-text available
Chatbots are envisioned to dramatically change the future of Software Engineering, allowing practitioners to chat and inquire about their software projects and interact with different services using natural language. At the heart of every chatbot is a Natural Language Understanding (NLU) component that enables the chatbot to understand natural lang...
Conference Paper
Full-text available
Continuous Integration (CI) is the process of automatically compiling, building, and testing code changes in the hope of catching bugs as they are introduced into the code base. With bug fixing being a core and increasingly costly task in software development, the community has adopted CI to mitigate this issue and improve the quality of their soft...
Conference Paper
Full-text available
Vulnerable dependencies are a major problem in modern software development. As software projects depend on multiple external dependencies, developers struggle to constantly track and check for corresponding security vulnerabilities that affect their project dependencies. To help mitigate this issue, Dependabot has been created, a bot that issues pu...
Article
Full-text available
A decade after its first release, the Go programming language has become a major programming language in the development landscape. While praised for its clean syntax and C-like performance, Go also contains a strong static type-system that prevents arbitrary type casting and arbitrary memory access, making the language type-safe by design. However...
Article
Full-text available
Nowadays, Human Activity Recognition (HAR) systems, which use wearables and smart systems, are a part of our daily life. Despite the abundance of literature in the area, little is known about the impact of muscle fatigue on these systems’ performance. In this work, we use the biceps concentration curls exercise as an example of a HAR activity to ob...
Conference Paper
Full-text available
Software ecosystems play an important role in modern software development, providing an open platform of reusable packages that speed up and facilitate development tasks. However, this level of code reusability supported by software ecosystems also makes the discovery of security vulnerabilities much more difficult, as software systems depend on an...
Preprint
Full-text available
Chatbots are envisioned to dramatically change the future of Software Engineering, allowing practitioners to chat and inquire about their software projects and interact with different services using natural language. At the heart of every chatbot is a Natural Language Understanding (NLU) component that enables the chatbot to understand natural lang...
Preprint
Dependency management in modern software development poses many challenges for developers who wish to stay up to date with the latest features and fixes whilst ensuring backwards compatibility. Project maintainers have opted for varied, and sometimes conflicting, approaches for maintaining their dependencies. Opting for unsuitable approaches can in...
Preprint
Software vulnerabilities have a large negative impact on the software systems that we depend on daily. Reports on software vulnerabilities always paint a grim picture, with some reports showing that 83% of organizations depend on vulnerable software. However, our experience leads us to believe that, in the grand scheme of things, these software vul...
Preprint
Full-text available
A decade after its first release, the Go programming language has become a major programming language in the development landscape. While praised for its clean syntax and C-like performance, Go also contains a strong static type-system that prevents arbitrary type casting and arbitrary memory access, making the language type-safe by design. However...
Conference Paper
Full-text available
Chatbots are becoming increasingly popular due to their benefits in saving costs, time, and effort. This is due to the fact that they allow users to communicate and control different services easily through natural language. Chatbot development requires special expertise (e.g., machine learning and conversation design) that differ from the developm...
Article
Full-text available
Despite huge software engineering efforts and programming language support, resource and memory leaks are still a troublesome issue, even in memory-managed languages such as Java. Understanding the properties of leak-inducing defects, how the leaks manifest, and how they are repaired is an essential prerequisite for designing better approaches for...
Conference Paper
Full-text available
Domain Specific Languages (DSLs) have proven useful in the domain of data science, as witnessed by the popularity of SQL. However, implementing and maintaining a DSL incurs a significant effort which limits their utility in context of fast-changing data science frameworks and libraries. We propose an approach and a Python-based library/tool NLDSL w...
Conference Paper
Full-text available
Monitoring software performance evolution is a daunting and challenging task. This paper proposes a lightweight visualization technique that contrasts source code variation with the memory consumption and execution time of a particular benchmark. The visualization fully integrates with the commit graph as common in many software repository managers...
Thesis
Full-text available
Software systems are an integral part of modern society. As we continue to harness software automation in all aspects of our daily lives, the runtime performance of these systems become increasingly important. When everything seems just a click away, performance issues that compromise the responsiveness of a system can lead to severe financial and...
Article
Full-text available
Microbenchmarking frameworks, such as Java's Microbenchmark Harness (JMH), allow developers to write fine-grained performance test suites at the method or statement level. However, due to the complexities of the Java Virtual Machine, developers often struggle with writing expressive JMH benchmarks which accurately represent the performance of such...
Preprint
Full-text available
Data analysis is at the core of scientific studies, a prominent task that researchers and practitioners typically undertake by programming their own set of automated scripts. While there is no shortage of tools and languages available for designing data analysis pipelines, users spend substantial effort in learning the specifics of such languages/t...
Preprint
Full-text available
Despite huge software engineering efforts and programming language support, resource and memory leaks are still a troublesome issue, even in memory-managed languages such as Java. Understanding the properties of leak-inducing defects, how the leaks manifest, and how they are repaired is an essential prerequisite for designing better approaches for...
Conference Paper
Full-text available
Networks play an increasingly important role in modelling real-world systems due to their utility in representing complex connections. For predictive analyses, the engineering of node features in such networks is of fundamental importance to machine learning applications, where the lack of external information often introduces the need for features...
Conference Paper
Full-text available
Despite many software engineering efforts and programming language support, resource and memory leaks remain a troublesome issue in managed languages such as Java. Understanding the properties of leak-related issues, such as their type distribution, how they are found, and which defects induce them is an essential prerequisite for designing better...
Conference Paper
Full-text available
Networks play an increasingly important role in modelling realworld systems due to their utility in representing complex connections. For predictive analyses, the engineering of node features in such networks is of fundamental importance to machine learning applications, where the lack of external information often introduces the need for features...
Presentation
Full-text available
CGO Slides of CollectionSwitch: A Framework for Efficient and Dynamic Collection Selection
Conference Paper
Full-text available
Selecting collection data structures for a given application is a crucial aspect of the software development. Inefficient usage of collections has been credited as a major cause of performance bloat in applications written in Java, C++ and C#. Furthermore, a single implementation might not be optimal throughout the entire program execution. This de...
Conference Paper
Full-text available
Selecting collection data structures for a given application is a crucial aspect of the software development. Inefficient usage of collections has been credited as a major cause of performance bloat in applications written in Java, C++ and C#. Furthermore, a single implementation might not be optimal throughout the entire program execution. This de...
Conference Paper
Full-text available
Collection data structures have a major impact on the performance of applications, especially in languages such as Java, C#, or C++. This requires a developer to select an appropriate collection from a large set of possibilities, including different abstractions (e.g. list, map, set, queue), and multiple implementations. In Java, the default implem...
Conference Paper
Configuration options are widely used for customiz-ing the behavior and initial settings of software applications, server processes, and operating systems. Their distinctive property is that each option is processed, defined, and described in different parts of a software project-namely in code, in configuration file, and in documentation. This cre...
Conference Paper
Full-text available
Dynamic memory allocation is one of the most ubiquitous operations in computer programs. In order to design effective memory allocation algorithms, it is a major requirement to understand the most frequent memory allocation patterns present in modern applications. In this paper, we present an experimental characterization study of dynamic memory al...
Conference Paper
Full-text available
Dynamic memory allocation is one of the most ubiquitous operations in computer programs. In order to design effective memory allocation algorithms, it is a major requirement to understand the most frequent memory allocation patterns present in modern applications. In this paper, we present an experimental characterization study of dynamic memory al...
Data
Full-text available
Software systems running continuously for a long time often confront software aging, which is the phenomenon of progressive degradation of execution environment caused by latent software faults. Removal of such faults in software development process is a crucial issue for system reliability. A known major obstacle is typically the large latency to...
Conference Paper
Full-text available
Software systems running continuously for a long time often confront software aging, which is the phenomenon of progressive degradation of execution environment caused by latent software faults. Removal of such faults in software development process is a crucial issue for system reliability. A known major obstacle is typically the large latency to...
Conference Paper
Full-text available
In this paper, we present an experimental study to compare six user-level memory allocators. In addition, we compare the experimental results with the asymptotic analyses of the evaluated algorithms. The experimental results show that parallelism affects negatively the investigated allocators. The theoretical analysis of the execution time demonstr...

Network

Cited By

Projects

Project (1)
Project
This project investigates and proposes a set of methods for autotuning of programs through a workload-sensitive selection of data structures. More specifically, we are focused on providing an automated framework that combines the application workload and machine learning techniques to select the proper collection for time and memory improvement of an application.