Nicolas Harrand's research while affiliated with SCS and other places

Publications (26)

Article
Full-text available
Software bloat is code that is packaged in an application but is actually not necessary to run the application. The presence of software bloat is an issue for security, for performance, and for maintenance. In this paper, we introduce a novel technique for debloating, which we call coverage-based debloating. We implement the technique for one singl...
Article
Full-text available
Previous work has shown that early resolution of issues detected by static code analyzers can prevent major costs later on. However, developers often ignore such issues for two main reasons. First, many issues should be interpreted to determine if they correspond to actual flaws in the program. Second, static analyzers often do not present the issu...
Preprint
Full-text available
Despite its obvious benefits, the increased adoption of package managers to automate the reuse of libraries has opened the door to a new class of hazards: supply chain attacks. By injecting malicious code in one library, an attacker may compromise all instances of all applications that depend on the library. To mitigate the impact of supply chain a...
Article
Hyrum’s law states a common observation in the software industry: “With a sufficient number of users of an API, it does not matter what you promise in the contract: all observable behaviors of your system will be depended on by somebody”. Meanwhile, recent research results seem to contradict this observation when they state that “for most APIs, the...
Article
Full-text available
Build automation tools and package managers have a profound influence on software development. They facilitate the reuse of third-party libraries, support a clear separation between the application’s code and its external dependencies, and automate several software development tasks. However, the wide adoption of these tools introduces new challeng...
Preprint
Full-text available
JSON is a popular file and data format that is precisely specified by the IETF in RFC 8259. Yet, this specification implicitly and explicitly leaves room for many design choices when it comes to parsing and generating JSON. This yields the opportunity of diverse behavior among independent implementations of JSON libraries. A thorough analysis of th...
Preprint
Full-text available
Previous work has shown that early resolution of issues detected by static code analyzers can prevent major cost later on. However, developers often ignore such issues for two main reasons. First, many issues should be interpreted to determine if they correspond to actual flaws in the program. Second, static analyzers often do not present the issue...
Preprint
Software bloat is code that is packaged in an application but is actually not used and not necessary to run the application. The presence of bloat is an issue for software security, for performance, and for maintenance. In recent years, several works have proposed techniques to detect and remove software bloat. In this paper, we introduce a novel t...
Preprint
Full-text available
During compilation from Java source code to bytecode, some information is irreversibly lost. In other words, compilation and decompilation of Java code is not symmetric. Consequently, decompilation, which aims at producing source code from bytecode, relies on strategies to reconstruct the information that has been lost. Different Java decompilers u...
Article
Full-text available
During compilation from Java source code to bytecode, some information is irreversibly lost. In other words, compilation and decompilation of Java code is not symmetric. Consequently, decompilation, which aims at producing source code from bytecode, relies on strategies to reconstruct the information that has been lost. Different Java decompilers u...
Preprint
Build automation tools and package managers have a profound influence on software development. They facilitate the reuse of third-party libraries, support a clear separation between the application's code and its external dependencies, and automate several software development tasks. However, the wide adoption of these tools introduces new challeng...
Article
Full-text available
Neutral program variants are alternative implementations of a program, yet equivalent with respect to the test suite. Techniques such as approximate computing or genetic improvement share the intuition that potential for enhancements lies in these acceptable behavioral differences (e.g., enhanced performance or reliability). Yet, the automatic synt...
Conference Paper
Full-text available
During compilation from Java source code to byte-code, some information is irreversibly lost. In other words, compilation and decompilation of Java code is not symmetric. Consequently, the decompilation process, which aims at producing source code from bytecode, must establish some strategies to reconstruct the information that has been lost. Moder...
Preprint
This paper addresses the following question: does a small, essential, core set of API members emerges from the actual usage of the API by client applications? To investigate this question, we study the 99 most popular libraries available in Maven Central and the 865,560 client programs that declare dependencies towards them, summing up to 2.3M depe...
Preprint
Full-text available
During compilation from Java source code to bytecode, some information is irreversibly lost. In other words, compilation and decompilation of Java code is not symmetric. Consequently, the decompilation process, which aims at producing source code from bytecode, must establish some strategies to reconstruct the information that has been lost. Modern...
Preprint
Full-text available
Maven artifacts are immutable: an artifact that is uploaded on Maven Central cannot be removed nor modified. The only way for developers to upgrade their library is to release a new version. Consequently, Maven Central accumulates all the versions of all the libraries that are published there, and applications that declare a dependency towards a li...
Conference Paper
Full-text available
Maven artifacts are immutable: an artifact that is uploaded on Maven Central cannot be removed nor modified. The only way for developers to upgrade their library is to release a new version. Consequently, Maven Central accumulates all the versions of all the libraries that are published there, and applications that declare a dependency towards a li...
Preprint
The Maven Central Repository provides an extraordinary source of data to understand complex architecture and evolution phenomena among Java applications. As of September 6, 2018, this repository includes 2.8M artifacts (compiled piece of code implemented in a JVM-based language), each of which is characterized with metadata such as exact version, d...
Preprint
Neutral program variants are functionally similar to an original program, yet implement slightly different behaviors. Techniques such as approximate computing or genetic improvement share the intuition that potential for enhancements lies in these acceptable behavioral differences (e.g., enhanced performance or reliability). Yet, the automatic synt...
Conference Paper
Full-text available
Automated diversity is a promising mean of increasing the security of software systems. However, current automated diversity techniques operate at the bottom of the software stack (operating system and compiler), yielding a limited amount of diversity. We present a novel Model-Driven Engineering approach to the diversification of communicating syst...
Conference Paper
Full-text available
DevOps emphasizes a high degree of automation at all phases of the software development lifecyle. Meanwhile, Genetic Improvement (GI) focuses on the automatic improvement of software artifacts. In this paper, we discuss why we believe that DevOps offers an excellent technical context for easing the adoption of GI techniques by software developers....
Conference Paper
Code{strata} is an interdisciplinary collaboration between art studies researchers (Rennes 2) and computer scientists (INRIA, KTH). It is a sound installation: a computer system unit made of concrete that sits on a wooden desk. The purpose of this project is to question the opacity and simplicity of high-level interfaces used in daily gestures. It...
Article
The Internet of Things (IoT) is a challenging combination of distribution and heterogeneity. A number of software engineering solutions address those challenges in isolation, but few solutions tackle them in combination, which poses a set of concrete challenges. The ThingML (Internet of Things Modeling Language) approach attempts to address those c...
Conference Paper
One of the selling points of Model-Driven Software Engineering (MDSE) is the increase in productivity offered by automatically generating code from models. However, the practical adoption of code generation remains relatively slow and limited to niche applications. Tooling issues are often pointed out but more fundamentally, experience shows that:...

Citations

... Invalid input data may also be caused by disagreements among data sources on the format specification which leads to different implementations of the specification. For example, JSON libraries implement slightly different definitions of the JSON formats [3], [4], different database systems support slightly different SQL formats [5], and various C compilers provide slightly different interpretation of the C language. These problems lead to invalid inputs that cannot be processed by their conforming programs or consumed by end-users (e.g., developers). ...
... Science gateways are domain-based, integrated components that help overcome those problems by providing a configurable interface and by utilizing cutting-edge technologies to save users from low-level technological issues and provide them with an easily customizable graphical interface for their scientific research [11]. Most of them decouple frontend and backend layers via API-based interfaces [12], allowing the gateway communities to focus their effort on designing community-specific Graphical User Interfaces (GUI) [3]. However, developing backend and frontend solutions can be challenging for non-IT experts [6]. ...
... In fact, quite the opposite is true. Pieces of evidence show that the libraries that are unused today in a software system (86% of them) are unlikely to be ever used in the future [49]. Then, the unused functionalities in a software system can threaten its security, slow down its performance, affect its reliability, or increase its maintenance [40]. ...
... Jozef et al. [5] repeated the experiment and confirmed a similar result. Moreover, Nicolas et al. [6,7] that there are at least three points including if statement that decompiled source codes are different from the original code. ...
... The entire Maven software ecosystem was released as a dataset with higher-level metrics, such as changes and dependencies (Raemaekers et al 2013). The Maven Dependency Graph by Benelallam et al (2019) provides a snapshot of the Maven Central as a graph, modeling all its dependencies. Fine-GRAPE is a dataset of fine-grained API usage across the Maven software ecosystem (Sawant and Bacchelli 2017). ...
... An AST of a given function includes not only all syntactic elements but also the names of the identifiers (e.g., function and variable names) and compile-time constants (e.g., numeric constants and string literals). Android malware are available in Java bytecode, which can be decompiled into ASTs of functions found in a given malware [17]. The decompiled ASTs lack the names of local variables. ...
... While introduced as a speed-up mechanism which does not alter the course of evolution , incremental evaluation shows the progressive concealing of, even large, effects by long chains of computation . The inescapable loss of information (Shannon and Weaver, 1964) in nested expressions is the underlying and unifying cause for several widespread features in computing, such as: in testing (Voas, 1992) (Androutsopoulos et al., 2014), failed disruption propagation (Petke et al., 2021), equivalent mutants (Langdon et al., 2010), neutral networks (Harrand et al., 2019), mutational robustness (Schulte et al., 2014), coincidental correctness (Abou Assi et al., 2019), correctness attraction (Danglot et al., 2018), robustness (Langdon and Petke, 2015) and AntiFragile software (Monperrus, 2017), which currently occupy different research silos (Petke et al., 2021). ...
... as well as the dependencies between them. We retrieve this data file from previous work [27], [28]. We use Gephi to produce a graph from this data, and to manipulate its layout, as illustrated in Figure 4. Finally, we export the resulting graph in PDF, PNG, and SVG formats, before exiting the application. ...
... Artavc et al. [5] uses deployment models on multiple cloud environments, which is a promising way to support the smooth transition of software from testing to production environments. Our previous approaches use model-based engineering to engineer the diversity of software deployment within the entire application [41,50,52]. Looking at approaches targeted at particular application domains, Bucchiarone et al. [17] use multi-level modelling to automate the deployment of gaming systems. ...
... In this article, we report on a study that aims at letting laypersons make sense of software presence and activity in their everyday lives, through sound: the invisible complexity of the software activities involved in the idle mode of a personal computer and its shutdown. In previous study, researchers have used different sonification strategies, e.g., for real-time monitoring of web server activities (Barra et al., 2002), of server logfiles (Hauer and Vogt, 2015), for communicating the network's traffic and monitoring its security (Vickers et al., 2017b;Debashi and Vickers, 2018;Axon et al., 2019), for monitoring the software trace collected when performing a copy and paste command in a text editor (Thomas et al., 2018). Different aesthetics have been used in these sonification studies, such as MIDI synthesizers, granular synthesis, and samples of a human voice whispering the sentence "copy and paste" manipulated using the Max/MSP's SPAT Operator. ...