[Show abstract][Hide abstract]ABSTRACT: Manual crash reproduction is a labor-intensive and time-consuming task. Therefore, several solutions have been proposed in literature for automatic crash reproduction, including generating unit tests via symbolic execution and mutation analysis. However, various limitations adversely affect the capabilities of the existing solutions in covering a wider range of crashes because generating helpful tests that trigger specific execution paths is particularly challenging. In this paper, we propose a new solution for automatic crash reproduction based on evolutionary unit test generation techniques. The proposed solution exploits crash data from collected stack traces to guide search-based algorithms toward the generation of unit test cases that can reproduce the original crashes. Results from our preliminary study on real crashes from Apache Commons libraries show that our solution can successfully reproduce crashes which are not reproducible by two other state-of-art techniques.
[Show abstract][Hide abstract]ABSTRACT: We report on an industrial case study on developing the embedded software for a smart meter using the C programming language and domain-specific extensions of C such as components, physical units, state machines, registers and interrupts. We find that the extensions help significantly with managing the complexity of the software. They improve testability mainly by supporting hardware-independent testing, as illustrated by low integration efforts. The extensions also do not incur significant overhead regarding memory consumption and performance. Our case study relies on mbeddr, an extensible version of C. mbeddr, in turn, builds on the MPS language workbench which supports modular extension of languages and IDEs.
No preview · Article · Oct 2015 · ACM SIGPLAN Notices
[Show abstract][Hide abstract]ABSTRACT: Evolving a large scale, highly variable system is a challenging task. For such a system, evolution operations often require to update consistently both their implementation and its feature model. In this context, the evolution of the feature model closely follows the evolution of the system. The purpose of this work is to show that fine-grained feature changes can be used to guide the evolution of the highly variable system. In this paper, we present an approach to obtain fine-grained feature model changes with its supporting tool “FMDiff”. Our approach is tailored for Kconfig-based variability models and proposes a feature change classification detailing changes in features, their attributes and attribute values. We apply our approach to the Linux kernel feature model, extracting feature changes occurring in sixteen official releases. In contrast to previous studies, we found that feature modifications are responsible for most of the changes. Then, by taking advantage of the multi-platform aspect of the Linux kernel, we observe the effects of a feature change across the different architecture-specific feature models of the kernel. We found that between 10 and 50 % of feature changes impact all the architecture-specific feature models, offering a new perspective on studies of the evolution of the Linux feature model and development practices of its developers.
Preview · Article · May 2015 · Software and Systems Modeling
[Show abstract][Hide abstract]ABSTRACT: This paper reports on a study mining the exception stack traces included in 159,048 issues reported on Android projects hosted in GitHub (482 projects) and Google Code (157 projects). The goal of this study is to investigate whether stack trace information can reveal bug hazards related to exception handling code that may lead to a decrease in application robustness. Overall 6,005 exception stack traces were extracted, and subjected to source code and bytecode analysis. The outcomes of this study include the identification of the following bug hazards: (i) unexpected cross-type exception wrappings (for instance, trying to handle an instance of OutOfMemoryError "hidden" in a checked exception) which can make the exception-related code more complex and negatively impact the application robustness; (ii) undocumented runtime exceptions thrown by both the Android platform and third party libraries; and (iii) undoc- umented checked exceptions thrown by the Android Platform. Such undocumented exceptions make difficult, and most of the times infeasible for the client code to protect against unforeseen situations that may happen while calling third-party code. This study provides further insights on such bug hazards and the robustness threats they impose to Android apps as well as to other systems based on the Java exception model.
[Show abstract][Hide abstract]ABSTRACT: In the pull-based development model, the integrator has the crucial role of managing and integrating contributions. This work focuses on the role of the integrator and investigates working habits and challenges alike. We set up an exploratory qualitative study involving a large-scale survey of 749 integrators, to which we add quantitative data from the integrator's project. Our results provide insights into the factors they consider in their decision making process to accept or reject a contribution. Our key findings are that integrators struggle to maintain the quality of their projects and have difficulties with prioritizing contributions that are to be merged. Our insights have implications for practitioners who wish to use or improve their pull-based development process, as well as for researchers striving to understand the theoretical implications of the pull-based model in software development.
[Show abstract][Hide abstract]ABSTRACT: Teamwork in software engineering is time-consuming and problematic. In this paper, we explore how to better sup-port developers' collaboration in teamwork, focusing on the software implementation phase happening in the integrated development environment (IDE). Conducting a qualitative in-vestigation, we learn that developers' teamwork needs mostly regard coordination, rather than concurrent work on the same (sub)task, and that developers successfully deal with scenar-ios considered problematic in literature, but they have prob-lems dealing with breaking changes made by peers on the same project. We derive implications and recommendations. Based on one of the latter, we analyze the current IDE support for receiving code changes, finding that historical information is neither visible nor easily accessible. Consequently, we de-vise and qualitatively evaluate BELLEVUE, the design of an IDE extension to make received changes always visible and code history accessible in the editor.
[Show abstract][Hide abstract]ABSTRACT: Known security vulnerabilities can be introduced in software systems as a result of being dependent upon third-party components. These documented software weaknesses are 'hiding in plain sight' and represent low hanging fruit for attackers. In this paper we present the Vulnerability Alert Service (VAS), a tool-based process to track known vulnerabilities in software systems throughout their life cycle. We studied its usefulness in the context of external software product quality monitoring provided by the Software Improvement Group, a software advisory company based in Amsterdam, the Netherlands. Besides empirically assessing the usefulness of the VAS, we have also leveraged it to gain insight and report on the prevalence of third-party components with known security vulnerabilities in proprietary applications.
[Show abstract][Hide abstract]ABSTRACT: In this paper we review five years of research in the field of automated crawling and testing of web applications. We describe the open source Crawljax tool, and the various extensions that have been proposed in order to address such issues as cross-browser compatibility testing, web application regression testing, and style sheet usage analysis.Based on that we identify the main challenges and future directions of crawl-based testing of web applications. In particular, we explore ways to reduce the exponential growth of the state space, as well as ways to involve the human tester in the loop, thus reconciling manual exploratory testing and automated test input generation. Finally, we sketch the future of crawl-based testing in the light of upcoming developments, such as the pervasive use of touch devices and mobile computing, and the increasing importance of cyber-security.
No preview · Article · Jan 2015 · Science of Computer Programming
[Show abstract][Hide abstract]ABSTRACT: A medium-sized west-European telecom company experienced a worsening trend in performance, indicating that the organization did not learn from history, in combination with much time and energy spent on preparation and review of project proposals. In order to create more transparency in the supplier proposal process a pilot was started on Functional Size Measurement pricing (FSM-pricing). In this paper, we evaluate the implementation of FSM-pricing in the software engineering domain of the company, as an instrument useful in the context of software man- agement and supplier proposal pricing. We found that a statistical, evidence-based pricing approach for software engineering, as a single instrument (without a connection with expert judgment), can be used in the subject companies to create cost transparency and performance management of software project portfolios.
[Show abstract][Hide abstract]ABSTRACT: Applying encapsulation techniques lead to software systems in which the majority of changes are localized, which reduces maintenance and testing effort. In the evaluation of implemented software architectures, metrics can be used to provide an indication of the degree of encapsulation within a system and to serve as a basis for an informed discussion about how well-suited the system is for expected changes. Current literature shows that over 40 different architecture-level metrics are available to quantify the encapsulation, but empirical validation of these metrics against changes in a system is not available. In this paper we investigate twelve existing architecture metrics for their ability to quantify the encapsulation of an implemented architecture. We correlate the values of the metrics against the ratio of local change over time using the history of ten open-source systems. In the design of our experiment we ensure that the values of the existing metrics are representative for the time period which is analyzed. Our study shows that one of the suitable architecture metrics can be considered a valid indicator for the degree of encapsulation of systems. We discuss the implications of our findings both for the research into architecture-level metrics and for software architecture evaluations in industry.
[Show abstract][Hide abstract]ABSTRACT: For users of software libraries or public programming interfaces (APIs), backward compatibility is a desirable trait. Without compatibility, library users will face increased risk and cost when upgrading their dependencies. In this study, we investigate semantic versioning, a versioning scheme which provides strict rules on major versus minor and patch releases. We analyze seven years of library release history in Maven Central, and contrast version identifiers with actual incompatibilities. We find that around one third of all releases introduce at least one breaking change, and that this figure is the same for minor and major releases, indicating that version numbers do not provide developers with information in stability of interfaces. Additionally, we find that the adherence to semantic versioning principles has only marginally increased over time. We also investigate the use of deprecation tags and find out that methods get deleted without applying deprecated tags, and methods with deprecated tags are never deleted. We conclude the paper by arguing that the adherence to semantic versioning principles should increase because it provides users of an interface with a way to determine the amount of rework that is expected when upgrading to a new version.
[Show abstract][Hide abstract]ABSTRACT: In the past two decades both the industry and the research community have proposed hundreds of metrics to track software projects, evaluate quality or estimate effort. Unfortunately, it is not always clear which metric works best in a particular context. Even worse, for some metrics there is little evidence whether the metric measures the attribute it was designed to measure. In this paper we propose a catalog format for software metrics as a first step towards a consolidated overview of available software metrics. This format is designed to provide an overview of the status of a metric in a glance, while providing enough information to make an informed decision about the use of the metric. We envision this format to be implemented in a (semantic) wiki to ensure that relationships between metrics can be followed with ease.
[Show abstract][Hide abstract]ABSTRACT: What can we learn from historic data that is collected in three software companies that on a daily basis had to cope with highly complex project portfolios? In this paper we analyze a large dataset, containing 352 finalized software engineering projects, with the goal to discover what factors affect software project performance, and what actions can be taken to increase project performance when building a software project portfolio. The software projects were classified in four quadrants of a Cost/Duration matrix: analysis was performed on factors that were strongly related to two of those quadrants, Good Practices and Bad Practices. A ranking was performed on the factors based on statistical significance. The paper results in an inventory of 'what factors should be embraced when building a project portfolio?' (Success Factors), and 'what factors should be avoided when doing so?' (Failure Factors). The major contribution of this paper is that it analyzes characteristics of best performers and worst performers in the dataset of software projects, resulting in 7 Success Factors (a.o. steady heartbeat, a fixed, experienced team, agile (Scrum), and release-based), and 9 Failure Factors (a.o. once-only project, dependencies with other systems, technology driven, and rules-and regulations driven).
[Show abstract][Hide abstract]ABSTRACT: The advent of distributed version control systems has led to the development of a new paradigm for distributed software development; instead of pushing changes to a central repository, developers pull them from other repositories and merge them locally. Various code hosting sites, notably Github, have tapped on the opportunity to facilitate pull-based development by offering workflow support tools, such as code reviewing systems and integrated issue trackers. In this work, we explore how pull-based software development works, first on the GHTorrent corpus and then on a carefully selected sample of 291 projects. We find that the pull request model offers fast turnaround, increased opportunities for community engagement and decreased time to incorporate contributions. We show that a relatively small number of factors affect both the decision to merge a pull request and the time to process it. We also examine the reasons for pull request rejection and find that technical ones are only a small minority.
[Show abstract][Hide abstract]ABSTRACT: Spreadsheets are used extensively in business processes around the world and just like software, spreadsheets are changed throughout their lifetime causing understandability and maintainability issues. This paper adapts known code smells to spreadsheet formulas. To that end we present a list of metrics by which we can detect smelly formulas; a visualization technique to highlight these formulas in spreadsheets and a method to automatically suggest refactorings to resolve smells. We implemented the metrics, visualization and refactoring suggestions techniques in a prototype tool and evaluated our approach in three studies. Firstly, we analyze the EUSES spreadsheet corpus, to study the occurrence of the formula smells. Secondly, we analyze ten real life spreadsheets, and interview the spreadsheet owners about the identified smells. Finally, we generate refactoring suggestions for those ten spreadsheets and study the implications. The results of these evaluations indicate that formula smells are common, that they can reveal real errors and weaknesses in spreadsheet formulas and that in simple cases they can be refactored.
[Show abstract][Hide abstract]ABSTRACT: The Linux kernel feature model has been studied as an example of large scale evolving feature model and yet details of its evolution are not known. We present here a classification of feature changes occurring on the Linux kernel feature model, as well as a tool, FMDiff, designed to automatically extract those changes. With this tool, we obtained the history of more than twenty architecture specific feature models, over ten releases and compared the recovered information with Kconfig file changes. We establish that FMDiff provides a comprehensive view of feature changes and show that the collected data contains promising information regarding the Linux feature model evolution.
[Show abstract][Hide abstract]ABSTRACT: Support for generic programming was added to the Java language in 2004, representing perhaps the most significant change to one of the most widely used programming languages today. Researchers and language designers anticipated this addition would relieve ...
Full-text · Article · Dec 2013 · Empirical Software Engineering