Arie van Deursen

Delft University of Technology, Delft, South Holland, Netherlands

Are you Arie van Deursen?

Claim your profile

Publications (239)50.9 Total impact

  • Source
    Nicolas Dintzner · Arie van Deursen · Martin Pinzger
    [Show abstract] [Hide abstract]
    ABSTRACT: Evolving a large scale, highly variable system is a challenging task. For such a system, evolution operations often require to update consistently both their implementation and its feature model. In this context, the evolution of the feature model closely follows the evolution of the system. The purpose of this work is to show that fine-grained feature changes can be used to guide the evolution of the highly variable system. In this paper, we present an approach to obtain fine-grained feature model changes with its supporting tool “FMDiff”. Our approach is tailored for Kconfig-based variability models and proposes a feature change classification detailing changes in features, their attributes and attribute values. We apply our approach to the Linux kernel feature model, extracting feature changes occurring in sixteen official releases. In contrast to previous studies, we found that feature modifications are responsible for most of the changes. Then, by taking advantage of the multi-platform aspect of the Linux kernel, we observe the effects of a feature change across the different architecture-specific feature models of the kernel. We found that between 10 and 50 % of feature changes impact all the architecture-specific feature models, offering a new perspective on studies of the evolution of the Linux feature model and development practices of its developers.
    Software and Systems Modeling 05/2015; DOI:10.1007/s10270-015-0472-2 · 1.41 Impact Factor
  • Source
    [Show abstract] [Hide abstract]
    ABSTRACT: This paper reports on a study mining the exception stack traces included in 159,048 issues reported on Android projects hosted in GitHub (482 projects) and Google Code (157 projects). The goal of this study is to investigate whether stack trace information can reveal bug hazards related to exception handling code that may lead to a decrease in application robustness. Overall 6,005 exception stack traces were extracted, and subjected to source code and bytecode analysis. The outcomes of this study include the identification of the following bug hazards: (i) unexpected cross-type exception wrappings (for instance, trying to handle an instance of OutOfMemoryError "hidden" in a checked exception) which can make the exception-related code more complex and negatively impact the application robustness; (ii) undocumented runtime exceptions thrown by both the Android platform and third party libraries; and (iii) undoc- umented checked exceptions thrown by the Android Platform. Such undocumented exceptions make difficult, and most of the times infeasible for the client code to protect against unforeseen situations that may happen while calling third-party code. This study provides further insights on such bug hazards and the robustness threats they impose to Android apps as well as to other systems based on the Java exception model.
    Proceedings of the 2015 IEEE/ACM 12th Working Conference on Mining Software Repositories (MSR), Florence, Italy; 05/2015
  • Source
    [Show abstract] [Hide abstract]
    ABSTRACT: In the pull-based development model, the integrator has the crucial role of managing and integrating contributions. This work focuses on the role of the integrator and investigates working habits and challenges alike. We set up an exploratory qualitative study involving a large-scale survey of 749 integrators, to which we add quantitative data from the integrator's project. Our results provide insights into the factors they consider in their decision making process to accept or reject a contribution. Our key findings are that integrators struggle to maintain the quality of their projects and have difficulties with prioritizing contributions that are to be merged. Our insights have implications for practitioners who wish to use or improve their pull-based development process, as well as for researchers striving to understand the theoretical implications of the pull-based model in software development.
    International Conference Software Engineering; 05/2015
  • Source
    Anja Guzzi · Alberto Bacchelli · Yann Riche · Arie van Deursen
    [Show abstract] [Hide abstract]
    ABSTRACT: Teamwork in software engineering is time-consuming and problematic. In this paper, we explore how to better sup-port developers' collaboration in teamwork, focusing on the software implementation phase happening in the integrated development environment (IDE). Conducting a qualitative in-vestigation, we learn that developers' teamwork needs mostly regard coordination, rather than concurrent work on the same (sub)task, and that developers successfully deal with scenar-ios considered problematic in literature, but they have prob-lems dealing with breaking changes made by peers on the same project. We derive implications and recommendations. Based on one of the latter, we analyze the current IDE support for receiving code changes, finding that historical information is neither visible nor easily accessible. Consequently, we de-vise and qualitatively evaluate BELLEVUE, the design of an IDE extension to make received changes always visible and code history accessible in the editor.
    CSCW 2015: 18th ACM conference on Computer-Supported Cooperative Work and Social Computing, Vancouver, BC, Canada; 03/2015
  • M Cadariu · E Bouwers · J Visser · A Deursen
    Software Analysis, Evolution and Reengineering (SANER), 2015 IEEE 22nd International Conference on; 03/2015
  • Arie van Deursen · Ali Mesbah · Alex Nederlof
    [Show abstract] [Hide abstract]
    ABSTRACT: In this paper we review five years of research in the field of automated crawling and testing of web applications. We describe the open source Crawljax tool, and the various extensions that have been proposed in order to address such issues as cross-browser compatibility testing, web application regression testing, and style sheet usage analysis.Based on that we identify the main challenges and future directions of crawl-based testing of web applications. In particular, we explore ways to reduce the exponential growth of the state space, as well as ways to involve the human tester in the loop, thus reconciling manual exploratory testing and automated test input generation. Finally, we sketch the future of crawl-based testing in the light of upcoming developments, such as the pervasive use of touch devices and mobile computing, and the increasing importance of cyber-security.
    Science of Computer Programming 01/2015; 97. DOI:10.1016/j.scico.2014.09.005 · 0.72 Impact Factor
  • Hennie Huijgens · Georgios Gousios · Arie Deursen
    [Show abstract] [Hide abstract]
    ABSTRACT: A medium-sized west-European telecom company experienced a worsening trend in performance, indicating that the organization did not learn from history, in combination with much time and energy spent on preparation and review of project proposals. In order to create more transparency in the supplier proposal process a pilot was started on Functional Size Measurement pricing (FSM-pricing). In this paper, we evaluate the implementation of FSM-pricing in the software engineering domain of the company, as an instrument useful in the context of software man- agement and supplier proposal pricing. We found that a statistical, evidence-based pricing approach for software engineering, as a single instrument (without a connection with expert judgment), can be used in the subject companies to create cost transparency and performance management of software project portfolios.
    Proceedings of the 9th International Symposium on Empirical Software Engineering and Measurement, Beijing, China; 01/2015
  • Eric Bouwers · Arie Deursen · Joost Visser
    [Show abstract] [Hide abstract]
    ABSTRACT: Applying encapsulation techniques lead to software systems in which the majority of changes are localized, which reduces maintenance and testing effort. In the evaluation of implemented software architectures, metrics can be used to provide an indication of the degree of encapsulation within a system and to serve as a basis for an informed discussion about how well-suited the system is for expected changes. Current literature shows that over 40 different architecture-level metrics are available to quantify the encapsulation, but empirical validation of these metrics against changes in a system is not available. In this paper we investigate twelve existing architecture metrics for their ability to quantify the encapsulation of an implemented architecture. We correlate the values of the metrics against the ratio of local change over time using the history of ten open-source systems. In the design of our experiment we ensure that the values of the existing metrics are representative for the time period which is analyzed. Our study shows that one of the suitable architecture metrics can be considered a valid indicator for the degree of encapsulation of systems. We discuss the implications of our findings both for the research into architecture-level metrics and for software architecture evaluations in industry.
    30th IEEE International Conference on Software Maintenance and Evolution, Victoria, BC, Canada, September 29 - October 3, 2014; 12/2014
  • Michael W. Godfrey · Arie van Deursen
    Empirical Software Engineering 10/2014; 19(5):1259-1260. DOI:10.1007/s10664-014-9329-5 · 2.16 Impact Factor
  • Eric Bouwers · Arie van Deursen · Joost Visser
    [Show abstract] [Hide abstract]
    ABSTRACT: In the past two decades both the industry and the research community have proposed hundreds of metrics to track software projects, evaluate quality or estimate effort. Unfortunately, it is not always clear which metric works best in a particular context. Even worse, for some metrics there is little evidence whether the metric measures the attribute it was designed to measure. In this paper we propose a catalog format for software metrics as a first step towards a consolidated overview of available software metrics. This format is designed to provide an overview of the status of a metric in a glance, while providing enough information to make an informed decision about the use of the metric. We envision this format to be implemented in a (semantic) wiki to ensure that relationships between metrics can be followed with ease.
  • Hennie Huijgens · Rini van Solingen · Arie van Deursen
    [Show abstract] [Hide abstract]
    ABSTRACT: What can we learn from historic data that is collected in three software companies that on a daily basis had to cope with highly complex project portfolios? In this paper we analyze a large dataset, containing 352 finalized software engineering projects, with the goal to discover what factors affect software project performance, and what actions can be taken to increase project performance when building a software project portfolio. The software projects were classified in four quadrants of a Cost/Duration matrix: analysis was performed on factors that were strongly related to two of those quadrants, Good Practices and Bad Practices. A ranking was performed on the factors based on statistical significance. The paper results in an inventory of 'what factors should be embraced when building a project portfolio?' (Success Factors), and 'what factors should be avoided when doing so?' (Failure Factors). The major contribution of this paper is that it analyzes characteristics of best performers and worst performers in the dataset of software projects, resulting in 7 Success Factors (a.o. steady heartbeat, a fixed, experienced team, agile (Scrum), and release-based), and 9 Failure Factors (a.o. once-only project, dependencies with other systems, technology driven, and rules-and regulations driven).
  • Source
    Georgios Gousios · Martin Pinzger · Arie van Deursen
    [Show abstract] [Hide abstract]
    ABSTRACT: The advent of distributed version control systems has led to the development of a new paradigm for distributed software development; instead of pushing changes to a central repository, developers pull them from other repositories and merge them locally. Various code hosting sites, notably Github, have tapped on the opportunity to facilitate pull-based development by offering workflow support tools, such as code reviewing systems and integrated issue trackers. In this work, we explore how pull-based software development works, first on the GHTorrent corpus and then on a carefully selected sample of 291 projects. We find that the pull request model offers fast turnaround, increased opportunities for community engagement and decreased time to incorporate contributions. We show that a relatively small number of factors affect both the decision to merge a pull request and the time to process it. We also examine the reasons for pull request rejection and find that technical ones are only a small minority.
    International Conference Software Engineering, Hyderabad; 05/2014
  • Alex Nederlof · Ali Mesbah · Arie van Deursen
    [Show abstract] [Hide abstract]
    ABSTRACT: Today’s web applications increasingly rely on client-side code execution. HTML is not just created on the server, but manipulated extensively within the browser through JavaScript code. In this paper, we seek to understand the software engineering implications of this. We look at deviations from many known best practices in such areas of performance, accessibility, and correct structuring of HTML documents. Furthermore, we assess to what extent such deviations manifest themselves through client-side code manipulation only. To answer these questions, we conducted a large scale experiment, involving automated client-enabled crawling of over 4000 web applications, resulting in over 100,000,000 pages analyzed, and close to 1,000,000 unique client-side user interface states. Our findings show that the majority of sites contain a substantial number of problems, making sites unnecessarily slow, inaccessible for the visually impaired, and with layout that is unpredictable due to errors in the dynamically modified DOM trees.
  • Felienne Hermans · Martin Pinzger · Arie van Deursen
    [Show abstract] [Hide abstract]
    ABSTRACT: Spreadsheets are used extensively in business processes around the world and just like software, spreadsheets are changed throughout their lifetime causing understandability and maintainability issues. This paper adapts known code smells to spreadsheet formulas. To that end we present a list of metrics by which we can detect smelly formulas; a visualization technique to highlight these formulas in spreadsheets and a method to automatically suggest refactorings to resolve smells. We implemented the metrics, visualization and refactoring suggestions techniques in a prototype tool and evaluated our approach in three studies. Firstly, we analyze the EUSES spreadsheet corpus, to study the occurrence of the formula smells. Secondly, we analyze ten real life spreadsheets, and interview the spreadsheet owners about the identified smells. Finally, we generate refactoring suggestions for those ten spreadsheets and study the implications. The results of these evaluations indicate that formula smells are common, that they can reveal real errors and weaknesses in spreadsheet formulas and that in simple cases they can be refactored.
    Empirical Software Engineering 04/2014; 20(2). DOI:10.1007/s10664-013-9296-2 · 2.16 Impact Factor
  • Nicolas Dintzner · Arie Van Deursen · Martin Pinzger
    [Show abstract] [Hide abstract]
    ABSTRACT: The Linux kernel feature model has been studied as an example of large scale evolving feature model and yet details of its evolution are not known. We present here a classification of feature changes occurring on the Linux kernel feature model, as well as a tool, FMDiff, designed to automatically extract those changes. With this tool, we obtained the history of more than twenty architecture specific feature models, over ten releases and compared the recovered information with Kconfig file changes. We establish that FMDiff provides a comprehensive view of feature changes and show that the collected data contains promising information regarding the Linux feature model evolution.
    Proceedings of the Eighth International Workshop on Variability Modelling of Software-Intensive Systems; 01/2014
  • Steven Raemaekers · Arie Deursen · Joost Visser
    Source Code Analysis and Manipulation (SCAM), 2014 IEEE 14th International Working Conference on; 01/2014
  • Source
    Tao Xie · Thomas Zimmermann · Arie van Deursen
    [Show abstract] [Hide abstract]
    ABSTRACT: Support for generic programming was added to the Java language in 2004, representing perhaps the most significant change to one of the most widely used programming languages today. Researchers and language designers anticipated this addition would relieve ...
    Empirical Software Engineering 12/2013; 18(6). DOI:10.1007/s10664-013-9273-9 · 2.16 Impact Factor
  • Khalid Adam Nasr · Hans-Gerhard Gross · Arie van Deursen
    [Show abstract] [Hide abstract]
    ABSTRACT: In this paper, we present two descriptive case studies covering the re-engineering and further evolution of adopting service-oriented architecture (SOA) in the industry. The first case was carried out for a company in the transport sector with an application portfolio of over 700 systems. The second case study was conducted for an organization in the public sector. The goal of both case studies is to identify the possible benefits and drawbacks of realizing SOA in large organizations in order to obtain a better perspective on the real, rather than the assumed, benefits of SOA in practice. We describe how the two cases were developed and carried out, and discuss the experiences gained and lessons learned from adopting SOA in the two organizations. Based on these findings, we propose several directions for further research. Copyright © 2011 John Wiley & Sons, Ltd.
    Journal of Software Maintenance and Evolution Research and Practice 06/2013; 25(6). DOI:10.1002/smr.540 · 1.32 Impact Factor
  • Kevin Dullemond · Ben van Gameren · M.-A. Storey · Arie van Deursen
    [Show abstract] [Hide abstract]
    ABSTRACT: Distributed teams face the challenge of staying connected. How do team members stay connected when they no longer see each other on a daily basis? What should be done when there is no coffee corner to share your latest exploits? In this paper we evaluate a microblogging system which makes this possible in a distributed setting. The system, WeHomer, enables the sharing of information and corresponding emotions in a fully distributed organization. We analyzed the content of over a year of usage data by 19 team members in a structured fashion, performed 5 semi-structured interviews and report our findings in this paper. We draw conclusions about the topics shared, the impact on software teams and the impact of distribution and team composition. Main findings include an increase in team-connectedness and easier access to information that is traditionally harder to consistently acquire.
    Mining Software Repositories (MSR), 2013 10th IEEE Working Conference on; 05/2013
  • F. Hermans · B. Sedee · M. Pinzger · A. van Deursen
    [Show abstract] [Hide abstract]
    ABSTRACT: Spreadsheets are widely used in industry: it is estimated that end-user programmers outnumber programmers by a factor 5. However, spreadsheets are error-prone, numerous companies have lost money because of spreadsheet errors. One of the causes for spreadsheet problems is the prevalence of copy-pasting. In this paper, we study this cloning in spreadsheets. Based on existing text-based clone detection algorithms, we have developed an algorithm to detect data clones in spreadsheets: formulas whose values are copied as plain text in a different location. To evaluate the usefulness of the proposed approach, we conducted two evaluations. A quantitative evaluation in which we analyzed the EUSES corpus and a qualitative evaluation consisting of two case studies. The results of the evaluation clearly indicate that 1) data clones are common, 2) data clones pose threats to spreadsheet quality and 3) our approach supports users in finding and resolving data clones.
    Software Engineering (ICSE), 2013 35th International Conference on; 01/2013

Publication Stats

5k Citations
50.90 Total Impact Points


  • 2003–2015
    • Delft University of Technology
      • Faculty of Electrical Engineering, Mathematics and Computer Sciences (EEMCS)
      Delft, South Holland, Netherlands
  • 1997–2008
    • Technische Universiteit Eindhoven
      • Department of Electrical Engineering
      Eindhoven, North Brabant, Netherlands
    • University of Amsterdam
      Amsterdamo, North Holland, Netherlands
  • 2005
    • Durham University
      Durham, England, United Kingdom
  • 2000–2005
    • Centrum Wiskunde & Informatica
      Amsterdamo, North Holland, Netherlands
  • 2004
    • College of Western Idaho
      Nampa, Idaho, United States