About
183
Publications
93,393
Reads
How we measure 'reads'
A 'read' is counted each time someone views a publication summary (such as the title, abstract, and list of authors), clicks on a figure, or views or downloads the full-text. Learn more
6,690
Citations
Publications
Publications (183)
As the world becomes more technology-driven, organizations are identifying and assessing technological opportunities and threats in order to stay relevant. These routines are what we refer to as technology forecasting. However, there is still a need for a method that fits the pace of technological advancements. To provide better guidance, we invest...
Machine learning (ML) is extensively used in production-ready applications, calling for mature engineering techniques to ensure robust development, deployment and maintenance. Given the potential negative impact machine learning (ML) can have on people, society or the environment, engineering techniques that can ensure robustness against technical...
Context: Abundant literature is available on metrics that can support Agile teams in improving their way of working and performance. However, little knowledge is available on how to operationalise metrics successfully.
Objective: Our objective is to understand the challenges and benefits of selecting measures and introducing these via dashboards in...
Survey shows
• 20-30% do not adopt AutoML at all
• Another 50-60% do not completely adopt AutoML
Interviews indicate
• Improved performance and time savings
• High computational costs hold back adoption
• Desire to understand the system and whether it is correct
• Concerns about data misuse and overfitting
More info: https://se-ml.github.io
Many organizations adopt DevOps practices and tools in order to break down silos within the organization, improve software quality and delivery, and increase customer satisfaction. However, the impact of the individual practices on the performance of the organization is not well known. In this paper, we collect evidence on the effects of DevOps pra...
Artificial intelligence (AI) is a powerful tool to accomplish a great many tasks. This exciting branch of technology is being adopted increasingly across varying sectors, including the insurance domain. With that power arise several complications. One of which is a lack of transparency and explainability of an algorithm for experts and non-experts...
Machine learning (ML) has become essential to a vast range of applications, while ML experts are in short supply. To alleviate this problem, AutoML aims to make ML easier and more efficient to use. Even so, it is not clear to which extent AutoML techniques are actually adopted in an engineering context, nor what facilitates or inhibits adoption. To...
Machine learning (ML) has become essential to a vast range of applications, while ML experts are in short supply. To alleviate this problem, AutoML aims to make ML easier and more efficient to use. Even so, it is not clear to which extent AutoML techniques are actually adopted in an engineering context, nor what facilitates or inhibits adoption. To...
Artificial intelligence (AI) is a powerful tool to accomplish a great many tasks. This exciting branch of technology is being adopted increasingly across varying sectors, including the insurance domain. With that power arise several complications. One of which is a lack of transparency and explainability of an algorithm for experts and non-experts...
While many organizations embark on agile transformations, they can lack insight into the actual impact of these transformations across organizational layers.
In this paper, we collect new and study existing evidence on the impact of agile transformations on organizational performance across teams, programs and portfolios. We conducted an internatio...
Specific developmental and operational characteristics of machine learning (ML) components, as well as their inherent uncertainty, demand robust engineering principles are used to ensure their quality. We aim to determine how software systems can be (re-) architected to enable robust integration of ML components. Towards this goal, we conducted a m...
While many defences against adversarial examples have been proposed, finding robust machine learning models is still an open problem. The most compelling defence to date is adversarial training and consists of complementing the training data set with adversarial examples. Yet adversarial training severely impacts training time and depends on findin...
Following the recent surge in adoption of machine learning (ML), the negative impact that improper use of ML can have on users and society is now also widely recognised. To address this issue, policy makers and other stakeholders, such as the European Commission or NIST, have proposed high-level guidelines aiming to promote trustworthy ML (i.e., la...
Sensitivity to adversarial noise hinders deployment of machine learning algorithms in security-critical applications. Although many adversarial defenses have been proposed, robustness to adversarial noise remains an open problem. The most compelling defense, adversarial training, requires a substantial increase in processing time and it has been sh...
Background. The increasing reliance on applications with machine learning (ML) components calls for mature engineering techniques that ensure these are built in a robust and future-proof manner.
Aim. We aim to empirically determine the state of the art in how teams develop, deploy and maintain software with ML components.
Method. We mined both ac...
The adoption of machine learning (ML) components in software systems raises new engineering challenges. In particular, the inherent uncertainty regarding functional suitability and the operation environment makes architecture evaluation and trade-off analysis difficult. We propose a software architecture evaluation method called Modeling Uncertaint...
Sensitivity to adversarial noise hinders deployment of machine learning algorithms in security-critical applications. Although many adversarial defenses have been proposed, robustness to adversarial noise remains an open problem. The most compelling defense, adversarial training, requires a substantial increase in processing time and it has been sh...
Mining and storage of data from software repositories is typically done on a per-project basis, where each project uses a unique combination of data schema, extraction tools, and (intermediate) storage infrastructure. We introduce GraphRepo, a tool that enables a unified approach to extract data from Git repositories, store it, and share it across...
The adoption of machine learning (ML) components in software systems raises new engineering challenges. In particular, the inherent uncertainty regarding functional suitability and the operation environment makes architecture evaluation and trade-off analysis difficult. We propose a software architecture evaluation method called Modeling Uncertaint...
Deep neural networks are at the forefront of machine learning research. However, despite achieving impressive performance on complex tasks, they can be very sensitive: Small perturbations of inputs can be sufficient to induce incorrect behavior. Such perturbations, called adversarial examples, are intentionally designed to test the network's sensit...
The increasing reliance on applications with machine learning (ML) components calls for mature engineering techniques that ensure these are built in a robust and future-proof manner. We aim to empirically determine the state of the art in how teams develop, deploy and maintain software with ML components. We mined both academic and grey literature...
Deep neural networks are at the forefront of machine learning research. However, despite achieving impressive performance on complex tasks, they can be very sensitive: Small perturbations of inputs can be sufficient to induce incorrect behavior. Such perturbations, called adversarial examples, are intentionally designed to test the network’s sensit...
This paper analyses security aspects of the ETSI ITS standard for co-operative transport systems, where cars communicate with each other (V2V) and with the roadside (V2I) to improve traffic safety and make more efficient use of the road system. We focus on the initial information exchange between vehicles and the road side infrastructure responsibl...
Binary relational algebra provides semantic foundations for major areas of computing, such as database design, state-based modeling and functional programming. Remarkably, static checking support in these areas fails to exploit the full semantic content of relations. In particular, properties such as the simplicity or injectivity of relations are n...
Traditionally, software quality is thought to depend on sound software engineering and development methodologies such as structured programming and agile development. However, high quality software depends just as much on high quality collaboration within the team. Since the success rate of software development projects is low (Wateridge, 1995; The...
The maintainability of software is an important cost factor for organizations across all industries, as maintenance makes up approximately 40% to 70% of the total development costs of a software system. Organizations are often stuck in the situation where software maintenance costs dominate IT budgets, leaving no room for enhancement and innovation...
In this paper we present a novel software analytics infrastructure supporting for a combination of three requirements to serve software practitioners in utilising data-driven decision making: (1) Real-time insight: streaming software analytics unify static historical and current event-stream data enabling for immediate, nearly real-time insight int...
Systems that depend on third-party libraries may have to be updated when updates to these libraries become available in order to benefit from new functionality, security patches, bug fixes, or API improvements. However, often such changes come with changes to the existing interfaces of these libraries, possibly causing rework on the client system....
Have you ever felt frustrated working with someone else’s code? Difficult-to-maintain source code is a big problem in software development today, leading to costly delays and defects. Be part of the solution. With this practical book, you’ll learn 10 easy-to-follow guidelines for delivering Java software that’s easy to maintain and adapt. These gui...
Applying agile methods in large, and complex organizations requires effective status reporting to external stakeholders in order to facilitate communication, inspire confidence, and allow appropriate project steering. We defined a practical model for measuring and reporting the quality, progress, and predictions of agile development towards stakeho...
Examining the relationship between IT capability and firm performance has received ample attention of researchers. However, a sound understanding of how to define IT capability remains underrepresented in existing literature. We extend and modify an existing definition analysis method derived from the Social Sciences and divided each definition in...
While energy is directly consumed by hardware, it is the software that provides the instructions to do so. Energy profilers provide a means to measure the energy consumption of software, enabling the user to take measures in making software more sustainable. Although each energy profiler has access to roughly the same data, the reported measurement...
In this paper we present a classification model for predicting cost slippage using data mining techniques. The model uses the initial planning of an ICT project in terms of budget and schedule and then predicts the category of cost slippage of the project. Three categories are distinguished where low slippage is considered normal, medium slippage r...
Known security vulnerabilities can be introduced in software systems as a result of being dependent upon third-party components. These documented software weaknesses are 'hiding in plain sight' and represent low hanging fruit for attackers. In this paper we present the Vulnerability Alert Service (VAS), a tool-based process to track known vulnerabi...
You can't control what you can't measure. And you can't decide if you are wandering around in the dark. Risk management in practice requires shedding light on the internals of the software product in order to make informed decisions. Thus, in practice, risk management has to be based on information about artifacts (documentation, code, and executab...
Applying encapsulation techniques lead to software systems in which the majority of changes are localized, which reduces maintenance and testing effort. In the evaluation of implemented software architectures, metrics can be used to provide an indication of the degree of encapsulation within a system and to serve as a basis for an informed discussi...
For users of software libraries or public programming interfaces (APIs), backward compatibility is a desirable trait. Without compatibility, library users will face increased risk and cost when upgrading their dependencies. In this study, we investigate semantic versioning, a versioning scheme which provides strict rules on major versus minor and p...
Though many warn that Agile at large scale is problematic or at least more challenging than in smaller projects, Agile software development seems to become the norm, also for large and complex projects. Based on literature, we constructed a conceptual model of social factors that may be of influence on the success of soft-ware development projects...
Automated testing is a basic principle of agile development. Its benefits include early defect detection, defect causelocalization and removal of fear to apply changes to the code. Therefore, maintaining high quality test code is essential. This study introduces a model that assesses test code quality by combining source code metrics that reflect t...
Evaluating the energy efficiency of software applica-tions currently is an ad-hoc affair, since no practical and widely applicable model exists for this purpose. The need for such an evaluation model is pressing given the sharp increase in energy demand generated by the ICT industry. In particular, we need to get in control of our software applicat...
In the past two decades both the industry and the research community have proposed hundreds of metrics to track software projects, evaluate quality or estimate effort. Unfortunately, it is not always clear which metric works best in a particular context. Even worse, for some metrics there is little evidence whether the metric measures the attribute...
Although spreadsheets can be seen as a flexible programming environment, they lack some of the concepts of regular programming languages, such as structured data types. This can lead the user to edit the spreadsheet in a wrong way and perhaps cause corrupt or redundant data.
We devised a method for extraction of a relational model from a spreadshee...
The application software design has a major impact on the energy efficiency of a computing system. But research on the subject is still in its infancy. What is the energy efficiency of software? How can it be measured? What are guidelines for the development of energy efficient software? In this paper, we set out to find an answer to these question...
This paper introduces a model for rating software security based on the ISO 25010 standard for software product quality. To rate software security, the authors define eleven system properties, which reflect how a typical software product addresses the confidentiality, integrity, non-repudiation, accountability and authenticity. The paper presents t...
A wide range of software metrics targeting various abstraction levels and quality attributes have been proposed by the research community. For many of these metrics the evaluation consists of verifying the mathematical properties of the metric, investigating the behavior of the metric for a number of open-source systems or comparing the value of th...
Using software metrics to keep track of the progress and quality of products and processes is a common practice in industry. Additionally, designing, validating and improving metrics is an important research area. Although using software metrics can help in reaching goals, the effects of using metrics incorrectly can be devastating. In this tutoria...
Great strides have been made to increase the energy efficiency of hardware, data center facilities, and network infrastructure. These Green IT initiatives aim to reduce energy-loss in the supply chain from energy grid to computing devices. However, the demand for computation comes from software applications that perform business services. Therefore...
Hardware dissipates energy because software tells it to. But attributing hardware energy usage to particular software functions is complicated due to distribution, resource sharing, and layering of software. To enable research on energy usage attribution, we have created the Software Energy Footprint Lab. We explain the experimental setup offered b...
We present the Maven Dependency Dataset (MDD), containing metrics, changes and dependencies of 148,253 jar files. Metrics and changes have been calculated at the level of individual methods, classes and packages of multiple library versions. A complete call graph is also presented which includes call, inheritance, containment and historical relatio...
Best practices in software development state that code that is likely to change should be encapsulated to localize possible modifications. In this paper, we investigate the application and effects of this design principle. We investigate the relationship between the stability, encapsulation and popularity of libraries on a dataset of 148,253 Java l...
Context: Research indicates that software quality, to a large extent, depends on cooperation within software teams [1]. Since software development is a creative process that involves human interaction in the context of a team, it is important to understand the teamwork factors that influence performance. Objective: We present a study design in whic...
According to a survey conducted by Forrester Research in 2008, at least 44% of North American, European, and Asian-Pacific enterprises have adopted SOA (Service-oriented Architecture), and at least 63% would adopt it by the end of 2008. A more recent survey by Forrester also shows that SOA adoption remains strong in 2010. Nevertheless, there are ma...
In order to evaluate large, heterogeneous information systems (i.e., comprising modules developed in diverse programming languages) a method to detect dependencies among these modules is needed. Although there is a variety of methods that can detect dependencies within a single programming language, the available cross-language detection methods us...
This tutorial volume includes revised and extended lecture notes of six long tutorials, five short tutorials, and one peer-reviewed participant contribution held at the 4th International Summer School on Generative and Transformational Techniques in Software Engineering, GTTSE 2011. The school presents the state of the art in software languagae eng...
Data schema transformations occur in the context of soft-ware evolution, refactoring, and cross-paradigm data mappings. When constraints exist on the initial schema, these need to be transformed into constraints on the target schema. Moreover, when high-level data types are refined to lower level structures, additional target schema constraints mus...
We present a pragmatic method for management of risks that arise due to
spreadsheet use in large organizations. We combine peer-review, tool-assisted
evaluation and other pre-existing approaches into a single organization-wide
approach that reduces spreadsheet risk without overly restricting spreadsheet
use. The method was developed in the course o...
The goal of this study is to investigate the use of UML and its impact on the change proneness of the implementation code. We look at whether the use of UML class diagrams, as opposed to using no modeling notation, influences code change proneness. Furthermore, using five design metrics we measure the quality of UML class diagrams and explore its c...
The technical debt metaphor is gaining significant traction in the software development community as a way to understand and communicate issues of intrinsic quality, value, and cost. This is a report on a third workshop on managing technical debt, which took place as part of the 34rd International Conference on Software Engineering (ICSE 2012). The...
Collecting product metrics during development or maintenance of a software system is an increasingly common practice that provides insight and control over the evolution of a product's quality. An important challenge remains in interpreting the vast amount of data as it is being collected and in transforming it into actionable information. We prese...
Backward compatibility is a major concern for any library developer. In this paper, we evaluate how stable a set of frequently used third-party libraries is in terms of method removals, implementation change, the ratio of change in old methods to change in new ones and the percentage of new methods in each snapshot. We provide a motivating example...
We performed an empirical study of the relation between technical quality of software products and the issue resolution performance
of their maintainers. In particular, we tested the hypothesis that ratings for source code maintainability, as employed by
the Software Improvement Group (SIG) quality model, are correlated with ratings for issue resol...
Assessment of the economic value of software systems is useful in contexts such as capitalization on the balance sheet and due diligence prior to acquisition. Current accounting practice in determining software value is based on the cost spent in software development. This approach fails to account for the efficiency with which software has been pr...
Software maintenance tasks are mainly related to fixing defects and implementing new features. Higher efficiency in performing such tasks is therefore going to reduce the costs of maintenance. A previous study involving open source systems has shown that higher software maintainability corresponds to faster speed in fixing defects [1]. In this pape...
The way features are implemented in source code has a significant influence on multiple quality aspects of a software system. Hence, it is important to regularly evaluate the quality of feature confinement. Unfortunately, existing approaches to such measurement rely on expert judgement for tracing links between features and source code which hinder...
The article discusses whether software metrics are helpful tools or whether they are a waste of time. For the past 11 years, the Software Improvement Group has advised hundreds of organizations concerning software development and risk management on the basis of software metrics. Metric in a bubble; treating the metric; One-track metric; and metrics...
Four common pitfalls in using software metrics for project management
Software metrics have been proposed as instruments, not only to guide individual developers in their coding tasks, but also to obtain high-level quality indicators for entire software systems. Such system-level indicators are intended to enable meaningful comparisons among systems or to serve as triggers for a deeper analysis.Common methods for agg...
In this paper we introduce the concept of a “dependency profile”, a system level metric aimed at quantifying the level of encapsulation and independence within a system. We verify that these profiles are suitable to be used in an evaluation context by inspecting the dependency profiles for a repository of almost 100 systems. Furthermore we outline...
Various programming languages allow the construction of structure-shy programs. Such programs are defined generically for many different datatypes and only specify specific behavior for a few relevant subtypes. Typical examples are XML query languages that allow selection of subdocuments without exhaustively specifying intermediate element tags. Ot...
We provide an overview of the approach developed by the Software Improvement Group for code analysis and quality consulting
focused on software maintainability. The approach uses a standardized measurement model based on the ISO/IEC 9126 definition
of maintainability and source code metrics. Procedural standardization in evaluation projects further...
The decomposition of a software system into com-ponents is a major decision in any software architecture, having a strong influence on many of its quality aspects. A system's analyzability, in particular, is influenced by its decomposition into components. But into how many components should a system be decomposed to achieve optimal analyzability?...