Jan Bosch

Jan Bosch
Chalmers University of Technology

Professor of SW Engineering

About

557
Publications
247,903
Reads
How we measure 'reads'
A 'read' is counted each time someone views a publication summary (such as the title, abstract, and list of authors), clicks on a figure, or views or downloads the full-text. Learn more
16,968
Citations
Additional affiliations
March 2011 - present
Chalmers University of Technology
Position
  • Professor (Full)
March 2011 - present
University of Gothenburg
Position
  • Professor of Software Engineering
April 2007 - March 2011
Intuit
Position
  • VP Engineering Process

Publications

Publications (557)
Preprint
Full-text available
Randomised field experiments, such as A/B testing, have long been the gold standard for evaluating software changes. In the automotive domain, running randomised field experiments is not always desired, possible, or even ethical. In the face of such limitations, we develop a framework BOAT (Bayesian causal modelling for ObvservAtional Testing), uti...
Preprint
Deep learning (DL) based software systems are difficult to develop and maintain in industrial settings due to several challenges. Data management is one of the most prominent challenges which complicates DL in industrial deployments. DL models are data-hungry and require high-quality data. Therefore, the volume, variety, velocity, and quality of da...
Preprint
Continuous deployment has become a widely used practice in web-based software applications. Deploying a new software version to production is a seamless automated process executed thousands of times per day. Continuous deployment reduces the time between a code commit and that commit is active in production. While continuous deployment promises man...
Article
Deep learning (DL) based software systems are difficult to develop and maintain in industrial settings due to several challenges. Data management is one of the most prominent challenges which complicates DL in industrial deployments. DL models are data-hungry and require high-quality data. Therefore, the volume, variety, velocity, and quality of da...
Preprint
Full-text available
Fast and reliable wireless communication has become a critical demand in human life. When natural disasters strike, providing ubiquitous connectivity becomes challenging by using traditional wireless networks. In this context, unmanned aerial vehicle (UAV) based aerial networks offer a promising alternative for fast, flexible, and reliable wireless...
Article
Artificial intelligence (AI) and the use of machine learning (ML) and deep learning (DL) technologies are becoming increasingly popular in companies. These technologies enable companies to leverage big quantities of data to improve system performance and accelerate business development. However, despite the appeal of ML/DL, there is a lack of syste...
Preprint
Full-text available
Fast and reliable connectivity is essential to enhancing situational awareness and operational efficiency for public safety mission-critical (MC) users. In emergency or disaster circumstances, where existing cellular network coverage and capacity may not be available to meet MC communication demands, deployable-network-based solutions such as cells...
Chapter
Federated Learning, as a distributed learning technique, has emerged with the improvement of the performance of IoT and edge devices. The emergence of this learning method alters the situation in which data must be centrally uploaded to the cloud for processing and maximizes the utilization of edge devices’ computing and storage capabilities. The l...
Article
Full-text available
In recent years, the application of artificial intelligence (AI) has become an integral part of a wide range of areas, including software engineering. By analyzing various data sources generated in software engineering, it can provide valuable insights into customer behavior, product performance, bugs and errors, and many more. In practice, however...
Article
Full-text available
Context : When developing software, it is vitally important to keep the level of technical debt down since, based on several studies, it has been well established that technical debt can lower the development productivity, decrease the developers' morale and compromise the overall quality of the software, among others. However, even if researchers...
Article
Full-text available
Continuous experimentation (CE) refers to a set of practices used by software companies to rapidly assess the usage, value, and performance of deployed software using data collected from customers and systems in the field using an experimental methodology. However, despite its increasing popularity in developing web‐facing applications, CE has not...
Preprint
Full-text available
Randomised field experiments, such as A/B testing, have long been the gold standard for evaluating the value that new software brings to customers. However, running randomised field experiments is not always desired, possible or even ethical in the development of automotive embedded software. In the face of such restrictions, we propose the use of...
Preprint
Full-text available
Data collected from in-service products play an important role in enabling software-intensive embedded systems suppliers to embrace data-driven practices. Data can be used in many different ways such as to continuously learn and improve the product, enhance post-deployment services, reduce operational cost or create a better user experience. While...
Conference Paper
Full-text available
Continuous Deployment (CD) advocates for quick and frequent deployments of software to production. The goal is to bring new functionality as early as possible to users while learning from their usage. CD has emerged from web-based applications where it has been gaining traction over the past years. While CD is appealing for many software developmen...
Preprint
Full-text available
Continuous Deployment is the practice to deploy software more frequently to customers and learn from their usage. The aim is to introduce new functionality and features in an additive way to customers as soon as possible. While Continuous Deployment is becoming popular among web and cloud-based software development organizations, the adoption of co...
Chapter
Data pipelines play an important role throughout the data management process whether these are used for data analytics or machine learning. Data-driven organizations can make use of data pipelines for producing good quality data applications. Moreover, data pipelines ensure end-to-end velocity by automating the processes involved in extracting, tra...
Preprint
Full-text available
A/B testing is gaining attention in the automotive sector as a promising tool to measure casual effects from software changes. Different from the web-facing businesses, where A/B testing has been well-established, the automotive domain often suffers from limited eligible users to participate in online experiments. To address this shortcoming, we pr...
Preprint
Full-text available
A/B experimentation is a known technique for data-driven product development and has demonstrated its value in web-facing businesses. With the digitalisation of the automotive industry, the focus in the industry is shifting towards software. For automotive embedded software to continuously improve, A/B experimentation is considered an important tec...
Article
Frequentist statistical methods, such as hypothesis testing, are standard practices in studies that provide benchmark comparisons. Unfortunately, these methods have often been misused, e.g., without testing for their statistical test assumptions or without controlling for family-wise errors in multiple group comparisons, among several other problem...
Preprint
Full-text available
Benchmark suites, i.e. a collection of benchmark functions, are widely used in the comparison of black-box optimization algorithms. Over the years, research has identified many desired qualities for benchmark suites, such as diverse topology, different difficulties, scalability, representativeness of real-world problems among others. However, while...
Preprint
Full-text available
With the development and the increasing interests in ML/DL fields, companies are eager to utilize these methods to improve their service quality and user experience. Federated Learning has been introduced as an efficient model training approach to distribute and speed up time-consuming model training and preserve user data privacy. However, common...
Preprint
Full-text available
Background: Data errors are a common challenge in machine learning (ML) projects and generally cause significant performance degradation in ML-enabled software systems. To ensure early detection of erroneous data and avoid training ML models using bad data, research and industrial practice suggest incorporating a data validation process and tool in...
Article
Full-text available
We need to built software rapidly and with a high quality. These goals seem to be contradictory, but actually, implementing automation in build and deployment procedures as well as quality analysis can improve both the development pace and the resulting quality at the same time. Rapid Continuous Software Engineering describes novel software enginee...
Article
Full-text available
With digitalization and with technologies such as software, data, and artificial intelligence, companies in the embedded systems domain are experiencing a rapid transformation of their conventional businesses. While the physical products and associated product sales provide the core revenue, these are increasingly being complemented with service of...
Chapter
Full-text available
With the increasing attention on Machine Learning applications, more and more companies are involved in implementing AI components into their software products in order to improve the service quality. With the rapid growth of distributed edge devices, Federated Learning has been introduced as a distributed learning technique, which enables model tr...
Chapter
A significant amount of research effort is put into studying machine learning (ML) and deep learning (DL) technologies. Real-world ML applications help companies to improve products and automate tasks such as classification, image recognition and automation. However, a traditional “fixed” approach where the system is frozen before deployment leads...
Chapter
Full-text available
Companies across domains are rapidly engaged in shifting computational power and intelligence from centralized cloud to fully decentralized edges to maximize value delivery, strengthen security and reduce latency. However, most companies have only recently started pursuing this opportunity and are therefore at the early stage of the cloud-to-edge t...
Preprint
Full-text available
When developing software, it is vitally important to keep the level of technical debt down since it is well established from several studies that technical debt can, e.g., lower the development productivity, decrease the developers' morale, and compromise the overall quality of the software. However, even if researchers and practitioners working in...
Chapter
Full-text available
Artificial intelligence (AI) and machine learning (ML) are increasingly broadly adopted in industry. However, based on well over a dozen case studies, we have learned that deploying industry-strength, production quality ML models in systems proves to be challenging. Companies experience challenges related to data quality, design methods and process...
Article
Software ecosystems are considered the natural evo�lution of software product lines. A software ecosystem provides a (software) product within a particular business and organizational context that supports the exchange of activities and services within a domain. However, the increasing degree of autonomy demanded by software ecosystems is elevating...
Chapter
Context: Pivot has been a common strategical tactic of startups by shifting course of actions to adapt to environmental changes to the companies. Among many factors influencing the decisions of pivot or preserve, technical characteristics of the product and its evolution are possible triggering factors. We have learned that technical debt is an inh...
Preprint
Heterogeneous computing is one of the most important computational solutions to meet rapidly increasing demands on system performance. It typically allows the main flow of applications to be executed on a CPU while the most computationally intensive tasks are assigned to one or more accelerators, such as GPUs and FPGAs. We have observed that the re...
Article
Context Exploratory testing plays an important role in the continuous integration and delivery pipelines of large-scale software systems, but a holistic and structured approach is needed to realize efficient and effective exploratory testing. Objective This paper seeks to address the need for a structured and reliable approach by providing a tangi...
Conference Paper
Full-text available
Nowadays, machine learning (ML) is an integral component in a wide range of areas, including software analytics (SA) and business intelligence (BI). As a result, the interest in custom ML-based software analytics and business intelligence solutions is rising. In practice, however, such solutions often get stuck in a prototypical stage because setti...
Chapter
Development of machine learning (ML) enabled applications in real-world settings is challenging and requires the consideration of sound software engineering (SE) principles and practices. A large body of knowledge exists on the use of modern approaches to developing traditional software components, but not ML components. Using exploratory case stud...
Chapter
Full-text available
Labeling is a cornerstone of supervised machine learning. However, in industrial applications, data is often not labeled, which complicates using this data for machine learning. Although there are well-established labeling techniques such as crowdsourcing, active learning, and semi-supervised learning, these still do not provide accurate and reliab...
Chapter
Data pipelines involve a complex chain of interconnected activities that starts with a data source and ends in a data sink. Data pipelines are important for data-driven organizations since a data pipeline can process data in multiple formats from distributed data sources with minimal human intervention, accelerate data life cycle activities, and en...
Conference Paper
Process Debt, like Technical Debt, can be the source of short-term benefits but often is harmful in the long term for a software organization. Nonetheless, information about Process Debt is scarce in current literature. We conducted an exploratory study of Process Debt in four international organizations by interviewing 16 practitioners. The findin...
Preprint
Full-text available
Frequentist statistical methods, such as hypothesis testing, are standard practice in papers that provide benchmark comparisons. Unfortunately, these frequentist tools have often been misused, without testing for the statistical test assumptions, without control for family-wise errors in multiple group comparisons, among several other problems. Bay...
Article
Full-text available
Context Nowadays, the hype around artificial intelligence is at its absolute peak. Large amounts of data are collected every second of the day and a variety of tools exists to enable easy analysis of data. In practice, however, making meaningful use of it is way more challenging. For instance, affected stakeholders often struggle to specify their i...
Conference Paper
System logs perform a critical function in software-intensive systems as logs record the state of the system and significant events in the system at important points in time. Unfortunately, log entries are typically created in an ad-hoc, unstructured and uncoordinated fashion, limiting their usefulness for analytics and machine learning. In a DevOp...
Preprint
Full-text available
The collection of high quality data provides a key competitive advantage to companies in their decision-making process, understanding customer behavior as well as enabling the usage and deployment of new technologies based on machine learning. However, the process from collecting the data, to clean and process it to be used by data scientists and a...
Conference Paper
When developing software, it is vitally important to keep the level of technical debt down since it is well established from several studies that technical debt can, e.g., lower the development productivity, decrease the developers' morale, and compromise the overall quality of the software. However, even if researchers and practitioners working in...
Article
Background: Developing and maintaining large scale machine learning (ML) based software systems in an industrial setting is challenging. There are no well-established development guidelines, but the literature contains reports on how companies develop and maintain deployed ML-based software systems. Objective: This study aims to survey the literatu...