Barbara KitchenhamKeele University · School of Computing and Mathematics
Barbara Kitchenham
About
306
Publications
418,232
Reads
How we measure 'reads'
A 'read' is counted each time someone views a publication summary (such as the title, abstract, and list of authors), clicks on a figure, or views or downloads the full-text. Learn more
48,907
Citations
Introduction
Skills and Expertise
Publications
Publications (306)
Context
A few years ago, rapid reviews (RR) were introduced in software engineering (SE) to address the problem that standard systematic reviews take too long and too much effort to be of value to practitioners. Prior to our study, few practice-driven RRs had been reported, and none involved collaboration with practitioners lacking SE research expe...
Context
Software engineering (SE) experiments often have small sample sizes. This can result in data sets with non-normal characteristics, which poses problems as standard parametric meta-analysis, using the standardized mean difference (StdMD) effect size, assumes normally distributed sample data. Small sample sizes and non-normal data set charact...
Context: Several tertiary studies have criticized the reporting of software engineering secondary studies.
Objective: Our objective is to identify guidelines for reporting software engineering (SE) secondary studies which would address problems observed in the reporting of software engineering systematic reviews (SRs).
Method: We review the criti...
Context
Evidence-based practice (EBP) has allowed several disciplines to become more mature by emphasizing the use of evidence from well-designed and well-conducted research in decision-making. Its application in SE, Evidence-based software engineering (EBSE) can help to bridge the gap between academia and industry by bringing together academic rig...
Context: Recent papers have proposed the use of grey literature (GL) and multivocal reviews. These papers have raised issues about the practices used for systematic reviews (SRs) in software engineering (SE) and suggested that there should be changes to the current SR guidelines.
Objective: To investigate whether current SR guidelines need to be ch...
ContextAlthough influential in academia, evidence-based software engineering (EBSE) has had little impact on industry practice. We found that other disciplines have identified lack of training as a significant barrier to Evidence-Based Practice.Objective
To build and assess an EBSE training proposal suitable for students with more than 3 years of c...
Context: In empirical software engineering, crossover designs are popular for experiments comparing software engineering techniques that must be undertaken by human participants. However, their value depends on the correlation (r) between the outcome measures on the same participants. Software engineering theory emphasizes the importance of individ...
Sharing research data from public funding is an important topic, especially now, during times of global emergencies like the COVID-19 pandemic, when we need policies that enable rapid sharing of research data. Our aim is to discuss and review the revised Draft of the OECD Recommendation Concerning Access to Research Data from Public Funding. The Re...
There are inconsistencies between the formulas for the variance of standardized mean difference (SMD) in the Cochrane Handbook for Systematic Reviews and the variance reported in other sources. Instead of the variance appropriate for the SMD of a crossover experiment, the Cochrane Handbook uses the variance appropriate for a pre-test post-test expe...
Empirical Standards are natural-language models of a scientific community's expectations for a specific kind of study (e.g. a questionnaire survey). The ACM SIGSOFT Paper and Peer Review Quality Initiative generated empirical standards for research methods commonly used in software engineering. These living documents, which should be continuously r...
Context
Previous studies have raised concerns about the analysis and meta-analysis of crossover experiments and we were aware of several families of experiments that used crossover designs and meta-analysis.
Objective
To identify families of experiments that used meta-analysis, to investigate their methods for effect size construction and aggregat...
Background Examples of questionable statistical practice, when published in high quality software engineering (SE) journals, may lead to novice researchers adopting incorrect statistical practices.
Objective Our goal is to highlight issues contributing to poor statistical practice in human-centric SE experiments.
Method We reviewed the statistical...
Vegas et al. IEEE Trans Softw Eng 42(2):120:135 (2016) raised concerns about the use of AB/BA crossover designs in empirical software engineering studies. This paper addresses issues related to calculating standardized effect sizes and their variances that were not addressed by the Vegas et al.’s paper. In a repeated measures design such as an AB/B...
We addressed the issues related to repeated measures experimental design such as an AB/BA crossover design that have been neither discussed nor addressed in the software engineering literature.
Firstly, there are potentially two different standardized mean difference effect sizes that can be calculated, depending on whether the mean difference is s...
Context: Previously, the authors had developed and evaluated a framework to evaluate systematic review (SR) lifecycle tools. Goal: The goal of this study was to use the experiences of researchers in other domains to further evaluate and refine the evaluation framework. Method: The authors investigated the opinions of researchers with experience of...
Context: There have been many changes in statistical theory in the past 30 years, including increased evidence that non-robust methods may fail to detect important results. The statistical advice available to software engineering researchers needs to be updated to address these issues.
Objective: This paper aims both to explain the new results in t...
Researchers have identified problems with the validity of software engineering research findings. In particular, it is often impossible to reproduce data analyses, due to lack of raw data, or sufficient summary statistics, or undefined analysis procedures. The aim of this paper is to raise awareness of the problems caused by unreproducible research...
This keynote discusses the need for more robust statistical methods. For visualizing data I suggest using Kernel density plots rather than box plots. For parametric analysis, I propose more robust measures of central location such as trimmed means, which can support reliable tests of the differences between the central location of two or more sampl...
Background: A number of software tools are being developed to support systematic reviewers within the software engineering domain. However, at present, we are not sure which aspects of the review process can most usefully be supported by such tools or what characteristics of the tools are most important to reviewers. Aim: The aim of the study is to...
Context: Cloud computing is a new computing technology that provides services to consumers and businesses. Due to the increasing use of these services, the quality of service (QoS) of cloud computing has become an important and essential issue since there are many open challenges which need to be addressed related to trust in cloud services. Many r...
Background The labour intensive and error prone nature of the systematic review process has led to the development and use of a range of tools to provide automated support.
Aim The aim of this research is to evaluate a set of candidate tools that provide support for the overall systematic review process.
Method A feature analysis is performed to co...
Context
Many researchers adopting systematic reviews (SRs) have also published papers discussing problems with the SR methodology and suggestions for improving it. Since guidelines for SRs in software engineering (SE) were last updated in 2007, we believe it is time to investigate whether the guidelines need to be amended in the light of recent res...
Context: For the outcomes of systematic literature reviews to be of use for practitioners, we need to develop models for addressing the needs of Knowledge Translation (KT). Aim: To identify some of the key issues that need to be addressed by a KT process for software engineering (SE) and possible routes for achieving these. Method: We have examined...
Context: Due to the lack of suitably skilled participants, software engineering experiments often lack the statistical power needed to detect the levels of effect that may be encountered. Aim: To investigate whether this can be remedied by running an experiment across multiple sites, organised as a single study rather than as a set of replications....
Context: Several text books and papers published between 2000 and 2002 have attempted to introduce experimental design and statistical methods to software engineers undertaking empirical studies. Objective: This paper investigates whether there has been an increase in the quality of human-centric experimental and quasi-experimental journal papers o...
SUMMARY Software process modeling and simulation have become effective tools for support of software process management and improvement over the past two decades. They have recently been integrated into the Trustworthy Process Management Framework (TPMF) as the infrastructural components to facilitate the delivery of trustworthy software products....
Many research studies report an economy of scale in software development, i.e., an increase in productivity with increasing project size. Several software practitioners seem, on the other hand, to believe in a diseconomy of scale, i.e., a decrease in productivity with increasing project size. In this paper we argue that violations of essential regr...
Empirical studies differ in what they report as the underlying relation between project size and percent cost overrun. As a consequence, the studies also differ in their project management recommendations. We show that studies with a project size measure based on the actual cost systematically report an increase in percent cost overrun with increas...
In 2004 Kitchenham et al. first proposed the idea of evidence-based software engineering (EBSE). EBSE requires a systematic and unbiased method of aggregating empirical studies and has encouraged software engineering researches to undertake systematic literature reviews (SLRs) of Software Engineering topics and research questions. As software engin...
ContextDuring systematic literature reviews it is necessary to assess the quality of empirical papers. Current guidelines suggest that two researchers should independently apply a quality checklist and any disagreements must be resolved. However, there is little empirical evidence concerning the effectiveness of these guidelines.AimsThis paper inve...
Context: There has been an increase in research into global software development (GSD) and in the number of systematic literature reviews (SLRs) addressing this topic. Objective: The aim of this research is to catalogue GSD SLRs in order to identify the topics covered, the active researchers, the publication vehicles, and to assess the quality of t...
Context: We have been undertaking a series of case studies to investigate the value of mapping (scoping) studies in software engineering. Our previous studies have assessed these using the subjective opinions of researchers. Objective: In order to provide a more objective assessment of value, for this study, we used the results of a systematic mapp...
We report the results of a literature survey that reviewed bidding processes and portfolio management processes in a variety of different industries including the make-to-order engineering industry, the pharmaceutical industry, the finance industry, insurance industry, the oil industry, shipbuilding, and construction. As a starting point we reviewe...
This report discusses the issues involved in evaluating a software bidding model. We found it difficult to assess the appropriateness of any model evaluation activities without a baseline or standard against which to assess them. This paper describes our attempt to construct such a baseline. We reviewed evaluation criteria used to assess cost model...
Context: We are strong advocates of evidence-based software engineering (EBSE) in general and systematic literature reviews (SLRs) in particular. We believe it is essential that the SLR methodology is used constructively to support software engineering research. Objective: This study aims to assess the value of mapping studies which are a form of S...
The Unified Modeling Language (UML) was created on the basis of expert opinion and has now become accepted as the ‘standard’ object-oriented modelling notation. Our objectives were to determine how widely the notations of the UML, and their usefulness, have been studied empirically, and to identify which aspects of it have been studied in most deta...
Previous work has demonstrated that the use of structured abstracts can lead to greater completeness and clarity of information, making it easier for researchers to extract information about a study. In academic year 2007/08, Durham University’s Computer Science Department revised the format of the project report that final year students were requi...
Background: One of the anticipated benefits of systematic literature reviews (SLRs) is that they can be conducted in an auditable way to produce repeatable results. Aim: This study aims to identify under what conditions SLRs are likely to be stable, with respect to the primary studies selected, when used in software engineering. The conditions we i...
Systematic literature reviews (SLRs) are a major tool for supporting evidence-based software engineering. Adapting the procedures
involved in such a review to meet the needs of software engineering and its literature remains an ongoing process. As part
of this process of refinement, we undertook two case studies which aimed 1) to compare the use of...
Context: The authors wanted to assess whether the quality of published human-centric software engineering experiments was improving. This required a reliable means of assessing the quality of such experiments. Aims: The aims of the study were to confirm the usability of a quality evaluation checklist, determine how many reviewers were needed per pa...
BACKGROUND – the systematic review is becoming a more commonly employed research instrument in empirical software engineering. Before undue reliance is placed on the outcomes of such reviews it would seem useful to consider the robustness of the approach in this particular research context. OBJECTIVE – the aim of this study is to assess the reliabi...
Context: Software has been developed since the 1960s but the success rate of software development projects is still low. During the development of software, the probability of success is affected by various practices or aspects. To date, it is not clear which of these aspects are more important in influencing project outcome.
Objective: In this res...
ContextIn a previous study, we reported on a systematic literature review (SLR), based on a manual search of 13 journals and conferences undertaken in the period 1st January 2004 to 30th June 2007.ObjectiveThe aim of this on-going research is to provide an annotated catalogue of SLRs available to software engineering researchers and practitioners....
peer-reviewed Software Process Simulation Modeling (SPSM) research has increased in the past two decades, especially since the first ProSimWorkshop held in 1998. Our research aims to systematically assess how SPSM has evolved during the past 10 years in particular whether the purposes for SPSM, the simulation paradigms, tools, research topics, and...
In this paper, we argue that metrics validation approaches used in software engineering are problematic. In particular, theoretical validation is not rigorous enough to detect invalid metrics and empirical validation has no mechanism for making any final decisions about the validity of metrics. In addition, we argue that cohesion and information-th...
We identify three challenges related to the provenance of the material we use in teaching software engineering. We suggest that these challenges can be addressed by using evidence-based software engineering (EBSE) and its primary tool of systematic literature reviews (SLRs). This paper aims to assess the educational and scientific value of undergra...
ContextThe technology acceptance model (TAM) was proposed in 1989 as a means of predicting technology usage. However, it is usually validated by using a measure of behavioural intention to use (BI) rather than actual usage.ObjectiveThis review examines the evidence that the TAM predicts actual usage using both subjective and objective measures of a...
BackgroundMany papers are published on the topic of software metrics but it is difficult to assess the current status of metrics research.AimThis paper aims to identify trends in influential software metrics papers and assess the possibility of using secondary studies to integrate research results.MethodSearch facilities in the SCOPUS tool were use...
Background: We are strong advocates of evidence-based software engineering (EBSE) in general and systematic literature reviews (SLRs) in particular. We believe it is essential that the SLR methodology is being used constructively to support software engineering research. Aim: This study aims to assess the value of mapping studies which are a form o...
This study aims to compare the use of targeted manual searches with broad automated searches, and to assess the importance of grey literature and breadth of search on the outcomes of SLRs. We used a participant-observer multi-case embedded case study. Our two cases were a tertiary study of systematic literature reviews published between January 200...
The authors are seeking the best ways to employ evidence-based practices in software engineering research and practice so that the outcomes can inform practice and policy-making. The objective of this study is to investigate how other academic disciplines use evidence-based practices in order to help assess the guidelines that the authors have deve...
The method of principal components is applied to data of the external measurements of the Humpback whale (Megaptera novaeangliae). The first two components are found to summarize the main features of the data and are used to compare differences between the sexes and of the time and locality of capture. Suggestions are made as to how a reduction in...
Without careful methodological guidance, case studies in software engineering are difficult to plan, design and execute. While there are a number of broad guidelines for case study research, there are none that specifically address the needs of a software engineer undertaking multiple case studies in an industrial setting. Through a synthesis of ex...
Background: Many cost estimation papers are based on finding a "new" estimation method, trying out the method on one or two past datasets and "proving" that the new method is better than linear regression. Aim: This paper aims to explain why this approach to model comparison is often invalid and to suggest that the PROMISE repository may be making...
BackgroundIn 2004 the concept of evidence-based software engineering (EBSE) was introduced at the ICSE04 conference.AimsThis study assesses the impact of systematic literature reviews (SLRs) which are the recommended EBSE method for aggregating evidence.MethodWe used the standard systematic literature review method employing a manual search of 10 j...
Software process simulation modeling (SPSM) research has increased in the past two decades. However, most of these models are quantitative, which require detailed understanding and accurate measurement. As the continuous work to our previous studies in qualitative modeling of software process, this paper aims to investigate the structure equivalenc...
For other domains that have adopted the evidence-based paradigm, the impact has included research outcomes having greater influence in terms of informing and influencing practitioners and policy-makers. We examine how evidence-based practices are being adapted for use in software engineering and discuss how decision-making in our own discipline can...
Background: One aspect of undertaking a systematic literature review is to perform a quality evaluation of primary studies. Most quality checklists adopted from medicine, psychology and social studies assume that the experimental unit in an experiment is a human being. However, in empirical studies in software engineering the experimental unit may...
Background: A recent set of guidelines for software engineering systematic literature reviews (SLRs) includes a list of quality criteria obtained from the literature. The guidelines suggest that the list can be used to construct a tailored set of questions to evaluate the quality of primary studies. Aim: This paper aims to evaluate whether the list...
Software Process Simulation (SPS) research has increased since 1998 when the first ProSim Workshop was held. This paper aims to reveal how SPS has evolved during the past 10 years based on the preliminary results from the systematic literature review of SPS publications from 1998 to 2007. Trends over the period showed that interest in continuous mo...
When conducting a systematic literature review, researchers usually determine the relevance of primary studies on the basis
of the title and abstract. However, experience indicates that the abstracts for many software engineering papers are of too
poor a quality to be used for this purpose. A solution adopted in other domains is to employ structure...
Abstract Data-intensive analogy has been proposed as a means of software cost estimation as an alternative to other data intensive methods such as linear regression. Unfortunately, there are drawbacks to the method. There is no mechanism to assess its appropriateness for a specific dataset. In addition, heuristic algorithms are necessary to select...
We have recently undertaken a large-scale Systematic Literature Review (SLR) of a research question concerning the Technology Acceptance Model (TAM). At the end of the study, we observed some anomalies during the analysis of the extracted data. In our attempts to identify the cause of the anomalies, we found a number of mistakes that had been made...
Software process simulation modeling (SPSM) has become an increasingly active research area since its introduction in the
late 1980s. Particularly during the last ten years the related research community and the number of publications have been
growing. The objective of this research is to provide insights about the evolution of SPSM research durin...
We report our experiences with adapting the systematic review procedures to help consolidate software engineering knowledge, seen as a step towards being able to employ evidence-based practices to assist with decision making in IT. We describe the different studies performed, and illustrate our procedures through a fuller description of one of our...
Software process modeling has become an essential technique for managing software development processes. However, purely quantitative process modeling requires a detailed understanding and accurate measurement of software process, which relies on reliable and precise history data. This paper presents a semi-quantitative process modeling approach to...
We developed a novel method called Analogy-X to provide statistical inference procedures for analogy- based software effort estimation. Analogy-X is a method to statistically evaluate the relationship between useful project features and target features such as effort to be estimated, which ensures the dataset used is relevant to the prediction prob...
Evaluating the reliability of maturity level (ML) ratings is crucial for providing confidence in the results of software process assessments. This study investigates the dimensions underlying the maturity construct in the Capability Maturity Model (CMM) ...
Attempts to perform systematic literature reviews have identified a problem with the quality of software engineering abstracts for papers describing empirical studies. Structured abstracts have been found useful for improving the quality of abstracts in many other disciplines. However, there have been no studies of the value of structured abstracts...
Abstract,Scenario-basedmethodsforevaluating softwarearchitecture require a large number of stakeholders to be collocated for evaluation meetings. Collocating stakeholders is often an expensive exercise. To reduce expense, we have proposed a framework for supporting software,architecture evaluation process using groupware,systems. This paper present...
Software cost estimation using analogy is an important area in software engineering research. Previous research has demonstrated that analogy is a viable alternative to other conventional estimation methods in terms of predictive accuracy. One of the important research areas for analogy is how to determine suitable project feature weights. This can...
Software process simulation modeling (SPSM) research has increased since the first ProSim workshop held in 1998 and Kellner, Madachy and Raffo (KMR) discussed the "why, what and how" of process simulation. This paper aims to assess how SPSM has evolved during the past 10 years in particular whether the reasons for SPSM, the simulation paradigms, to...
Although surveys are an extremely common research method, surveybased research is not an easy option. In this chapter, we
use examples of three software engineering surveys to illustrate the advantages and pitfalls of using surveys. We discuss
the six most important stages in survey-based research: setting the survey’s objectives; selecting the mos...