Martin John ShepperdBrunel University London · Department of Computer Science
Martin John Shepperd
PhD
About
190
Publications
98,675
Reads
How we measure 'reads'
A 'read' is counted each time someone views a publication summary (such as the title, abstract, and list of authors), clicks on a figure, or views or downloads the full-text. Learn more
11,646
Citations
Introduction
I am Professor of Software Technology and Modelling and was Head of Department (2013-16).
My research interests principally focus on empirical software engineering and associated statistical and machine learning methods.
Additional affiliations
September 2005 - present
Brunel University
September 1984 - January 1991
February 1991 - August 2005
Education
September 1987 - February 1991
Publications
Publications (190)
Background. The ability to predict defect-prone software components would be valuable. Consequently, there have been many empirical studies to evaluate the performance of different techniques endeavouring to accomplish this effectively. However no one technique dominates and so designing a reliable defect prediction model remains problematic.
Objec...
In 2014 we published a meta-analysis of software defect prediction studies [1]. This suggested that the most important factor in determining results was Research Group i.e., who conducts the experiment is more important than the classifier algorithms being investigated. A recent re-analysis [2] sought to argue that the effect is less strong than or...
ContextSoftware engineering has a problem in that when we empirically evaluate competing prediction systems we obtain conflicting results.Objective
To reduce the inconsistency amongst validation study results and provide a more formal foundation to interpret results with a particular focus on continuous prediction systems.MethodA new framework is p...
This paper reports on the use of search techniques to help optimise a case-based reasoning (CBR) system for predicting software project effort. A major problem, common to ML techniques in general, has been dealing with large numbers of case features, some of which can hinder the prediction process. Unfortunately searching for the optimal feature su...
Provides the software estimation research community with a better
understanding of the meaning of, and relationship between, two
statistics that are often used to assess the accuracy of predictive
models: the mean magnitude relative error (MMRE) and the number of
predictions within 25% of the actual, pred(25). It is demonstrated that
MMRE and pred(...
Dataset of the paper "A novel aggregation-based dominance for Pareto-based evolutionary algorithms to configure software product lines"
Context
The retraction of research papers, for whatever reason, is a growing phenomenon. However, although retracted paper information is publicly available via publishers, it is somewhat distributed and inconsistent.
Objective
The aim is to assess: (i) the extent and nature of retracted research in Computer Science (CS) (ii) the post-retraction c...
Context: The retraction of research papers, for whatever reason, is a growing phenomenon. However, although retracted paper information is publicly available via publishers, it is somewhat distributed and inconsistent. Objective: The aim is to assess: (i) the extent and nature of retracted research in Computer Science (CS) (ii) the post-retraction...
Context
Software engineering researchers have undertaken many experiments investigating the potential of software defect prediction algorithms. Unfortunately some widely used performance metrics are known to be problematic, most notably F1, but nevertheless F1 is widely used.
Objective
To investigate the potential impact of using F1 on the validit...
Context: Software engineering researchers have undertaken many experiments investigating the potential of software defect prediction algorithms. Unfortunately, some widely used performance metrics are known to be problematic, most notably F1, but nevertheless F1 is widely used. Objective: To investigate the potential impact of using F1 on the valid...
Context: Software engineering has a problem in that when we empirically evaluate competing prediction systems we obtain conflicting results. Objective: To reduce the inconsistency amongst validation study results and provide a more formal foundation to interpret results with a particular focus on continuous prediction systems. Method: A new framewo...
Among infant researchers there is growing concern regarding the widespread practice of undertaking studies that have small sample sizes and employ tests with low statistical power (to detect a wide range of possible effects). For many researchers, issues of confidence may be partially resolved by relying on replications. Here, we bring further evid...
Context: There is considerable diversity in the range and design of computational experiments to assess classifiers for software defect prediction. This is particularly so, regarding the choice of classifier performance metrics. Unfortunately some widely used metrics are known to be biased, in particular F1. Objective: We want to understand the ext...
Background
Unsupervised machine learners have been increasingly applied to software defect prediction. It is an approach that may be valuable for software practitioners because it reduces the need for labeled training data.
Objective
Investigate the use and performance of unsupervised learning techniques in software defect prediction.
Method
We c...
Context: Conducting experiments is central to research machine learning research to benchmark, evaluate and compare learning algorithms. Consequently it is important we conduct reliable, trustworthy experiments. Objective: We investigate the incidence of errors in a sample of machine learning experiments in the domain of software defect prediction....
Context: Conducting experiments is central to research machine learning research to benchmark, evaluate and compare learning algorithms. Consequently it is important we conduct reliable, trustworthy experiments. Objective: We investigate the incidence of errors in a sample of machine learning experiments in the domain of software defect prediction....
Background: Unsupervised machine learners have been increasingly applied to software defect prediction. It is an approach that may be valuable for software practitioners because it reduces the need for labeled training data.
Objective: Investigate the use and performance of unsupervised learning techniques in software defect prediction.
Method: W...
I welcome the contribution from Falessi et al. [1] hereafter referred to as F++ , and the ensuing debate. Experimentation is an important tool within empirical software engineering, so how we select participants is clearly a relevant question. Moreover as F++ point out, the question is considerably more nuanced than the simple dichotomy it might ap...
Context: Most research into software defect prediction ignores the differing amount of effort entailed in searching for defects between software components. The result is sub-optimal solutions in terms of allocating testing resources. Recently effort-aware (EA) defect prediction has sought to redress this deficiency. However, there is a gap between...
Context: Software defect prediction (SDP) is an important challenge in the field of software engineering, hence much research work has been conducted, most notably through the use of machine learning algorithms. However, class-imbalance typified by few defective components and many non-defective ones is a common occurrence causing difficulties for...
Context: The role of expert judgement is essential in our quest to improve software project planning and execution. However, its accuracy is dependent on many factors, not least the avoidance of judgement biases, such as the anchoring bias, arising from being influenced by initial information, even when it's misleading or irrelevant. This strong ef...
CONTEXT: The role of expert judgement is essential in our quest to improve software project planning and execution. However, its accuracy is dependent on many factors, not least the avoidance of judgement biases, such as the anchoring bias, arising from being influenced by initial information, even when it's misleading or irrelevant. This strong ef...
CONTEXT: There has been a rapid growth in the use of data analytics to underpin evidence-based software engineering. However the combination of complex techniques, diverse reporting standards and complex underlying phenomena are causing some concern as to the reliability of studies. OBJECTIVE: Our goal is to provide guidance for producers and consu...
Context: There is growing interest in establishing software engineering as an evidence-based discipline. To that end, replication is often used to gain confidence in empirical findings, as opposed to reproduction where the goal is showing the correctness, or validity of the published results.
Objective: To consider what is required for a replicatio...
Context
Concerns have been raised from many quarters regarding the reliability of empirical research findings and this includes software engineering. Replication has been proposed as an important means of increasing confidence.
Objective
We aim to better understand the value of replication studies, the level of confirmation between replication and...
The relative pros and cons of using students or practitioners in experiments in empirical software engineering have been discussed for a long time and continue to be an important topic. Following the recent publication of “Empirical software engineering experts on the use of students and professionals in experiments” by Falessi, Juristo, Wohlin, Tu...
The need to replicate results is a central tenet throughout science, and empirical software engineering is no exception. The reasons are threefold.
Context: We revisit our review of data quality within the context of empirical software engineering eight years on from our PROMISE 2008 article.
Objective: To assess the extent and types of techniques used to manage quality within data sets. We consider this a particularly interesting question in the context of initiatives to promote sharing and s...
Context: Test-driven development (TDD) is an agile practice claimed to improve the quality of a software product, as well as the productivity of its developers. A previous study (i.e., baseline experiment) at the University of Oulu (Finland) compared TDD to a test-last development (TLD) approach through a randomized controlled trial. The results fa...
Context: It is unclear that current approaches to evaluating or comparing competing software cost or effort models give a realistic picture of how they would perform in actual use. Specifically, we're concerned that the usual practice of using all data with some holdout strategy is at variance with the reality of a data set growing as projects comp...
Context: In recent years there has been growing concern about conflicting experimental results in empirical software engineering. This has been paralleled by awareness of how bias can impact research results.
Objective: To explore the practicalities of blind analysis of experimental results to reduce bias.
Method: We apply blind analysis to a real...
A meta-analysis indicated that some areas of computer science research are subject to researcher bias. However, rather than mistrust all scientific research, researchers should examine research to determine its validity.
Computer programming is notoriously difficult to learn. To this end, regular practice in the form of application and reflection is an important enabler of student learning. However, educators often find that first-year B.Sc. students do not readily engage in such activities. Providing each student with a programmable robot, however, could be used to fa...
Context: Software effort estimation is one of the most important activities in the software development process. Unfortunately, estimates are often substantially wrong. Numerous estimation methods have been proposed including Case-based Reasoning (CBR). In order to improve CBR estimation accuracy, many researchers have proposed feature weighting te...
This chapter reviews the background and extent of the software project cost prediction problem. Given the importance of the topic, there has been a great deal of research activity over the past 40 years, most of which has focused on developing formal cost prediction systems. The problem is that presently there is limited evidence to suggest formal...
Background--Self-evidently empirical analyses rely upon the quality of their data. Likewise, replications rely upon accurate reporting and using the same rather than similar versions of datasets. In recent years, there has been much interest in using machine learners to classify software modules into defect-prone and not defect-prone categories. Th...
For this special issue, the guest editors asked a panel of six established experts in software analytics to highlight what they thought were the most important, or overlooked, aspect of this field. They all pleaded for a much broader view of analytics than seen in current practice: software analytics should go beyond developers (Ahmed Hassan) and n...
Trust plays an important role in enabling software development teams to function effectively. Trust between individual team members has been shown to improve the independence of software teams and reduce the amount of project management effort required by those teams. Our main aims are to investigate (i) the impact communication has on trust betwee...
Recently there has been a welcome move to realign software engineering as an evidence-based practice. Many research groups are actively conducting empirical research e.g. to compare different fault prediction models or the value of various architectural patterns. However, this brings some challenges. First, for a particular question, how can we loc...
In recent years there has been a huge growth in using statistical and machine learning methods to find useful prediction systems for software engineers. Of particular interest is predicting project effort and duration and defect behaviour. Unfortunately though results are often promising no single technique dominates and there are clearly complex i...
The Special Issue 2012 of Empirical Software Engineering journal presents articles on repeatable results in software engineering prediction. The Difficulties of Building Generic Reliability Models for Software is the first paper discussed by Murphy, which says that the computer industry is capable of producing generic predictive models, but only if...
BACKGROUND -- whilst substantial effort has been invested in developing and evaluating knowledge-based techniques for project prediction, little is known about the interaction between them and expert users. OBJECTIVE -- the aim is to explore the interaction of cognitive processes and personality of software project managers undertaking tool-support...
We redesigned our undergraduate computing programmes to address problems of motivation and outdated content. The primary vehicle for the new curriculum was the group project which formed a central spine for the entire degree right from the first year. In terms of results, thus far this programme has been successfully run once. Failures, drop outs a...
The inherent uncertainty of the software development process presents particular challenges for software effort prediction. We need to systematically address missing data values, outlier detection, feature subset selection and the continuous evolution of predictions as the project unfolds, and all of this in the context of data-starvation and noisy...
In this keynote I explore what exactly do we mean by data quality, techniques to assess data quality and the very significant challenges that poor data quality can pose. I believe we neglect data quality at our peril since - whether we like it or not - our research results are founded upon data and our assumptions that data quality issues do not co...
Background: There has been much research into building formal (metrics-based) prediction systems with the aim of improving resource estimation and planning of software projects. However the 'objectivity' of such systems is illusory in the sense that many inputs need themselves to be estimated by the software engineer. Method: We review the uptake o...
We consider the roles of algorithm and human and their inter-relationships. As a vehicle for some of our ideas we describe an empirical investigation of software profession- als using analogy-based tools and unaided search in or- der to solve various prediction problems. We conclude that there exist a class of software engineering problems which mi...
BACKGROUND—Predicting defect-prone software components is an economically important activity and so has received a good deal of attention. However, making sense of the many, and sometimes seemingly inconsistent, results is difficult.
OBJECTIVE—We propose and evaluate a general framework for software defect prediction that supports 1) unbiased and...
BACKGROUND: Prediction e.g. of project cost is an im-portant concern in software engineering. PROBLEM: Although many empirical validations of soft-ware engineering prediction systems have been published, no one approach dominates and sense-making of conflicting empirical results is proving challenging. METHOD: We propose a new approach to evaluatin...
BACKGROUND: In reality project managers are constrained by the incremental nature of data collection. Specifically, project observations are accumulated one project at a time. Likewise within-project data are accumulated one stage or phase at a time. However, empirical researchers have given limited attention to this perspective. PROBLEM: Consequen...
BACKGROUND – the systematic review is becoming a more commonly employed research instrument in empirical software engineering. Before undue reliance is placed on the outcomes of such reviews it would seem useful to consider the robustness of the approach in this particular research context. OBJECTIVE – the aim of this study is to assess the reliabi...
As a result of restructuring our computing undergraduate programmes, we introduced group projects as an integrating central spine for Levels 1 and 2. The task includes requirements discovery, software development, team working, negotiation and project management. We found careful team formation and fostering of close staff-group relationships to be...
Inheritance is a fundamental feature of the Object-Oriented (OO) paradigm. It is used to promote extensibility and reuse in OO systems. Understanding how systems evolve, and specifically, trends in the movement and re-location of classes in OO hierarchies can help us understand and predict future maintenance effort. In this paper, we explore how an...
Software effort prediction clearly plays a crucial role in software project management. In keeping with more dynamic approaches to software development, it is not sufficient to only predict the whole-project effort at an early stage. Rather, the project manager must also dynamically predict the effort of different stages or activities during the so...
The aim of this research is to investigate the relationship between personality and expert prediction behaviour when estimating software project effort using analogical reasoning. For some years we have been developing tools and techniques for estimation by analogy (EBA). However the variability of results from using these tools and techniques can...
A challenging problem for e-assessment is automatic marking of diagrams. There are a number of difficulties not least that much of the meaning of a diagram resides in the labels and hence label matching is an important process in the e-assessment of diagrams. Previous research has shown that the labels used by the students in the diagrams can be di...
In this short paper we explore a problematic aspect of automated assessment of diagrams. Diagrams have partial and sometimes inconsistent semantics. Typically much of the meaning of diagram resides in the labels, however, the choice of labeling is largely unrestricted. This means a correct solution may utilise differing yet semantically equivalent...
We consider software project cost estimation from a problem solving perspective. Taking a cognitive psychological approach, we argue that the algorithmic basis for CBR tools is not representative of human problem solving and this mismatch could account for inconsistent results. We describe the fundamentals of problem solving, focusing on experts so...
Missing data is a widespread problem that can affect the ability to use data to construct effective prediction systems. We investigate a common machine learning technique that can tolerate missing values, namely C4.5, to predict cost using six real world software project databases. We analyze the predictive performance after using the k-NN missing...
OBJECTIVE - to assess the extent and types of techniques used to manage quality within software engineering data sets. We consider this a particularly interesting question in the context of initiatives to promote sharing and secondary analysis of data sets. METHOD - we perform a systematic review of available empirical software engineering studies....
Previous studies of Object-Oriented (OO) software have reported avoidance of the inheritance mechanism and cast doubt on the wisdom of 'deep' inheritance levels. From an evolutionary perspective, the picture is unclear - we still know relatively little about how, over time, changes tend to be applied by developers. Our conjecture is that an inherit...
Intelligent data analysis techniques are useful for better exploring real-world data sets. However, the real-world data sets always are accompanied by missing data that is one major factor affecting data quality. At the same time, good intelligent data exploration requires quality data. Fortunately, Missing Data Imputation Techniques (MDITs) can be...
The availability of multi-organisation data sets has made it possible for individual organisations to build and apply management models, even if they do not have data of their own. In the absence of any data this may be a sensible option, driven by necessity. However, if both cross-company (or global) and within-company (or local) data are availabl...
Data quality is an important aspect of empirical analysis. This paper compares three noise handling methods to assess the benefit of identifying and either filtering or editing problematic instances. We compare a 'do nothing' strategy with (i) filtering, (ii) robust filtering and (Hi) filtering followed by polishing. A problem is that it is not pos...
The objective of this paper is to consider research progress in the field of software project economics with a view to identifying important challenges and promising research directions. I argue that this is an important sub-discipline since this will underpin any cost-benefit analysis used to justify the resourcing, or otherwise, of a software pro...
Unfortunately a side effect of the introduction of some e-learning systems into HEIs has been the loss of flexibility and the imposition of new processes that can result in additional clerical and administrative burdens. In this paper we describe the development of a software tool named PreMark that fits between the e-learning tool (in our case Web...
The aim of this investigation is to perform an independent study of the various emerging elearning standards. This paper presents a summary of these standards in order to make them more accessible and understandable, and provide preliminary evidence as to their utility and adoption by the various UK higher and further education institutions. Recent...
This paper aims to provide a basis for the improvement of software-estimation research through a systematic review of previous work. The review identifies 304 software cost estimation papers in 76 journals and classifies the papers according to research topic, estimation approach, research approach, study context and data set. A Web-based library o...
Citations and related work are crucial in any research to position the work and to build on the work of others. A high citation count is an indication of the influence of specific articles. The importance of citations means that it is interesting to ...
Effort prediction is a very important issue for software project management. Historical project data sets are frequently used to support such prediction. But missing data are often contained in these data sets and this makes prediction more difficult. One common practice is to ignore the cases with missing data, but this makes the originally small...