
Ludmila Kuncheva- Bangor University
Ludmila Kuncheva
- Bangor University
About
213
Publications
87,521
Reads
How we measure 'reads'
A 'read' is counted each time someone views a publication summary (such as the title, abstract, and list of authors), clicks on a figure, or views or downloads the full-text. Learn more
21,189
Citations
Introduction
Current institution
Publications
Publications (213)
Detection of spatial areas where biodiversity is at risk is of paramount importance for the conservation and monitoring of ecosystems. Large terrestrial mammalian herbivores are keystone species as their activity not only has deep effects on soils, plants, and animals, but also shapes landscapes, as large herbivores act as allogenic ecosystem engin...
The aim of online clustering is to discover a structure in running data. Adding label constraints or pairwise constraints to this has shown to improve the clustering accuracy. In this study we present an analysis of how different hyperparameters – proportion of constraints, initial number of clusters, and batch window size – affect most recent and...
A keyword search on constrained clustering on Web-of-Science returned just under 3,000 documents. We ran automatic analyses of those, and compiled our own bibliography of 183 papers which we analysed in more detail based on their topic and experimental study, if any. This paper presents general trends of the area and its sub-topics by Pareto analys...
Individual animal recognition and re-identification from still images or video are useful for research in animal behaviour, environment preservation, biology and more. We propose to use Restricted Set Classification (RSC) for classifying multiple animals simultaneously from the same image. Our literature review revealed that this problem has not be...
In classification problems, the purpose of feature selection is to identify a small, highly discriminative subset of the original feature set. In many applications, the dataset may have thousands of features and only a few dozens of samples (sometimes termed `wide'). This study is a cautionary tale demonstrating why feature selection in such cases...
Over the past few decades, the remarkable prediction capabilities of ensemble methods have been used within a wide range of applications. Maximization of base-model ensemble accuracy and diversity are the keys to the heightened performance of these methods. One way to achieve diversity for training the base models is to generate artificial/syntheti...
The present study compares the development experiences and the nature and microstructure of practice activities of super-elite and elite cricket batsmen, domains of expertise previously unexplored simultaneously within a truly elite sample. The study modeled the development of super-elite and elite cricket batsmen using non-linear machine learning...
This article tells the story of our collaboration on prototype selection and what has happened since. In prototype classification, the data live in some metric space ℝ
<sup xmlns:mml="http://www.w3.org/1998/Math/MathML" xmlns:xlink="http://www.w3.org/1999/xlink">n</sup>
equipped with a distance. Depending on which discipline or school (or continent...
Segmentation of the liver from 3D computer tomography (CT) images is one of the most frequently performed operations in medical image analysis. In the past decade, Deep Learning Models (DMs) have offered significant improvements over previous methods for liver segmentation. The success of DMs is usually owed to the user's expertise in deep learning...
Random Balance strategy (RandBal) has been recently proposed for constructing classifier ensembles for imbalanced, two-class data sets. In RandBal, each base classifier is trained with a sample of the data with a random class prevalence, independent of the a priori distribution. Hence, for each sample, one of the classes will be undersampled while...
Research exploring the development of expertise has mostly adopted linear methods to identify precursors of expertise, assessing statistical differences between groups of isolated features (variables) by way of attaching importance; e.g., deliberate practice hours (Ericsson et al., 1993). However, confining the complex nature of expertise developme...
Many methods exist for generating keyframe summaries of videos. However, relatively few methods consider on-line summarisation, where memory constraints mean it is not practical to wait for the full video to be available for processing. We propose a classification (taxonomy) for on-line video summarisation methods based upon their descriptive and d...
This multidisciplinary study used pattern recognition analyses to examine the developmental biographies of 16 Great British Olympic and World Champions ("Super-Elite") and 16 matched international athletes who had not won major medals ("Elite"). Athlete, coach and parent interviews (260 total interview hours) combined in-depth qualitative and quant...
In Statistical Learning, the Vapnik-Chervonenkis (VC) dimension is an important combinatorial property of classifiers. To our knowledge, no theoretical results yet exist for the VC dimension of edited nearest-neighbour (1NN) classifiers with reference set of fixed size. Related theoretical results are scattered in the literature and their implicati...
Large volumes of egocentric video data are being continually collected every day. While the standard video summarisation approach offers all-purpose summaries, here we propose a method for selective video summarisation. The user can query the video with an unlimited vocabulary of terms. The result is a time-tagged summary of keyframes related to th...
Many existing methods for video summarisation are not suitable for on-line applications, where computational and memory constraints mean that feature extraction and frame selection must be simple and efficient. Our proposed method uses RGB moments to represent frames, and a control-chart procedure to identify shots from which keyframes are then sel...
Despite the existence of a large number of approaches for generating summaries from egocentric video, online video summarisation has not been fully explored yet. We present an online video summarisation algorithm to generate keyframe summaries during video capture. Event boundaries are identified using control charts and a keyframe is subsequently...
Visualising the content of a video through a keyframe summary has been a long-standing quest in computer vision. Using real egocentric videos, this paper explores the suitability of seven feature representations of the video frames for the purpose of online summarisation. Computational speed is an essential requirement in this setup. We found that...
This paper draws a parallel between similarity-based categorisation models developed in cognitive psychology and the nearest neighbour classifier (1-NN) in machine learning. Conceived as a result of the historical rivalry between prototype theories (abstraction) and exemplar theories (memorisation), recent models of human categorisation seek a comp...
A natural way of handling imbalanced data is to attempt to equalise the class frequencies and train the classifier of choice on balanced data. For two-class imbalanced problems, the classification success is typically measured by the geometric mean (GM) of the true positive and true negative rates. Here we prove that GM can be improved upon by inst...
A natural way of handling imbalanced data is to attempt to equalise the class frequencies and train the classifier of choice on balanced data. For two-class imbalanced problems, the classification success is typically measured by the geometric mean (GM) of the true positive and true negative rates. Here we prove that GM can be improved upon by inst...
In the Restricted Set Classification approach (RSC), a set of instances must be labelled simultaneously into a given number of classes, while observing an upper limit on the number of instances from each class. In this study we expand RSC by incorporating prior probabilities for the classes and demonstrate the improvement on the classification accu...
High-dimensional data with very few instances are typical in many application domains. Selecting a highly discriminative subset of the original features is often the main interest of the end user. The widely-used feature selection protocol for such type of data consists of two steps. First, features are selected from the data (possibly through cros...
Large numbers of data streams are today generated in many fields. A key challenge when learning from such streams is the problem of concept drift. Many methods, including many prototype methods, have been proposed in recent years to address this problem. This paper presents a refined taxonomy of instance selection and generation methods for the cla...
Detecting change in multivariate data is a challenging problem, especially when class labels are not available. There is a large body of research on univariate change detection, notably in control charts developed originally for engineering applications. We evaluate univariate change detection approaches —including those in the MOA framework — buil...
A keyframe summary of a video must be concise, comprehensive and diverse. Current video summarisation methods may not be able to enforce diversity of the summary if the events have highly similar visual content, as is the case of egocentric videos. We cast the problem of selecting a keyframe summary as a problem of prototype (instance) selection fo...
Given the great interest in creating keyframe summaries from video, it is surprising how little has been done to formalise their evaluation and comparison. User studies are often carried out to demonstrate that a proposed method generates a more appealing summary than one or two rival methods. But larger comparison studies cannot feasibly use such...
A keyframe summary, or "static storyboard", is a collection of frames from a video designed to summarise its semantic content. Many algorithms have been proposed to extract such summaries automatically. How best to evaluate these outputs is an important but little-discussed question. We review the current methods for matching frames between two sum...
The position of a developing embryo or foetus relative to members of the same or opposite sex can have profound effects on its resulting anatomy, physiology and behavior. Here we treat intrauterine position as a combinatorial problem and determine the theoretical probability of having 0, 1 or 2 adjacent foetuses of the opposite sex for species with...
We consider a problem where a set X of N objects (instances) coming from c classes have to be classified simultaneously. A restriction is imposed on X in that the maximum possible number of objects from each class is known, hence we dubbed the problem who-is-there? We compare three approaches to this problem: (1) independent classification whereby...
This study brings together systematised views of two related areas: data editing for the nearest neighbour classifier and adaptive learning in the presence of concept drift. The growing number of studies in the intersection of these areas warrants a closer look. We revise and update the taxonomies of the two areas proposed in the literature and arg...
We aim to dispel the blind faith in theoretical criteria for optimisation of the edited nearest neighbour classifier and its version called the Voronoi classifier. Three criteria from past and recent literature are considered: two bounds using Vapnik-Chervonenkis (VC) dimension and a probabilistic criterion derived by a Bayesian approach. We demons...
Pattern recognition concerns assigning objects to classes. The objects are described by features (variables or measurements) organized as p-dimensional points in some feature space. A classifier is a formula, an algorithm or a technique that can assign a class label to any given point in the feature space. Pattern recognition comprises supervised l...
A change detection algorithm for multi-dimensional data reduces the input space to a single statistic and compares it with a threshold to signal change. This study investigates the performance of two methods for estimating such a threshold: bootstrapping and control charts. The methods are tested on a challenging dataset of emotional facial express...
Consider a multi-class classification problem. Given is a set of objects, for which it is known that there is at most one object from each class. The problem is to identify the missing classes. We propose
to apply the Hungarian assignment algorithm to the logarithms of the estimated posterior probabilities for the given objects. Each object is ther...
This chapter explains the types of classifier outputs such as class labels, ranked class labels, numerical support for the classes, and oracle. It then provides a probabilistic framework for combining label outputs. Despite the condescending names it has received, the Na?ve Bayes (NB) combiner has been acclaimed for its rigorous statistical underpi...
This chapter presents methods that uses a decision profile (DP(x)) to find the overall support for each class, and subsequently label the input x in the class with the largest support. Next, it explains how to get probability outputs. Calibrating the classifiers' outputs is important, especially for heterogeneous ensembles. An example of such outpu...
This introductory chapter talks about fundamentals of pattern recognition. It explains basic concepts including class, feature and dataset involved in pattern recognition. Depending on the classifier model, the ordering of the categories and the scaling of the values may have a positive, negative, or neutral effect on the relevance of the feature....
The presumption in classifier selection is that there is an oracle that can identify the best expert for a particular input x. This expert's decision is accepted as the decision of the ensemble for x. This chapter address the following questions: (1) how do we build the individual classifiers; (2) should they be stable or unstable; (3) homogeneous...
This chapter illustrates the difference between the performance of a bagging ensemble and a random forest ensemble. AdaBoost is the only ensemble method featured among the “Top 10 algorithms in datamining” by Wu et al. In problems with a large number of features, a natural ensemble-building heuristic is to use different feature subsets to train the...
This chapter details some of the most popular base classifier models. Naïve Bayes or also “Idiot's Bayes” is a simple and often surprisingly accurate classification technique. Accurate estimates of the marginal pdfs can be obtained from much smaller amounts of data compared to these for the joint pdf. This makes the Naïve Bayes classifier so attrac...
Common sense suggests that the classifiers in the ensemble should be as accurate as possible and should not make coincident errors. This chapter talks about diversity in classifier ensembles. Ensemble-creating methods which rely on inducing diversity in an intuitive manner have proven their value. The diversity of the classifier outputs is therefor...
It is important that feature selection experiments are “clean,” an issue that has been often overlooked. In this context, “clean” means that the testing data which evaluates the quality of a classifier and a feature subset must not have been seen at any point during the training. This chapter introduces approaches and methods for ensemble feature s...
When classifiers are deployed in real-world applications, it is assumed that the distribution of the incoming data matches the distribution of the data used to train the classifier. This assumption is often incorrect, which necessitates some form of change detection or adaptive classification. While there has been a lot of work on change detection...
Affective gaming (AG) is a cross-disciplinary area drawing upon psychology, physiology, electronic engineering and computer science, among others. This paper presents a historical overview of affective gaming, bringing together psychophysiological system developments, a time-line of video game graphical advancements and industry trends, thereby off...
We propose a probabilistic framework for classifier combination, which gives rigorous optimality conditions (minimum classification error) for four combination methods: majority vote, weighted majority vote, recall combiner and the naive Bayes combiner. The framework is based on two assumptions: class-conditional independence of the classifier outp...
Chapter 4 discusses the fusion of label outputs. Four types of classifier outputs are listed: class labels (abstract level), ranked class labels (rank level), degree of support for the classes (measurement level) and correct/incorrect decision (oracle level). Combination methods for class label outputs are presented including majority vote, plurali...
This paper describes a general method to address partial occlusions for human detection in still images. The random subspace method (RSM) is chosen for building a classifier ensemble robust against partial occlusions. The component classifiers are chosen on the basis of their individual and combined performance. The main contribution of this work l...
Kappa-error diagrams are used to gain insights about why an ensemble method is better than another on a given data set. A point on the diagram corresponds to a pair of classifiers. The x-axis is the pairwise diversity (kappa), and the y-axis is the averaged individual error. In this study, kappa is calculated from the 2 × 2 correct/wrong contingenc...
Developing accurate, reliable and easy to use diagnostic tests is based upon identifying a small set of highly discriminative biomarkers. This task can be cast as feature selection within a pattern recognition context. Medical data are usually of the "wide" type where the number of features is substantially larger than the number of instances. With...
While there is a lot of research on change detection based on the streaming classification error, finding changes in multidimensional unlabelled streaming data is still a challenge. Here we propose to apply principal component analysis (PCA) to the training data, and mine the stream of selected principal components for change in the distribution. A...
Suppose that the only available information in a multi-class problem are expert estimates of the conditional probabilities of occurrence for a set of binary features. The aim is to select a subset of features to be measured in subsequent data collection experiments. In the lack of any information about the de-pendencies between the features, we ass...
Functional magnetic resonance imaging (fMRI) provides a spatially accurate measure of brain activity. Real-time classification allows the use of fMRI in neurofeedback experiments. With limited labelled data available, a fixed pre-trained classifier may be inaccurate. We propose that streaming fMRI data may be classified using a classifier ensemble...
Functional Magnetic Resonance Imaging serves to identify networks and regions in the brain engaged in vari-ous mental activities, represented as a set of voxels in the 3D image. It is important to be able to measure how similar two selected voxel sets are. The major flaw of the currently used correlation-based and overlap-based measures is that the...
The diagnosis of Chronic Obstructive Pulmonary Disease COPD is based on symptoms, clinical examination, exposure to risk factors smoking and certain occupational dusts and confirming lung airflow obstruction on spirometry. However, most people with COPD remain undiagnosed and controversies regarding spirometry persist. Developing accurate and relia...
Event-related potential data can be used to index perceptual and cognitive operations. However, they are typically high-dimensional and noisy. This study examines the original raw data and six feature-extraction methods as a pre-processing step before classification. Four traditionally used feature-extraction methods were considered: principal comp...
We introduce a system called AMBER (Advanced Multimodal Biometric Emotion Recognition), which combines Electroencephalography (EEG) with Electro Dermal Activity (EDA) and pulse sensors to provide low cost, portable real-time emotion recognition. A single-subject pilot experiment was carried out to evaluate the ability of the system to distinguish b...
Change detection in streaming data relies on a fast estimation of the probability that the data in two consecutive windows come from different distributions. Choosing the criterion is one of the multitude of questions that need to be addressed when designing a change detection procedure. This paper gives a log-likelihood justification for two well...
Consider a set-classification task where c objects must be labelled simultaneously in c classes, knowing that there is only one object coming from each class (full-class set). Such problems may occur in automatic
attendance registration systems, simultaneous tracking of fast moving objects and more. A Bayes-optimal solution to the full-class
set cl...
Cryptanalysis attempts identify the weaknesses in the algorithms used to encrypt code or the methods used to generate keys. In this study, we use pattern recognition techniques for identification of encryption algorithms for block ciphers. The following block cipher algorithms, DES, IDEA, AES, and RC operating in ECB mode were considered. Eight dif...
We introduce Learn++.MF, an ensemble-of-classifiers based algorithm that employs random subspace selection to address the missing feature problem in supervised classification. Unlike most established approaches, Learn++.MF does not replace missing values with estimated ones, and hence does not need specific assumptions on the underlying data distri...
The advent of real-time fMRI pattern classification opens many avenues for interactive self-regulation where the brain's response is better modelled by multivariate, rather than univariate techniques. Here we test three on-line linear classifiers, applied to a real fMRI dataset, collected as part of an experiment on the cortical response to emotion...
Beyond Quality of Service and billing, one of the most important applications of traffic identification is in the field of network security. Despite their simplicity, current approaches based on port numbers are highly unreliable. This paper proposes an identification approach, based on a cascade of decision trees. The approach uses the sign patter...
Functional magnetic resonance imaging (fMRI) is becoming a forefront brain-computer interface tool. To decipher brain patterns, fast, accurate and reliable classifier methods are needed. The support vector machine (SVM) classifier has been traditionally used. Here we argue that state-of-the-art methods from pattern recognition and machine learning,...
Functional magnetic resonance imaging (fMRI) is a non-invasive and powerful method for analysis of the operational mechanisms
of the brain. fMRI classification poses a severe challenge because of the extremely large feature-to-instance ratio. Random
Subspace ensembles (RS) have been found to work well for such data. To enable a theoretical analysis...
Although diversity in classifier ensembles is desirable, its relationship with the ensemble accuracy is not straightforward. Here we derive a decomposition of the majority vote error into three terms: average individual accuracy, “good” diversity and “bad diversity”. The good diversity term is taken out of the individual error whereas the bad diver...
Classification of brain images obtained through functional magnetic resonance imaging (fMRI) poses a serious challenge to pattern recognition and machine learning due to the extremely large feature-to-instance ratio. This calls for revision and adaptation of the current state-of-the-art classification methods. We investigate the suitability of the...
We consider classification of sequential data in the presence of frequent and abrupt concept changes. The current practice is to use the data after the change to train a new classifier. However, if the window with the new data is too small, the classifier will be undertrained and hence less accurate that the "old'' classifier. Here we propose a met...
Classification in changing environments (commonly known as concept drift) requires adaptation of the classifier to accommodate the changes. One approach is to keep a moving window on the streaming data and constantly update the classifier on it. Here we consider an abrupt change scenario where one set of probability distributions of the classes is...
Classical approaches for network traffic classification are based on port analysis and packet inspection. Recent studies indicate that network protocols can be recognised more accurately using the flow statistics of the TCP connection. We propose a classifier selection ensemble for a fast and accurate verification of network protocols. Using the re...
This paper investigates the stability of an automatic system for classifying kerogen material from images of sieved rock samples.
The system comprises four stages: image acquisition, background removal, segmentation, and classification of the segmented
kerogen pieces as either inertinite or vitrinite. Depending upon a segmentation parameter d, call...
Any change in the classification problem in the course of online classification is termed changing environments. Examples
of changing environments include change in the underlying data distribution, change in the class definition, adding or removing
a feature. The two general strategies for handling changing environments are (i) constant update of...
We propose a strategy for updating the learning rate parameter of online linear classifiers for streaming data with concept
drift. The change in the learning rate is guided by the change in a running estimate of the classification error. In addition,
we propose an online version of the standard linear discriminant classifier (O-LDC) in which the in...
We study streaming data where the true labels come with a delay. The question is whether the online nearest neighbour classifier (IB2 and IB3 here) should employ the unlabelled data. Three strategies are examined: do-nothing, replace and forget. Experiments with 28 data sets show that IB2 benefits from unlabelled data, while IB3 does not.
The abundance of unlabelled data alongside limited labelled data has provoked significant interest in semi-supervised learning methods. “Naïve labelling” refers to the following simple strategy for using unlabelled data in on-line classification. A new data point is first labelled by the current classifier and then added to the training set togethe...
We develop the classification part of a system that analyses transmitted light microscope images of dispersed kerogen preparation.
The system automatically extracts kerogen pieces from the image and labels each piece as either inertinite or vitrinite. The
image pre-processing analysis consists of background removal, identification of kerogen materi...
Identification of fossil material under a microscope is the basis of micropalentology. Our task is to locate and count the pieces of inertinite and vitrinite in images of sieve sampled rock. The classical watershed algorithm oversegments the objects because of their irregular shapes. In this paper we propose a method for locating multiple objects i...
We derive a tight dependency-related bound on the difference between the Naïve Bayes (NB) error and Bayes error for two binary features and two equiprobable classes. A measure of discrepancy of feature dependencies is proposed for multiple features. Its correlation with NB is shown using 23 real data sets.
Simple classifiers, including LDC, have often been praised for their robustness and accuracy. Here we consider an online version of LDC applied to streaming data with concept drift. The classifier is trained on a moving window containing the latest N observations. Current approaches to determining the window size are mostly heuristical. The talk pr...