Chapter

Incremental Extreme Learning Machine for Binary Data Stream Classification

Authors:
To read the full-text of this research, you can request a copy directly from the authors.

Abstract

Classifier ensembles have shown the ability to classify drifted data streams. The following paper proposes an ensemble consisting of a single hidden layer feedforward neural network and an Extreme Learning Machine. For this purpose, a new incremental version of the Extreme Learning Machine is also proposed. Motivations behind such an approach have been precisely described and supported by conducted research. The achieved results show when the architecture might be most useful and what are the possible directions for future development of this method.

No full-text available

Request Full-text Paper PDF

To read the full-text of this research,
you can request a copy directly from the authors.

ResearchGate has not been able to resolve any citations for this publication.
Article
Full-text available
Concept drift primarily refers to an online supervised learning scenario when the relation between the input data and the target variable changes over time. Assuming a general knowledge of supervised learning in this article, we characterize adaptive learning processes; categorize existing strategies for handling concept drift; overview the most representative, distinct, and popular techniques and algorithms; discuss evaluation methodology of adaptive algorithms; and present a set of illustrative applications. The survey covers the different facets of concept drift in an integrated way to reflect on the existing scattered state of the art. Thus, it aims at providing a comprehensive introduction to the concept drift adaptation for researchers, industry analysts, and practitioners.
Article
Full-text available
Scikit-learn is an increasingly popular machine learning li- brary. Written in Python, it is designed to be simple and efficient, accessible to non-experts, and reusable in various contexts. In this paper, we present and discuss our design choices for the application programming interface (API) of the project. In particular, we describe the simple and elegant interface shared by all learning and processing units in the library and then discuss its advantages in terms of composition and reusability. The paper also comments on implementation details specific to the Python ecosystem and analyzes obstacles faced by users and developers of the library.
Article
Full-text available
Most streaming decision models evolve continuously over time, run in resource-aware environments, and detect and react to changes in the environment generating data. One important issue, not yet convincingly addressed, is the design of experimental work to evaluate and compare decision models that evolve over time. This paper proposes a general framework for assessing predictive stream learning algorithms. We defend the use of prequential error with forgetting mechanisms to provide reliable error estimators. We prove that, in stationary data and for consistent learning algorithms, the holdout estimator, the prequential error and the prequential error estimated over a sliding window or using fading factors, all converge to the Bayes error. The use of prequential error with forgetting mechanisms reveals to be advantageous in assessing performance and in comparing stream learning algorithms. It is also worthwhile to use the proposed methods for hypothesis testing and for change detection. In a set of experiments in drift scenarios, we evaluate the ability of a standard change detection algorithm to detect change using three prequential error estimators. These experiments point out that the use of forgetting mechanisms (sliding windows or fading factors) are required for fast and efficient change detection. In comparison to sliding windows, fading factors are faster and memoryless, both important requirements for streaming applications. Overall, this paper is a contribution to a discussion on best practice for performance assessment when learning is a continuous process, and the decision models are dynamic and evolve over time.
Conference Paper
Full-text available
It is clear that the learning speed of feedforward neural networks is in general far slower than required and it has been a major bottleneck in their applications for past decades. Two key reasons behind may be: 1) the slow gradient-based learning algorithms are extensively used to train neural networks, and 2) all the parameters of the networks are tuned iteratively by using such learning algorithms. Unlike these traditional implementations, this paper proposes a new learning algorithm called extreme learning machine (ELM) for single-hidden layer feedforward neural networks (SLFNs) which randomly chooses the input weights and analytically determines the output weights of SLFNs. In theory, this algorithm tends to provide the best generalization performance at extremely fast learning speed. The experimental results based on real-world benchmarking function approximation and classification problems including large complex applications show that the new algorithm can produce best generalization performance in some cases and can learn much faster than traditional popular learning algorithms for feedforward neural networks.
Article
stream-learn is a Python package compatible with scikit-learn and developed for the drifting and imbalanced data stream analysis. Its main component is a stream generator, which allows producing a synthetic data stream that may incorporate each of the three main concept drift types (i.e., sudden, gradual and incremental drift) in their recurring or non-recurring version, as well as static and dynamic class imbalance. The package allows conducting experiments following established evaluation methodologies (i.e., Test-Then-Train and Prequential). Besides, estimators adapted for data stream classification have been implemented, including both simple classifiers and state-of-the-art chunk-based and online classifier ensembles. The package utilises its own implementations of prediction metrics for imbalanced binary classification tasks to improve computational efficiency.
Article
In recent years, the availability of time series streaming information has been growing enormously. Learning from real-time data has been receiving increasingly more attention since the last decade. Online learning encounters the change in the distribution of data while extracting considerable information from data streams. Hidden data contexts, which are not known to the learning algorithms, are known as concept drift. Classifier classifies incoming instances using past training instances of the data stream. The accuracy of the classifier deteriorates because of the concept drift. The traditional classifiers are not expected to learn the patterns in a non-stationary distribution of data. For any real-time use, the classifier needs to detect the concept drift and adapts over time. In the real-time scenario, we have to deal with semi-supervised and unsupervised data, which provide no or fewer labeled data. The motivation behind this paper is to introduce a survey identified with a broad categorization of concept drift detectors with their key points, limitations, and advantages. Eventually, the article suggests research trends, research challenges, and future work. The adaptive mechanisms are also incorporated in this survey.
Conference Paper
Despite the fact that real-life data streams may often be characterized by the dynamic changes in the prior class probabilities, there is a scarcity of articles trying to clearly describe and classify this problem as well as suggest new methods dedicated to resolving this issue. The following paper aims to fill this gap by proposing a novel data stream taxonomy defined in the context of prior class probability and by introducing the Dynamic Statistical Concept Analysis (DSCA) – prior probability estimation algorithm. The proposed method was evaluated using computer experiments carried out on 100 synthetically generated data streams with various class imbalance characteristics. The obtained results, supported by statistical analysis, confirmed the usefulness of the proposed solution, especially in the case of discrete dynamically imbalanced data streams (DDIS).
Article
The imbalanced data classification remains a vital problem. The key is to find such methods that classify both the minority and majority class correctly. The paper presents the classifier ensemble for classifying binary, non-stationary and imbalanced data streams where the Hellinger Distance is used to prune the ensemble. The paper includes an experimental evaluation of the method based on the conducted experiments. The first one checks the impact of the base classifier type on the quality of the classification. In the second experiment, the Hellinger Distance Weighted Ensemble (hdwe) method is compared to selected state-of-the-art methods using a statistical test with two base classifiers. The method was profoundly tested based on many imbalanced data streams and obtained results proved the hdwe method's usefulness.
Chapter
Learning from the non-stationary imbalanced data stream is a serious challenge to the machine learning community. There is a significant number of works addressing the issue of classifying non-stationary data stream, but most of them do not take into consideration that the real-life data streams may exhibit high and changing class imbalance ratio, which may complicate the classification task. This work attempts to connect two important, yet rarely combined, research trends in data analysis, i.e., non-stationary data stream classification and imbalanced data classification. We propose a novel framework for training base classifiers and preparing the dynamic selection dataset (DSEL) to integrate data preprocessing and dynamic ensemble selection (DES) methods for imbalanced data stream classification. The proposed approach has been evaluated on the basis of computer experiments carried out on 72 artificially generated data streams with various imbalance ratios, levels of label noise and types of concept drift. In addition, we consider six variations of preprocessing methods and four DES methods. Experimentation results showed that dynamic ensemble selection, even without the use of any data preprocessing, can outperform a naive combination of the whole pool generated with the use of preprocessing methods. Combining DES with preprocessing further improves the obtained results.
Article
Due to variety of modern real-life tasks, where analyzed data is often not a static set, the data stream mining gained a substantial focus of machine learning community. Main property of such systems is the large amount of data arriving in a sequential manner, which creates an endless stream of objects. Taking into consideration the limited resources as memory and computational power, it is widely accepted that each instance can be processed up once and it is not remembered, making reevaluation impossible. In the following work, we will focus on the data stream classification task where parameters of a classification model may vary over time, so the model should be able to adapt to the changes. It requires a forgetting mechanism, ensuring that outdated samples will not impact a model. The most popular approaches base on so-called windowing, requiring storage of a batch of objects and when new examples arrive, the least relevant ones are forgotten. Objects in a new window are used to retrain the model, which is cumbersome especially for online learners and contradicts the principle of processing each object at most once. Therefore, this work employs inbuilt forgetting mechanism of neural networks. Additionally, to reduce a need of expensive (sometimes even impossible) object labeling, we are focusing on active learning, which asks for labels only for interesting examples, crucial for appropriate model upgrading. Characteristics of proposed methods were evaluated on the basis of the computer experiments, performed over diverse pool of data streams. Their results confirmed the convenience of proposed strategy.
Conference Paper
Multiple classifier systems are used to improve the performance of base classifiers. One of the most important steps in the formation of multiple classifier systems is the integration process in which the base classifiers outputs are combined. The most commonly used classifiers outputs are class labels, the ranking list of possible classes or confidence levels. In this paper, we propose an integration process which takes place in the “geometry space”. It means that we use the decision boundary in the integration process. The results of the experiment based on several data sets show that the proposed integration algorithm is a promising method for the development of multiple classifiers systems.
Article
In many applications of information systems learning algorithms have to act in dynamic environments where data are collected in the form of transient data streams. Compared to static data mining, processing streams imposes new computational requirements for algorithms to incrementally process incoming examples while using limited memory and time. Furthermore, due to the non-stationary characteristics of streaming data, prediction models are often also required to adapt to concept drifts. Out of several new proposed stream algorithms, ensembles play an important role, in particular for non-stationary environments. This paper surveys research on ensembles for data stream classification as well as regression tasks. Besides presenting a comprehensive spectrum of ensemble approaches for data streams, we also discuss advanced learning concepts such as imbalanced data streams, novelty detection, active and semi-supervised learning, complex data representations and structured outputs. The paper concludes with a discussion of open research problems and lines of future research.
Article
Extreme learning machine (ELM) is a competitive machine learning technique, which is simple in theory and fast in implementation. The network types are ''generalized'' single hidden layer feedforward networks, which are quite diversified in the form of variety in feature mapping functions or kernels. To deal with data with imbalanced class distribution, a weighted ELM is proposed which is able to generalize to balanced data. The proposed method maintains the advantages from original ELM: (1) it is simple in theory and convenient in implementation; (2) a wide type of feature mapping functions or kernels are available for the proposed framework; (3) the proposed method can be applied directly into multiclass classification tasks. In addition, after integrating with the weighting scheme, (1) the weighted ELM is able to deal with data with imbalanced class distribution while maintain the good performance on well balanced data as unweighted ELM; (2) by assigning different weights for each example according to users' needs, the weighted ELM can be generalized to cost sensitive learning.
Article
Many organizations today have more than very large databases; they have databases that grow without limit at a rate of several million records per day. Mining these continuous data streams brings unique opportunities, but also new challenges. This paper describes and evaluates VFDT, an anytime system that builds decision trees using constant memory and constant time per example. VFDT can incorporate tens of thousands of examples per second using o#-the-shelf hardware. It uses Hoe#ding bounds to guarantee that its output is asymptotically nearly identical to that of a conventional learner. We study VFDT's properties and demonstrate its utility through an extensive set of experiments on synthetic data. We apply VFDT to mining the continuous stream of Web access data from the whole University of Washington main campus.
Classifier selection for imbalanced data stream classification (2021) Google Scholar
  • P Zyblewski