[Show abstract][Hide abstract] ABSTRACT: We have previously described an incremental learning algorithm, Learn + + .NC, for learning from new datasets that may include new concept classes without accessing previously seen data. We now propose
an extension, Learn + + .UDNC, that allows the algorithm to incrementally learn new concept classes from unbalanced datasets. We describe the algorithm
in detail, and provide some experimental results on two separate representative scenarios (on synthetic as well as real world
data) along with comparisons to other approaches for incremental and/or unbalanced dataset approaches.
KeywordsIncremental Learning-Ensembles of Classifiers-Learn++-Unbalanced Data
[Show abstract][Hide abstract] ABSTRACT: We have previously introduced an incremental learning algorithm Learn(++), which learns novel information from consecutive data sets by generating an ensemble of classifiers with each data set, and combining them by weighted majority voting. However, Learn(++) suffers from an inherent "outvoting" problem when asked to learn a new class omega(new) introduced by a subsequent data set, as earlier classifiers not trained on this class are guaranteed to misclassify omega(new) instances. The collective votes of earlier classifiers, for an inevitably incorrect decision, then outweigh the votes of the new classifiers' correct decision on omega(new) instances--until there are enough new classifiers to counteract the unfair outvoting. This forces Learn(++) to generate an unnecessarily large number of classifiers. This paper describes Learn(++).NC, specifically designed for efficient incremental learning of multiple new classes using significantly fewer classifiers. To do so, Learn (++).NC introduces dynamically weighted consult and vote (DW-CAV), a novel voting mechanism for combining classifiers: individual classifiers consult with each other to determine which ones are most qualified to classify a given instance, and decide how much weight, if any, each classifier's decision should carry. Experiments on real-world problems indicate that the new algorithm performs remarkably well with substantially fewer classifiers, not only as compared to its predecessor Learn(++), but also as compared to several other algorithms recently proposed for similar problems.
[Show abstract][Hide abstract] ABSTRACT: We describe an ensemble of classifiers based algorithm for incre- mental learning in nonstationary environments. In this formulation, we assume that the learner is presented with a series of training datasets, each of which is drawn from a different snapshot of a distribution that is drifting at an unknown rate. Furthermore, we assume that the algorithm must learn the new environ- ment in an incremental manner, that is, without having access to previously available data. Instead of a time window over incoming instances, or an aged based forgetting - as used by most ensemble based nonstationary learning algo- rithms - a strategic weighting mechanism is employed that tracks the classifi- ers' performances over drifting environments to determine appropriate voting weights. Specifically, the proposed approach generates a single classifier for each dataset that becomes available, and then combines them through a dynamically modified weighted majority voting, where the voting weights themselves are computed as weighted averages of classifiers' individual per- formances over all environments. We describe the implementation details of this approach, as well as its initial results on simulated non-stationary environments.
Multiple Classifier Systems, 7th International Workshop, MCS 2007, Prague, Czech Republic, May 23-25, 2007, Proceedings; 01/2007
[Show abstract][Hide abstract] ABSTRACT: We have previously introduced the Learn + + algorithm that provides surprisingly promising performance for incremental learning as well as data fusion applications.
In this contribution we show that the algorithm can also be used to estimate the posterior probability, or the confidence
of its decision on each test instance. On three increasingly difficult tests that are specifically designed to compare posterior
probability estimates of the algorithm to that of the optimal Bayes classifier, we have observed that estimated posterior
probability approaches to that of the Bayes classifier as the number of classifiers in the ensemble increase. This satisfying
and intuitively expected outcome shows that ensemble systems can also be used to estimate confidence of their output.
Multiple Classifier Systems, 6th International Workshop, MCS 2005, Seaside, CA, USA, June 13-15, 2005, Proceedings; 01/2005
[Show abstract][Hide abstract] ABSTRACT: An ensemble based algorithm, Learn++. MT2, is introduced as an enhanced alternative to our previously reported incremental learning algorithm, Learn++. Both algorithms are capable of incrementally learning novel information from new datasets that consecutively become available, without requiring access to the previously seen data. In this contribution, we describe Learn++. MT2, which specifically targets incrementally learning from distinctly unbalanced data, where the amount of data that become available varies significantly from one database to the next. The problem of unbalanced data within the context of incremental learning is discussed first, followed by a description of the proposed solution. Initial, yet promising results indicate considerable improvement on the generalization performance and the stability of the algorithm.
[Show abstract][Hide abstract] ABSTRACT: An ensemble of classifiers based algorithm, Learn++, was recently introduced that is capable of incrementally learning new information from data- sets that consecutively become available, even if the new data introduce addi- tional classes that were not formerly seen. The algorithm does not require ac- cess to previously used datasets, yet it is capable of largely retaining the previously acquired knowledge. However, Learn++ suffers from the inherent "out-voting" problem when asked to learn new classes, which causes it to gen- erate an unnecessarily large number of classifiers. This paper proposes a modi- fied version of this algorithm, called Learn++.MT that not only reduces the number of classifiers generated, but also provides performance improvements. The out-voting problem, the new algorithm and its promising results on two benchmark datasets as well as on one real world application are presented.
Multiple Classifier Systems, 5th International Workshop, MCS 2004, Cagliari, Italy, June 9-11, 2004, Proceedings; 01/2004