Article

Accurate and Efficient Selection of Voting Ensembles

Authors:
To read the full-text of this research, you can request a copy directly from the authors.

Abstract

In this paper, we present a method for efficient se- lection of heterogeneous majority voting ensem- bles. Given a set A of algorithms, the set of possi- ble voters V over A is exponential in the size of A. Thus, it is not computationally feasible to select the best ensemble by evaluating each possibility. In- stead, we compute the classification accuracies of a small subset of A ∪ V, and use these values to pre- dict the accuracy of all the remaining elements in the union. We demonstrate that this procedure, called landmarking, estimates performance well, allowing for the selection of good voting ensem- bles at significantly reduced computational cost. We also conduct statistical tests to show a link be- tween the correlation of performance patterns and diversity of a pair of algorithms.

No full-text available

Request Full-text Paper PDF

To read the full-text of this research,
you can request a copy directly from the authors.

ResearchGate has not been able to resolve any citations for this publication.
Article
Full-text available
This is the first of two papers that use off-training set (OTS) error to investigate the assumption-free relationship between learning algorithms. This first paper discusses the senses in which there are no a priori distinctions between learning algorithms. (The second paper discusses the senses in which there are such distinctions.) In this first paper it is shown, loosely speaking, that for any two algorithms A and B, there are “as many” targets (or priors over targets) for which A has lower expected OTS error than B as vice versa, for loss functions like zero-one loss. In particular, this is true if A is cross-validation and B is “anti-cross-validation” (choose the learning algorithm with largest cross-validation error). This paper ends with a discussion of the implications of these results for computational learning theory. It is shown that one cannot say: if empirical misclassification rate is low, the Vapnik-Chervonenkis dimension of your generalizer is small, and the training set is large, then with high probability your OTS error is small. Other implications for “membership queries” algorithms and “punting” algorithms are also discussed.
Estimating the Performance of Heterogeneous Majority Voting Ensembles via Landmarking. Submitted to the 6th International Workshop on Multiple Classifier Systems
  • C Blake
  • C Merz
References [Authors, 2005] Estimating the Performance of Heterogeneous Majority Voting Ensembles via Landmarking. Submitted to the 6th International Workshop on Multiple Classifier Systems, 2005. [Blake and Merz, 1998] Blake, C., and Merz, C. UCI repository of machine learning databases. University of California, Irvine, Department of Information and Computer Sciences, 1998.