Conference Paper

Approximate Splitting for Ensembles of Trees using Histograms.

In proceeding of: Proceedings of the Second SIAM International Conference on Data Mining, Arlington, VA, USA, April 11-13, 2002
Source: DBLP
0 Bookmarks
 · 
23 Views
  • Source
    [Show abstract] [Hide abstract]
    ABSTRACT: Many dimension reduction methods have been proposed to discover the intrinsic, lower dimensional structure of a high-dimensional dataset. However, determining critical features in datasets that consist of a large number of features is still a challenge. In this paper, through a series of carefully designed experiments on real-world datasets, we investigate the performance of different dimension reduction techniques, ranging from feature subset selection to methods that transform the features into a lower dimensional space. We also discuss methods that calculate the intrinsic dimensionality of a dataset in order to understand the reduced dimension. Using several evaluation strategies, we show how these different methods can provide useful insights into the data. These comparisons enable us to provide guidance to a user on the selection of a technique for their dataset.
    01/2012;
  • Source
    [Show abstract] [Hide abstract]
    ABSTRACT: Los multiclasificadores son actualmente un área de interés dentro del Reconocimiento de Patrones. En esta tesis se presentan tres métodos multiclasificadores: "Cascadas para datos nominales", "Disturbing Neighbors" y "Random Feature Weights". Las Cascadas permiten que clasificadores que necesitan entradas numéricas mejoren sus resultados, tomando como entradas adicionales las estimaciones de probabilidad de otro clasificador que sí pueda trabajar con datos nominales. "Disturbing Neighbors" aumenta el conjunto de entrenamiento de cada clasificador base a partir de la salida de un clasificador NN. El NN de cada clasificador base es obtenido de forma aleatoria. Random Feature Weights es un método que utiliza árboles cómo clasificadores base, que modifica la función de mérito de los mismos mediante un peso aleatorio. Además la tesis aporta nuevos diagramas para la visualización de la diversidad de los clasificadores base: Diagramas de Movimiento Kappa-Error y los Diagramas de Movimiento Relativo Kappa-Error
    01/2010;
  • Source
    [Show abstract] [Hide abstract]
    ABSTRACT: The idea of ensemble methodology is to build a predictive model by integrating multiple models. It is well-known that ensemble methods can be used for improving prediction performance. Researchers from various disciplines such as statistics and AI considered the use of ensemble methodology. This paper, review existing ensemble techniques and can be served as a tutorial for practitioners who are interested in building ensemble based systems.
    Artificial Intelligence Review 01/2010; 33:1-39. · 1.57 Impact Factor