Article

Maximum Margin Bayesian Network Classifiers

Dept. of Electr. Eng., Graz Univ. of Technol., Graz, Austria
IEEE Transactions on Pattern Analysis and Machine Intelligence (Impact Factor: 5.69). 04/2012; DOI: 10.1109/TPAMI.2011.149
Source: IEEE Xplore

ABSTRACT We present a maximum margin parameter learning algorithm for Bayesian network classifiers using a conjugate gradient (CG) method for optimization. In contrast to previous approaches, we maintain the normalization constraints on the parameters of the Bayesian network during optimization, i.e., the probabilistic interpretation of the model is not lost. This enables us to handle missing features in discriminatively optimized Bayesian networks. In experiments, we compare the classification performance of maximum margin parameter learning to conditional likelihood and maximum likelihood learning approaches. Discriminative parameter learning significantly outperforms generative maximum likelihood estimation for naive Bayes and tree augmented naive Bayes structures on all considered data sets. Furthermore, maximizing the margin dominates the conditional likelihood approach in terms of classification performance in most cases. We provide results for a recently proposed maximum margin optimization approach based on convex relaxation [1]. While the classification results are highly similar, our CG-based optimization is computationally up to orders of magnitude faster. Margin-optimized Bayesian network classifiers achieve classification performance comparable to support vector machines (SVMs) using fewer parameters. Moreover, we show that unanticipated missing feature values during classification can be easily processed by discriminatively optimized Bayesian network classifiers, a case where discriminative classifiers usually require mechanisms to complete unknown feature values in the data first.

0 Bookmarks
 · 
314 Views
  • Source
    Conference Paper: Integer Bayesian Networks
    European Conference on Machine Learning (ECML); 01/2014
  • Source
    IEEE Transactions on Pattern Analysis and Machine Intelligence 01/2014; · 5.69 Impact Factor
  • Source
    [Show abstract] [Hide abstract]
    ABSTRACT: Stochastic discriminative EM (sdEM) is an online-EM-type algorithm for discriminative training of probabilistic generative models belonging to the exponential family. In this work, we introduce and justify this algorithm as a stochastic natural gradient descent method, i.e. a method which accounts for the information geometry in the parameter space of the statistical model. We show how this learning algorithm can be used to train probabilistic generative models by minimizing different discriminative loss functions, such as the negative conditional log-likelihood and the Hinge loss. The resulting models trained by sdEM are always generative (i.e. they define a joint probability distribution) and, in consequence, allows to deal with missing data and latent variables in a principled way either when being learned or when making predictions. The performance of this method is illustrated by several text classification problems for which a multinomial naive Bayes and a latent Dirichlet allocation based classifier are learned using different discriminative loss functions.
    10/2014;

Full-text (2 Sources)

Download
172 Downloads
Available from
May 26, 2014