Matthew J. Beal’s research while affiliated with State University of New York and other places

What is this page?


This page lists works of an author who doesn't have a ResearchGate profile or hasn't added the works to their profile yet. It is automatically generated from public (personal) data to further our legitimate goal of comprehensive and accurate scientific recordkeeping. If you are this author and want this page removed, please let us know.

Publications (39)


Graphical Models and Variational Methods
  • Chapter

June 2001

·

17 Citations

Zoubin Ghahramani

·

Matthew J. Beal

This book covers the theoretical foundations of advanced mean field methods, explores the relation between the different approaches, examines the quality of the approximation obtained, and demonstrates their application to various areas of probabilistic modeling. A major problem in modern probabilistic modeling is the huge computational complexity involved in typical calculations with multivariate probability distributions when the number of random variables is large. Because exact computations are infeasible in such cases and Monte Carlo sampling techniques may reach their limits, there is a need for methods that allow for efficient approximate computations. One of the simplest approximations is based on the mean field method, which has a long history in statistical physics. The method is widely used, particularly in the growing field of graphical models. Researchers from disciplines such as statistical physics, computer science, and mathematical statistics are studying ways to improve this and related methods and are exploring novel application areas. Leading approaches include the variational approach, which goes beyond factorizable distributions to achieve systematic improvements; the TAP (Thouless-Anderson-Palmer) approach, which incorporates correlations by including effective reaction terms in the mean field theory; and the more general methods of graphical models. Bringing together ideas and techniques from these diverse disciplines, this book covers the theoretical foundations of advanced mean field methods, explores the relation between the different approaches, examines the quality of the approximation obtained, and demonstrates their application to various areas of probabilistic modeling. Bradford Books imprint


Propagation Algorithms for Variational Bayesian Learning

February 2001

·

179 Reads

·

300 Citations

Advances in Neural Information Processing Systems

Variational approximations are becoming a widespread tool for Bayesian learning of graphical models. We provide some theoretical results for the variational updates in a very general family of conjugate-exponential graphical models. We show how the belief propagation and the junction tree algorithms can be used in the inference step of variational Bayesian learning. Applying these results to the Bayesian analysis of linear-Gaussian state-space models we obtain a learning procedure that exploits the Kalman smoothing propagation, while integrating over all model parameters. We demonstrate how this can be used to infer the hidden state dimensionality of the state-space model in a variety of synthetic problems and one real high-dimensional data set.




Graphical Models and Variational Methods

October 2000

·

29 Reads

·

80 Citations

We review the use of variational methods of approximating inference and learning in probabilistic graphical models. In particular, we focus on variational approximations to the integrals required for Bayesian learning. For models in the conjugate-exponential family, a generalisation of the EM algorithm is derived that iterates between optimising hyperparameters of the distribution over parameters, and inferring the hidden variable distributions. These approximations make use of available propagation algorithms for probabilistic graphical models. We give two case studies of how the variational Bayesian approach can be used to learn model structure: inferring the number of clusters and dimensionalities in a mixture of factor analysers, and inferring the dimension of the state space of a linear dynamical system. Finally, importance sampling corrections to the variational approximations are discussed, along with their limitations.


Variational Inference for Bayesian Mixtures of Factor Analysers

September 2000

·

159 Reads

·

364 Citations

Advances in Neural Information Processing Systems

We present an algorithm that infers the model structure of a mixture of factor analysers using an ecient and deterministic variational approximation to full Bayesian integration over model parameters. This procedure can automatically determine the optimal number of components and the local dimensionality of each component (i.e. the number of factors in each factor analyser). Alternatively it can be used to infer posterior distributions over number of components and dimensionalities. Since all parameters are integrated out the method is not prone to over tting. Using a stochastic procedure for adding components it is possible to perform the variational optimisation incrementally and to avoid local maxima. Results show that the method works very well in practice and correctly infers the number and dimensionality of nontrivial synthetic examples. By importance sampling from the variational approximation we show how to obtain unbiased estimates of the true evidence, the exa...



Variational Inference for Bayesian Mixture of Factor Analysers

June 1999

·

39 Reads

·

63 Citations

We present an algorithm that infers the model structure of a mixture of factor analysers using an efficient and deterministic variational approximation to full Bayesian integration over model parameters. This procedure can automatically determine the optimal number of components and the local dimensionality of each component (i.e. the number of factors in each factor analyser). Alternatively it can be used to infer posterior distributions over number of components and dimensionalities. Since all parameters are integrated out the method is not prone to overfitting. Using a stochastic procedure for adding components it is possible to perform the variational optimisation incrementally and to avoid local maxima. Results show that the method works very well in practice and correctly infers the number and dimensionality of nontrivial synthetic examples. 1 Introduction Factor analysis (FA) is a method for modelling correlations in multidimensional data. The model assumes that each p-dimensi...


Speeding Up Multi-class SVM Evaluation via Principle Component Analysis and Recursive Feature Elimination
  • Article
  • Full-text available

128 Reads

·

4 Citations

Support Vector Machines (SVM) have been shown to yield state-of-the-art performance in many pattern analysis applications. Feature selection methods for SVMs are often used to reduce the complexity of learning and evaluation. In this article we propose to combine a standard method, Recursive Feature Elimination (RFE), with Principal Component Analysis (PCA) to produce a multi-class SVM framework for evaluation speedup. In addition, we propose using the Leave-One-Out error to guide the feature elimination and determine the minimum size of feature subset. RFE together with PCA is able to compress the size of SVM classifier model and speed up the SVM evaluation significantly. Experimental results on the MNIST benchmark database and other commonly used datasets show that RFE and PCA can speed up the evaluation of SVM by an order of magnitude while maintaining comparable accuracy.

Download

Citations (32)


... The parameters of the rest of the DGP depend on the discrete state ψ t . The objective is to infer the sequence of underlying discrete states that best "explains" the observed data (Ostendorf et al., 1996;Ghahramani & Hinton, 2000;Beal et al., 2001;Fox et al., 2007;Van Gael et al., 2008;Linderman et al., 2017). In this context, non-stationarity arises from the switching behaviour of the underlying discrete process. ...

Reference:

BONE: a unifying framework for Bayesian online learning in non-stationary environments
The Infinite Hidden Markov Model
  • Citing Chapter
  • November 2002

... Instead of evaluating the likelihood, the algorithms in this category operate under the assumption that simulating data under the model (or a surrogate thereof) facilitates an understanding of the likelihood. Representatives for these algorithms are Bayesian synthetic likelihood (Price et al. 2018), specific versions of Variational Bayes (Beal and Ghahramani 2003;Jordan et al. 1999;Blei et al. 2017), Integrated nested Laplace (Rue et al. 2009), and, possibly the most popular one, Approximate Bayesian computation (ABC) (Tavaré et al. 1997;Pritchard et al. 1999;Beaumont et al. 2002;Marjoram et al. 2003;Csilléry et al. 2010;Beaumont 2010;Sisson et al. 2007Sisson et al. , 2019. In this work, we focus exclusively on ABC, which has proven to facilitate successful calibration in the context of ABMs in biological applications, e.g., (Lambert et al. 2018;Wang et al. 2024). ...

The Variational Bayesian EM Algorithm for Incomplete Data: With Application to Scoring Graphical Model Structures
  • Citing Chapter
  • July 2003

... This is the conventional negative evidence lower bound objective (ELBO) (Beal and Ghahramani, 2000) up to a constant. In contrast to variational latent variable models such as variational autoencoders (VAEs) (Kingma and Welling, 2014), here the space modeled by the prior and variational posterior is grounded by observed data. ...

Gatsby Computational Neuroscience Unit
  • Citing Article
  • January 2000

... To reduce the complexity of learning and evaluation, the Lei team combined standard recursive feature elimination with principal component analysis based on SVM to generate a multi-class SVM framework. Experiments denoted that this method can improve the evaluation speed of SVM by an order of magnitude while maintaining considerable accuracy [10]. ...

Speeding Up Multi-class SVM Evaluation via Principle Component Analysis and Recursive Feature Elimination

... VB algorithm for HSMM allows making inference on parameters, hidden states and models by approximating the joint posterior of hidden states and parameters, with a simpler variational density. The usual Mean Field Approximation (Ghahramani et al., 2000) requires the approximate posterior to factorise over subsets of parameters and hidden variables: ...

Graphical model and variational methods
  • Citing Chapter
  • January 2001

... In other cases, the total error rate ε t , which is defined as ε t = ((FRR · P (ω 1 )) + (FAR · P (ω 2 ))-where P (ω 1 ) and P (ω 2 ) are the a priori probabilities of classes of genuine signatures (ω 1 ) and forgeries (ω 2 ), is used [281]- [283]. The receiver operating characteristic (ROC) curve analysis is also applied to FRR versus FAR evaluation since it shows the ability of a system to discriminate genuine signatures from forged ones [see Fig. 10(b)] [309], [311]. ...

Machine learning approaches for person identification and verification
  • Citing Article
  • May 2005

Proceedings of SPIE - The International Society for Optical Engineering

... Goodness of fit tests can be successfully used in various areas, such as signature verification, automatic speaker identification, detection of radio frequency, economics, and data reconstruction (Biswas et al. 2008;Cho et al. 2013;Güner et al. 2009;Srinivasan et al. 2005). ...

Signature verification using kolmogorov-smirnov statistic
  • Citing Article
  • February 2005

... Low-level visual cues (e.g., intensity, contrast, homogeneity, etc.) in combination with shallow machine learning techniques have been used for specific object classification tasks [22], [23] with fair performance, but these methods are mainly for classification as they tend to aggregate global features in compact representations which are less suitable for performing object detection. Performance improvement has been also sought by resorting to graphical models [24], [25], [26]) such as Conditional Random Fields (CRFs), but despite their capabilities to capture fine edge details, these methods are still not as effective as expected. Our hypothesis is that the main reason for unsatisfactory performance is that tables (mainly) and charts usually cover large areas and, as such, they need methods able to account for long-range dependencies. ...

Segmentation and labeling of documents using conditional random fields
  • Citing Article
  • March 2007

Proceedings of SPIE - The International Society for Optical Engineering

... (1) (2) (3) (4) (5) (6) (7) (8) (9) (10) (11) (12) (13) (14) [14] and [15] [ 16] [17] [18] [8] Our approach users with similar behaviors, originating sets of trajectories that share characteristic traits. A recommendation agent analyzes contextual information and current user's past interactions in order to establish its most significant coupling with one of the clusters of trajectories, following a Multi-Armed Bandit policy, and to provide the most appropriate suggestions. ...

On profiling mobility and predicting locations of wireless users
  • Citing Article
  • May 2006