ArticlePublisher preview available

Dynamic Hybrid Random Fields for the Probabilistic Graphical Modeling of Sequential Data: Definitions, Algorithms, and an Application to Bioinformatics

Authors:
To read the full-text of this research, you can request a copy directly from the authors.

Abstract and Figures

The paper introduces a dynamic extension of the hybrid random field (HRF), called dynamic HRF (D-HRF). The D-HRF is aimed at the probabilistic graphical modeling of arbitrary-length sequences of sets of (time-dependent) discrete random variables under Markov assumptions. Suitable maximum likelihood algorithms for learning the parameters and the structure of the D-HRF are presented. The D-HRF inherits the computational efficiency and the modeling capabilities of HRFs, subsuming both dynamic Bayesian networks and Markov random fields. The behavior of the D-HRF is first evaluated empirically on synthetic data drawn from probabilistic distributions having known form. Then, D-HRFs (combined with a recurrent autoencoder) are successfully applied to the prediction of the disulfide-bonding state of cysteines from the primary structure of proteins in the Protein Data Bank.
The graphical components of a hybrid random field for the variables X1,…,X4\documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$X_{1}, \ldots , X_{4}$$\end{document}. Since each node Xi\documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$X_{i}$$\end{document} has its own Bayesian network (where nodes in MBi(Xi)\documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$\mathcal {MB}_{i}(X_{i})$$\end{document} are shaded), there are four different DAGs G1,…,G4\documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$${\mathcal {G}}_1, \ldots , {\mathcal {G}}_4$$\end{document}. Relatives of Xi\documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$X_{i}$$\end{document} that are not in MBi(Xi)\documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$\mathcal {MB}_{i}(X_{i})$$\end{document} are dashed
… 
This content is subject to copyright. Terms and conditions apply.
Neural Process Lett (2018) 48:733–768
https://doi.org/10.1007/s11063-017-9730-3
Dynamic Hybrid Random Fields for the Probabilistic
Graphical Modeling of Sequential Data: Definitions,
Algorithms, and an Application to Bioinformatics
Marco Bongini1·Antonino Freno2·
Vincenzo Laveglia1,3·Edmondo Trentin1
Published online: 26 October 2017
© Springer Science+Business Media, LLC 2017
Abstract The paper introduces a dynamic extension of the hybrid random field (HRF),
called dynamic HRF (D-HRF). The D-HRF is aimed at the probabilistic graphical modeling
of arbitrary-length sequences of sets of (time-dependent) discrete random variables under
Markov assumptions. Suitable maximum likelihood algorithms for learning the parameters
and the structure of the D-HRF are presented. The D-HRF inherits the computational effi-
ciency and the modeling capabilities of HRFs, subsuming both dynamic Bayesian networks
and Markov random fields. The behavior of the D-HRF is first evaluated empirically on
synthetic data drawn from probabilistic distributions having known form. Then, D-HRFs
(combined with a recurrent autoencoder) are successfully applied to the prediction of the
disulfide-bonding state of cysteines from the primary structure of proteins in the Protein
Data Bank.
Keywords Probabilistic graphical model ·Hybrid random field ·Dynamic Bayesian
network ·Recurrent autoencoder ·Disulfide bond
BEdmondo Trentin
trentin@dii.unisi.it
Marco Bongini
bongini@dii.unisi.it
Antonino Freno
antonino.freno@zalando.de
Vincenzo Laveglia
vincenzo.laveglia@unifi.it
1DIISM, Università di Siena, Via Roma, 56, 53100 Siena, Italy
2Zalando SE, Charlottenstrasse, 4, 10969 Berlin, Germany
3DINFO, Università di Firenze, Via di S. Marta, 3, 50139 Florence, Italy
123
Content courtesy of Springer Nature, terms of use apply. Rights reserved.
... , Y L corresponding to the individual states of nature are known in advance. We propose a hybrid neural/Markovian realization [27] of -HMMs for the probabilistic graphical modelling of p(Y | W ) . Using a standard HMM notation [28], an -HMM H is formally defined as H = (S, π, A, B E ) where: S = {S 1 , . . . ...
Article
Full-text available
A difficult and open problem in artificial intelligence is the development of agents that can operate in complex environments which change over time. The present communication introduces the formal notions, the architecture, and the training algorithm of a machine capable of learning and decision-making in evolving structured environments. These environments are defined as sets of evolving relations among evolving entities. The proposed machine relies on a probabilistic graphical model whose time-dependent latent variables undergo a Markov assumption. The likelihood of such variables given the structured environment is estimated via a probabilistic variant of the recursive neural network.
Article
Full-text available
We present a method for learning treewidth-bounded Bayesian networks from data sets containing thousands of variables. Bounding the treewidth of a Bayesian greatly reduces the complexity of inferences. Yet, being a global property of the graph, it considerably increases the difficulty of the learning process. We propose a novel algorithm for this task, able to scale to large domains and large treewidths. Our novel approach consistently outperforms the state of the art on data sets with up to ten thousand variables.
Book
Comprehensive introduction to the neural network models currently under intensive study for computational applications. It also provides coverage of neural network applications in a variety of problems of both theoretical and practical interest.
Chapter
This chapter addresses the problem of learning the parameters from data. It also discusses score-based structure learning and constraint-based structure learning. The method for learning all parameters in a Bayesian network follows readily from the method for learning a single parameter. The chapter presents a method for learning the probability of a binomial variable and extends this method to multinomial variables. It also provides guidelines for articulating the prior beliefs concerning probabilities. The chapter illustrates the constraint-based approach by showing how to learn a directed acyclic graph (DAG) faithful to a probability distribution. Structure learning consists of learning the DAG in a Bayesian network from data. It is necessary to know which DAG satisfies the Markov condition with the probability distribution P that is generating the data. The process of learning such a DAG is called “model selection.” A DAG includes a probability distribution P if the DAG does not entail any conditional independencies that are not in P. In score-based structure learning, a score is assigned to each DAG based on the data such that in the limit. After scoring the DAGs, the score are used, possibly along with prior probabilities, to learn a DAG. The most straightforward score, the Bayesian score, is the probability of the data D given the DAG. Once a DAG is learnt from data, the parameters can be known. The result will be a Bayesian network that can be used to do inference. In the constraint-based approach, a DAG is found for which the Markov condition entails all and only those conditional independencies that are in the probability distribution P of the variables of interest. The chapter applies structure learning to inferring causal influences from data and presents learning packages. It presents examples of learning Bayesian networks and of causal learning.
Conference Paper
Cysteines in a protein have a tendency to form mutual disulfide bonds. This affects the secondary and tertiary structure of the protein. Therefore, automatic prediction of the bonding state of cysteines from the primary structure of proteins has long been a relevant task in bioinformatics. The paper investigates the feasibility of a predictor based on a hybrid approach that combines the dynamic encoding capabilities of a recurrent autoencoder with the short-term/long-term dependencies modeling capabilities of a dynamic probabilistic graphical model (a dynamic extension of the hybrid random field). Results obtained using 1797 proteins from the May 2010 version of the Protein Data Bank show an average accuracy of \(85\,\%\) by relying only on the sub-sequences of the residue chains with no additional attributes (like global descriptors, or evolutionary information provided by multiple alignment).
Conference Paper
This paper develops a maximum pseudo-likelihood algorithm for learning the structure of the dynamic extension of Hybrid Random Field introduced in the companion paper [5]. The technique turns out to be a viable method for capturing the statistical (in)dependencies among the random variables within a sequence of patterns. Complexity issues are tackled by means of adequate strategies from classic literature on probabilistic graphical models. A preliminary empirical evaluation is presented eventually.
Article
Robust acoustic modeling is essential in the development of automatic speech recognition systems applied to spoken human-computer interaction. To this end, traditional hidden Markov models (HMM) may be improved by hybridizing them with artificial neural networks (ANN). Crucially, ANNs require input values that do not compromize their numerical stability. In spite of the relevance feature normalization has on the success of ANNs in real-world applications, the issue is mostly overlooked on the false premize that “any normalization technique will do”. The paper proposes a gradient-ascent, maximum-likelihood algorithm for feature normalization. Relying on mixtures of logistic densities, it ensures ANN-friendly values that are distributed over the (0, 1) interval in a uniform manner. Some nice properties of the approach are discussed. The algorithm is applied to the normalization of acoustic features for a hybrid ANN/HMM speech recognizer. Experiments on real-world continuous speech recognition tasks are presented. The hybrid system turns out to be positively affected by the proposed technique.