Ruslan R Salakhutdinov's research while affiliated with University of Toronto and other places

Publications (5)

Article
Full-text available
We introduce a Deep Boltzmann Machine model suitable for modeling and extracting latent semantic representations from a large unstructured collection of documents. We overcome the apparent difficulty of training a DBM with judicious parameter tying. This parameter tying enables an efficient pretraining algorithm and a state initialization scheme th...
Article
Many practitioners who use the EM algorithm complain that it is sometimes slow. When does this happen, and what can be done about it? In this paper, we study the general class of bound optimization algorithms - including Expectation-Maximization, Iterative Scaling and CCCP - and their relationship to direct optimization algorithms such as gradient-...
Article
The recent proliferation of richly structured probabilistic models raises the question of how to automatically determine an appropriate model for a dataset. We investigate this question for a space of matrix decomposition models which can express a variety of widely used models from unsupervised learning. To enable model selection, we organize thes...
Article
Full-text available
When a large feedforward neural network is trained on a small training set, it typically performs poorly on held-out test data. This "overfitting" is greatly reduced by randomly omitting half of the feature detectors on each training case. This prevents complex co-adaptations in which a feature detector is only helpful in the context of several oth...
Conference Paper
We apply a constrained Hidden Markov Model architecture to the problem of simultaneous localization and surveying from sensor logs of mobile agents navigating in unknown environments. We show the solution of this problem for the case of one robot and extend our model to the more interesting case of multiple agents, that interact with each other thr...

Citations

... One extension of the RSM consists in adding independent variables in a manner analogous to the conditional RBM (Mnih et al. 2011). Another possibility is a deep RSM encompassing two or more layers of hidden variable in place of just one hidden layer (Salakhutdinov et al. 2013). This extension will entail an increase of computing times needed for estimation and inference, but might lead to a further improvement of model performance. ...
... While the optimization problem given by (1) may in general be nonconvex and highly nonlinear with many local minima, the design problem may become convex if the initial designb * is selected 260 sufficiently close to the final optimum. This functional form is advantageous for optimization as many existing optimization algorithms perform very well given convex functions [80,81]. ...
... The Automatic Statistician framework (Grosse et al. 2012;Duvenaud et al. 2013;Lloyd et al. 2014;Ghahramani 2015;Steinruecken et al. 2019) aims to address challenges on automating model discovery. The framework adopts search procedures over a space of models, enumerating possible compositional Gaussian Process (GP) kernel functions generated from base ones and compositional grammar rules. ...
... Global Average Pooling layer is used in place of Max-Pooling to prevent loss of information contained in the pooling neighbourhood of an activation [25]. Regularization technique is applied to the fully connected dense layer in the sequential model via a Dropout layer [26] to prevent overfitting of the network on the training data. The dropout is varied from 0.2 to 0.5 within each model series created for each of the architectures used, so that a well generalized model with best possible prediction capability can be obtained. ...