Mehdi Mirza's research while affiliated with Université de Montréal and other places
What is this page?
This page lists the scientific contributions of an author, who either does not have a ResearchGate profile, or has not yet added these contributions to their profile.
It was automatically created by ResearchGate to create a record of this author's body of work. We create such pages to advance our goal of creating and maintaining the most comprehensive scientific repository possible. In doing so, we process publicly available (personal) data relating to the author as a member of the scientific community.
If you're a ResearchGate member, you can follow this page to keep up with this author's work.
If you are this author, and you don't want us to display this page anymore, please let us know.
It was automatically created by ResearchGate to create a record of this author's body of work. We create such pages to advance our goal of creating and maintaining the most comprehensive scientific repository possible. In doing so, we process publicly available (personal) data relating to the author as a member of the scientific community.
If you're a ResearchGate member, you can follow this page to keep up with this author's work.
If you are this author, and you don't want us to display this page anymore, please let us know.
Publications (13)
Humans learn a predictive model of the world and use this model to reason about future events and the consequences of actions. In contrast to most machine predictors, we exhibit an impressive ability to generalize to unseen scenarios and reason intelligently in these settings. One important aspect of this ability is physical intuition(Lake et al.,...
Theano is a Python library that allows to define, optimize, and evaluate mathematical expressions involving multi-dimensional arrays efficiently. Since its introduction, it has been one of the most used CPU and GPU mathematical compilers - especially in the machine learning community - and has shown steady performance improvements. Theano is being...
The task of the emotion recognition in the wild (EmotiW) Challenge is to assign one of seven emotions to short video clips extracted from Hollywood style movies. The videos depict acted-out emotions under realistic conditions with a large degree of variation in attributes such as pose and illumination, making it worthwhile to explore approaches whi...
We propose a new framework for estimating generative models via an
adversarial process, in which we simultaneously train two models: a generative
model G that captures the data distribution, and a discriminative model D that
estimates the probability that a sample came from the training data rather than
G. The training procedure for G is to maximiz...
We propose a new framework for estimating generative models via an adversarial process, in which we simultaneously train two models: a generative model G that captures the data distribution, and a discriminative model D that estimates the probability that a sample came from the training data rather than G. The training procedure for G is to maximiz...
Catastrophic forgetting is a problem faced by many machine learning models
and algorithms. When trained on one task, then trained on a second task, many
machine learning models "forget'' how to perform the first task. This is widely
believed to be a serious problem for neural networks. Here, we investigate the
extent to which the catastrophic forge...
In this paper we present the techniques used for the University of Montréal's team submissions to the 2013 Emotion Recognition in the Wild Challenge. The challenge is to classify the emotions expressed by the primary human subject in short video clips extracted from feature length movies. This involves the analysis of video clips of acted scenes la...
Pylearn2 is a machine learning research library. This does not just mean that
it is a collection of machine learning algorithms that share a common API; it
means that it has been designed for flexibility and extensibility in order to
facilitate research projects that involve new or unusual use cases. In this
paper we give a brief history of the lib...
The ICML 2013 Workshop on Challenges in Representation Learning. 11http://deeplearning.net/icml2013-workshop-competition. focused on three challenges: the black box learning challenge, the facial expression recognition challenge, and the multimodal learning challenge. We describe the datasets created for these challenges and summarize the results o...
We consider the problem of designing models to leverage a recently introduced
approximate model averaging technique called dropout. We define a simple new
model called maxout (so named because its output is the max of a set of inputs,
and because it is a natural companion to dropout) designed to both facilitate
optimization by dropout and improve t...
We consider the problem of designing models to leverage a recently introduced approximate model averaging technique called dropout. We define a simple new model called maxout (so named because its output is the max of a set of inputs, and because it is a natural companion to dropout) designed to both facilitate optimization by dropout and improve t...
We introduce the multi-prediction deep Boltzmann machine (MP-DBM). The MPDBM can be seen as a single probabilistic model trained to maximize a variational approximation to the generalized pseudolikelihood, or as a family of recurrent nets that share parameters and approximately solve different inference problems. Prior methods of training DBMs eith...
We propose a semi-supervised approach to solve the task of emotion recognition in 2D face images using recent ideas in deep learning for handling the factors of variation present in data. An emotion classification algorithm should be both robust to (1) remaining variations due to the pose of the face in the image after centering and alignment, (2)...
Citations
... Therefore, Biswas et al. [41] used the smooth function to approximate the |x| function. They found a general approximation formula of the maximum function from the smooth approximation of the |x| function, which can smoothly approximate the general maxout [42] family, ReLU, leaky ReLU, or its variants, such as Swish, etc. In addition, the author also proves that the GELU function is a special case of the SMU. ...
... A. Ferchichi et al. To deal with the challenges mentioned above, we explored one of the most significant achievements in DL: Generative Adversarial Networks (GANs) (Goodfellow et al., 2014), which can implicitly learn rich distributions over ST data and work with multi-model outputs (Gao et al., 2022). The GAN framework was recently presented as a way to create generative DL models using adversarial training. ...
... processing videos is more memory-intensive and computationally intensive. One simple solution is to directly apply CNN on individual frames for either prediction or feature extraction and aggregate the network outputs in the temporal dimension to obtain video-level classification [19,20]. However, handling sequential data is not the nature of CNN, and frame-level aggregation cannot explicitly exploit the temporal dependency among frames in the video. ...
... They use CNN based image classifiers taking as input an image of a block tower and returning a probability for the tower to fall. Lerer et al. (2016); Mirza et al. (2017) also include a decoding module to predict final positions of these blocks. Groth et al. (2018) investigate the ability of such a model to actively position shapes in stable tower configurations. ...
... DLTS was implemented in Python 3 using Keras 1.1.0 (Ketkar 2017) with Theano 0.8.2 (Al-Rfou et al. 2016) as the backend for the implementation of the deep neural networks. While neural networks were trained using six cores for a few days, the results reported on the tables use a single core. ...
... DBMs are known to fail to train from random initialization, which is called the joint training problem (Goodfellow et al., 2016), and a widely known solution to this is greedy layer-wise pretraining. Although there have been attempts to train without pertaining (Montavon & Müller, 2012;Goodfellow et al., 2013), it is still difficult to show good generative performance. Our method shows high-generation performance by end-to-end training without pretraining and thus can be considered a potential solution to the joint training problem. ...
... (1) Generative Adversarial Networks (GAN) (Goodfellow et al., 2020) consist of two competing neural networks: a generator creating realistic data samples and a discriminator distinguishing between real as well as generated samples (Pan et al., 2019). Both networks are trained in tandem, resulting in an adversarial competition in which the data generation capability optimizes over time (Janiesch et al., 2021). ...
... CV researchers use the term "disentanglement" to illustrate the extraction of object features, such as shape, appearance, pose, or specific parts of the object, from the image data [46]. Salah Rifai and his colleague computer scientists [47] stated, "A central challenge in CV is to disentangle the various factors of variation that explain an image, such as object pose, identity, or various other attributes" (p. 808). ...
... The background information is removed during the process of reprocessing. The next step is to normalise the face image to the size of 227x227 pixels [19][20]. The three represents different samples after pre-processing of images. ...
... However, current SSL methods mainly optimize on global level objectives for encoders, and remain suboptimal to the downstream tasks [20], [23], [24]. Besides, catastrophic forgetting [25], [26] could happen in two-stage training, where model generality may lose by fine-tuning the downstream task on a few labeled images. This motivates us to learn better semantic sensitive local representations for both the encoder and decoder in a semi-supervised setting. ...