Michael Villamizar

Michael Villamizar
Idiap Research Institute | IDIAP

phD

About

46
Publications
4,074
Reads
How we measure 'reads'
A 'read' is counted each time someone views a publication summary (such as the title, abstract, and list of authors), clicks on a figure, or views or downloads the full-text. Learn more
423
Citations
Additional affiliations
September 2012 - June 2016
Institute of Robotics and Industrial Informatics
Position
  • PostDoc Position

Publications

Publications (46)
Preprint
We propose to leverage Transformer architectures for non-autoregressive human motion prediction. Our approach decodes elements in parallel from a query sequence, instead of conditioning on previous predictions such as instate-of-the-art RNN-based approaches. In such a way our approach is less computational intensive and potentially avoids error acc...
Preprint
We propose to leverage recent advances in reliable 2D pose estimation with Convolutional Neural Networks (CNN) to estimate the 3D pose of people from depth images in multi-person Human-Robot Interaction (HRI) scenarios. Our method is based on the observation that using the depth information to obtain 3D lifted points from 2D body landmark detection...
Article
Full-text available
We present an efficient and accurate people detection approach based on deep learning to detect people attacks and intrusion in video surveillance scenarios Unlike other approaches using background segmentation and pre-processing techniques, which are not able to distinguish people from other elements in the scene, we propose WatchNet++ that is a d...
Preprint
Achieving robust multi-person 2D body landmark localization and pose estimation is essential for human behavior and interaction understanding as encountered for instance in HRI settings. Accurate methods have been proposed recently, but they usually rely on rather deep Convolutional Neural Network (CNN) architecture, thus requiring large computatio...
Article
Achieving robust multi-person 2D body landmark localization and pose estimation is essential for human behavior and interaction understanding as encountered for instance in HRI settings. Accurate methods have been proposed recently, but they usually rely on rather deep Convolutional Neural Network (CNN) architecture, thus requiring large computatio...
Article
We present a novel method for semantic text document analysis which in addition to localizing text it labels the text in user-defined semantic categories. More precisely, it consists of a fully-convolutional and sequential network that we apply to the particular case of slide analysis to detect title, bullets and standard text. Our contributions ar...
Preprint
We propose to combine recent Convolutional Neural Networks (CNN) models with depth imaging to obtain a reliable and fast multi-person pose estimation algorithm applicable to Human Robot Interaction (HRI) scenarios. Our hypothesis is that depth images contain less structures and are easier to process than RGB images while keeping the required inform...
Chapter
This chapter explains an adaptive on-line object detection and classification technique for robust perception due to varying scene conditions, for example partial cast shadows, change on the illumination conditions or changes in the angle of the object target view. This approach continuously updates the target model upon arrival of new data, being...
Article
Full-text available
We present an efficient, online, and interactive approach for computing a classifier, called Wild Lady Ferns (WiLFs), for face learning and detection using small human supervision. More precisely, on the one hand, WiLFs combine online boosting and extremely randomized trees (random ferns) to compute progressively an efficient and discriminative cla...
Chapter
Convolutional Neural Networks (CNN) are the leading models for human body landmark detection from RGB vision data. However, as such models require high computational load, an alternative is to rely on depth images which, due to their more simple nature, can allow the use of less complex CNNs and hence can lead to a faster detector. As learning CNNs...
Article
Full-text available
We propose an efficient and robust method for the recognition of objects exhibiting multiple intra-class modes, where each one is associated with a particular object appearance. The proposed method, called random clustering ferns, combines synergically a single and real-time classifier, based on the boosted assembling of extremely randomized trees...
Article
Full-text available
In recent years, there has been a growing interest in enabling autonomous social robots to interact with people. However, many questions remain unresolved regarding the social capabilities robots should have in order to perform this interaction in an ever more natural manner. In this paper, we tackle this problem through a comprehensive study of va...
Article
In this paper we introduce the Boosted Random Ferns (BRFs) to rapidly build discriminative classifiers for learning and detecting object categories. At the core of our approach we use standard random ferns, but we introduce four main innovations that let us bring ferns from an instance to a category level, and still retain efficiency. First, we def...
Chapter
Full-text available
This paper presents a robust and real-time method for people detection in urban and crowed environments. Unlike other conventional methods which either focus on single features or compute multiple and independent classifiers specialized in a particular feature space, the proposed approach creates a synergic combination of appearance and depth cues...
Article
We present a fast and online human-robot interaction approach that progressively learns multiple object classifiers using scanty human supervision. Given an input video stream recorded during the human-robot interaction, the user just needs to annotate a small fraction of frames to compute object specific classifiers based on random ferns which sha...
Conference Paper
Full-text available
We propose a robust and efficient method to estimate the pose of a camera with respect to complex 3D textured models of the environment that can potentially contain more than 100, 000 points. To tackle this problem we follow a top down approach where we combine high-level deep network classifiers with low level geometric approaches to come up with...
Conference Paper
Full-text available
In this paper, we present an object recognition approach that in addition allows to discover intra-class modalities exhibiting high-correlated visual information. Unlike to more conventional approaches based on computing multiple specialized classifiers, the proposed approach combines a single classifier, Boosted Random Ferns (BRFs), with probabili...
Conference Paper
Full-text available
We propose an efficient Human Robot Interaction approach to efficiently model the appearance of all relevant objects in robot's environment. Given an input video stream recorded while the robot is navigating, the user just needs to annotate a very small number of frames to build specific classifiers for each of the objects of interest. At the core...
Conference Paper
Full-text available
We present a method for efficiently detecting natural landmarks that can handle scenes with highly repetitive patterns and targets progressively changing its appearance. At the core of our approach lies a Random Ferns classifier, that models the posterior probabilities of different views of the target using multiple and independent Ferns, each cont...
Conference Paper
Full-text available
During the last decade, there has been a growing interest in making autonomous social robots able to interact with people. However, there are still many open issues regarding the social capabilities that robots should have in order to perform these interactions more naturally. In this paper we present the results of several experiments conducted at...
Chapter
This chapter presents some real-life examples using the interactive multimodal framework; in this work, the robot is capable of learning through human assistance. The basic idea is to use the human feedback to improve the learning behavior of the robot when it deals with human beings.We show two different prototypes that have been developed for the...
Article
In this paper we show that the performance of binary classifiers based on Boosted Random Ferns can be significantly improved by appropriately bootstrapping the training step. This results in a classifier which is both highly discriminative and computationally efficient and is particularly suitable when only small sets of training images are availab...
Conference Paper
Full-text available
We present an Online Random Ferns (ORFs) classifier that progressively learns and builds enhanced models of object appearances. During the learning process, we allow the human intervention to assist the classifier and discard false positive training samples. The amount of human intervention is minimized and integrated within the online learning, su...
Conference Paper
Full-text available
We present an experimental evaluation of Boosted Random Ferns in terms of the detection performance and the training data. We show that adding an iterative bootstrapping phase during the learning of the object classifier, it increases its detection rates given that additional positive and negative samples are collected (bootstrapped) for retraining...
Conference Paper
Full-text available
Cast shadows add additional difficulties on detecting objects because they locally modify image intensity and color. Shadows may appear or disappear in an image when the object, the camera, or both are free to move through a scene. This work evaluates the performance of an object detection method based on boosted HOG paired with three different ima...
Article
Full-text available
We propose a new algorithm for detecting multiple object categories that exploits the fact that different categories may share common features but with different geometric distributions. This yields an efficient detector which, in contrast to existing approaches, considerably reduces the computation cost at runtime, where the feature computation st...
Conference Paper
Full-text available
We present a new approach for building an efficient and robust classifier for the two class problem, that localizes objects that may appear in the image under different orientations. In contrast to other works that address this problem using multiple classifiers, each one specialized for a specific orientation, we propose a simple two-step approach...
Conference Paper
Full-text available
In this work we present a robust detection method in outdoor scenes under cast shadows using color based invariant gradients in combination with HoG local features. The method achieves good detection rates in urban scene classification and person detection outperforming traditional methods based on intensity gradient detectors which are sensible to...
Conference Paper
Full-text available
The present paper addresses pedestrian detection using local boosted features that are learned from a small set of training images. Our contribution is to use two boosting steps. The first one learns discriminant local features corresponding to pedestrian parts and the second one selects and combines these boosted features into a robust class class...
Conference Paper
Full-text available
In this work a new robust color and contour based object detection method in images with varying shadows is presented. The method relies on a physics-based contour detector that emphasizes material changes and a contour-based boosted classifier. The method has been tested in a sequence of outdoor color images presenting varying shadows using two cl...
Conference Paper
Full-text available
In this article, scale and orientation invariant object detection is performed by matching intensity level histograms. Unlike other global measurement methods, the present one uses a local feature description that allows small changes in the histogram signature, giving robustness to partial occlusions. Local features over the object histogram are e...
Conference Paper
Full-text available
We present a framework for object recognition based on sim- ple scale and orientation invariant local features that when combined with a hierarchical multiclass boosting mechanism produce robust clas- sifiers for a limited number of object classes in cluttered backgrounds. The system extracts the most relevant features from a set of training sam- p...
Conference Paper
Full-text available
We present a framework for object detection that is invariant to object translation, scale, rotation, and to some degree, occlusion, achieving high detection rates, at 14 fps in color images and at 30 fps in gray scale images. Our approach is based on boosting over a set of simple local features. In contrast to previous approaches, and to efficient...
Article
Object recognition entails identifying instances of known objects in sensory data by searching for a match between features in a scene and features on a model. The key elements that make object recognition feasible are the use of diverse sensory input forms such as stereo imagery or range data, appropriate low level processing of the sensory input,...
Article
Full-text available
Resumen Este artículo presenta un sistema para el re-conocimiento de objetos basado en características lo-cales simples invariantes a escala y orientación, que al ser entrenado con un mecanismo de clasificación supervisada produce clasificadores robustos para un número limitado de clases de objetos. El sistema ex-trae las características más releva...

Network

Cited By