About
14
Publications
3,558
Reads
How we measure 'reads'
A 'read' is counted each time someone views a publication summary (such as the title, abstract, and list of authors), clicks on a figure, or views or downloads the full-text. Learn more
420
Citations
Introduction
Additional affiliations
Education
August 2009 - July 2013
September 2005 - February 2008
September 2002 - July 2005
Publications
Publications (14)
We propose a new method for estimating the relative pose between two images, where we jointly learn keypoint detection, description extraction, matching and robust pose estimation. While our architecture follows the traditional pipeline for pose estimation from geometric computer vision, all steps are learnt in an end-to-end fashion, including feat...
Image-based tracking of animals in their natural habitats can provide rich behavioural data, but is very challenging due to complex and dynamic background and target appearances. We present an effective method to recover the positions of terrestrial animals in cluttered environments from video sequences filmed using a freely moving monocular camera...
We address the problem of temporally aligning semantically similar videos, for example two videos of cars on different tracks. We present an alignment method that establishes frame-to-frame correspondences such that the two cars are seen from a similar viewpoint (e.g. facing right), while also being temporally smooth and visually pleasing. Unlike p...
Internet videos provide a wealth of data that could be used to learn the
appearance or expected behaviors of many object classes. However, most
supervised methods cannot exploit this data directly, as they require a large
amount of time-consuming manual annotations. As a step towards solving this
problem, we propose an automatic system for organizi...
We propose a motion-based method to discover the physical parts of an articulated object class (e.g. head/torso/leg of a horse) from multiple videos. The key is to find object regions that exhibit consistent motion relative to the rest of the object, across multiple videos. We can then learn a location model for the parts and segment them accuratel...
We investigate the problem of automatically discovering the visual aspects of an object class. Existing methods discover aspects from still images under strong supervision, as they require time-consuming manual annotation of the objects’ location (e.g. bounding boxes). Instead, we explore using video, which enables automatic localisation by motion...
We propose an unsupervised approach for discovering characteristic motion patterns in videos of highly articulated objects performing natural, unscripted behaviors, such as tigers in the wild. We discover consistent patterns in a bottom-up manner by analyzing the relative displacements of large numbers of ordered trajectory pairs through time, such...
Given unstructured videos of deformable objects, we automatically recover spatiotemporal correspondences to map one object to another (such as animals in the wild). While traditional methods based on appearance fail in such challenging conditions, we exploit consistency in object motion
between instances. Our approach discovers pairs of short video...
We develop a Bayesian modeling approach for tracking people in 3D from monocular video with unknown cameras. Modeling in 3D provides natural explanations for occlusions and smoothness discontinuities that result from projection, and allows priors on velocity and smoothness to be grounded in physical quantities: meters and seconds vs. pixels and fra...
We develop a comprehensive Bayesian generative model for understanding indoor scenes. While it is common in this domain to approximate objects with 3D bounding boxes, we propose using strong representations with finer granularity. For example, we model a chair as a set of four legs, a seat and a backrest. We find that modeling detailed geometry imp...
The task of inferring the 3D layout of indoor scenes from images has seen many recent advancements. Understanding the basic 3D geometry of these environments is important for higher level applications, such as object recognition and robot navigation. In this chapter, we present our Bayesian generative model for understanding indoor environments. We...
We propose a method for understanding the 3D geometry of indoor environments (e.g. bedrooms, kitchens) while simultaneously identifying objects in the scene (e.g. beds, couches, doors). We focus on how modeling the geometry and location of specific objects is helpful for indoor scene understanding. For example, beds are shorter than they are wide,...
We present a method for automatically aligning words to image regions that integrates specific object classifiers (e.g., "car" detectors) with weak models based on appearance features. Previous strategies have largely focused on the latter, and thus have not exploited progress on object category recognition. Hence, we augment region labeling with o...
We propose a top down approach for understanding indoor scenes such as bedrooms and living rooms. These environments typically have the Manhattan world property that many surfaces are parallel to three principle ones. Further, the 3D geometry of the room and objects within it can largely be approximated by non overlapping simple structures such as...