Luca Del Pero

Luca Del Pero
Blippar

PhD

About

14
Publications
3,558
Reads
How we measure 'reads'
A 'read' is counted each time someone views a publication summary (such as the title, abstract, and list of authors), clicks on a figure, or views or downloads the full-text. Learn more
420
Citations
Additional affiliations
October 2013 - present
University of Edinburgh
Position
  • PostDoc Position
Description
  • Weakly supervised learning from Internet videos
August 2009 - July 2013
University of Arizona
Position
  • Research Assistant
Description
  • Understanding indoor scenes from monocular images. People tracking from monocular video. Alignment of images with associated text.
Education
August 2009 - July 2013
University of Arizona
Field of study
  • PhD in Computer Science
September 2005 - February 2008
Polytechnic University of Turin
Field of study
  • Master of Science in Computer Engineering
September 2002 - July 2005
Polytechnic University of Turin
Field of study
  • Bachelor in Computer Engineering

Publications

Publications (14)
Preprint
Full-text available
We propose a new method for estimating the relative pose between two images, where we jointly learn keypoint detection, description extraction, matching and robust pose estimation. While our architecture follows the traditional pipeline for pose estimation from geometric computer vision, all steps are learnt in an end-to-end fashion, including feat...
Conference Paper
Image-based tracking of animals in their natural habitats can provide rich behavioural data, but is very challenging due to complex and dynamic background and target appearances. We present an effective method to recover the positions of terrestrial animals in cluttered environments from video sequences filmed using a freely moving monocular camera...
Conference Paper
We address the problem of temporally aligning semantically similar videos, for example two videos of cars on different tracks. We present an alignment method that establishes frame-to-frame correspondences such that the two cars are seen from a similar viewpoint (e.g. facing right), while also being temporally smooth and visually pleasing. Unlike p...
Article
Full-text available
Internet videos provide a wealth of data that could be used to learn the appearance or expected behaviors of many object classes. However, most supervised methods cannot exploit this data directly, as they require a large amount of time-consuming manual annotations. As a step towards solving this problem, we propose an automatic system for organizi...
Conference Paper
We propose a motion-based method to discover the physical parts of an articulated object class (e.g. head/torso/leg of a horse) from multiple videos. The key is to find object regions that exhibit consistent motion relative to the rest of the object, across multiple videos. We can then learn a location model for the parts and segment them accuratel...
Article
We investigate the problem of automatically discovering the visual aspects of an object class. Existing methods discover aspects from still images under strong supervision, as they require time-consuming manual annotation of the objects’ location (e.g. bounding boxes). Instead, we explore using video, which enables automatic localisation by motion...
Conference Paper
Full-text available
We propose an unsupervised approach for discovering characteristic motion patterns in videos of highly articulated objects performing natural, unscripted behaviors, such as tigers in the wild. We discover consistent patterns in a bottom-up manner by analyzing the relative displacements of large numbers of ordered trajectory pairs through time, such...
Article
Full-text available
Given unstructured videos of deformable objects, we automatically recover spatiotemporal correspondences to map one object to another (such as animals in the wild). While traditional methods based on appearance fail in such challenging conditions, we exploit consistency in object motion between instances. Our approach discovers pairs of short video...
Conference Paper
Full-text available
We develop a Bayesian modeling approach for tracking people in 3D from monocular video with unknown cameras. Modeling in 3D provides natural explanations for occlusions and smoothness discontinuities that result from projection, and allows priors on velocity and smoothness to be grounded in physical quantities: meters and seconds vs. pixels and fra...
Conference Paper
We develop a comprehensive Bayesian generative model for understanding indoor scenes. While it is common in this domain to approximate objects with 3D bounding boxes, we propose using strong representations with finer granularity. For example, we model a chair as a set of four legs, a seat and a backrest. We find that modeling detailed geometry imp...
Chapter
The task of inferring the 3D layout of indoor scenes from images has seen many recent advancements. Understanding the basic 3D geometry of these environments is important for higher level applications, such as object recognition and robot navigation. In this chapter, we present our Bayesian generative model for understanding indoor environments. We...
Conference Paper
We propose a method for understanding the 3D geometry of indoor environments (e.g. bedrooms, kitchens) while simultaneously identifying objects in the scene (e.g. beds, couches, doors). We focus on how modeling the geometry and location of specific objects is helpful for indoor scene understanding. For example, beds are shorter than they are wide,...
Conference Paper
Full-text available
We present a method for automatically aligning words to image regions that integrates specific object classifiers (e.g., "car" detectors) with weak models based on appearance features. Previous strategies have largely focused on the latter, and thus have not exploited progress on object category recognition. Hence, we augment region labeling with o...
Conference Paper
Full-text available
We propose a top down approach for understanding indoor scenes such as bedrooms and living rooms. These environments typically have the Manhattan world property that many surfaces are parallel to three principle ones. Further, the 3D geometry of the room and objects within it can largely be approximated by non overlapping simple structures such as...

Network

Cited By