Active Vision from Multiple Cues.
ABSTRACT Active vision involves processes for stabilisation and fixation on objects of interest. To provide robust performance for
such processes it is necessary to consider integration and processing as closely coupled processes. In this paper we discuss
methods for integration of cues and present a unified architecture for active vision. The performance of the approach is illustrated
by a few examples.
- [Show abstract] [Hide abstract]
ABSTRACT: Segmenting semantically meaningful whole objects from images is a challenging problem, and it becomes especially so without higher level common sense reasoning. In this paper, we present an interactive segmentation framework that integrates image appearance and boundary constraints in a principled way to address this problem. In particular, we assume that small sets of pixels, which are referred to as seed pixels, are labeled as the object and background. The seed pixels are used to estimate the labels of the unlabeled pixels using Dirichlet process multiple-view learning, which leverages 1) multiple-view learning that integrates appearance and boundary constraints and 2) Dirichlet process mixture-based nonlinear classification that simultaneously models image features and discriminates between the object and background classes. With the proposed learning and inference algorithms, our segmentation framework is experimentally shown to produce both quantitatively and qualitatively promising results on a standard dataset of images. In particular, our proposed framework is able to segment whole objects from images given insufficient seeds.IEEE Transactions on Image Processing 12/2011; 21(4):2119-29. · 3.11 Impact Factor
- [Show abstract] [Hide abstract]
ABSTRACT: This paper presents a general approach for the simultaneous tracking of multiple moving targets using a generic active stereo setup. The problem is formulated on the plane, where cameras are modeled as “line scan cameras,” and targets are described as points with unconstrained motion. We propose to control the active system parameters in such a manner that the images of the targets in the two views are related by a homography. This homography is specified during the design stage and, thus, can be used to implicitly encode the desired tracking behavior. Such formulation leads to an elegant geometric framework that enables a systematic and thorough analysis of the problem at hand. The benefits of the approach are illustrated by applying the framework to two distinct stereo configurations. In the first case, we assume two pan-tilt-zoom cameras, with rotation and zoom control, which are arbitrarily placed in the working environment. It is proved that such a stereo setup can track up to N = 3 free-moving targets, while assuring that the image location of each target is the same for both views. The second example considers a robot head with neck pan motion and independent eye rotation. For this case, it is shown that it is not possible to track more than N = 2 targets because of the lack of zoom. The theoretical framework is used to derive the control equations, and the implementation of the tracking behavior is described in detail. The correctness of the results is confirmed through simulations and real tracking experiments.IEEE Transactions on Robotics 07/2010; · 2.65 Impact Factor