[Show abstract][Hide abstract] ABSTRACT: We present a classifier unifying local features based representation and subspace based learning. We also propose a novel method to merge kernel eigen spaces (KES) in feature space. Subspace methods have traditionally been used with the full appearance of the image. Recently local features based bag-of-features (BoF) representation has performed impressively on classification tasks. We use KES with BoF vectors to construct class specific subspaces and use the distance of a query vector from the database KESs as the classification criteria. The use of local features makes our approach invariant to illumination, rotation, scale, small affine transformation and partial occlusions. The system allows hierarchy by merging the KES in the feature space. The classifier performs competitively on the challenging Caltech-101 dataset under normal and simulated occlusion conditions. We show hierarchy on a dataset of videos collected over the internet.
19th International Conference on Pattern Recognition (ICPR 2008), December 8-11, 2008, Tampa, Florida, USA; 01/2008
[Show abstract][Hide abstract] ABSTRACT: We propose a novel framework for object detection and localization in images containing appreciable clutter and occlusions. The problem is cast in a statistical hypothesis testing framework. The image under test is converted into a set of local features using affine invariant local region detectors, described using the popular SIFT descriptor. Due to clutter and occlusions, this set is expected to contain features which do not belong to the object. We sample subsets of local features from this set and test for the alternate hypothesis of object present against the null hypothesis of object absent. Further, we use a method similar to the recently proposed spatial scan statistic to refine the object localization estimates obtained from the sampling process. We demonstrate the results of our method on the two datasets TUD Motorbikes and TUD Cars. TUD Cars database has background clutter. TUD Motorbikes dataset is recognized to have substantial variation in terms of scale, background, illumination, viewpoint and occlusions.
Sixth Indian Conference on Computer Vision, Graphics & Image Processing, ICVGIP 2008, Bhubaneswar, India, 16-19 December 2008; 01/2008
[Show abstract][Hide abstract] ABSTRACT: We have attempted the problem of novel view synthesis of scenes containing man-made objects from images taken by arbitrary, uncalibrated cameras. Under the assumption of availability of the correspondence of three vanishing points, in general position, we propose two techniques. The first is a transfer-based scheme which synthesizes new views with only a translation of the virtual camera and computes z-buffer values for handling occlusions in synthesized views. The second is a reconstruction-based scheme which synthesizes arbitrary new views in which the camera can undergo rotation as well as translation. We present experimental results to establish the validity of both formulations.
[Show abstract][Hide abstract] ABSTRACT: We propose a technique for view synthesis of scenes with static objects as well as objects that translate independent of the camera motion. Assuming the availability of three vanishing points in general position in the given views, we set up an affine coordinate system in which the static and moving points are reconstructed and the translations of the dynamic objects are recovered. We then describe how to synthesize new views corresponding to a completely new camera specified in the affine space with new translations for the dynamic objects. As the extent of the synthesized scene is restricted by the availability of corresponding points, we use a voxel-based volumetric scene reconstruction scheme to obtain a scene model and synthesize views of the entire scene. We present experimental results to validate our technique.
Computer Vision - ACCV 2006, 7th Asian Conference on Computer Vision, Hyderabad, India, January 13-16, 2006, Proceedings, Part I; 01/2006
[Show abstract][Hide abstract] ABSTRACT: This paper addresses the problem of synthesizing novel views of a scene using images taken by an uncalibrated translating camera. We propose a method for synthesis of views corresponding to translational motion of the camera. Our scheme can handle occlusions and changes in visibility in the synthesized views. We give a characterisation of the viewpoints corresponding to which views can be synthesized. Experimental results have established the validity and effectiveness of the method. Our synthesis scheme can also be used to detect translational pan motion of the camera in a given video sequence. We have also presented experimental results to illustrate this feature of our scheme.
[Show abstract][Hide abstract] ABSTRACT: We propose a scheme for view synthesis of scenes contain- ing man-made objects from images taken by arbitrary, un- calibrated cameras. Under the assumption of availability of the correspondence of three vanishing points,in general position, our scheme computes z-buffer values that can be used for handling occlusions in the synthesized view. This requires the computation of the infinite homography. We also present an alternate formulation of the technique which works with the same assumptions but does not require in- finite homography computation. We present experimental results to establish the validity of both formulations. In this paper we have addressed the problem of synthe- sizing new views of a scene using multiple images taken by uncalibrated cameras. We make no assumptions on the motion or internal parameters of the cameras. Infact, our scheme can be used to synthesize novel views using frames extracted from a motion picture and we provide examples of the same. Our technique for view synthesis requires that the correspondence of three vanishing points, in general po- sition, be available in the given views. The knowledge of the vanishing points is used to compute the infinite homog- raphy between two given views. We also give an alternate formulation that does not require computation of the infinite homography and is motivated by (3). In both formulations we compute a z-buffer value for each corresponding point that can be used to resolve changes in visibility in the new view correctly. We have proposed a technique for view syn- thesis under the simpler assumption of translating cameras in (11). In (3), three vanishing points in general position are used to setup a coordinate system in the world. Two points in a single image are said to be corresponding points if one is on one of the coordinate planes and the other is on the line through the first point parallel to the coordinate axis perpen- dicular to the plane. Reconstruction from a single image can be done but only for those points for which atleast one of the corresponding pointsis known. Our technique uses two or more views to reconstruct each point visible in atleast two of the given views. Also, (3) uses two views to make certain measurements, for example, the affine distance of the cam- era centres to a chosen world plane. However, they require the ratio of the distances of two reference points from the world plane to be known. We require only that the corre- spondence of three vanishing points be given.
ICVGIP 2004, Proceedings of the Fourth Indian Conference on Computer Vision, Graphics & Image Processing, Kolkata, India, December 16-18, 2004; 01/2004
[Show abstract][Hide abstract] ABSTRACT: This paper addresses the problem of invariant-based recognition of
quadric configurations from a single image. These configurations consist
of a pair of rigidly connected translationally repeated quadric
surfaces. This problem is approached via a reconstruction framework. A
new mathematical framework, using relative affine structure, on the
lines of Luong and Vieville (1996), has been proposed. Using this
mathematical framework, translationally repeated objects have been
projectively reconstructed, from a single image, with four image point
correspondences of the distinguished points on the object and its
translate. This has been used to obtain a reconstruction of a pair of
translationally repeated quadrics. We have proposed joint projective
invariants of a pair of proper quadrics. For the purpose of recognition
of quadric configurations, we compute these invariants for the pair of
reconstructed quadrics. Experimental results on synthetic and real
images, establish the discriminatory power and stability of the proposed
invariant-based recognition strategy. As a specific example, we have
applied this technique for discriminating images of monuments which are
characterized by translationally repeated domes modeled as
IEEE Transactions on Pattern Analysis and Machine Intelligence 07/2001; · 4.80 Impact Factor
[Show abstract][Hide abstract] ABSTRACT: In this paper we propose a reconstruction based recognition scheme for objects with repeated components, using a single image of such a configuration, in which one of the repeated components may be partially occluded. In our strategy we reconstruct each of the components with respect to the same frame and use these to compute invariants.We propose a new mathematical framework for the projective reconstruction of affinely repeated objects. This uses the repetition explicitly and hence is able to handle substantial occlusion of one of the components. We then apply this framework to the reconstruction of a pair of repeated quadrics. The image information required for the reconstruction are the outline conic of one of the quadrics and correspondence between any four points which are images of points in general position on the quadric and its repetition. Projective invariants computed using the reconstructed quadrics have been used for recognition. The recognition strategy has been applied to images of monuments with multi-dome architecture. Experiments have established the discriminatory ability of the invariants.
Journal of Mathematical Imaging and Vision 01/2001; 14(1):5-20. · 1.77 Impact Factor
[Show abstract][Hide abstract] ABSTRACT: This paper proposes an invariance based recognition scheme for scenes with multiple repeated components. The scheme considers three component subsets which characterize the scene completely. Each such three component subset is reconstructed using single image based information. We have developed a mathematical framework for the projective reconstruction based on relative affine structure of each such three component building block. This is extended to the case when each of the components is a quadric. A set of projective invariants of three quadrics has also been obtained by us. Although the reconstruction scheme is general and applicable to all multiple repeated components, it requires the computation of infinite homography. The infinite homography and hence the reconstruction scheme are only image computable with the given information in the case of translational repetition. We therefore develop a recognition strategy for the specific case of translationally repeated quadrics. As a recognition strategy for scenes with multiple translationally repeated quadric components, we propose to compute and store invariant values for each such three component subsets. Experiments on real data have shown the applicability of this approach for recognition of aerial images of power plants. The discriminatory power of the invariants and the stability of the recognition results have also been experimentally demonstrated.