Harpreet S. Sawhney

Harpreet S. Sawhney
Microsoft

About

165
Publications
21,063
Reads
How we measure 'reads'
A 'read' is counted each time someone views a publication summary (such as the title, abstract, and list of authors), clicks on a figure, or views or downloads the full-text. Learn more
13,206
Citations

Publications

Publications (165)
Chapter
Matching street-level images to a database of airborne images is hard because of extreme viewpoint and illumination differences. Color/gradient distributions or local descriptors fail to match forcing us to rely on the structure of self-similarity of patterns on facades. We propose to capture this structure with a novel “scale-selective self-simila...
Article
We propose a new zero-shot Event-Detection method by Multi-modal Distributional Semantic embedding of videos. Our model embeds object and action concepts as well as other available modalities from videos into a distributional semantic space. To our knowledge, this is the first Zero-Shot event detection model that is built on top of distributional s...
Conference Paper
Full-text available
We propose a new zero-shot Event Detection method by Multi-modal Distributional Semantic embedding of videos. Our model embeds object and action concepts as well as other available modalities from videos into a distributional semantic space. To our knowledge, this is the first Zero-Shot event detection model that is built on top of distributional s...
Article
Full-text available
We present an algorithm to estimate depth in dynamic video scenes. We propose to learn and infer depth in videos from appearance, motion, occlusion boundaries, and geometric context of the scene. Using our method, depth can be estimated from unconstrained videos with no requirement of camera pose estimation, and with significant background/foregrou...
Article
We present snap-n-eat, a mobile food recognition system. The system can recognize food and estimate the calorific and nutrition content of foods automatically without any user intervention. To identify food items, the user simply snaps a photo of the food plate. The system detects the salient region, crops its image, and subtracts the background ac...
Patent
A computer implemented method for deriving an attribute entity network (AEN) from video data is disclosed, comprising the steps of: extracting at least two entities from the video data; tracking the trajectories of the at least two entities to form at least two tracks; deriving at least one association between at least two entities by detecting at...
Patent
A method and apparatus for determining a geographic location of a scene in a captured depiction comprising extracting a first set of features from the captured depiction by algorithmically analyzing the captured depiction, matching the extracted features of the captured depiction against a second set of extracted features associated with reference...
Article
The problem of training a classifier from a handful of positive examples, without having to supply class specific negatives is of great practical importance. The proposed approach to solving this problem builds on the idea of training LDA classifiers using only class specific foreground images and a large collection of unlabelled images, as describ...
Patent
A computer implemented method for determining a vehicle type of a vehicle detected in an image is disclosed. An image having a detected vehicle is received. A number of vehicle models having salient feature points is projected on the detected vehicle. A first set of features derived from each of the salient feature locations of the vehicle models i...
Patent
A computer implemented method for detecting the presence of one or more pedestrians in the vicinity of the vehicle is disclosed. Imagery of a scene is received from at least one image capturing device. A depth map is derived from the imagery. A plurality of pedestrian candidate regions of interest (ROIs) is detected from the depth map by matching e...
Conference Paper
Detecting pedestrians at a distance from large-format wide-area imagery is a challenging problem because of low ground sampling distance (GSD) and low frame rate of the imagery. In such a scenario, the approaches based on appearance cues alone mostly fail because pedestrians are only a few pixels in size. Frame-differencing and optical flow based a...
Patent
The present invention relates to a method and apparatus for detecting and tracking vehicles. One embodiment of a system for detecting and tracking an object (e.g., vehicle) in a field of view includes a moving object indication stage for detecting a candidate object in a series of input video frames depicting the field of view and a track associati...
Conference Paper
Important diagnostic criteria for glaucoma are changes in the 3D structure of the optic disc due to optic nerve damage. We propose an automatic approach for detecting these changes in 3D models reconstructed from fundus images of the same patient taken at different times. For each time session, only two uncalibated fundus images are required. The a...
Conference Paper
We propose a novel hybrid model that exploits the strength of discriminative classifiers along with the representational power of generative models. Our focus is on detecting multimodal events in time varying sequences. Discriminative classifiers have been shown to achieve higher performances than the corresponding generative likelihood-based class...
Patent
Full-text available
A computer implemented method for matching video data to a database containing a plurality of video fingerprints of the type described above, comprising the steps of calculating at least one fingerprint representing at least one query frame from the video data; indexing into the database using the at least one calculated fingerprint to find a set o...
Patent
The present invention relates to a method and system for creating a strong classifier based on motion patterns wherein the strong classifier may be used to determine an action being performed by a body in motion. When creating the strong classifier, action classification is performed by measuring similarities between features within motion patterns...
Conference Paper
Complex event detection is very challenging in open source such as You-Tube videos, which usually comprise very diverse visual contents involving various object, scene and action concepts. Not all of them, however, are relevant to the event. In other words, a video may contain a lot of "junk" information which is harmful for recognition. Hence, we...
Conference Paper
Despite recent advances, automatic blood vessel extraction from low quality retina images remains difficult. We propose an interactive approach that enables a user to efficiently obtain near perfect vessel segmentation with a few mouse clicks. Given two seed points, the approach seeks an optimal path between them by minimizing a cost function. In c...
Article
Full-text available
This paper presents a novel method to recover 3D structure of the optic disc in the retina from two uncalibrated fundus images. Retinal images are commonly uncalibrated when acquired clinically, creating rectification challenges as well as significant radiometric and blur differences within the stereo pair. By exploiting structural peculiarities of...
Conference Paper
We present a novel approach for multi-modal affect analysis in human interactions that is capable of integrating data from multiple modalities while also taking into account temporal dynamics. Our fusion approach, Joint Hidden Conditional Random Fields (JHRCFs), combines the advantages of purely feature level (early fusion) fusion approaches with l...
Conference Paper
Full-text available
In this work, we propose a novel video representation for activity recognition that models video dynamics with attributes of activities. A video sequence is decomposed into short-term segments, which are characterized by the dynamics of their attributes. These segments are modeled by a dictionary of attribute dynamics templates, which are implement...
Patent
The present invention provides a computer implemented process for detecting multi-view multi-pose objects. The process comprises training of a classifier for each intra-class exemplar, training of a strong classifier and combining the individual exemplar-based classifiers with a single objective function. This function is optimized using the two ne...
Patent
A computer implemented method for automatically detecting and classifying acoustic signatures across a set of recording conditions is disclosed. A first acoustic signature is received. The first acoustic signature is projected into a space of a minimal set of exemplars of acoustic signature types derived from a larger set of exemplars using a wrapp...
Patent
The present invention is a system and a method for processing stereo images utilizing a real time, robust, and accurate stereo matching system and method based on a coarse-to-fine architecture. At each image pyramid level, non-centered windows for matching and adaptive upsampling of coarse-level disparities are performed to generate estimated dispa...
Conference Paper
We present a novel method for matching ground-based query images to a georeferenced LIDAR 3D dataset acquired from an airborne platform in urban environments. We are addressing two main technical challenges: (i) different modalities between the query and the reference data (electro-optical vs. LIDAR) that impose unique challenges to the matching pr...
Conference Paper
We propose to use action, scene and object concepts as semantic attributes for classification of video events in InTheWild content, such as YouTube videos. We model events using a variety of complementary semantic attribute features developed in a semantic concept space. Our contribution is to systematically demonstrate the advantages of this conce...
Patent
Full-text available
A computer-implemented method for estimating a volume of at least one food item on a food plate is disclosed. A first and second plurality of images are received from different positions above a food plate, wherein angular spacing between the positions of the first plurality of images is greater than angular spacing between the positions of the sec...
Patent
Full-text available
A method and apparatus for recognizing an object, comprising providing a set of scene features from a scene, pruning a set of model features, generating a set of hypotheses associated with the pruned set of model features for the set of scene features, pruning the set of hypotheses, and verifying the set of pruned hypotheses is provided.
Conference Paper
Multimedia event detection has drawn a lot of attention in recent years. Given a recognized event, in this paper, we conduct a pilot study of the multimedia event recounting problem, which answers the question why this video is recognized as this event, i.e. what evidences this decision is made on. In order to provide a semantic recounting of the m...
Conference Paper
Matching street-level images to a database of airborne images is hard because of extreme viewpoint and illumination differences. Color/gradient distributions or local descriptors fail to match forcing us to rely on the structure of self-similarity of patterns on facades. We propose to capture this structure with a novel “scale-selective self-simila...
Patent
A method for extracting a 3D terrain model for identifying at least buildings and terrain from LIDAR data is disclosed, comprising the steps of generating a point cloud representing terrain and buildings mapped by LIDAR; classifying points in the point cloud, the point cloud having ground and non-ground points, the non-ground points representing bu...
Conference Paper
Low-level appearance as well as spatio-temporal features, appropriately quantized and aggregated into Bag-of-Words (BoW) descriptors, have been shown to be effective in many detection and recognition tasks. However, their effcacy for complex event recognition in unconstrained videos have not been systematically evaluated. In this paper, we use the...
Article
It is true that the teeth of man have been hurting for many thousands of years. The causes of the pain and effective methods to relieve or prevent the pain tend to follow traditional routes like random treatment, starting with materials of natural origin and then work by the aggressively curious, the intelligent and the innovative to better underst...
Article
Photodynamic therapy (PDT) also known as photoradiation therapy, phototherapy, or photochemotherapy, involves the use of a photoactive dye (photosensitizer) that is activated by exposure to light of a specific wavelength in the presence of oxygen. PDT is a very promising and non invasive treatment modality. At present, PDT is alternative method of...
Conference Paper
We study the feasibility of solving the challenging problem of geolocalizing ground level images in urban areas with respect to a database of images captured from the air such as satellite and oblique aerial images. We observe that comprehensive aerial image databases are widely available while complete coverage of urban areas from the ground is at...
Article
We propose a novel similarity measure of two image sequences based on shapeme histograms. The idea of shapeme histogram has been used for single image/texture recognition, but is used here to solve the sequence-to-sequence matching problem. We develop techniques to represent each sequence as a set of shapeme histograms, which captures different var...
Conference Paper
Analyzing change in the 3D structure of the optic disc over time has long been recognized as central to the diagnosis of glaucoma but has been inadequately addressed by computer vision methods. Currently, clinicians examine stereo pairs from different time instants for interval changes indicative of glaucoma. Due to the clinical procedures in captu...
Conference Paper
We describe a vehicle tracking algorithm using input from a network of nonoverlapping cameras. Our algorithm is based on a novel statistical formulation that uses joint kinematic and image appearance information to link local tracks of the same vehicles into global tracks with longer persistence. The algorithm can handle significant spatial separat...
Conference Paper
We present a novel LIDAR streaming architecture for real-time, on-board processing using unmanned robots. We propose a two-level 3D data structure that allows pipelined and streaming processing of the 3D data as it arrives from a moving robot: (i) at the coarse level, the incoming 3D scans are stored in memory in a dense 3D voxel grid with a relati...
Article
Full-text available
To assess the suitability of digital stereo images for optic disc evaluations in glaucoma. Stereo color optic disc images in both digital and 35-mm slide film formats were acquired contemporaneously from 29 subjects with various cup-to-disc ratios (range, 0.26-0.76; median, 0.475). Using a grading scale designed to assess image quality, the ease of...
Article
Localizing blood vessels in eye images is a crucial step in the automated and objective diagnosis of eye diseases. Most previous research has focused on extracting the centerlines of vessels in large field of view images. However, for diagnosing diseases of the optic disk region, like glaucoma, small field of view images have to be analyzed. One ne...
Conference Paper
We present an approach that uses detailed 3D models to detect and classify objects into fine levels of vehicle categories. Unlike other approaches that use silhouette information to fit a 3D model, our approach uses complete appearance from the image. Each D model has a set of salient location markers that are determined a-priori. These salient loc...
Conference Paper
This paper presents a joint probabilistic relation graph approach to simultaneously detect and track a large number of vehicles in low frame rate aerial videos. Due to low frame rate, low spatial resolution and sheer number of moving objects, detection and tracking in wide area video poses unique challenges. In this paper, we explore vehicle behavi...
Conference Paper
We present a real-time pedestrian detection system based on structure and appearance classification. We discuss several novel ideas that contribute to having low-false alarms and high detection rates, while at the same time achieving computational efficiency: (i) At the front end of our system we employ stereo to detect pedestrians in 3D range maps...
Article
We describe a real-time wide area surveillance system (WA-ACTV) for the automatic tracking of vessels using a network of PTZ cameras. The system is capable of optimally managing hundreds of PTZ cameras to simultaneously track a large numbers of vessels. The tracked vessels are fingerprinted using a scale-invariant part-based representation and subs...
Article
We present a real-time pedestrian detection system which uses cues derived from structure and appearance classification We discuss several novel ideas to achieve computational efficien y while improving on both detection and false-alarm rates: (i) At the front end of our system we employ stereo to detect pedestrians in 3D range maps, and to classif...
Article
Gunshot recordings have the potential for both tactical detection and forensic evaluation particularly to ascertain information about the type of firearm and ammunition used. Perhaps the most significant challenge to such an analysis is the effect of recording conditions on the audio signature of recorded data. In this paper we present a first stud...
Article
This paper presents a relational graph based approach to track thousands of vehicles from persistent wide area airborne surveillance (WAAS) videos. Due to the low ground sampling distance and low frame rate, vehicles usually have small size and may travel a long distance between consecutive frames, WAAS videos pose great challenges to correct assoc...
Article
We present an on-the-move LIDAR-based object detection system for autonomous and semi-autonomous unmanned vehicle systems. In this paper we make several contributions: (i) we describe an algorithm for real-time detection of objects such as doors and stairs in indoor environments; (ii) we describe efficient data structures and algorithms for process...
Conference Paper
We present a system that improves accuracy of food intake assessment using computer vision techniques. Traditional dietetic method suffers from the drawback of either inaccurate assessment or complex lab measurement. Our solution is to use a mobile phone to capture images of foods, recognize food types, estimate their respective volumes and finally...
Conference Paper
We propose a principled statistical approach for using 3D information and scene context to reduce the number of false positives in stereo based pedestrian detection. Current pedestrian detection algorithms have focused on improving the discriminability of 2D features that capture the pedestrian appearance, and on using various classifier architectu...
Conference Paper
We propose a real-time action detection system based on a novel action representation and an effective learning method with a small training set. We represent actions with a new feature that measures the ¿global¿ distance from a set of action exemplars, where action exemplars are constructed from a vocabulary that encodes ¿local¿ instantaneous...
Article
We apply a unique hierarchical audio classification technique to weapon identification using gunshot analysis. The Audio Classification classifies each audio segment as one of ten weapon classes (e.g., 9mm, 22, shotgun etc.) using lowcomplexity Gaussian Mixture Models (GMM). The first level of hierarchy consists of classification into broad weapons...
Article
Full-text available
We describe a novel scalable approach for the management of a large number of Pan-Tilt-Zoom (PTZ) cameras deployed outdoors for persistent tracking of humans and vehicles, without resorting to the large fields of view of associated static cameras. Our system, Active Collaborative Tracking - Vision (ACT-Vision), is essentially a real-time operating...
Conference Paper
We describe a low-cost vision-based sensing and positioning system that enables intelligent vehicles of the future to autonomously drive in an urban environment with traffic. The system was built by integrating Sarnoff's algorithms for driver awareness and vehicle safety with commercial off-the-shelf hardware on a robot vehicle. We implemented a mo...
Article
In this paper, we presented a fully integratedreal-time computer vision system that can detect and track multiple humans in a wide-area using a network of stereo cameras. Continuous human identities are achieved by fusing video tracking with different kinds of biometric devices. The system also provides immersive visualization which enables the use...
Article
Predicting motion of humans, animals and other objects which move according to internal plans is a challenging problem. Most existing approaches operate in two stages: (a) learning typical motion patterns by observing an environment and (b) predicting ...
Conference Paper
This paper presents an approach to extracting and using semantic layers from low altitude aerial videos for scene understanding and object tracking. The input video is captured by low flying aerial platforms and typically consists of strong parallax from non-ground-plane structures. A key aspect of our approach is the use of geo-registration of vid...
Article
In this paper, we present a new feature to model a class of events that consist of complex interactions among multiple entities captured by tracks and inter-object relationships over space and time. Existing approaches represent these events using features that measure only pairwise relationships between entities at a time, such as relative distanc...
Conference Paper
We present a novel building segmentation system for densely built areas, containing thousands of buildings per square kilometer. We employ solely sparse LIDAR (Light/Laser Detection Ranging) 3D data, captured from an aerial platform, with resolution less than one point per square meter. The goal of our work is to create segmented and delineated bui...
Conference Paper
This paper proposes a novel approach to discover a set of class specific ldquocomposite featuresrdquo as the feature pool for the detection and classification of complex objects using AdaBoost. Each composite feature is constructed from the combination of multiple individual features. Unlike previous works that design features manually or with cert...
Conference Paper
Full-text available
We propose a robust object recognition method based on approximate 3D models that can effectively match ob- jects under large viewpoint changes and partial occlusion. The specific problem we solve is: given two views of an object, determine if the views are for the same or differ- ent object. Our domain of interest is vehicles, but the ap- proach c...
Conference Paper
Full-text available
In this paper, we study how to build a vision-based sys- tem for global localization with accuracies within 10cm. for robots and humans operating both indoors and outdoors over wide areas covering many square kilometers. In par- ticular, we study the parameters of building a landmark database rapidly and utilizing that database online for real- tim...
Article
This paper proposes a novel unsupervised algorithm learning discriminative features in the context of matching road vehicles between two non-overlapping cameras. The matching problem is formulated as a same-different classification problem, which aims to compute the probability of vehicle images from two distinct cameras being from the same vehicle...
Conference Paper
We propose an efficient action retrieval system that is based on a novel action representation and an effective video matching method. We represent actions with a hierarchical encoding scheme that at low-level measures local body parts motions, which then evolves into encoding of instantaneous global body motions and finally high-level description...
Conference Paper
Fingerprinting is the process of mapping content or fragments of it, into unique, discriminative hashes called fingerprints. In this paper, we propose an automated video identification algorithm that employs fingerprinting for storing videos inside its database. When queried using a degraded short video segment, the objective of the system is to re...
Conference Paper
We present an action recognition scheme that integrates multiple modality of cues that include shape, motion and depth to recognize human gesture in the video sequences. In the proposed approach we extend classification framework that is commonly used in 2D object recognition to 3D spatio-temporal space for recognizing actions. Specifically, a boos...
Conference Paper
Video cameras are no longer being used only in their traditional role of providing "viewable pixels", but are rapidly becoming sources of intelligent information about the world. More recently 3D cameras are being developed to directly provide 3D measurements of objects and scenes. Appearance and geometry of objects and scenes, and the temporal dyn...
Conference Paper
This paper presents a novel framework, prototype embedding and embedding transition (PEET), for matching objects, especially vehicles, that undergo drastic pose, appearance, and even modality changes. The problem of matching objects seen under drastic variations is reduced to matching embeddings of object appearances instead of matching the object...
Article
This paper addresses the problem of matching vehicles across multiple sightings under variations in illumination and camera poses. Since multiple observations of a vehicle are separated in large temporal and/or spatial gaps, thus prohibiting the use of standard frame-to-frame data association, we employ features extracted over a sequence during one...
Conference Paper
Full-text available
Our goal is to create a visual odometry system for robots and wearable systems such that localization accuracies of centimeters can be obtained for hundreds of meters of distance traveled. Existing systems have achieved approximately a 1% to 5% localization error rate whereas our proposed system achieves close to 0.1% error rate, a ten-fold reducti...
Conference Paper
This paper presents an action analysis method based on robust string matching using dynamic programming. Similar to matching text sequences, atomic actions based on semantic and structural features are first detected and coded as spatio-temporal characters or symbols. These symbols are subsequently concatenated to form a unique set of strings for e...
Article
Full-text available
We propose a new method for rapid 3D object indexing that combines feature-based methods with coarse alignment-based matching techniques. Our approach achieves a sublinear complexity on the number of models, maintaining at the same time a high degree of performance for real 3D sensed data that is acquired in largely uncontrolled settings. The key c...
Article
This paper describes a model-based 3D object recognition system, which makes use of 3D data acquired by LIDAR sensors. The system is based on a coarse-to-fine scheme for object indexing and verification to achieve high efficiency and accuracy. The system employs rotationally invariant semi-local spin image features for object representation and for...
Conference Paper
This paper proposes a novel method to exploit model similarity in model-based 3D object recognition. The scenario consists of a large D model database of vehicles, and rapid indexing and matching needs to be done without sequential model alignment. In this scenario, the competition amongst shape features from similar models may pose serious challen...
Conference Paper
Full-text available
Using the variational approaches to estimate optical flow between two frames, the flow discontinuities between different motion fields are usually not distinguished even when an anisotropic diffusion operator is applied. In this pa- per, we propose a multi-cue driven adaptive bilateral filter to regularize the flow computation, which is able to ach...
Conference Paper
Full-text available
We present a novel approach for identifying 3D objects from a database of models, highly similar in shape, using range data acquired in unconstrained settings from a limited number of viewing directions. We are addressing also the challenging case of identifying targets not present in the database. The method is based on learning offline saliency...
Article
This paper describes a model-based 3D object recognition system, which makes use of 3D data acquired by LIDAR sensors. The system is based on a coarse-to-fine scheme for object indexing and verification to achieve high efficiency and accuracy. The system employs rotationally invariant semi-local spin image features for object representation and for...
Article
Histograms of shape signature or prototypical shapes, called shapemes, have been used effectively in previous work for 2D/3D shape matching and recognition. We extend the idea of shapeme histogram to recognize partially observed query objects from a database of complete model objects. We propose representing each model object as a collection of sha...
Conference Paper
Full-text available
In this paper, we propose a robust heterogeneous feature based image alignment method that utilizes points, lines and regions in a unified framework. The image motion is decomposed into progressively complex components, i.e., translation, similarity, affine, and projective motion models, and alignment is obtained with deliberatively selected suitab...
Conference Paper
This paper proposes a novel approach for multi-view multi-pose object detection using discriminative shapebased exemplars. The key idea underlying this method is motivated by numerous previous observations that manually clustering multi-view multi-pose training data into different categories and then combining the separately trained two-class class...
Conference Paper
Tracking objects over a long period of time in realistic environments remains a challenging problem for ground and aerial video surveillance. Matching objects and verifying their identities across multiple spatial and temporal gaps proves to be an effective way to extend tracking range. When an object track is lost due to occlusion or other reasons...
Conference Paper
We propose a novel method for identifying road vehicles between two nonoverlapping cameras. The problem is formulated as a same-different classification problem: probability of two vehicle images from two distinct cameras being from the same vehicle or from different vehicles. The key idea is to compute the probability without matching the two vehi...
Conference Paper
This paper proposes a method for matching road vehicles between two non-overlapping cameras. The matching problem is formulated as a same-different classification problem: probability of two observations from two distinct cameras being from the same vehicle or from different vehicles. We employ a measurement vector consists of three independent edg...
Conference Paper
We present a fully integrated real-time system to track humans with a network of stereo sensors over a wide area. The processing includes single camera tracking and multi-camera fusion. Each single camera detects and tracks humans in its own view and a multi-camera fusion module combines all the local tracks of the same human into a global track. W...
Article
Visual recognition of objects through multiple observations is an important component of object tracking. We address the problem of vehicle matching when multiple observations of a vehicle are separated in time such that frames of observations are not contiguous, thus prohibiting the use of standard frame-to-frame data association. We employ featur...
Article
This paper proposes a joint feature-based model indexing and geometric constraint based alignment pipeline for efficient and accurate recognition of 3D objects from a large model database. Traditional approaches either first prune the model database using indexing without geometric alignment or directly perform recognition based alignment. The inde...
Conference Paper
Histogram of shape signature or prototypical shapes, called shapemes, have been used effectively in previous work for 2D/3D shape matching & recognition. We extend the idea of shapeme histogram to recognize partially observed query objects from a database of complete model objects. We propose to represent each model object as a collection of shapem...
Article
Realistic and interactive telepresence has been a hot research topic in recent years. Enabling telepresence using depth-based new view rendering requires the compression and transmission of video as well as dynamic depth maps from multiple cameras. The telepresence application places additional requirements on the compressed representation of depth...
Conference Paper
Full-text available
This paper proposes a joint feature-based model indexing and geometric constraint based alignment pipeline for efficient and accurate recognition of 3D objects from a large model database. Traditional approaches either first prune the model database using indexing without geometric alignment or directly perform recognition based alignment. The inde...
Article
We propose a novel similarity measure of two image sequences based on shapeme histograms. The idea of shapeme histogram has been used for single image/texture recognition, but is used here to solve the sequence-to- sequence matching problem. We develop techniques to represent each sequence as a set of shapeme histograms, which captures different va...
Article
Video information can provide an inexpensive source of information about the world. For many applications such as surveillance, situation awareness and navigation, the utility of this video information is increased if we are able to assign precise geocoordinates to the pixels in the video acquired from an airborne platform. Many video-capture platf...
Article
In a typical security and monitoring system a large number of networked cameras are installed at fixed positions around a site under surveillance. There is generally no global view or map that shows the guard how the views of different cameras relate to one another. Individual cameras may be equipped with pan, tilt and zoom capabilities, and the gu...
Conference Paper
Reconstruction-based super-resolution from motion video has been an active area of study in computer vision and video analysis. Image alignment is a key component of super-resolution algorithms. Almost all previous super-resolution algorithms have assumed that standard methods of image alignment can provide accurate enough alignment for creating su...
Article
This article presents a complete approach for automated construction of mosaics from images and video, constituting a practical end-to-end system. Local alignment of spatially overlapping frames followed by global consistency provides spatial continuity, while compositing via multiresolution blending provides photometric continuity, so that the mos...
Article
Full-text available
Decomposing video frames into coherent 2D motion layers is a powerful method for representing videos. Such a representation provides an intermediate description that enables applications such as object tracking, video summarization and visualization, video insertion, and sprite-based video compression. Previous work on motion layer analysis has lar...
Conference Paper
View-based 3D video streaming requires the compression and transmission of depth maps along with the video sequence. The requirements for representing these depth maps include moderately high compression, preservation of depth discontinuities, low complexity decoding, and to be in a form that is suitable for real-time rendering using graphics cards...
Article
Decomposing video frames into coherent two-dimensional motion layers is a powerful method for representing videos. Such a representation provides an intermediate description that enables applications such as object tracking, video summarization and visualization, video insertion, and sprite-based video compression. Previous work on motion layer ana...
Conference Paper
Full-text available
Videos and 3D models have traditionally existed in separate worlds and as distinct representations. Although texture maps for 3D models have been traditionally derived from multiple still images, real-time mapping of live videos as textures on 3D models has not been attempted. This paper presents a system for rendering multiple live videos in real-...
Conference Paper
Full-text available
Reconstruction-based super-resolution algorithms require very accurate alignment and good choice of filters to be effective. Often these requirements are hard to satisfy, for example, when we adopt optical flow as the motion model. In addition, the condition of having enough sub-samples may vary from pixel to pixel. We propose an alternative super-...

Network

Cited By