About
165
Publications
21,063
Reads
How we measure 'reads'
A 'read' is counted each time someone views a publication summary (such as the title, abstract, and list of authors), clicks on a figure, or views or downloads the full-text. Learn more
13,206
Citations
Publications
Publications (165)
Matching street-level images to a database of airborne images is hard because of extreme viewpoint and illumination differences. Color/gradient distributions or local descriptors fail to match forcing us to rely on the structure of self-similarity of patterns on facades. We propose to capture this structure with a novel “scale-selective self-simila...
We propose a new zero-shot Event-Detection method by Multi-modal Distributional Semantic embedding of videos. Our model embeds object and action concepts as well as other available modalities from videos into a distributional semantic space. To our knowledge, this is the first Zero-Shot event detection model that is built on top of distributional s...
We propose a new zero-shot Event Detection method by Multi-modal
Distributional Semantic embedding of videos. Our model embeds object and action
concepts as well as other available modalities from videos into a
distributional semantic space. To our knowledge, this is the first Zero-Shot
event detection model that is built on top of distributional s...
We present an algorithm to estimate depth in dynamic video scenes. We propose
to learn and infer depth in videos from appearance, motion, occlusion
boundaries, and geometric context of the scene. Using our method, depth can be
estimated from unconstrained videos with no requirement of camera pose
estimation, and with significant background/foregrou...
We present snap-n-eat, a mobile food recognition system. The system can recognize food and estimate the calorific and nutrition content of foods automatically without any user intervention. To identify food items, the user simply snaps a photo of the food plate. The system detects the salient region, crops its image, and subtracts the background ac...
A computer implemented method for deriving an attribute entity network (AEN) from video data is disclosed, comprising the steps of: extracting at least two entities from the video data; tracking the trajectories of the at least two entities to form at least two tracks; deriving at least one association between at least two entities by detecting at...
A method and apparatus for determining a geographic location of a scene in a captured depiction comprising extracting a first set of features from the captured depiction by algorithmically analyzing the captured depiction, matching the extracted features of the captured depiction against a second set of extracted features associated with reference...
The problem of training a classifier from a handful of positive examples, without having to supply class specific negatives is of great practical importance. The proposed approach to solving this problem builds on the idea of training LDA classifiers using only class specific foreground images and a large collection of unlabelled images, as describ...
A computer implemented method for determining a vehicle type of a vehicle detected in an image is disclosed. An image having a detected vehicle is received. A number of vehicle models having salient feature points is projected on the detected vehicle. A first set of features derived from each of the salient feature locations of the vehicle models i...
A computer implemented method for detecting the presence of one or more pedestrians in the vicinity of the vehicle is disclosed. Imagery of a scene is received from at least one image capturing device. A depth map is derived from the imagery. A plurality of pedestrian candidate regions of interest (ROIs) is detected from the depth map by matching e...
Detecting pedestrians at a distance from large-format wide-area imagery is a challenging problem because of low ground sampling distance (GSD) and low frame rate of the imagery. In such a scenario, the approaches based on appearance cues alone mostly fail because pedestrians are only a few pixels in size. Frame-differencing and optical flow based a...
The present invention relates to a method and apparatus for detecting and tracking vehicles. One embodiment of a system for detecting and tracking an object (e.g., vehicle) in a field of view includes a moving object indication stage for detecting a candidate object in a series of input video frames depicting the field of view and a track associati...
Important diagnostic criteria for glaucoma are changes in the 3D structure of the optic disc due to optic nerve damage. We propose an automatic approach for detecting these changes in 3D models reconstructed from fundus images of the same patient taken at different times. For each time session, only two uncalibated fundus images are required. The a...
We propose a novel hybrid model that exploits the strength of discriminative classifiers along with the representational power of generative models. Our focus is on detecting multimodal events in time varying sequences. Discriminative classifiers have been shown to achieve higher performances than the corresponding generative likelihood-based class...
A computer implemented method for matching video data to a database containing a plurality of video fingerprints of the type described above, comprising the steps of calculating at least one fingerprint representing at least one query frame from the video data; indexing into the database using the at least one calculated fingerprint to find a set o...
The present invention relates to a method and system for creating a strong classifier based on motion patterns wherein the strong classifier may be used to determine an action being performed by a body in motion. When creating the strong classifier, action classification is performed by measuring similarities between features within motion patterns...
Complex event detection is very challenging in open source such as You-Tube videos, which usually comprise very diverse visual contents involving various object, scene and action concepts. Not all of them, however, are relevant to the event. In other words, a video may contain a lot of "junk" information which is harmful for recognition. Hence, we...
Despite recent advances, automatic blood vessel extraction from low quality retina images remains difficult. We propose an interactive approach that enables a user to efficiently obtain near perfect vessel segmentation with a few mouse clicks. Given two seed points, the approach seeks an optimal path between them by minimizing a cost function. In c...
This paper presents a novel method to recover 3D structure of the optic disc in the retina from two uncalibrated fundus images. Retinal images are commonly uncalibrated when acquired clinically, creating rectification challenges as well as significant radiometric and blur differences within the stereo pair. By exploiting structural peculiarities of...
We present a novel approach for multi-modal affect analysis in human interactions that is capable of integrating data from multiple modalities while also taking into account temporal dynamics. Our fusion approach, Joint Hidden Conditional Random Fields (JHRCFs), combines the advantages of purely feature level (early fusion) fusion approaches with l...
In this work, we propose a novel video representation for activity recognition that models video dynamics with attributes of activities. A video sequence is decomposed into short-term segments, which are characterized by the dynamics of their attributes. These segments are modeled by a dictionary of attribute dynamics templates, which are implement...
The present invention provides a computer implemented process for detecting multi-view multi-pose objects. The process comprises training of a classifier for each intra-class exemplar, training of a strong classifier and combining the individual exemplar-based classifiers with a single objective function. This function is optimized using the two ne...
A computer implemented method for automatically detecting and classifying acoustic signatures across a set of recording conditions is disclosed. A first acoustic signature is received. The first acoustic signature is projected into a space of a minimal set of exemplars of acoustic signature types derived from a larger set of exemplars using a wrapp...
The present invention is a system and a method for processing stereo images utilizing a real time, robust, and accurate stereo matching system and method based on a coarse-to-fine architecture. At each image pyramid level, non-centered windows for matching and adaptive upsampling of coarse-level disparities are performed to generate estimated dispa...
We present a novel method for matching ground-based query images to a georeferenced LIDAR 3D dataset acquired from an airborne platform in urban environments. We are addressing two main technical challenges: (i) different modalities between the query and the reference data (electro-optical vs. LIDAR) that impose unique challenges to the matching pr...
We propose to use action, scene and object concepts as semantic attributes for classification of video events in InTheWild content, such as YouTube videos. We model events using a variety of complementary semantic attribute features developed in a semantic concept space. Our contribution is to systematically demonstrate the advantages of this conce...
A computer-implemented method for estimating a volume of at least one food item on a food plate is disclosed. A first and second plurality of images are received from different positions above a food plate, wherein angular spacing between the positions of the first plurality of images is greater than angular spacing between the positions of the sec...
A method and apparatus for recognizing an object, comprising providing a set of scene features from a scene, pruning a set of model features, generating a set of hypotheses associated with the pruned set of model features for the set of scene features, pruning the set of hypotheses, and verifying the set of pruned hypotheses is provided.
Multimedia event detection has drawn a lot of attention in recent years. Given a recognized event, in this paper, we conduct a pilot study of the multimedia event recounting problem, which answers the question why this video is recognized as this event, i.e. what evidences this decision is made on. In order to provide a semantic recounting of the m...
Matching street-level images to a database of airborne images is hard because of extreme viewpoint and illumination differences. Color/gradient distributions or local descriptors fail to match forcing us to rely on the structure of self-similarity of patterns on facades. We propose to capture this structure with a novel “scale-selective self-simila...
A method for extracting a 3D terrain model for identifying at
least buildings and terrain from LIDAR data is disclosed,
comprising the steps of generating a point cloud representing
terrain and buildings mapped by LIDAR; classifying points in the
point cloud, the point cloud having ground and non-ground
points, the non-ground points representing bu...
Low-level appearance as well as spatio-temporal features, appropriately quantized and aggregated into Bag-of-Words (BoW) descriptors, have been shown to be effective in many detection and recognition tasks. However, their effcacy for complex event recognition in unconstrained videos have not been systematically evaluated. In this paper, we use the...
It is true that the teeth of man have been hurting for many thousands of years. The causes of the pain and effective methods to relieve or prevent the pain tend to follow traditional routes like random treatment, starting with materials of natural origin and then work by the aggressively curious, the intelligent and the innovative to better underst...
Photodynamic therapy (PDT) also known as photoradiation therapy, phototherapy, or photochemotherapy, involves the use of a photoactive dye (photosensitizer) that is activated by exposure to light of a specific wavelength in the presence of oxygen. PDT is a very promising and non invasive treatment modality. At present, PDT is alternative method of...
We study the feasibility of solving the challenging problem of geolocalizing ground level images in urban areas with respect to a database of images captured from the air such as satellite and oblique aerial images. We observe that comprehensive aerial image databases are widely available while complete coverage of urban areas from the ground is at...
We propose a novel similarity measure of two image sequences based on shapeme histograms. The idea of shapeme histogram has been used for single image/texture recognition, but is used here to solve the sequence-to-sequence matching problem. We develop techniques to represent each sequence as a set of shapeme histograms, which captures different var...
Analyzing change in the 3D structure of the optic disc over time has long been recognized as central to the diagnosis of glaucoma but has been inadequately addressed by computer vision methods. Currently, clinicians examine stereo pairs from different time instants for interval changes indicative of glaucoma. Due to the clinical procedures in captu...
We describe a vehicle tracking algorithm using input from a network of nonoverlapping cameras. Our algorithm is based on a novel statistical formulation that uses joint kinematic and image appearance information to link local tracks of the same vehicles into global tracks with longer persistence. The algorithm can handle significant spatial separat...
We present a novel LIDAR streaming architecture for real-time, on-board processing using unmanned robots. We propose a two-level 3D data structure that allows pipelined and streaming processing of the 3D data as it arrives from a moving robot: (i) at the coarse level, the incoming 3D scans are stored in memory in a dense 3D voxel grid with a relati...
To assess the suitability of digital stereo images for optic disc evaluations in glaucoma.
Stereo color optic disc images in both digital and 35-mm slide film formats were acquired contemporaneously from 29 subjects with various cup-to-disc ratios (range, 0.26-0.76; median, 0.475). Using a grading scale designed to assess image quality, the ease of...
Localizing blood vessels in eye images is a crucial step in the automated and objective diagnosis of eye diseases. Most previous research has focused on extracting the centerlines of vessels in large field of view images. However, for diagnosing diseases of the optic disk region, like glaucoma, small field of view images have to be analyzed. One ne...
We present an approach that uses detailed 3D models to detect and classify objects into fine levels of vehicle categories. Unlike other approaches that use silhouette information to fit a 3D model, our approach uses complete appearance from the image. Each D model has a set of salient location markers that are determined a-priori. These salient loc...
This paper presents a joint probabilistic relation graph approach to simultaneously detect and track a large number of vehicles in low frame rate aerial videos. Due to low frame rate, low spatial resolution and sheer number of moving objects, detection and tracking in wide area video poses unique challenges. In this paper, we explore vehicle behavi...
We present a real-time pedestrian detection system based on structure and appearance classification. We discuss several novel ideas that contribute to having low-false alarms and high detection rates, while at the same time achieving computational efficiency: (i) At the front end of our system we employ stereo to detect pedestrians in 3D range maps...
We describe a real-time wide area surveillance system (WA-ACTV) for the automatic tracking of vessels using a network of PTZ cameras. The system is capable of optimally managing hundreds of PTZ cameras to simultaneously track a large numbers of vessels. The tracked vessels are fingerprinted using a scale-invariant part-based representation and subs...
We present a real-time pedestrian detection system which uses cues derived from structure and appearance classification We discuss several novel ideas to achieve computational efficien y while improving on both detection and false-alarm rates: (i) At the front end of our system we employ stereo to detect pedestrians in 3D range maps, and to classif...
Gunshot recordings have the potential for both tactical detection and forensic evaluation particularly to ascertain information about the type of firearm and ammunition used. Perhaps the most significant challenge to such an analysis is the effect of recording conditions on the audio signature of recorded data. In this paper we present a first stud...
This paper presents a relational graph based approach to track thousands of vehicles from persistent wide area airborne surveillance (WAAS) videos. Due to the low ground sampling distance and low frame rate, vehicles usually have small size and may travel a long distance between consecutive frames, WAAS videos pose great challenges to correct assoc...
We present an on-the-move LIDAR-based object detection system for
autonomous and semi-autonomous unmanned vehicle systems. In this paper
we make several contributions: (i) we describe an algorithm for
real-time detection of objects such as doors and stairs in indoor
environments; (ii) we describe efficient data structures and algorithms
for process...
We present a system that improves accuracy of food intake assessment using computer vision techniques. Traditional dietetic method suffers from the drawback of either inaccurate assessment or complex lab measurement. Our solution is to use a mobile phone to capture images of foods, recognize food types, estimate their respective volumes and finally...
We propose a principled statistical approach for using 3D information and scene context to reduce the number of false positives in stereo based pedestrian detection. Current pedestrian detection algorithms have focused on improving the discriminability of 2D features that capture the pedestrian appearance, and on using various classifier architectu...
We propose a real-time action detection system based on a novel action representation and an effective learning method with a small training set. We represent actions with a new feature that measures the ¿global¿ distance from a set of action exemplars, where action exemplars are constructed from a vocabulary that encodes ¿local¿ instantaneous...
We apply a unique hierarchical audio classification technique to weapon identification using gunshot analysis. The Audio Classification classifies each audio segment as one of ten weapon classes (e.g., 9mm, 22, shotgun etc.) using lowcomplexity Gaussian Mixture Models (GMM). The first level of hierarchy consists of classification into broad weapons...
We describe a novel scalable approach for the management of a large number of Pan-Tilt-Zoom (PTZ) cameras deployed outdoors for persistent tracking of humans and vehicles, without resorting to the large fields of view of associated static cameras. Our system, Active Collaborative Tracking - Vision (ACT-Vision), is essentially a real-time operating...
We describe a low-cost vision-based sensing and positioning system that enables intelligent vehicles of the future to autonomously drive in an urban environment with traffic. The system was built by integrating Sarnoff's algorithms for driver awareness and vehicle safety with commercial off-the-shelf hardware on a robot vehicle. We implemented a mo...
In this paper, we presented a fully integratedreal-time computer vision system that can detect and track multiple humans in
a wide-area using a network of stereo cameras. Continuous human identities are achieved by fusing video tracking with different
kinds of biometric devices. The system also provides immersive visualization which enables the use...
Predicting motion of humans, animals and other objects which move according to internal plans is a challenging problem. Most existing approaches operate in two stages: (a) learning typical motion patterns by observing an environment and (b) predicting ...
This paper presents an approach to extracting and using semantic layers from low altitude aerial videos for scene understanding and object tracking. The input video is captured by low flying aerial platforms and typically consists of strong parallax from non-ground-plane structures. A key aspect of our approach is the use of geo-registration of vid...
In this paper, we present a new feature to model a class of events that consist of complex interactions among multiple entities captured by tracks and inter-object relationships over space and time. Existing approaches represent these events using features that measure only pairwise relationships between entities at a time, such as relative distanc...
We present a novel building segmentation system for densely built areas, containing thousands of buildings per square kilometer. We employ solely sparse LIDAR (Light/Laser Detection Ranging) 3D data, captured from an aerial platform, with resolution less than one point per square meter. The goal of our work is to create segmented and delineated bui...
This paper proposes a novel approach to discover a set of class specific ldquocomposite featuresrdquo as the feature pool for the detection and classification of complex objects using AdaBoost. Each composite feature is constructed from the combination of multiple individual features. Unlike previous works that design features manually or with cert...
We propose a robust object recognition method based on approximate 3D models that can effectively match ob- jects under large viewpoint changes and partial occlusion. The specific problem we solve is: given two views of an object, determine if the views are for the same or differ- ent object. Our domain of interest is vehicles, but the ap- proach c...
In this paper, we study how to build a vision-based sys- tem for global localization with accuracies within 10cm. for robots and humans operating both indoors and outdoors over wide areas covering many square kilometers. In par- ticular, we study the parameters of building a landmark database rapidly and utilizing that database online for real- tim...
This paper proposes a novel unsupervised algorithm learning discriminative features in the context of matching road vehicles between two non-overlapping cameras. The matching problem is formulated as a same-different classification problem, which aims to compute the probability of vehicle images from two distinct cameras being from the same vehicle...
We propose an efficient action retrieval system that is based on a novel action representation and an effective video matching method. We represent actions with a hierarchical encoding scheme that at low-level measures local body parts motions, which then evolves into encoding of instantaneous global body motions and finally high-level description...
Fingerprinting is the process of mapping content or fragments of it, into unique, discriminative hashes called fingerprints.
In this paper, we propose an automated video identification algorithm that employs fingerprinting for storing videos inside
its database. When queried using a degraded short video segment, the objective of the system is to re...
We present an action recognition scheme that integrates multiple modality of cues that include shape, motion and depth to
recognize human gesture in the video sequences. In the proposed approach we extend classification framework that is commonly
used in 2D object recognition to 3D spatio-temporal space for recognizing actions. Specifically, a boos...
Video cameras are no longer being used only in their traditional role of providing "viewable pixels", but are rapidly becoming sources of intelligent information about the world. More recently 3D cameras are being developed to directly provide 3D measurements of objects and scenes. Appearance and geometry of objects and scenes, and the temporal dyn...
This paper presents a novel framework, prototype embedding and embedding transition (PEET), for matching objects, especially vehicles, that undergo drastic pose, appearance, and even modality changes. The problem of matching objects seen under drastic variations is reduced to matching embeddings of object appearances instead of matching the object...
This paper addresses the problem of matching vehicles across multiple sightings under variations in illumination and camera poses. Since multiple observations of a vehicle are separated in large temporal and/or spatial gaps, thus prohibiting the use of standard frame-to-frame data association, we employ features extracted over a sequence during one...
Our goal is to create a visual odometry system for robots and wearable systems such that localization accuracies of centimeters can be obtained for hundreds of meters of distance traveled. Existing systems have achieved approximately a 1% to 5% localization error rate whereas our proposed system achieves close to 0.1% error rate, a ten-fold reducti...
This paper presents an action analysis method based on robust string matching using dynamic programming. Similar to matching
text sequences, atomic actions based on semantic and structural features are first detected and coded as spatio-temporal characters
or symbols. These symbols are subsequently concatenated to form a unique set of strings for e...
We propose a new method for rapid 3D object indexing that combines feature-based methods with coarse alignment-based matching techniques. Our approach achieves a sublinear complexity on the number of models, maintaining at the same time a high degree of performance for real 3D sensed data that is acquired in largely uncontrolled settings. The key c...
This paper describes a model-based 3D object recognition system, which
makes use of 3D data acquired by LIDAR sensors. The system is based on a
coarse-to-fine scheme for object indexing and verification to achieve
high efficiency and accuracy. The system employs rotationally invariant
semi-local spin image features for object representation and for...
This paper proposes a novel method to exploit model similarity in model-based 3D object recognition. The scenario consists
of a large D model database of vehicles, and rapid indexing and matching needs to be done without sequential model alignment.
In this scenario, the competition amongst shape features from similar models may pose serious challen...
Using the variational approaches to estimate optical flow between two frames, the flow discontinuities between different motion fields are usually not distinguished even when an anisotropic diffusion operator is applied. In this pa- per, we propose a multi-cue driven adaptive bilateral filter to regularize the flow computation, which is able to ach...
We present a novel approach for identifying 3D objects from a database of models, highly similar in shape, using range data
acquired in unconstrained settings from a limited number of viewing directions. We are addressing also the challenging case
of identifying targets not present in the database. The method is based on learning offline saliency...
This paper describes a model-based 3D object recognition system, which makes use of 3D data acquired by LIDAR sensors. The system is based on a coarse-to-fine scheme for object indexing and verification to achieve high efficiency and accuracy. The system employs rotationally invariant semi-local spin image features for object representation and for...
Histograms of shape signature or prototypical shapes, called shapemes, have been used effectively in previous work for 2D/3D shape matching and recognition. We extend the idea of shapeme histogram to recognize partially observed query objects from a database of complete model objects. We propose representing each model object as a collection of sha...
In this paper, we propose a robust heterogeneous feature based image alignment method that utilizes points, lines and regions in a unified framework. The image motion is decomposed into progressively complex components, i.e., translation, similarity, affine, and projective motion models, and alignment is obtained with deliberatively selected suitab...
This paper proposes a novel approach for multi-view multi-pose object detection using discriminative shapebased exemplars. The key idea underlying this method is motivated by numerous previous observations that manually clustering multi-view multi-pose training data into different categories and then combining the separately trained two-class class...
Tracking objects over a long period of time in realistic environments remains a challenging problem for ground and aerial video surveillance. Matching objects and verifying their identities across multiple spatial and temporal gaps proves to be an effective way to extend tracking range. When an object track is lost due to occlusion or other reasons...
We propose a novel method for identifying road vehicles between two nonoverlapping cameras. The problem is formulated as a same-different classification problem: probability of two vehicle images from two distinct cameras being from the same vehicle or from different vehicles. The key idea is to compute the probability without matching the two vehi...
This paper proposes a method for matching road vehicles between two non-overlapping cameras. The matching problem is formulated as a same-different classification problem: probability of two observations from two distinct cameras being from the same vehicle or from different vehicles. We employ a measurement vector consists of three independent edg...
We present a fully integrated real-time system to track humans with a network of stereo sensors over a wide area. The processing includes single camera tracking and multi-camera fusion. Each single camera detects and tracks humans in its own view and a multi-camera fusion module combines all the local tracks of the same human into a global track. W...
Visual recognition of objects through multiple observations is an important component of object tracking. We address the problem of vehicle matching when multiple observations of a vehicle are separated in time such that frames of observations are not contiguous, thus prohibiting the use of standard frame-to-frame data association. We employ featur...
This paper proposes a joint feature-based model indexing and geometric constraint based alignment pipeline for efficient and accurate recognition of 3D objects from a large model database. Traditional approaches either first prune the model database using indexing without geometric alignment or directly perform recognition based alignment. The inde...
Histogram of shape signature or prototypical shapes, called shapemes, have been used effectively in previous work for 2D/3D
shape matching & recognition. We extend the idea of shapeme histogram to recognize partially observed query objects from a
database of complete model objects. We propose to represent each model object as a collection of shapem...
Realistic and interactive telepresence has been a hot research topic in recent years. Enabling telepresence using depth-based new view rendering requires the compression and transmission of video as well as dynamic depth maps from multiple cameras. The telepresence application places additional requirements on the compressed representation of depth...
This paper proposes a joint feature-based model indexing and geometric constraint based alignment pipeline for efficient and accurate recognition of 3D objects from a large model database. Traditional approaches either first prune the model database using indexing without geometric alignment or directly perform recognition based alignment. The inde...
We propose a novel similarity measure of two image sequences based on shapeme histograms. The idea of shapeme histogram has been used for single image/texture recognition, but is used here to solve the sequence-to- sequence matching problem. We develop techniques to represent each sequence as a set of shapeme histograms, which captures different va...
Video information can provide an inexpensive source of information about the world. For many applications such as surveillance,
situation awareness and navigation, the utility of this video information is increased if we are able to assign precise geocoordinates
to the pixels in the video acquired from an airborne platform. Many video-capture platf...
In a typical security and monitoring system a large number of networked cameras are installed at fixed positions around a site under surveillance. There is generally no global view or map that shows the guard how the views of different cameras relate to one another. Individual cameras may be equipped with pan, tilt and zoom capabilities, and the gu...
Reconstruction-based super-resolution from motion video has been an active area of study in computer vision and video analysis.
Image alignment is a key component of super-resolution algorithms. Almost all previous super-resolution algorithms have assumed that standard methods of image alignment can provide accurate enough alignment for creating su...
This article presents a complete approach for automated construction of mosaics from images and video, constituting a practical end-to-end system. Local alignment of spatially overlapping frames followed by global consistency provides spatial continuity, while compositing via multiresolution blending provides photometric continuity, so that the mos...
Decomposing video frames into coherent 2D motion layers is a powerful method for representing videos. Such a representation provides an intermediate description that enables applications such as object tracking, video summarization and visualization, video insertion, and sprite-based video compression. Previous work on motion layer analysis has lar...
View-based 3D video streaming requires the compression and transmission of depth maps along with the video sequence. The requirements for representing these depth maps include moderately high compression, preservation of depth discontinuities, low complexity decoding, and to be in a form that is suitable for real-time rendering using graphics cards...
Decomposing video frames into coherent two-dimensional motion layers is a powerful method for representing videos. Such a representation provides an intermediate description that enables applications such as object tracking, video summarization and visualization, video insertion, and sprite-based video compression. Previous work on motion layer ana...
Videos and 3D models have traditionally existed in separate worlds and as distinct representations. Although texture maps for 3D models have been traditionally derived from multiple still images, real-time mapping of live videos as textures on 3D models has not been attempted. This paper presents a system for rendering multiple live videos in real-...
Reconstruction-based super-resolution algorithms require very accurate alignment and good choice of filters to be effective. Often these requirements are hard to satisfy, for example, when we adopt optical flow as the motion model. In addition, the condition of having enough sub-samples may vary from pixel to pixel. We propose an alternative super-...