About
424
Publications
40,715
Reads
How we measure 'reads'
A 'read' is counted each time someone views a publication summary (such as the title, abstract, and list of authors), clicks on a figure, or views or downloads the full-text. Learn more
23,588
Citations
Publications
Publications (424)
We consider scenarios where we have zero instances of real pedestrian data (e.g., a newly installed surveillance system in a novel location in which no labeled real data or unsupervised real data exists yet) and a pedestrian detector must be developed prior to any observations of pedestrians. Given a single image and auxiliary scene information in...
We propose an Ensemble of Robust Constrained Local Models for alignment of faces in the presence of significant occlusions and of any unknown pose and expression. To account for partial occlusions we introduce, Robust Constrained Local Models, that comprises of a deformable shape and local landmark appearance model and reasons over binary occlusion...
We introduce the concept of a Visual Compiler that generates a scene specific pedestrian detector and pose estimator without any pedestrian observations. Given a single image and auxiliary scene information in the form of camera parameters and geometric layout of the scene, the Visual Compiler first infers geometrically and photometrically accurate...
A projector manipulates outgoing light rays, while a camera records incoming ones. Combining these optically inverse devices, especially in a coaxial manner, creates the possibility of a new computer-vision technology. The "Smart Headlight," currently under development at Carnegie Mellon's Robotics Institute, is one example: a device that can "eras...
This paper describes the development of the IAS-Society and the trends in the Intelligent Autonomous Systems conferences. The first IAS conference was held in 1986 and was the first conference on this topic. The Society, founded in 1994, laid the basis for the organization of the IAS conferences. The topics presented at the successive IAS conferenc...
We propose a mechanism that exploits the singular configuration in a closed-loop four-bar linkage that can produce a high impulsive torque (a high torque for a short period in time) at the start of motion and high angular velocity during the successive motion. Such characteristics make the mechanism suitable for executing with high energy efficienc...
Face alignment is the problem of automatically locating detailed facial landmarks across different subjects, illuminations, and viewpoints. Previous methods can be divided into two broad categories. 2D-based methods locate a relatively small number of 2D fiducial points in real time while 3D-based methods fit a high-resolution 3D model offline at a...
This paper examines the capturability of a guidance law that provides a desired angular acceleration of the heading angle to a pursuer. The guidance law has originally been proposed by Fajen and Warren[13], in order to explain the behavior of walking humans. It is shown that the relative rotational motion between a pursuer and an evader can be repr...
Achieving sub-pixel accuracy with face alignment algorithms is a difficult task given the diversity of appearance in real world facial profiles. To capture variations in perspective, occlusion, and illumination with adequate precision, current face alignment approaches rely on detecting facial landmarks and iteratively adjusting deformable models t...
Object representation is useful for many computer vision tasks, such as object detection, recognition, and tracking. Computer vision tasks must handle situations where unknown objects appear and must detect and track some object which is not in the trained database. In such cases, the system must learn or, otherwise derive, descriptions of new obje...
The primary goal of an automotive headlight is to improve safety in low light and poor weather conditions. But, despite decades of innovation on light sources, more than half of accidents occur at night even with less traffic on the road. Recent developments in adaptive lighting have addressed some limitations of standard headlights, however, they...
The primary goal of a vehicular headlight is to improve safety in low-light and poor weather conditions. The typical headlight however has very limited flexibility - switching between high and low beams, turning off beams toward the opposing lane or rotating the beam as the vehicle turns - and is not designed for all driving environments. Thus, des...
A close relationship exists between the advancement of face recognition algorithms and the availability of face databases varying factors that affect facial appearance in a controlled manner. The CMU PIE database has been very influential in advancing research in face recognition across pose and illumination. Despite its success the PIE database ha...
This paper presents a novel stereo-based visual odometry approach that provides state-of-the-art results in real time, both indoors and outdoors. Our proposed method follows the procedure of computing optical flow and stereo disparity to minimize the re-projection error of tracked fea ture points. However, instead of following the traditional appro...
Objects in a real world image cannot have arbitrary appearance, sizes and locations due to geometric constraints in 3D space. Such a 3D geometric context plays an important role in resolving visual ambiguities and achieving coherent object detection. In this paper, we develop a RANSAC-CRF framework to detect objects that are geometrically coherent...
Finding satisfactory scientific literature is still a very time-consuming task. In the last decade several tools have been proposed to approach this task, however only few of them actually analyse the whole document in order to select and present it to the user and even less tools offer any kind of explanation of why a given item was retrieved/reco...
Alignment of 3D objects from 2D images is one of the most important and well studied problems in computer vision. A typical object alignment system consists of a landmark appearance model which is used to obtain an initial shape and a shape model which refines this initial shape by correcting the initialization errors. Since errors in landmark init...
The human body is structurally symmetric. Tracking by detection approaches for human pose suffer from double counting, where the same image evidence is used to explain two separate but symmetric parts, such as the left and right feet. Double counting, if left unaddressed can critically affect subsequent processes, such as action recognition, afford...
In sports, wearable gaze tracking devices can enrich the viewer experience and be a powerful training tool. Because devices can be used for long periods of time, often outside, it is desirable that they do not use active illumination (infra-red light sources) for safety reasons and to minimize the interference of the sun. Unlike traditional wearabl...
An urban operation of unmanned aerial vehicles (UAVs) demands a high level of autonomy for tasks presented in a cluttered environment. While fixed-wing UAVs have been well suited for long-endurance missions at a high altitude, their navigation inside an urban area brings more challenges in motion planning and control. The inability to hover and low...
This paper presents a convenient self-calibration method for an inertial measurement unit (IMU) using matrix factorization. Using limited information about applied loads (accelerations or angular rates) available from natural references, the proposed method can linearly solve all the parameters of an IMU in any configuration of its inertial compone...
The main obstacles for a straightforward use of association rules as candidate business rules are the excessive number of rules discovered even on small datasets, and the fact that contradicting rules are generated. This paper shows that Association Rule Classification algorithms, such as CBA, solve both these problems, and provides a practical gui...
Wearable devices with gaze tracking can assist users in many daily-life tasks. When used for extended periods of time, it is desirable that such devices do not employ active illumination for safety reasons and to minimize interference from other light sources such as the sun. Most non active-illumination methods for gaze tracking attempt to locate...
Knowing how well an activity is performed is important for home rehabilitation. We would like to not only know if a motion is being performed correctly, but also in what way the motion is incorrect so that we may provide feedback to the user. This paper describes methods for assessing human motion quality using body-worn tri-axial accelerometers an...
We propose a multiview method for reconstructing a folded cloth surface on which regularly-textured color patches are printed. These patches provide not only easy pixel-correspondence between multiviews but also the following two new functions. (1) Error recovery: errors in 3D surface reconstruction (e.g. errors in occlusion boundaries and shaded r...
The articles in this special issue focus on advancements in quality of life technology, or QoLT. This is defined as intelligent systems that augment body and mind functions for self-determination for older adults and people with disabilities. QoLT systems can take many forms: they could be a device that a person carries or wears, a mobile system th...
For understanding the behavior, intent, and environment of a person, the surveillance metaphor is traditional; that is, install cameras and observe the subject, and his/her interaction with other people and the environment. Instead, we argue that the first-person vision (FPV), which senses the environment and the subject's activities from a wearabl...
Within the task of collaborative filtering two challenges for computing
conditional probabilities exist. First, the amount of training data available
is typically sparse with respect to the size of the domain. Thus, support for
higher-order interactions is generally not present. Second, the variables that
we are conditioning upon vary for each quer...
This paper presents the capturability analysis of a three-dimensional guidance law that provides a desired angular acceleration of the heading angle to a pursuer. The relative rotational motion between a pursuer and an evader is represented by an equation similar to the equation of motion for a spherical pendulum with a disturbance. A set of suffic...
Autonomous vehicles must be capable of localizing even in GPS denied situations. In this paper, we propose a real-time method to localize a vehicle along a route using visual imagery or range information. Our approach is an implementation of topometric localization, which combines the robustness of topological localization with the geometric accura...
We propose a unified model for human motion prior with multiple actions. Our model is generated from sample pose sequences of the multiple actions, each of which is recorded from real human motion. The sample sequences are connected to each other by synthesizing a variety of possible transitions among the different actions. For kinematically-realis...
During low-light conditions, drivers rely mainly on headlights to improve visibility. But in the presence of rain and snow, headlights can paradoxically reduce visibility due to light reflected off of precipitation back towards the driver. Precipitation also scatters light across a wide range of angles that disrupts the vision of drivers in oncomin...
A wearable gaze tracking device can work with users in daily-life. For long time of use, a non-active method that does not employ an infrared illumination system is desirable from safety standpoint. It is well known that the eye model constraints substantially improve the accuracy and robustness of gaze estimation. However, the eye model needs to b...
Therapies using adult stem cells often require mechanical manipulation such as injection or incorporation into scaffolds. However, force-induced rupture and mechanosensitivity of cells during manipulation is largely ignored. Here, we image cell mechanical structures and perform a biophysical characterization of three different types of human adult...
Our aim is to utilize smart surfaces to separate specific populations of
cells from a heterogeneous sample. There are a number of diseases (e.g.,
malaria and various cancers) that alter the elasticity of biological
cells. In this work, we use the mechanical stiffness of the cells as a
key parameter since it can reveal the presence of disease. By
in...
Service-based applications have become more and more multi-layered in nature, as we tend to build software as a service on top of infrastructure as a service. Most existing SOA monitoring and adaptation techniques address layer-specific issues. These techniques, if used in isolation, cannot deal with real-world domains, where changes in one layer o...
Cell separation technology is a key tool for biological studies and medical diagnostics that relies primarily on chemical labeling to identify particular phenotypes. An emergent method of sorting cells based on differential rolling on chemically patterned substrates holds potential benefits over existing technologies, but the underlying mechanisms...
The saliency of regions or objects in an image can be significantly boosted if they recur in multiple images. Leveraging this idea, cosegmentation jointly segments common regions from multiple images. In this paper, we propose CoSand, a distributed cosegmentation approach for a highly variable large-scale image collection. The segmentation task is...
We propose an approach to identify and segment objects from scenes that a person (or robot) encounters in Activities of Daily Living (ADL). Images collected in those cluttered scenes contain multiple objects. Each image provides only a partial, possibly very different view of each object. An object instance discovery program must be able to link pi...
Gaze estimation is a key technology to understand a person's interests and intents, and it is becoming more popular in daily situations such as driving scenarios. Wearable gaze estimation devices are use for long periods of time, therefore non-active sources are not desirable from a safety point of view. Gaze estimation that does not rely on active...
Lidar and visual imagery have been broadly utilized in computer vision and mobile robotics applications because these sensors provide complementary information. However, in order to convert data between the local coordinate systems, we must estimate the rigid body transformation between the sensors. In this paper, we propose a robust- weighted extr...
Existing approaches to nonrigid structure from motion assume that the instantaneous 3D shape of a deforming object is a linear combination of basis shapes. These bases are object dependent and therefore have to be estimated anew for each video sequence. In contrast, we propose a dual approach to describe the evolving 3D structure in trajectory spac...
We propose a method for geometric calibration of an active vision system, composed of a projector and a camera, using structured light projection. Unlike existing methods of self-calibration for projector-camera systems, our method estimates the intrinsic parameters of both the projector and the camera as well as extrinsic parameters except a globa...
One of the fundamental requirements of an autonomous vehicle is the ability to determine its location on a map. Frequently, solutions to this localization problem rely on GPS information or use expensive three dimensional (3D) sensors. In this paper, we describe a method for long-term vehicle localization based on visual features alone. Our approac...
In this paper, we discuss adaptive control of a space robot system with an attitude controlled base on which the robot is attached. We at first derive the system kinematic and dynamic equations based on Lagrangian dynamics and linear momentum conservation law. Using the dynamic model developed, we discuss the problem of linear parameterization in t...
The fusion of stereo and laser range finders (LIDARs) has been proposed as a method to compensate for each individual sensor's deficiencies - stereo output is dense, but noisy for large distances, while LIDAR is more accurate, but sparse. However, stereo usually performs poorly on textureless areas and on scenes containing repetitive structures, an...
The fast and accurate computation of surface normals from a point cloud is a critical step for many 3D robotics and automotive problems, including terrain estimation, mapping, navigation, object segmentation, and object recognition. To obtain the tangent plane to the surface at a point, the traditional approach applies total least squares to its sm...
We present a drift-free attitude estimation method that uses image line segments for the correction of accumulated errors in integrated gyro rates when an unmanned aerial vehicle (UAV) operates in urban areas. Since man-made environments generally exhibit strong regularity in structure, a set of line segments that are either parallel or orthogonal...
In this paper we present an approach to robustly align facial features to a face image even when the face is partially occluded. Previous methods are vulnerable to partial occlusion of the face, since it is assumed, explicitly or implicitly, that there is no significant occlusion. In order to cope with this difficulty, our approach relies on two sc...
This paper presents a short-baseline real-time stereo vision system that is capable of the simultaneous and robust estimation of the ego-motion and of the D structure and the independent motion of thousands of points of the environment. Kalman lters estimate the position and velocity of world points in 3D Euclidean space. The six degrees of freedom...
Recently, techniques for measuring and modeling of human body are taking
attention, because human models are useful for ergonomic design in
manufacturing. We aim to measure accurate shape of human foot that will
be useful for the design of shoes. For such purpose, shape measurement
of foot in motion is obviously important, because foot shape in the...
We have developed a system for image-based rendering of real-world objects on web-browsers. The light field of the object is reconstructed from uncalibrated images using structure-from-motion techniques. The reconstructed model is then rendered on web-browsers using our algorithm of approximated perspective transformation, which allows a user to br...
This paper discusses the advantages of singular configurations of a two-link robot arm in achieving tasks of pulling or lifting a heavy object. Optimal base location and arm motion for minimizing the joint torques are examined by numerical simulations, and the base location where the robot arm is near a singular configuration at the start time of t...
We address the problem of interactive search for a target of interest in surveillance imagery. Our solution consists of iteratively learning a distance metric for retrieval, based on user feedback. The approach employs (retrieval) rank based constraints and convex optimization to efficiently learn the distance metric. The algorithm uses both user l...
This paper evaluates the dynamic and kinematic properties of a prismatic mechanism and shows its capabilities in performing home manipulation tasks when integrated into a robotic arm. Our design is motivated from the observation that human hand motions often follow a linear trajectory when manipulating everyday objects. We present the mechanical de...
We propose a singularity-based mechanism (SBM) to exploit the singular configuration that improves the angular acceleration instead of constraining the movement. The tradeoff between the responsiveness and the range of motion is achieved by varying a length of linkage in the SBM. In this paper, we clarify the responsiveness of the SBM using the dyn...
In previous work, we have developed a “Glance-Look” model, which has replicated a broad profile of data on the semantic Attentional Blink (AB) task and characterized how attention deployment is modulated by emotion. The model relies on a distinction between two levels of meaning: implicational and propositional, which are supported by two correspon...
In this paper, we describe methods for assessment of exercise quality using body-worn tri-axial accelerometers. We assess exercise quality by building a classifier that labels incorrect exercises. The incorrect performances are divided into a number of classes of errors as defined by a physical therapist. We focus on exercises commonly prescribed f...
We present a multi-layered display that uses water drops as voxels. Water drops refract most incident light, making them excellent wide-angle lenses. Each 2D layer of our display can exhibit arbitrary visual content, creating a layered-depth (2.5D) display. Our system consists of a single projector-camera system and a set of linear drop generator m...
Detecting the boundaries of objects is a key step in separating foreground objects from the background, which is useful for robotics and computer vision applications, such as object detection, recognition, and tracking. We propose a new method for detecting object boundaries using planar laser scanners (LIDARs) and, optionally, co-registered imager...
Detecting and segmenting cell regions in microscopic images is a challenging task, because cells typically do not have rich features, and their shapes and appearances are highly irregular and flexible. Furthermore, cells often form clusters, rendering the existing joint detection and segmentation algorithms unable to segment out individual cells. W...
We propose a fully-automated mitosis event detector using hidden conditional random fields for cell populations imaged with time-lapse phase contrast microscopy. The method consists of two stages that jointly optimize recall and precision. First, we apply model-based microscopy image preconditioning and volumetric segmentation to identify candidate...
There has been a recent push in extraction of 3D spatial layout of scenes. However, none of these approaches model the 3D interaction between objects and the spatial layout. In this paper, we argue for a parametric representation of objects in 3D, which allows us to incorporate volumetric constraints of the physical world. We show that augmenting c...
Dynamic weather such as rain and snow causes complex spatio-temporal intensity fluctuations in videos. Such fluctuations can adversely impact vision systems that rely on small image features for tracking, object detection and recognition. While these effects appear to be chaotic in space and time, we show that dynamic weather has a pre-dictable glo...
The Animals to Animats Conference brings together researchers from ethology, psychology, ecology, artificial intelligence, artificial life, robotics, engineering, and related fields to further understanding of the behaviors and underlying mechanisms that allow natural and synthetic agents (animats) to adapt and survive in uncertain environments. Th...