Norimichi Ukita

Nara Institute of Science and Technology, Ikuma, Nara, Japan

Are you Norimichi Ukita?

Claim your profile

Publications (54)14.73 Total impact

  • Edilson De Aguiar, Norimichi Ukita
    [Show abstract] [Hide abstract]
    ABSTRACT: We propose a new approach to represent and manipulate a mesh-based character animation preserving its time-varying details. Our method first decomposes the input mesh animation into coarse and fine deformation components. A model for the coarse deformations is constructed by an underlying kinematic skeleton structure and blending skinning weights. Thereafter, a non-linear probabilistic model is used to encode the fine time-varying details of the input animation. The user can manipulate the corresponding skeleton-based component of the input, which can be done by any standard animation package, and the final result is generated including its important time-varying details. By converting an input sample animation into our new hybrid representation, we are able to maintain the flexibility of mesh-based methods during animation creation while allowing for practical manipulations using the standard skeleton-based paradigm. We demonstrate the performance of our method by converting and manipulating several mesh animations generated by different performance capture approaches and apply it to represent and manipulate cloth simulation data.
    Computers & Graphics 02/2014; 38:10-17. DOI:10.1016/j.cag.2013.07.007 · 1.03 Impact Factor
  • Source
    Norimichi Ukita, Daniel Kaulen, Carsten Röcker
    [Show abstract] [Hide abstract]
    ABSTRACT: The development of a widely applicable automatic motion coaching system requires one to address a lot of issues including motion capturing, motion analysis and comparison, error detection as well as error feed-back. In order to cope with this complexity, most existing approaches focus on a specific motion sequence or exercise. As a first step towards the development of a more generic system, this paper systematically ana-lyzes different error and feedback types. A prototype of a feedback system that addresses multiple modali-ties is presented. The system allows to evaluate the applicability of the proposed feedback techniques for ar-bitrary types of motions in a next step.
    International Conference on Physiological Computing Systems, Lisbon, Portugal; 01/2014
  • Norimichi Ukita, Atsushi Nakazawa
    [Show abstract] [Hide abstract]
    ABSTRACT: This paper discusses the usefulness of human body-parts tracking for acquiring subtle cues in social interactions. While many kinds of body-parts tracking algorithms have been proposed, we focus on particle filtering-based tracking using prior models, which have several advantages for researches on social interactions. As a first step for extracting subtle cues from videos of social interaction behaviors, the advantages, disadvantages, and prospective properties of the body-parts tracking using prior models are summarized with actual results.
    2013 IEEE International Conference on Computer Vision Workshops (ICCVW); 12/2013
  • Norimichi Ukita
    [Show abstract] [Hide abstract]
    ABSTRACT: This paper proposes human motion models of multiple actions for 3D pose tracking. A training pose sequence of each action, such as walking and jogging, is separately recorded by a motion capture system and modeled independently. This independent modeling of action-specific motions allows us 1) to optimize each model in accordance with only its respective motion and 2) to improve the scalability of the models. Unlike existing approaches with similar motion models (e.g. switching dynamical models), our pose tracking method uses the multiple models simultaneously for coping with ambiguous motions. For robust tracking with the multiple models, particle filtering is employed so that particles are distributed simultaneously in the models. Efficient use of the particles can be achieved by locating many particles in the model corresponding to an action that is currently observed. For transferring the particles among the models in quick response to changes in the action, transition paths are synthesized between the different models in order to virtually prepare inter-action motions. Experimental results demonstrate that the proposed models improve accuracy in pose tracking.
    Image and Vision Computing 06/2013; 31(s 6–7):448–459. DOI:10.1016/j.imavis.2012.09.010 · 1.58 Impact Factor
  • Norimichi Ukita, Shigenobu Fujine, Norihiro Hagita
    [Show abstract] [Hide abstract]
    ABSTRACT: We have developed a system with multiple pan-tilt cameras for capturing high-resolution videos of a moving person. This system controls the cameras so that each camera captures the best view of the person (i.e. one of body parts such as the head, torso, and limbs) based on criteria for camera-work optimization. For achieving this optimization in real time, time-consuming pre-processes, which give useful clues for the optimization, are performed in a training stage. Specifically, a target performance (e.g. a dance) is captured to acquire the configuration of the body parts at each frame. In a real capture stage, the system compares an online-reconstructed shape with those in the training data for fast retrieval of the configuration of the body parts. The retrieved configuration is used by an efficient scheme for optimizing a camera work. Experimental results show the camera work optimized in accordance with given criteria. A high-resolution 3D videos produced by the proposed system are also shown as a typical use of high-resolution videos.
    Proceedings of the 2012 Second International Conference on 3D Imaging, Modeling, Processing, Visualization & Transmission; 10/2012
  • Norimichi Ukita, Takeo Kanade
    [Show abstract] [Hide abstract]
    ABSTRACT: We propose a multiview method for reconstructing a folded cloth surface on which regularly-textured color patches are printed. These patches provide not only easy pixel-correspondence between multiviews but also the following two new functions. (1) Error recovery: errors in 3D surface reconstruction (e.g. errors in occlusion boundaries and shaded regions) can be recovered based on the spatio-temporal consistency of the patches. (2) Single-view hole filling: patches that are visible only from a single view can be extrapolated from the reconstructed ones based on the regularity of the patches. Using these functions for improving 3D reconstruction also produces the patch configuration on the reconstructed surface, showing how the cloth is deformed from its reference shape. Experimental results demonstrate the above improvements and the accurate patch configurations produced by our method.
    Computer Vision and Image Understanding 08/2012; 116(8):869–881. DOI:10.1016/j.cviu.2012.04.001 · 1.36 Impact Factor
  • Kazuki Matsuda, Norimichi Ukita
    [Show abstract] [Hide abstract]
    ABSTRACT: This paper proposes a method for reconstructing a smooth and accurate 3D surface. Recent machine vision techniques can reconstruct accurate 3D points and normals of an object. The reconstructed point cloud is used for generating its 3D surface by surface reconstruction. The more accurate the point cloud, the more correct the surface becomes. For improving the surface, how to integrate the advantages of existing techniques for point reconstruction is proposed. Specifically, robust and dense reconstruction with Shape-from-Silhouettes (SfS) and accurate stereo reconstruction are integrated. Unlike gradual shape shrinking by space carving, our method obtains 3D points by SfS and stereo independently and accepts the correct points reconstructed. Experimental results show the improvement by our method.
    IEICE Transactions on Information and Systems 07/2012; E95.D(7):1811-1818. DOI:10.1587/transinf.E95.D.1811 · 0.19 Impact Factor
  • N. Ukita
    [Show abstract] [Hide abstract]
    ABSTRACT: This paper proposes contour-based features for articulated pose estimation. Most of recent methods are designed using tree-structured models with appearance evaluation only within the region of each part. While these models allow us to speed up global optimization in localizing the whole parts, useful appearance cues between neighboring parts are missing. Our work focuses on how to evaluate parts connectivity using contour cues. Unlike previous works, we locally evaluate parts connectivity only along the orientation between neighboring parts within where they overlap. This adaptive localization of the features is required for suppressing bad effects due to nuisance edges such as those of background clutter and clothing textures, as well as for reducing computational cost. Discriminative training of the contour features improves estimation accuracy more. Experimental results verify the effectiveness of our contour-based features.
    Computer Vision and Pattern Recognition (CVPR), 2012 IEEE Conference on; 06/2012
  • Norimichi Ukita, Takeo Kanade
    [Show abstract] [Hide abstract]
    ABSTRACT: We propose a unified model for human motion prior with multiple actions. Our model is generated from sample pose sequences of the multiple actions, each of which is recorded from real human motion. The sample sequences are connected to each other by synthesizing a variety of possible transitions among the different actions. For kinematically-realistic transitions, our model integrates nonlinear probabilistic latent modeling of the samples and interpolation-based synthesis of the transition paths. While naive interpolation makes unexpected poses, our model rejects them (1) by searching for smooth and short transition paths by employing the good properties of the observation and latent spaces and (2) by avoiding using samples that unexpectedly synthesize the nonsmooth interpolation. The effectiveness of the model is demonstrated with real data and its application to human pose tracking.
    Computer Vision and Image Understanding 04/2012; 116(4):500-509. DOI:10.1016/j.cviu.2011.11.005 · 1.36 Impact Factor
  • Source
    [Show abstract] [Hide abstract]
    ABSTRACT: We propose a method for calibrating the topology of distributed pan-tilt cameras (i.e. the structure of routes among and within FOVs) and its probabilistic model. To observe as many objects as possible for as long as possible, pan-tilt control is an important issue in automatic calibration as well as in tracking. In a calibration period, each camera should be controlled towards an object that goes through an unreliable route whose topology is not calibrated yet. This camera control allows us to efficiently establish the topology model. After the topology model is established, the camera should be directed towards the route with the biggest possibility of object observation. We propose a camera control framework based on the mixture of the reliability of the estimated routes and the probability of object observation. This framework is applicable both to camera calibration and object tracking by adjusting weight variables. Experiments demonstrate the efficiency of our camera control scheme for establishing the camera topology model and tracking objects as long as possible.
    IEICE Transactions on Information and Systems 02/2012; 95-D(2):626-635. DOI:10.1587/transinf.E95.D.626 · 0.19 Impact Factor
  • Edilson de Aguiar, Norimichi Ukita
    [Show abstract] [Hide abstract]
    ABSTRACT: We propose a new approach to represent and manipulate a mesh-based character animation preserving its time-varying details. Our method first decomposes the input mesh animation into coarse and fine deformation components. A model for the coarse deformations is constructed by an underlying kinematic skeleton structure and blending skinning weights. Thereafter, a non-linear probabilistic model is used to encode the fine time-varying details of the input animation. The user can manipulate the corresponding skeleton-based component of the input, which can be done by any standard animation package, and the final result is generated including its important time-varying details. By converting an input sample animation into our new hybrid representation, we are able to maintain the flexibility of mesh-based methods during animation creation while allowing for practical manipulations using the standard skeleton-based paradigm. We demonstrate the performance of our method by converting and editing several mesh animations generated by different performance capture approaches.
    Graphics, Patterns and Images (SIBGRAPI), 2012 25th SIBGRAPI Conference on; 01/2012
  • N. Ukita, K. Matsuda, N. Hagita
    [Show abstract] [Hide abstract]
    ABSTRACT: This paper proposes a method for reconstructing accurate 3D surface points. To this end, robust and dense reconstruction with Shape-from-Silhouettes (SfS) and accurate multiview stereo are integrated. Unlike gradual shape shrinking and/or bruteforce large space search by existing space carving approaches, our method obtains 3D points by SfS and stereo independently, and then selects correct ones from them. The point selection is achieved in accordance with spatial consistency and smoothness of 3D point coordinates and normals. The globally optimized points are selected by graph-cuts. Experimental results demonstrate that our method outperforms existing approaches.
    Pattern Recognition (ICPR), 2012 21st International Conference on; 01/2012
  • Source
    M. Hirai, N. Ukita, M. Kidode
    [Show abstract] [Hide abstract]
    ABSTRACT: We present a real-time method for estimating the pose of a human body using its 3D volume obtained from synchronized videos. The method achieves pose estimation by pose regression from its 3D volume. While the 3D volume allows us to estimate the pose robustly against self occlusions, 3D volume analysis requires a large amount of computational cost. We propose fast and stable volume tracking with efficient volume representation in a low dimensional dynamical model. Experimental results demonstrated that pose estimation of a body with a significantly deformable clothing could run at around 60 fps.
    Pattern Recognition (ICPR), 2010 20th International Conference on; 09/2010
  • Norimichi Ukita, Akira Makino, Masatsugu Kidode
    [Show abstract] [Hide abstract]
    ABSTRACT: In this research, we focus on how to track a target region that lies next to similar regions (e.g. a forearm and an upper arm) in zoom-in images. Many previous tracking methods express the target region (i.e. a part in a human body) with a single model such as an ellipse, a rectangle, and a deformable closed region. With the single model, however, it is difficult to track the target region in zoom-in images without confusing it and its neighboring similar regions (e.g. ``a forearm and an upper arm'' and ``a small region in a torso and its neighboring regions'') because they might have the same texture patterns and do not have the detectable border between them. In our method, a group of feature points in a target region is extracted and tracked as the model of the target. Small differences between the neighboring regions can be verified by focusing only on the feature points. In addition, (1) the stability of tracking is improved using particle filtering and (2) tracking robust to occlusions is realized by removing unreliable points using random sampling. Experimental results demonstrate the effectiveness of our method even when occlusions occur.
    IEICE Transactions on Information and Systems 07/2010; 93-D(7):1682-1689. DOI:10.1587/transinf.E93.D.1682 · 0.19 Impact Factor
  • Norimichi Ukita, Michiro Hirai, Masatsugu Kidode
    [Show abstract] [Hide abstract]
    ABSTRACT: We propose a method for estimating the pose of a human body using its approximate 3D volume (visual hull) obtained in real time from synchronized videos. Our method can cope with loose-fitting clothing, which hides the human body and produces non-rigid motions and critical reconstruction errors, as well as tight-fitting clothing. To follow the shape variations robustly against erratic motions and the ambiguity between a reconstructed body shape and its pose, the probabilistic dynamical model of human volumes is learned from training temporal volumes refined by error correction. The dynamical model of a body pose (joint angles) is also learned with its corresponding volume. By comparing the volume model with an input visual hull and regressing its pose from the pose model, pose estimation can be realized. In our method, this is improved by double volume comparison: 1) comparison in a low-dimensional latent space with probabilistic volume models and 2) comparison in an observation volume space using geometric constrains between a real volume and a visual hull. Comparative experiments demonstrate the effectiveness of our method faster than existing methods.
    Computer Vision, 2009 IEEE 12th International Conference on; 11/2009
  • Source
    N. Ukita, K. Terashita, M. Kidode
    [Show abstract] [Hide abstract]
    ABSTRACT: We propose a method for calibrating the topology of distributed pan tilt cameras (i.e., the structure of routes among FOVs) and its probabilistic model, which is useful for multi-object tracking in a wide area. To observe objects as long and many as possible, pan tilt control is an important issue in automatic calibration as well as in tracking. If only one object is observed by a camera and its neighboring cameras, the camera should point towards this object both in the calibration and tracking periods. However, if there are multiple objects, in the calibration period, the camera should be controlled towards an object that goes through an unreliable route in which a sufficient number of object detection results have not been observed. This control allows us to efficiently establish the reliable topology model. After the reliable topology model is established, on the other hand, the camera should be directed towards the route with the biggest possibility of object observation. We therefore propose a camera control framework based on the mixture of the reliability of the estimated routes and the probability of object observation. This framework is applicable both to camera calibration and object tracking by adjusting weight variables. Experiments demonstrate the efficiency of our camera control scheme for establishing the camera topology model and tracking objects as long as possible.
    Distributed Smart Cameras, 2009. ICDSC 2009. Third ACM/IEEE International Conference on; 10/2009
  • Source
    [Show abstract] [Hide abstract]
    ABSTRACT: This paper proposes a free-viewpoint imaging method that can be used in a complicated scene such as an office room by using sparsely located cameras. In our method, a free-viewpoint image is generated from multiple image patches obtained by dividing observed images. The quality of the generated image strongly depends on how to divide the observed images. In an incorrect patch in the generated image, the images projected from different cameras differ significantly. With this property, the incorrect patches can be detected. These patches are then re-divided. We demonstrated the effectiveness of our method by generating free-viewpoint images from the real images observed by the cameras in an office room.
    Pattern Recognition, 2008. ICPR 2008. 19th International Conference on; 01/2009
  • Source
    Naoko Enami, Norimichi Ukita, Masatsugu Kidode
    [Show abstract] [Hide abstract]
    ABSTRACT: In this paper, we propose a matching method for images captured at different times and under different capturing conditions. Our method is designed for change detection in street scapes using normal automobiles that has an off-the-shelf car mounted camera and a GPS. Therefore, we should analyze low-resolution and low frame-rate images captured asynchronously. To cope with this difficulty, previous and current panoramic images are created from sequential images which are rectified based on the view direction of a camera, and then compared. In addition, in order to allow the matching method to be applicable to images captured under varying conditions, (1) for different lanes, enlarged/reduced panoramic images are compared with each other, and (2) robustness to noises and changes in illumination is improved by the edge features. To confirm the effectiveness of the proposed method, we conducted experiments matching real images captured under various capturing conditions.
    Distributed Smart Cameras, 2008. ICDSC 2008. Second ACM/IEEE International Conference on; 10/2008
  • Norimichi Ukita, Ryosuke Tsuji, Masatsugu Kidode
    [Show abstract] [Hide abstract]
    ABSTRACT: We propose a real-time method for simultaneously refining the reconstructed volume of a human body with loose-fitting clothing and identifying body-parts in it. Time-series volumes, which are acquired by a slow but sophisticated D reconstruction algorithm, with body-part la- bels are obtained offline. The time-series sample volumes are represented by trajectories in the eigenspaces using PCA. An input visual hull recon- structed online is projected into the eigenspace and compared with the trajectories in order to find similar high-precision samples with body- part labels. The hierarchical search taking into account 3D reconstruc- tion errors can achieve robust and fast matching. Experimental results demonstrate that our method can refine the input visual hull including loose-fitting clothing and identify its body-parts in real time.
    Computer Vision - ECCV 2008, 10th European Conference on Computer Vision, Marseille, France, October 12-18, 2008, Proceedings, Part III; 01/2008
  • Source
    [Show abstract] [Hide abstract]
    ABSTRACT: This paper proposes a method for precise overlapping of projected images from multiple steerable projectors. When they are controlled simultaneously, two problems are revealed: (1) even a slight positional error of the projected image, which does not matter in the case of a single projector, causes misalignments of multiple projected images that can be perceived clearly when using multiple projectors; and (2) as the projectors usually do not have architectures for their synchronization it is impossible to display a moving image that is by tiling or overlaying precisely the multiple projected images. To overcome (1), a method is proposed that measures preliminarily the misalignments through every plane in the environment, and hence displays the image without the misalignment. For (2), a consideration and a new proposal for the synchronization of multiple projectors are also discussed.
    2007 IEEE Computer Society Conference on Computer Vision and Pattern Recognition (CVPR 2007), 18-23 June 2007, Minneapolis, Minnesota, USA; 06/2007

Publication Stats

291 Citations
14.73 Total Impact Points

Institutions

  • 2003–2013
    • Nara Institute of Science and Technology
      • Graduate School of Information Science
      Ikuma, Nara, Japan
  • 2009
    • Carnegie Mellon University
      Pittsburgh, Pennsylvania, United States
  • 2000–2005
    • Kyoto University
      • • Graduate School of Informatics
      • • Department of Intelligence Sciences and Technology
      Kioto, Kyōto, Japan