Gaussian Process Dynamical Models for Human Motion

Department of Computer Science, University of Toronto, 40 St. George Street, Toronto, Ontario M5S 2E4 Canada.
IEEE Transactions on Pattern Analysis and Machine Intelligence (Impact Factor: 5.78). 03/2008; 30(2):283-98. DOI: 10.1109/TPAMI.2007.1167
Source: PubMed


We introduce Gaussian process dynamical models (GPDM) for nonlinear time series analysis, with applications to learning models of human pose and motion from high-dimensionalmotion capture data. A GPDM is a latent variable model. It comprises a low-dimensional latent space with associated dynamics, and a map from the latent space to an observation space. We marginalize out the model parameters in closed-form, using Gaussian process priors for both the dynamics and the observation mappings. This results in a non-parametric model for dynamical systems that accounts for uncertainty in the model. We demonstrate the approach, and compare four learning algorithms on human motion capture data in which each pose is 50-dimensional. Despite the use of small data sets, the GPDM learns an effective representation of the nonlinear dynamics in these spaces.

Download full-text


Available from: David J. Fleet, Sep 03, 2015
9 Reads
  • Source
    • "A better solution is to use Gaussian Processes (GPs) which are non-linear, non-parametric models [7]. They have been successfully applied in various tasks including speech and music processing [8] [9] [10]. Previously, we have also used GPs for static music emotion recognition [11]. "
  • Source
    • "Motion generation Generation of naturalistic human motion using probabilistic models trained on motion capture data has previous been addressed in the context of computer graphics and machine learning. Prior work has tackled synthesis of stylized human motion using bilinear spatiotemporal basis models [1], Hidden Markov Models [3], linear dynamical systems [21], and Gaussian process latent variable models [46] [40], as well as multilinear variants thereof [12] [45]. Unlike methods based on Gaussian processes, we use a parametric representation and a simple, scalable supervised training method that makes it practical to train on large datasets. "
    [Show abstract] [Hide abstract]
    ABSTRACT: We propose the Encoder-Recurrent-Decoder (ERD) model for recognition and prediction of human body pose in videos and motion capture. The ERD model is a recurrent neural network that incorporates nonlinear encoder and decoder networks before and after recurrent layers. We test instantiations of ERD architectures in the tasks of motion capture (mocap) generation, body pose labeling and body pose forecasting in videos. Our model handles mocap training data across multiple subjects and activity domains, and synthesizes novel motions while avoid drifting for long periods of time. For human pose labeling, ERD outperforms a per frame body part detector by resolving left-right body part confusions. For video pose forecasting, ERD predicts body joint displacements across a temporal horizon of 400ms and outperforms a first order motion model based on optical flow. ERDs extend previous Long Short Term Memory (LSTM) models in the literature to jointly learn representations and their dynamics. Our experiments show such representation learning is crucial for both labeling and prediction in space-time. We find this is a distinguishing feature between the spatio-temporal visual domain in comparison to 1D text, speech or handwriting, where straightforward hard coded representations have shown excellent results when directly combined with recurrent units.
    • "Such a shift from detection–diagnosis–mitigation of anomalies to prediction–prognosis–prevention is seen across various engineering domains beyond manufacturing and healthcare , including telecommunication and utility networks, infrastructure, and lifeline systems. In particular, the rich causal and dynamic information discernible from time series data has made forecasting of the evolution of complex biological, physical, and engineering system dynamics (Lang et al., 2007; Duy and Peters, 2008; Wang et al., 2008)—crucial for their preventative control—possible, and Fig. 1. Time series forecasting application for complex nonlinear processes. "
    [Show abstract] [Hide abstract]
    ABSTRACT: Forecasting the evolution of complex systems is noted as one of the ten grand challenges of modern science. Time series data from complex systems capture the dynamic behaviors and causalities of the underlying processes and provide a tractable means to predict and monitor system state evolution. However, the nonlinear and nonstationary dynamics of the underlying processes pose a major challenge for accurate forecasting. For most real-world systems, the vector field of state dynamics is a nonlinear function of the state variables, i.e., the relationship connecting intrinsic state variables with their autoregressive terms and exogenous variables is nonlinear. Time series emerging from such complex systems exhibit aperiodic (chaotic) patterns even under steady state. Also, since real-world systems often evolve under transient conditions, the signals obtained therefrom tend to exhibit myriad forms of nonstationarity. Nonetheless, methods reported in the literature focus mostly on forecasting linear and stationary processes. This paper presents a review of these advancements in nonlinear and nonstationary time series forecasting models and a comparison of their performances in certain real-world manufacturing and health informatics applications. Conventional approaches do not adequately capture the system evolution (from the standpoint of forecasting accuracy, computational effort, and sensitivity to quantity and quality of a priori information) in these applications.
    IIE Transactions 01/2015; DOI:10.1080/0740817X.2014.999180 · 1.37 Impact Factor
Show more