Article

Interactive Partner Control in Close Interactions for Real-Time Applications

Authors:
Article

Interactive Partner Control in Close Interactions for Real-Time Applications

If you want to read the PDF, try requesting it from the authors.

Abstract

This article presents a new framework for synthesizing motion of a virtual character in response to the actions performed by a user-controlled character in real time. In particular, the proposed method can handle scenes in which the characters are closely interacting with each other such as those in partner dancing and fighting. In such interactions, coordinating the virtual characters with the human player automatically is extremely difficult because the system has to predict the intention of the player character. In addition, the style variations from different users affect the accuracy in recognizing the movements of the player character when determining the responses of the virtual character. To solve these problems, our framework makes use of the spatial relationship-based representation of the body parts called interaction mesh, which has been proven effective for motion adaptation. The method is computationally efficient, enabling real-time character control for interactive applications. We demonstrate its effectiveness and versatility in synthesizing a wide variety of motions with close interactions.

No full-text available

Request Full-text Paper PDF

To read the full-text of this research,
you can request a copy directly from the authors.

... However, careful design of the motion editing algorithms (such as [8,9]) and parameter tuning are required to avoid interpenetration of body parts. Synthesizing a virtual partner/opponent from the user's movement has been explored in VR dancing [10] and sword fighting [11]. While the aforementioned approaches can generate the reactive motion (i.e. ...
... With the availability of interaction datasets [22,23,13,24], data-driven approaches are becoming more popular. By retrieving pre-recorded interactions, virtual partner/opponent can be synthesized based on the user's motion in dancing [10] and sword fighting [11] in VR. Kundu et al. [12] proposed using Cross-Conditioned Recurrent Networks for synthesizing human-human interactions. ...
... Further exploring the usage of the latent space, such as interpolation and extrapolation, learned using our model as well as learning a topology-aware latent space [40] for avoiding interpenetration are potential future directions. Using existing close interaction editing methods such as Interaction Mesh [8] and Aura Mesh [19] as a post-processing step to clean up the interpenetration as well as maintain the contact points between the characters can be another solution as demonstrated in [10]. In terms of the quality of the synthesized motion, artefacts such as foot sliding can be found in the synthesized motions since there is no explicit loss term on the stepping pattern in our proposed network and this is quite common to other GAN-based motion synthesis methods [14] such as MotionCLIP [41]. ...
Preprint
Synthesizing multi-character interactions is a challenging task due to the complex and varied interactions between the characters. In particular, precise spatiotemporal alignment between characters is required in generating close interactions such as dancing and fighting. Existing work in generating multi-character interactions focuses on generating a single type of reactive motion for a given sequence which results in a lack of variety of the resultant motions. In this paper, we propose a novel way to create realistic human reactive motions which are not presented in the given dataset by mixing and matching different types of close interactions. We propose a Conditional Hierarchical Generative Adversarial Network with Multi-Hot Class Embedding to generate the Mix and Match reactive motions of the follower from a given motion sequence of the leader. Experiments are conducted on both noisy (depth-based) and high-quality (MoCap-based) interaction datasets. The quantitative and qualitative results show that our approach outperforms the state-of-the-art methods on the given datasets. We also provide an augmented dataset with realistic reactive motions to stimulate future research in this area. The code is available at https://github.com/Aman-Goel1/IMM
... The topology and the dimension of the meshes vary over time according to the changing poses of the characters, which allows representing the varying spatial relationship over time. We customize the process to generate the interaction mesh [38] such that the resultant mesh is more suitable for interaction comparison. In particular, we would like to have a uniform distribution of vertices to ensure that the comparison is not biased to body parts with more joints. ...
... In particular, we would like to have a uniform distribution of vertices to ensure that the comparison is not biased to body parts with more joints. Therefore, on top of the set of vertices generated by the joint positions of the characters in [38], we include a set of vertices by uniformly sampling the skeleton structure of the characters using a predefined length. This allows us to maintain a more uniform density for the mesh, such that the interaction comparison based on the mesh is not biased to specific joints. ...
... where DT is the Delaunay Tetrahedralization process, and E t DT is the set of edges created. Different from [38] that considers all edges, we filter E t DT by removing all edges connecting to the same character, as those edges do not contribute to the interaction. The resultant set of edges, E t , is regarded as the interaction mesh of frame t. ...
Article
Full-text available
Traditional methods for motion comparison consider features from individual characters. However, the semantic meaning of many human activities is usually defined by the interaction between them, such as a high-five interaction of two characters. There is little success in adapting interaction-based features in activity comparison, as they either do not have a fixed topology or are in high dimensional. In this paper, we propose a unified framework for activity comparison from the interaction point of view. Our new metric evaluates the similarity of interaction by adapting the Earth Mover's Distance onto a customized geometric mesh structure that represents spatial-temporal interactions. This allows us to compare different classes of interactions and discover their intrinsic semantic similarity. We created five interaction databases of different natures, covering both two-characters (synthetic and real-people) and character-object interactions, which are open for public uses. We demonstrate how the proposed metric aligns well with the semantic meaning of the interaction. We also apply the metric in interaction retrieval and show how it outperforms existing ones. The proposed method can be used for unsupervised activity detection in monitoring systems and activity retrieval in smart animation systems.
... In doing so, we identify suitable responses by aligning the observed motion to the templates in our library. The resulting response is then optimized utilizing Interaction Meshes [8] to allow for fine grade adaptation to the observed human motion. We extend the Interaction Mesh approach by adding a context-aware decision layer that allows multiple two-person interactions to be triggered. ...
... Ho et al. [8] propose a method based on two-person motion capture data with a two stage process. First, the postures of the afterwards active interaction partner, i.e the human, and the virtual agent are organized in a kd-tree. ...
... In doing so, crucial characteristics of an interaction as well as small details of motions are preserved and used to animate a virtual agent. We extended the approach presented by Ho et al. [8] to situations where the temporal context of interactions plays an important role. ...
Conference Paper
Full-text available
We address the problem of creating believable animations for virtual humans that need to react to the body movements of a human interaction partner in real-time. Our data-driven approach uses prerecorded motion capture data of two interacting persons and performs motion adaptation during the live human-agent interaction. Extending the interaction mesh approach, our main contribution is a new scheme for efficient identification of motions in the prerecorded animation data that are similar to the live interaction. A global low-dimensional posture space serves to select the most similar interaction example, while local, more detail-rich posture spaces are used to identify poses closely matching the human motion. Using the interaction mesh of the selected motion example, an animation can then be synthesized that takes into account both spatial and temporal similarities between the prerecorded and live interactions.
... An alternative approach for spatial relationship preservation between interactants are IMs [1]. First introduced within the computer animation community to animate virtual characters in realtime [15], [16], IMs are also applied in the field of robotics. Ivan and colleagues, for example, compute optimal motion paths for a robotic arm in semi-dynamic environments [2]. ...
... Spatial adaptation and coordination using IMs has been demonstrated extensively by the computer animation community [2], [4], [15]. One eminent feature of IMs is the ability to adapt full body behaviors to new situations. ...
... One eminent feature of IMs is the ability to adapt full body behaviors to new situations. So far, however, proposed methods rely on fully connected graphs [2] or Delaunay tetrahedralization [15] for net generation. These approaches include all joints equally into the topology yielding densely interconnected nets as shown in Fig. 4, left. ...
Conference Paper
Full-text available
We present a data-driven imitation learning system for learning human-robot interactions from human-human demonstrations. During training, the movements of two interaction partners are recorded through motion capture and an interaction model is learned. At runtime, the interaction model is used to continuously adapt the robot’s motion, both spatially and temporally, to the movements of the human interaction partner. We show the effectiveness of the approach on complex, sequential tasks by presenting two applications involving collaborative human-robot assembly. Experiments with varied object hand-over positions and task execution speeds confirm the capabilities for spatio-temporal adaption of the demonstrated behavior to the current situation.
... Such methods either use motion capture data of two interacting persons (e.g. [19][20][21] ) or compose separately recorded motions that would match to an interaction into a table [22] or a graph structure [23] . Most of these methods are designed to control the motions of both interacting characters. ...
... All motions are completely synthesized. An approach closer to the requirements of our application has been published in [19] . Here the motions of a user are captured based on optical tracking and matching movements of a virtual partner (dancer, fighter) are computed. ...
... In fact, our system has been developed as a tool to investigate existing fight strategies and develop new ones. From a practical viewpoint it must be noted that in [19] very restrictive assumptions have been made to limit the size of the data base. This includes the consideration of only infight situations, a limited motion repertoire and the uniqueness of the query result. ...
... Our method builds upon on previous findings from the computer graphics community (Ho et al. 2013) as well as the robotics community (Ben Amor 2010) and extends them to triadic setups. In the remainder of the paper we will introduce relevant related work, propose our methodology, and perform a set of experiments to evaluate the results using objective and subjective measures. ...
... During the course of the experiment 246 handovers where recorded with 83 using triadic IMs, 80 using IM topologies created by Delaunay triangulation 5 (Ho et al. 2013) 5 3 recordings were dropped due to inconsistencies in motion capture readings. . 8 The figure illustrates 7 handovers. ...
... In the remaining 8% percent, the object either slipped out of the gripper or dropped immediately. In contrast, using the approaches of Ho et al. (2013) and Huang et al. (2015), only ≈ 40 percent of all interactions succeeded. Since the Fig. 9 The figure shows participants of the survey and how they handed the object over. ...
Article
Full-text available
We propose an imitation learning methodology that allows robots to seamlessly retrieve and pass objects to and from human users. Instead of hand-coding interaction parameters, we extract relevant information such as joint correlations and spatial relationships from a single task demonstration of two humans. At the center of our approach is an interaction model that enables a robot to generalize an observed demonstration spatially and temporally to new situations. To this end, we propose a data-driven method for generating interaction meshes that link both interaction partners to the manipulated object. The feasibility of the approach is evaluated in a within user study which shows that human–human task demonstration can lead to more natural and intuitive interactions with the robot.
... The virtual character can handle close interactions with a user-controlled character in real time. The framework relies on an interactive mesh which is a spatial relationshipbased representation of the body parts [22]. By using a Motion Analysis Eagle Digital optical motion capture system, the interactive partner is created for dancing and fighting scenarios. ...
... The most similar approach to my work is the creation of a virtual salsa dance partner which uses a motion capture or virtual reality system [22] [31]. The main difference is my approach will only require the model to be trained off of YouTube videos which are free and easily obtainable. ...
... Another potential direction is to create a virtual dance partner where the user can record themselves through their laptop camera and the virtual dance partner would display on their screen in real-time. Incorporating depth, the system could also create an interactive virtual dance partner in 3D [31] [22]. OpenPose can estimate 3D skeleton keypoints, so the user could be captured in real-time and the virtual dance partner could be recreated. ...
Article
My work focuses on taking a single person as input and predicting the intentional movement of one dance partner based on the other dance partner's movement. Human pose estimation has been applied to dance and computer vision, but many existing applications focus on a single individual or multiple individuals performing. Currently there are very few works that focus specifically on dance couples combined with pose prediction. This thesis is applicable to the entertainment and gaming industry by training people to dance with a virtual dance partner. Many existing interactive or virtual dance partners require a motion capture system, multiple cameras or a robot which creates an expensive cost. This thesis does not use a motion capture system and combines OpenPose with swing dance YouTube videos to create a virtual dance partner. By taking in the current dancer's moves as input, the system predicts the dance partner's corresponding moves in the video frames. In order to create a virtual dance partner, datasets that contain information about the skeleton keypoints are necessary to predict a dance partner's pose. There are existing dance datasets for a specific type of dance, but these datasets do not cover swing dance. Furthermore, the dance datasets that do include swing have a limited number of videos. The contribution of this thesis is a large swing dataset that contains three different types of swing dance: East Coast, Lindy Hop and West Coast. I also provide a basic framework to extend the work to create a real-time and interactive dance partner.
... Ho et. al. present "a new framework for synthesizing motion of a virtual character in response to the actions performed by a user-controlled character in real time" According to the authors "the proposed method can handle scenes in which the characters are closely interacting with each other such as those in partner dancing and fighting" [44]. This novel approach to dance interaction might be able to contribute to dance learning and education in cases of ballroom dances for example. ...
... Ho et. al. [44] method could be a solution for a virtual partner that plays the role of the leader or the follower. From all the systems that have been considered, the only one that is an educational system and can provide a form of collaboration is Whatever Dance toolbox. ...
Article
Full-text available
Motion Capture and whole-body interaction technologies have been experimentally proven to contribute to the enhancement of dance learning and to the investigation of bodily knowledge, innovating at the same time the practice of dance. Designing and implementing a dance interactive learning system with the aim to achieve effective, enjoyable, and meaningful educational experiences is, however, a highly demanding interdisciplinary and complex problem. In this work, we examine the interactive dance training systems that are described in the recent bibliography, proposing a framework of the most important design parameters, which we present along with particular examples of implementations. We discuss the way that the different phases of a common workflow are designed and implemented in these systems, examining aspects such as the visualization of feedback to the learner, the movement qualities involved, the technological approaches used, as well as the general context of use and learning approaches. Our aim is to identify common patterns and areas that require further research and development toward creating more effective and meaningful digital dance learning tools.
... However, the model is not necessarily applicable to other interactions, nor it is suited to human motion input. Ho at al. [24] outline a more general method for human-character interaction animation. It queries a motion database with the current state of the user and character to search for similar recorded clips, editing the retrieved motion to match it to the actual situation. ...
... Their approach is showcased in a character able to "high-five" a user in a virtual environment with different styles. Vogt at al. [67] present a data-driven method conceptually similar to Ho et al. [24], but with several improvements, including a better low-dimensional representation of the current queried pose and both spatial and temporal matching to stored data. ...
... Finally, position constraints should prevent body overlap or other appearances that do not conform to the objective world. Researchers have proposed topology coordinates [14,15] and interaction mesh [13,16] to solve these problems in close interaction [6,17,47]. In our work, the proposed model mainly solves the problems of stability and human dynamics of the generated animation as well as adaptability with various interactions. ...
Article
Full-text available
In this paper, we propose a generative recurrent model for human-character interaction. Our model is an encoder-recurrent-decoder network. The recurrent network is composed by multiple layers of long short-term memory (LSTM) and is incorporated with an encoder network and a decoder network before and after the recurrent network. With the proposed model, the virtual character’s animation is generated on the fly while it interacts with the human player. The coming animation of the character is automatically generated based on the history motion data of both itself and its opponent. We evaluated our model based on both public motion capture databases and our own recorded motion data. Experimental results demonstrate that the LSTM layers can help the character learn a long history of human dynamics to animate itself. In addition, the encoder–decoder networks can significantly improve the stability of the generated animation. This method can automatically animate a virtual character responding to a human player.
... A Markov process is used in the lower layer, and a clustering technique is used in the upper layer. Ho et al. [16] developed a framework for synthesizing motion of a virtual character in response to the actions performed by a user-controlled character in real time. ...
... Our research is related with studies to create interactive motions between two characters. The interaction mesh-based method 3,4 represents the spatial relationship between two interacting bodies in terms of a mesh consisting of edges that connect two body parts. This method can be applied to various scenarios for close interaction between persons or between a human and an environment. ...
Conference Paper
We present an avatar animation technique for a telepresence system that allows for the hand contact, especially handshaking, between remote users. The key idea is that, while the avatar follows the remote user's motion normally, it modifies the motion to create and maintain hand contact with the local user when the two users try to engage hand contact. To this end, we develop the support vector machine (SVM)-based classifiers to recognize the users' intention for contact interaction, and online motion generation method to create realistic image sequence of an avatar to realize the continuous contact with the user. A user study has been conducted to verify the effect of our method on the social telepresence.
... Our research is related with studies to create interactive motions between two characters. The interaction mesh-based method 3,4 represents the spatial relationship between two interacting bodies in terms of a mesh consisting of edges that connect two body parts. This method can be applied to various scenarios for close interaction between persons or between a human and an environment. ...
Article
In avatar-mediated telepresence, remote users find it difficult to engage and maintain contact, such as a handshake, with each other without a haptic device. We address the problem of adjusting an avatar's pose to promote multifinger contact interaction between remote users. To this end, we first construct a contact point database for nine types of contact interactions between hands through contact experiments with human subjects. We then develop an optimization-based framework to compute the avatar's pose that realizes the desired contact learned from the experiment while maintaining the naturalness of the hand pose. We show that our method improves the quality of hand interaction for the predefined set of social interactions.
... A Markov process is used in the first layer, and a clustering technique is used in the second layer. Finally, a framework is developed [20] for synthesizing the motion of a virtual character in response to the actions performed by a user-controlled character in real-time. ...
Article
Full-text available
Learning couple dance such as salsa is challenging as it requires to understand and assimilate all the dance skills (guidance, rhythm, style) correctly. Salsa is traditionally learned by attending a dancing class with a teacher and practice with a partner, the difficulty to access such classes though, and the variability of dance environment can impact the learning process. Understanding how people learn using a virtual reality platform could bring interesting knowledge in motion analysis and can be the first step toward a complementary learning system at home. In this paper, we propose an interactive learning application in the form of a virtual reality game, that aims to help the user to improve its salsa dancing skills. The application was designed upon previous literature and expert discussion and has different components that simulate salsa dance: A virtual partner with interactive control to dance with, visual and haptic feedback, and a game mechanic with dance tasks. This application is tested on a two-class panel of 20 regular and 20 non-dancers, and their learning is evaluated and analyzed through the extraction of Musical Motion Features and the Laban Motion Analysis system. Both motion analysis frameworks were compared prior and after training and show a convergence of the profile of non-dancer toward the profile of regular dancers, which validates the learning process. The work presented here has profound implications for future studies of motion analysis, couple dance learning, and human-human interaction.
... Researchers also attempted different forms of interactive and evaluation feedback to ensure the effectiveness of the evaluation results in similar self-training dancing system [14,15,16]. Jacky et al. [2] proposed a framework that first identified dance movement with neural network algorithm and then calculated the movement quality of the different body parts through the DTW comparison algorithm. ...
Article
Full-text available
The capabilities of general motion evaluation algorithms are significantly limited in analyzing the stylistic qualities and expressions of dance movement. This study proposes a novel dance self-learning framework on the basis of the principles of Laban movement analysis (LMA) to facilitate trainees in automatically analyzing dance movements and correcting dance techniques without an expert. First, a "shape-effort" feature description model was presented in this framework to reflect the subtleties of dance movement. The evaluation of body-shape performance was obtained via open-end dynamic time warping algorithm. Next, rhythm was qualitatively assessed by curve fitting, whereas effort was measured by using standard deviation. Finally, constructive instructions were generated in this framework on basis of the assessment scores of the movement of the trainees. The framework was implemented in cave automatic virtual environment, and its effectiveness and feasibility were verified through experiments. Results demonstrate that the feature description model with 23 LMA parameters can be used in describing dance movements. Multi-mode feedback with direct instructions for the problems in question satisfies the learning habits of the trainee. The quality of the trainees' movements achieves an average of 10% overall improvement by using the framework. Body-shape performance acquires the most improvement of 18%, followed by effort. This study provides a new research method for evaluation and training of dance movements.
... Human motion data have been used in a wide range of VR applications, such as virtual training [Pronost et al. 2008;Kyan et al. 2015] and virtual rehabilitation [Celiktutan et al. 2013], for analyzing the performance of the human subject as well as animating virtual avatars to interact with the subject to enhance the realism of the system. In particular, pre-captured human motions can be used as examples to guide the movement of virtual characters in response to the performance of the human subject [Ho et al. 2013;Pronost et al. 2008]. However, when retrieving relevant examples from the motion database, maintaining the temporal coherency of the poses over successive frames is a crucial factor in producing realistic movement of the virtual characters. ...
Conference Paper
Full-text available
Retrieving pre-captured human motion for analyzing and synthesizing virtual character movement have been widely used in Virtual Reality (VR) and interactive computer graphics applications. In this paper, we propose a new human pose representation, called Spatial Relations of Human Body Parts (SRBP), to represent spatial relations between body parts of the subject(s), which intuitively describes how much the body parts are interacting with each other. Since SRBP is computed from the local structure (i.e. multiple body parts in proximity) of the pose instead of the information from individual or pairwise joints as in previous approaches, the new representation is robust to minor variations of individual joint location. Experimental results show that SRBP outperforms the existing skeleton-based motion retrieval and classification approaches on benchmark databases.
... Our approach is different from the method for retargeting motions for interacting characters [5] or the method for creating character motion responsive to user's input [6] in that our avatar creates adaptive contact behavior online given the un-predefined input motions from the remote users. Figure 1 shows the overview of our system. ...
... For example , one can compare the user performed motion obtained from depth sensors with a set of motion in the database and understand the nature of the motion as well as how it should affect the real-time render (Bleiweiss et al. 2010). Alternatively, with an interaction database, one can generate a virtual character that acts according to the posture of the user, in order to create a two character dancing animation, which is difficult to be captured due to hardware limitation (Ho et al. 2013). While it is possible to utilize the posture captured by depth sensors for driving the animation of virtual characters, the generated animation may not be physically correct and dynamically plausible. ...
Chapter
Full-text available
Depth sensors have become one of the most popular means of generating human facial and posture information in the past decade. By coupling a depth camera and computer vision based recognition algorithms, these sensors can detect human facial and body features in real time. Such a breakthrough has fused many new research directions in animation creation and control, which also has opened up new challenges. In this chapter, we explain how depth sensors obtain human facial and body information. We then discuss on the main challenge on depth sensor-based systems, which is the inaccuracy of the obtained data, and explain how the problem is tackled. Finally, we point out the emerging applications in the field, in which human facial and body feature modeling and understanding is a key research problem.
... In this section, we will first review existing examples of AR in various industries. While Virtual Reality (VR) and interactive computer graphics have been used for teaching and learning, such as partner dancing [3], visualizing wrestling [4], [5] and boxing [6], [7] skills, in the last two decades, more attention has been paid on vision-based frameworks which make use of cameras and sensors. By capturing the information from the surrounding using cameras and sensors, useful feedback can be provided to the user, such as posture monitoring [8] and interacting with virtual objects using body movement [9], [10]. ...
Conference Paper
Full-text available
Connecting network cables to network switches is a time-consuming and inefficient task, and requires extensive documentation and preparation beforehand to ensure no service faults are encountered by the users. In this paper, a new AR smartphone application that overlays network switch information over the user’s vision is designed and developed for real working environment to increase user’s efficiency in working with a network switch. Specifically, the prototype of the AR App is developed on the Android platform using both the Unity game engine and Vuforia AR library and connecting to the network switch to retrieve network information through telnet. By using the camera on the smartphone for capturing the visual information from the working environment, i.e. the network switch in this App, the network switch information such as speed, types, etc. will be overlaid on each port on the smartphone screen. A user study was conducted to evaluate the effectiveness of the AR App to assist users in performing network tasks. In particular, participants were tasked with connecting switchports to a patch panel to match up corresponding configurations. After three tests, it was found that the times for completion and mistakes made were reduced in the final test when compared to the first. This highlights the positive effects of the application in improving the user’s efficiency.
Article
This article presents a framework for real-time analysis and visualization of ballet dance movements performed within a Cave Virtual Reality Environment (CAVE). A Kinect sensor captures and extracts dance-based movement features, from which a topology preserved "posture space"' is constructed using a spherical self-organizing map (SSOM). Recordings of dance movements are parsed into gestural elements by projection onto the SSOM to form unique trajectories in posture space. Dependencies between postures in a trajectory are modeled using a Markovian empirical transition matrix, which is then used to recognize attempted movements. This allows for quantitative assessment and feedback of a student's performance, delivered using concurrent, localized visualizations together with a performance score based on incremental dynamic time warping (IDTW).
Article
Interaction meshes are a promising approach for generating natural behaviors of virtual characters during ongoing user interactions. In this paper, we propose several extensions to the interaction mesh approach based on statistical analyses of the underlying example interactions. By applying principal component analysis and correlation analysis in addition to joint distance calculations, both the interaction mesh topology and the constraints used for mesh optimization can be generated in an automated fashion that accounts for the spatial and temporal contexts of the interaction. Copyright © 2015 John Wiley & Sons, Ltd.
Conference Paper
Full-text available
Synthesizing close interactions such as dancing and fighting be-tween characters is a challenging problem in computer animation. While encouraging results are presented in [Ho et al. 2010], the high computation cost makes the method unsuitable for interactive motion editing and synthesis. In this paper, we propose an efficient multiresolution approach in the temporal domain for editing and adapting close character interactions based on the Interaction Mesh framework. In particular, we divide the original large spacetime optimization problem into multiple smaller problems such that the user can observe the adapted motion while playing-back the move-ments during run-time. Our approach is highly parallelizable, and achieves high performance by making use of multi-core architec-tures. The method can be applied to a wide range of applications including motion editing systems for animators and motion retar-geting systems for humanoid robots.
Article
This study aims to develop a controller for use in the online simulation of two interacting characters. This controller is capable of generalizing two sets of interaction motions of the two characters based on the relationships between the characters. The controller can exhibit similar motions to a captured human motion while reacting in a natural way to the opponent character in real time. To achieve this, we propose a new type of physical model called a coupled inverted pendulum on carts that comprises two inverted pendulum on a cart models, one for each individual, which are coupled by a relationship model. The proposed framework is divided into two steps: motion analysis and motion synthesis. Motion analysis is an offline preprocessing step, which optimizes the control parameters to move the proposed model along a motion capture trajectory of two interacting humans. The optimization procedure generates a coupled pendulum trajectory which represents the relationship between two characters for each frame, and is used as a reference in the synthesis step. In the motion synthesis step, a new coupled pendulum trajectory is planned reflecting the effects of the physical interaction, and the captured reference motions are edited based on the planned trajectory produced by the coupled pendulum trajectory generator. To validate the proposed framework, we used a motion capture data set showing two people performing kickboxing. The proposed controller is able to generalize the behaviors of two humans to different situations such as different speeds and turning speeds in a realistic way in real time.
Article
This article proposes a novel framework for the real-time capture, assessment, and visualization of ballet dance movements as performed by a student in an instructional, virtual reality (VR) setting. The acquisition of human movement data is facilitated by skeletal joint tracking captured using the popular Kinect camera system, while instruction and performance evaluation are provided in the form of 3D visualizations and feedback through a CAVE virtual environment, in which the student is fully immersed. The proposed framework is based on the unsupervised parsing of ballet dance movement into a structured posture space using the spherical self-organizing map (SSOM). A unique feature descriptor is proposed to more appropriately reflect the subtleties of ballet dance movements, which are represented as gesture trajectories through posture space on the SSOM. This recognition subsystem is used to identify the category of movement the student is attempting when prompted (by a virtual instructor) to perform a particular dance sequence. The dance sequence is then segmented and cross-referenced against a library of gestural components performed by the teacher. This facilitates alignment and score-based assessment of individual movements within the context of the dance sequence. An immersive interface enables the student to review his or her performance from a number of vantage points, each providing a unique perspective and spatial context suggestive of how the student might make improvements in training. An evaluation of the recognition and virtual feedback systems is presented.
Article
This paper presents a computer-based system for assessment and training of ballet dance in a CAVE virtual reality environment. The system utilizes Kinect sensor to capture student's dance and extracts features from skeleton joints. This system depends on a structured posture space, which comprises a set of dance elements that represent key moments - 'postures', that typically will be so briefly held as to experience as a fleeting moment in a flux - in the dance movements whose performance we are attempting to assess. The recording captured from the Kinect allows the parsing of dance movement into a structured posture space using the spherical self-organizing map (SSOM). From this, a unique descriptor can be obtained by following gesture trajectories through posture space on the SSOM, which appropriately reflects the subtleties of ballet dance movements. Consequently, the system can recognize the category of movement the student is attempting, and this allows us make a quantitative assessment of individual movements. Based on the experimental results, the proposed system appears to be very effective for recognition and offering generalization across instances of movement. Thus, it is possible for the construction of assessment and visualization of ballet dance movements performed by the student in an instructional, virtual reality setting.
Article
A motion retargeting process is necessary as the body size and proportion of the actors are generally different from those of the target characters. However, the original spatial relationship between the multiple characters and the environment is easily broken when using previous motion retargeting methods, which are generally performed for each character independently. Therefore, time‐consuming manual adjustments by animators are usually required to obtain satisfactory results. To address these issues, we present a novel multicharacter motion retargeting method that preserves various types of spatial relationships between characters and environments. We establish a unified deformation‐based framework for the motion retargeting of multiple characters (more than two) or nonhuman characters with complex interactions. Also, an interactive motion editing interface with immediate feedback to the user is provided. We experimentally show that our method achieves a speedup when compared with previous motion retargeting methods. We present a novel multicharacter motion retargeting method that preserves various types of spatial relationships between characters and environments. We establish a unified deformation‐based framework for the motion retargeting of multiple characters or nonhuman characters with complex interactions. Also, an interactive motion editing interface with immediate feedback to the user is provided.
Chapter
Posing character has always been playing an important role in character animation and interactive applications such as computer games. However, such a task is time-consuming and labor-intensive. In order to improve the efficiency in character posing, researchers in computer graphics have been working on a wide variety of semi- or fully automatic approaches in creating full-body poses, ranging from traditional approaches like inverse kinematics (IK), data-driven approaches which make use of captured motion data, as well as direct pose manipulation through intuitive interfaces. In this book chapter, we will introduce the aforementioned techniques and also discuss their applications in animation production.
Chapter
This chapter presents gesture recognition methods and their application to a dance training system in an instructional, virtual reality (VR) setting. The proposed system is based on the unsupervised parsing of dance movement into a structured posture space using the spherical self-organizing map (SSOM). A unique feature descriptor is obtained from the gesture trajectories through posture space on the SSOM. For recognition, various methods are explored for trajectory analysis, which include sparse coding, posture occurrence, posture transition, and the hidden Markov model. Within the system, the dance sequence of a student can be segmented online and cross-referenced against a library of gestural components performed by the teacher. This facilitates the assessment of the student dance, as well as provides visual feedback for effective training.
Article
Applying motion‐capture data to multi‐person interaction between virtual characters is challenging because one needs to preserve the interaction semantics while also satisfying the general requirements of motion retargeting, such as preventing penetration and preserving naturalness. An efficient means of representing interaction semantics is by defining the spatial relationships between the body parts of characters. However, existing methods consider only the character skeleton and thus are not suitable for capturing skin‐level spatial relationships. This paper proposes a novel method for retargeting interaction motions with respect to character skins. Specifically, we introduce the aura mesh, which is a volumetric mesh that surrounds a character's skin. The spatial relationships between two characters are computed from the overlap of the skin mesh of one character and the aura mesh of the other, and then the interaction motion retargeting is achieved by preserving the spatial relationships as much as possible while satisfying other constraints. We show the effectiveness of our method through a number of experiments.
Chapter
Full-text available
The VR technologies are widely adopted for training purposes by providing the users with educational virtual experience. In this work, we propose an immersive VR system that help the choreographers and the dancers to facilitate their dance rehearsal experience. The system integrates motion capture devices and head-mounted displays (HMDs). The motions of the dancers, their partners, and the choreographers are captured and projected into a virtual dancing scene in an interactive frame rate. The dancers who are wearing the HMDs, are allowed to observe the synthesized virtual performances within a virtual stage space from several selected third-person views. These are the audience’s view, the dancing partner’s view, and the choreographer’s view. Such synthesized external self-images augment dancer’s perception of their dance performance and their understanding of the choreography. Feedbacks from the participants indicate the proposed system is effective and the preliminary experimental results agree with our observations.
Article
Full-text available
Creating realistic characters that can react to the users’ or another character’s movement can benefit computer graphics, games and virtual reality hugely. However, synthesizing such reactive motions in human-human interactions is a challenging task due to the many different ways two humans can interact. While there are a number of successful researches in adapting the generative adversarial network (GAN) in synthesizing single human actions, there are very few on modelling human-human interactions. In this paper, we propose a semi-supervised GAN system that synthesizes the reactive motion of a character given the active motion from another character. Our key insights are two-fold. First, to effectively encode the complicated spatial–temporal information of a human motion, we empower the generator with a part-based long short-term memory (LSTM) module, such that the temporal movement of different limbs can be effectively modelled. We further include an attention module such that the temporal significance of the interaction can be learned, which enhances the temporal alignment of the active-reactive motion pair. Second, as the reactive motion of different types of interactions can be significantly different, we introduce a discriminator that not only tells if the generated movement is realistic or not, but also tells the class label of the interaction. This allows the use of such labels in supervising the training of the generator. We experiment with the SBU and the HHOI datasets. The high quality of the synthetic motion demonstrates the effective design of our generator, and the discriminability of the synthesis also demonstrates the strength of our discriminator.
Chapter
Unlike single-character motion retargeting, multi-character motion retargeting (MCMR) algorithms should be able to retarget each character’s motion correcly while maintaining the interaction between them. Existing MCMR solutions mainly focus on small scale changes between interacting characters. However, many retargeting applications require large-scale transformations. In this paper, we propose a new algorithm for large-scale MCMR. We build on the idea of interaction meshes, which are structures representing the spatial relationship among characters. We introduce a new distance-based interaction mesh that embodies the relationship between characters more accurately by prioritizing local connections over global ones. We also introduce a stiffness weight for each skeletal joint in our mesh deformation term, which defines how undesirable it is for the interaction mesh to deform around that joint. This parameter increases the adaptability of our algorithm for large-scale transformations and reduces optimization time considerably. We compare the performance of our algorithm with current state-of-the-art MCMR solution for several motion sequences under four different scenarios. Our results show that our method not only improves the quality of retargeting, but also significantly reduces computation time.
Conference Paper
Full-text available
Automatic dance synthesis has become more and more popular due to the increasing demand in computer games and animations. Existing research generates dance motions without much consideration for the context of the music. In reality, professional dancers make choreography according to the lyrics and music features. In this research, we focus on a particular genre of dance known as sign dance, which combines gesture-based sign language with full body dance motion. We propose a system to automatically generate sign dance from a piece of music and its corresponding sign gesture. The core of the system is a Sign Dance Model trained by multiple regression analysis to represent the correlations between sign dance and sign gesture/music, as well as a set of objective functions to evaluate the quality of the sign dance. Our system can be applied to music visualization, allowing people with hearing difficulties to understand and enjoy music.
Article
This article presents a global 3D human pose estimation method for markerless motion capture. Given two calibrated images of a person, it first obtains the 2D joint locations in the images using a pre-trained 2D Pose CNN, then constructs the 3D pose based on stereo triangulation. To improve the accuracy and the stability of the system, we propose two efficient optimization techniques for the joints. The first one, called cross-view refinement, optimizes the joints based on epipolar geometry. The second one, called cross-joint refinement, optimizes the joints using bone-length constraints. Our method automatically detects and corrects the unreliable joint, and consequently is robust against heavy occlusion, symmetry ambiguity, motion blur, and highly distorted poses. We evaluate our method on a number of benchmark datasets covering indoors and outdoors, which showed that our method is better than or on par with the state-of-the-art methods. As an application, we create a 3D human pose dataset using the proposed motion capture system, which contains about 480K images of both indoor and outdoor scenes, and demonstrate the usefulness of the dataset for human pose estimation.
Article
Creating realistic human movement is a time consuming and labour intensive task. The major difficulty is that the user has to edit individual joints while maintaining an overall realistic and collision free posture. Previous research suggests the use of data‐driven inverse kinematics, such that one can focus on the control of a few joints, while the system automatically composes a natural posture. However, as a common problem of kinematics synthesis, penetration of body parts is difficult to avoid in complex movements. In this paper, we propose a new data‐driven inverse kinematics framework that conserves the topology of the synthesizing postures. Our system monitors and regulates the topology changes using the Gauss Linking Integral (GUI), such that penetration can be efficiently prevented. As a result, complex motions with tight body movements, as well as those involving interaction with external objects, can be simulated with minimal manual intervention. Experimental results show that using our system, the user can create high quality human motion in real‐time by controlling a few joints using a mouse or a multi‐touch screen. The movement generated is both realistic and penetration free. Our system is best applied for interactive motion design in computer animations and games.
Article
Full-text available
Fast searching of content in large motion databases is essential for efficient motion analysis and synthesis. In this work we demonstrate that identifying locally similar regions in human motion data can be practical even for huge databases, if medium-dimensional (15–90 dimensional) feature sets are used for kd-tree-based nearest-neighbor-searches. On the basis of kd-tree-based local neighborhood searches we devise a novel fast method for global similarity searches. We show that knn-searches can be used efficiently within the problems of (a) numerical and logical similarity searches, (b) reconstruction of motions from sparse marker sets, and (c) building so called fat graphs, tasks for which previously algorithms with preprocessing time quadratic in the size of the database and thus only applicable to small collections of motions had been presented. We test our techniques on the two largest freely available motion capture databases, the CMU and HDM05 motion databases comprising more than 750 min of motion capture data proving that our approach is not only theoretically applicable but also solves the problem of fast similarity searches in huge motion databases in practice. Keywords: Computer Graphics [I.3.7]: Three-Dimensional Graphics and Realism-[Animation] Information Storage and Retrieval [H.3]: Information Search and Retrieval
Conference Paper
Full-text available
We have proposed a dance partner robot, which has been developed as a platform for realizing the effective human-robot coordination with physical interactions. In this paper, especially, we improve an estimation system for dance steps, which estimates a next dance step intended by a human. For estimating the dance step, time series data of force/moment applied by a human to the robot are utilized. The time series data of force/moment measured during dancing by a human and the robot include the uncertainty such as time-lag and variations for each repeated trial, because a human can not always apply the same force/moment to the robot exactly. In order to treat the time series data including such uncertainty, hidden Markov models are utilized for designing the dance step estimation system. With the proposed system, the robot estimates a next dance step based on human intention successfully.
Conference Paper
Full-text available
A design methodology to build miniature humanoid robots is discussed. Although light and small bodies would make aggressive types of motion experiments much safer and smoother, they would often cause self-collisions and even restrict the space to mount mechatronic components. In order to defeat some kinematic difficulties including the former issue, a technique to modularize and assign joints is proposed through our prototyped robot. And, as a solution against the latter issue, a portable core control unit which stores a stand-alone electronic system is also introduced through the second version of our humanoid, whose system centers around it.
Article
Full-text available
In this paper, a new dance training system based on the motion capture and virtual reality (VR) technologies is proposed. Our system is inspired by the traditional way to learn new movements-imitating the teacher's movements and listening to the teacher's feedback. A prototype of our proposed system is implemented, in which a student can imitate the motion demonstrated by a virtual teacher projected on the wall screen. Meanwhile, the student's motions will be captured and analyzed by the system based on which feedback is given back to them. The result of user studies showed that our system can successfully guide students to improve their skills. The subjects agreed that the system is interesting and can motivate them to learn.
Conference Paper
Full-text available
We present a method to decompose an arbitrary 3D piecewise linear complex (PLC) into a constrained Delaunay tetrahedralization (CDT). It successfully resolves the problem of non-existence of a CDT by updating the input PLC into another PLC which is topologically and geometrically equivalent to the original one and does have a CDT. Based on a strong CDT existence condition, the redefinition is done by a segment splitting and vertex perturbation. Once the CDT exists, a practically fast cavity retetrahedralization algorithm recovers the missing facets. This method has been implemented and tested through various examples. In practice, it behaves rather robust and efficient for relatively complicated 3D domains.
Conference Paper
Full-text available
It is difficult to create scenes where multiple avatars are fighting / competing with each other. Manually creating the motions of avatars is time consuming due to the correlation of the movements between the avatars. Capturing the motions of multiple avatars is also difficult as it requires a huge amount of post-processing. In this paper, we propose a new method to generate a realistic scene of avatars densely interacting in a competitive environment. The motions of the avatars are considered to be captured individually, which will increase the easiness of obtaining the data. We propose a new algorithm called the temporal expansion approach which maps the continuous time action plan to a discrete space such that turn-based evaluation methods can be used. As a result, many mature algorithms in game such as the min-max search and α---β pruning can be applied. Using our method, avatars will plan their strategies taking into account the reaction of the opponent. Fighting scenes with multiple avatars are generated to demonstrate the effectiveness of our algorithm. The proposed method can also be applied to other kinds of continuous activities that require strategy planning such as sport games.
Conference Paper
Full-text available
We have implemented an interactive dancing game using optical 3D motion capture technology. We propose a Progressive Block Matching algorithm to recognize the dance moves performed by the player in real-time. This makes a virtual partner be able to recognize and respond to the player's movement without a noticeable delay. The completion progress of a move is tracked progressively and the virtual partner's move is rendered in synchronization with the player's current action. Our interactive dancing game contains moves with various difficulty levels that suits for both novices and skillful players. Through animating the virtual partner in response to the player's movements, the player gets immersed into the virtual environment. A user test is performed to have a subjective evaluation of our game and the feedbacks from the subjects are positive.
Conference Paper
Full-text available
In this paper, mimetic communication is extended to human-robot interaction tasks, in which physical contact transitions must be handled. The mimetic communication consists of imitation learning for learning low level motion primitives and a higher level interaction learning stage in which also the information about the human-robot contacts is included. For the imitation learning, Cartesian marker data from a motion capture system is used. A modification of the low level marker trajectory following algorithm is presented, which allows to reshape the trajectory of the motion primitive in accordance with the human hand motion in real-time. Moreover, for performing safe contact motion, an appropriate impedance controller is integrated into the setting. All the presented concepts are evaluated in experiments with a humanoid robot.
Conference Paper
Full-text available
A fast online gait planning method is proposed. Based on an approximate dynamical biped model whose mass is concentrated to COG, general solution of the equation of motion is analytically obtained. Dynamical constraint on the external reaction force due to the underactuation is resolved by boundary condition relaxation, namely, by admitting some error between the desired and actually reached state. It potentially creates responsive motion which requires strong instantaneous acceleration by accepting discontinuity of ZMP trajectory, which is designed as an exponential function. A semi-automatic continuous gait planning is also presented. It generates physically feasible referential trajectory of the whole-body only from the next desired foot placement. The validity of proposed is ensured through both simulations and experiments with a small anthropomorphic robot.
Conference Paper
Full-text available
In this paper we present the European project Open Dance and in particular our contribution to the D simulation of folk dances and their presentation on the web. Our aim is to provide a learning framework for folk dances. First we describe the conceptual and learning model that we apply, focusing on the requirements of dance education. Then we digitize folk dances, originating from several regions of Europe, using as a recording device an optical motion capture system. We allow dance teachers and students to use our web3D platform and interact with the animated dancers, aiming to the better understanding of dances. Students interact with the platform and observe how the virtual dance teachers perform. The evaluation of the system shows that the increased usability of our approach enhances the learning process. Our long term objective is to create an online dance learning community and allow dance teachers to create their own dance lessons online.
Conference Paper
Full-text available
a) (b) (c) Figure 1: The interactions of articulated avatars are generated by maximizing the reward function defined by the relative pose between avatars, the effectiveness of actions, and/or user-defined constraints. This framework of synthesizing character animation is efficient and flex-ible enough to make a variety of practical applications including (a) interactive character control using high-level motion descriptions such as punches, kicks, avoids and dodges, (b) real-time massive character interactions by a large number of automated avatars, (c) collaborative motion synthesis such as carrying large luggage by two persons. Abstract Efficient computation of strategic movements is essential to con-trol virtual avatars intelligently in computer games and 3D virtual environments. Such a module is needed to control non-player char-acters (NPCs) to fight, play team sports or move through a mass crowd. Reinforcement learning is an approach to achieve real-time optimal control. However, the huge state space of human interac-tions makes it difficult to apply existing learning methods to control avatars when they have dense interactions with other characters. In this research, we propose a new methodology to efficiently plan the movements of an avatar interacting with another. We make use of the fact that the subspace of meaningful interactions is much smaller than the whole state space of two avatars. We efficiently collect samples by exploring the subspace where dense interactions between the avatars occur and favor samples that have high con-nectivity with the other samples. Using the collected samples, a finite state machine (FSM) called Interaction Graph is composed. At run-time, we compute the optimal action of each avatar by min-max search or dynamic programming on the Interaction Graph. The methodology is applicable to control NPCs in fighting and ball-sports games.
Conference Paper
Full-text available
Fast searching of content in large motion databases is essential for efficient motion analysis and synthesis. In this work we demonstrate that identifying locally similar regions in human motion data can be practical even for huge databases, if medium-dimensional (15--90 dimensional) feature sets are used for kd-tree-based nearest-neighbor-searches. On the basis of kd-tree-based local neighborhood searches we devise a novel fast method for global similarity searches. We show that knn-searches can be used efficiently within the problems of (a) "numerical and logical similarity searches", (b) reconstruction of motions from sparse marker sets, and (c) building so called "fat graphs", tasks for which previously algorithms with preprocessing time quadratic in the size of the database and thus only applicable to small collections of motions had been presented. We test our techniques on the two largest freely available motion capture databases, the CMU and HDM05 motion databases comprising more than 750 min of motion capture data proving that our approach is not only theoretically applicable but also solves the problem of fast similarity searches in huge motion databases in practice.
Article
Full-text available
An ANSI C code for sparse LU factorization is presented that combines a column pre-ordering strategy with a right-looking unsymmetric-pattern multifrontal numerical factorization. The pre-ordering and symbolic analysis phase computes an upper bound on fill-in, work, and memory usage during the subsequent numerical factorization. User-callable routines are provided for ordering and analyzing a sparse matrix, computing the numerical factorization, solving a system with the LU factors, transposing and permuting a sparse matrix, and converting between sparse matrix representations. The simple user interface shields the user from the details of the complex sparse factorization data structures by returning simple handles to opaque objects. Additional user-callable routines are provided for printing and extracting the contents of these opaque objects. An even simpler way to use the package is through its MATLAB interface. UMFPACK is incorporated as a built-in operator in MATLAB 6.5 as x = A\b when A is sparse and unsymmetric.
Article
Full-text available
Figure 1: Our system can retarget motions of close interactions to characters of different morphologies. A judo interaction (red / orange pair) retargeted to characters of different sizes. Abstract This paper presents a new method for editing and retargeting mo-tions that involve close interactions between body parts of single or multiple articulated characters, such as dancing, wrestling, and sword fighting, or between characters and a restricted environment, such as getting into a car. In such motions, the implicit spatial rela-tionships between body parts/objects are important for capturing the scene semantics. We introduce a simple structure called an interaction mesh to represent such spatial relationships. By min-imizing the local deformation of the interaction meshes of anima-tion frames, such relationships are preserved during motion editing while reducing the number of inappropriate interpenetrations. The interaction mesh representation is general and applicable to various kinds of close interactions. It also works well for interactions in-volving contacts and tangles as well as those without any contacts. The method is computationally efficient, allowing real-time char-acter control. We demonstrate its effectiveness and versatility in synthesizing a wide variety of motions with close interactions.
Article
Full-text available
“Mimesis” theory focused in the cognitive science field and “mirror neurons” found in the biology field show that the behavior generation process is not independent of the behavior cognition process. The generation and cognition processes have a close relationship with each other. During the behavioral imitation period, a human being does not practice simple joint coordinate transformation, but will acknowledge the parents’ behavior. It understands the behavior after abstraction as symbols, and will generate its self-behavior. Focusing on these facts, we propose a new method which carries out the behavior cognition and behavior generation processes at the same time. We also propose a mathematical model based on hidden Markov models in order to integrate four abilities: (1) symbol emergence; (2) behavior recognition; (3) self-behavior generation; (4) acquiring the motion primitives. Finally, the feasibility of this method is shown through several experiments on a humanoid robot.
Article
Full-text available
In human motion control applications, the mapping between a control specification and an appropriate target motion often defies an explicit encoding. We present a method that allows such a mapping to be defined by example, given that the control specification is recorded motion. Our method begins by building a database of semantically meaningful instances of the mapping, each of which is represented by synchronized segments of control and target motion. A dynamic programming algorithm can then be used to interpret an input control specification in terms of mapping instances. This interpretation induces a sequence of target segments from the database, which is concatenated to create the appropriate target motion. We evaluate our method on two examples of indirect control. In the first, we synthesize a walking human character that follows a sampled trajectory. In the second, we generate a synthetic partner for a dancer whose motion is acquired through motion capture.
Article
Full-text available
Human motion indexing and retrieval are important for animators due to the need to search for motions in the database which can be blended and concatenated. Most of the previous researches of human motion indexing and retrieval compute the Euclidean distance of joint angles or joint positions. Such approaches are difficult to apply for cases in which multiple characters are closely interacting with each other, as the relationships of the characters are not encoded in the representation. In this research, we propose a topology-based approach to index the motions of two human characters in close contact. We compute and encode how the two bodies are tangled based on the concept of rational tangles. The encoded relationships, which we define as TangleList, are used to determine the similarity of the pairs of postures. Using our method, we can index and retrieve motions such as one person piggy-backing another, one person assisting another in walking, and two persons dancing / wrestling. Our method is useful to manage a motion database of multiple characters. We can also produce motion graph structures of two characters closely interacting with each other by interpolating and concatenating topologically similar postures and motion clips, which are applicable to 3D computer games and computer animation.
Article
Full-text available
The main purpose of this paper is to realize an effective human-robot coordination with physical interaction. A dance partner robot has been proposed as a platform for it. To realize the effective human-robot coordination, recognizing human intention would be one of the key issues. This paper focuses on an estimation method for dance steps, which estimates a next dance step intended by a human. In estimating the dance step, time series data of force/moment applied by the human to the robot are used. The time series data of force/moment measured in dancing include uncertainty such as time lag and variations for repeated trials because the human could not always exactly apply the same force/moment to the robot. In order to treat the time series data including such uncertainty, hidden Markov models are utilized for designing the dance step estimation method. With the proposed method, the robot successfully estimates a next dance step based on human intention
Article
Full-text available
Real-time control of three-dimensional avatars is an important problem in the context of computer games and virtual environments. Avatar animation and control is difficult, however, because a large repertoire of avatar behaviors must be made available, and the user must be able to select from this set of behaviors, possibly with a low-dimensional input device. One appealing approach to obtaining a rich set of avatar behaviors is to collect an extended, unlabeled sequence of motion data appropriate to the application. In this paper, we show that such a motion database can be preprocessed for flexibility in behavior and efficient search and exploited for real-time avatar control. Flexibility is created by identifying plausible transitions between motion segments, and efficient search through the resulting graph structure is obtained through clustering. Three interface techniques are demonstrated for controlling avatar motion using this data structure: the user selects from a set of available choices, sketches a path through an environment, or acts out a desired motion in front of a video camera. We demonstrate the flexibility of the approach through four different applications and compare the avatar motion to directly recorded human motion.
Article
An ANSI C code for sparse LU factorization is presented that combines a column pre-ordering strategy with a right-looking unsymmetric-pattern multifrontal numerical factorization. The pre-ordering and symbolic analysis phase computes an upper bound on fill-in, work, and memory usage during the subsequent numerical factorization. User-callable routines are provided for ordering and analyzing a sparse matrix, computing the numerical factorization, solving a system with the LU factors, transposing and permuting a sparse matrix, and converting between sparse matrix representations. The simple user interface shields the user from the details of the complex sparse factorization data structures by returning simple handles to opaque objects. Additional user-callable routines are provided for printing and extracting the contents of these opaque objects. An even simpler way to use the package is through its MATLAB interface. UMFPACK is incorporated as a built-in operator in MATLAB 6.5 as x = Abb when A is sparse and unsymmetric.
Article
A ballroom dance is a performance between a male dancer and a female dancer and consists of its own steps. The dance is led by a male dancer and a female dancer continues to dance by estimating the following step through physical interaction between them. A dance partner robot, PBDR, has been proposed as a research platform for human-robot coordination. It dances a waltz with a male dancer by estimating the following step led by the male dancer. The step estimator has been designed based on the hidden Markov model. The parameters of the hidden Markov model are determined based on a set of time series data of force/moment applied to the upper body of the robot by the male dancer. The proposed method is effective for the male dancer whom the teaching data are collected from, although the success rate of the step estimation with another male dancer is not always high. In this paper, a step estimation method for a dance partner robot is proposed which updates parameters of the hidden Markov model at each step transition and improves the success rate of the dance step estimation for any dance partner. Experimental results illustrate the effectiveness of the proposed method.
Article
In this paper, we present a technique for retargetting motion: the problem of adapting an animated motion from one character to another. Our focus is on adapting the motion of one articulated figure to another figure with identical structure but different segment lengths, although we use this as a step when considering less similar characters. Our method creates adaptations that preserve desirable qualities of the original motion. We identify specific features of the motion as constraints that must be maintained. A spacetime constraints solver computes an adapted motion that re-establishes these constraints while preserving the frequency characteristics of the original signal. We demonstrate our approach on motion capture data.
Article
In this paper, we propose a new method to efficiently synthesize character motions that involve close contacts such as wearing a T-shirt, passing the arms through the strings of a knapsack, or piggy-back carrying an injured person. We introduce the concept of topology coordinates, in which the topological relationships of the segments are embedded into the attributes. As a result, the computation for collision avoidance can be greatly reduced for complex motions that require tangling the segments of the body. Our method can be combinedly used with other prevalent frame-based optimization techniques such as inverse kinematics.
Conference Paper
This paper proposes an approach to construct a system which allows humanoid robots to recognize human behaviors and predict his or her future behaviors. The system consists of two modules : "motion symbol tree" and "motion symbol graph", Human demonstrator motion patterns are stored as motion symbols, which abstract the motion data by using Hidden Markov Models. The stored motion patterns are organized into a hierarchical tree structure, which represents the similarity among the motion patterns and provides ab- stracted motion patterns. The formed hierarchical structure is the motion symbol tree. Concatenated sequences of motion patterns are stochastically represented as transitions between the motion patterns by using an Ngram Model, and the causality among the human behaviors are extracted. This structure is the motion symbol graph. The behavioral hierarchy and transition model make it possible to predict human behaviors during observation and to generate sequences of motion patterns automatically while maintaining a natural motion stream, as if the system is a "crystal ball" to reflect future behaviors. The experiments demonstrate the validity of the proposed framework on a large scale motion data.
Conference Paper
As an example of platforms for realizing the effective human-robot coordination with physical interactions, a dance partner robot has been developed, which could dance together with a human by estimating the next dance step intended by the human. Generating robot's active motion so as to be adapted to its user could be one of essential functions in the next generation robotic technology. This paper proposes the cooperative motion generation method for the robot, which is implemented by adjusting length of dance step stride based on physical interaction between the human and the robot. Experimental results illustrate the validity of the proposed method.
Conference Paper
This paper discusses advanced human-robot coordination systems. In order to realize the coordination, robots have to behave as not only followers but also leaders when they execute tasks with humans. As an example of the advanced human-robot coordination systems, a male-type dance partner robot is developed, which behaves as a male dancer and executes ballroom dances with a human. In ballroom dances, a male dancer leads a female dancer, and selects the next step based on the information such as the relative position between themselves and other dance couples, their positions in the dance floor, and so on. This paper addresses the step selection problems, which contains collision avoidances with other dance couples. Hidden Markov models are used to estimate their dance step trajectories. Experiments and simulations are performed to illustrate the validity of the proposed method.
Conference Paper
A volume-based realistic communication system called Haptic Communication is described. The system allows participants to interact in real time with others at remote locations on the network using haptic perception (sense of touch) of soft objects in virtual environments. We constructed the system so that it provides a sense of touch at remote locations in real time. First, an adaptive volume model represents virtual soft objects in PCs at remote locations. Next, the reflection force of the soft object is calculated rapidly and accurately from the parameters of positions and forces at contacting points transmitted via network at each PC. Eventually, the haptic and visual information are rendered by a haptic device (PHANToM) and a volume graphic software in the PCs. We investigated the efficiency of our system via experiments on a simulation of needle insertion with high force feedback rates at two remote locations on a WAN between Ritsumeikan University, Biwako Kusatsu Campus and Shiga Medical University. The experiment results show that the delay due to network traffic is negligible.
Conference Paper
We describe a discriminative method for distinguishing natural- looking from unnatural-looking motion. Our method is based on physical and data-driven features of motion to which humans seem sensitive. We demonstrate that our technique is significantly more accurate than current alternatives. We use this technique as the testing part of a hypothesize-and-test motion synthesis procedure. The mechanism we build using this procedure can quickly provide an application with a transition of user-specified duration from any frame in a motion collection to any other frame in the collection. During pre-processing, we search all possible 2-, 3-, and 4-way blends between representative samples of motion obtained using clustering. The blends are automatically evaluated, and the recipe (i.e., the representatives and the set of weighting functions) that created the best blend is cached. At run-time, we build a transition between motions by matching a future window of the source motion to a representative, matching the past of the target motion to a representative, and then applying the blend recipe recovered from the cache to source and target mo- tion. People seem sensitive to poor contact with the environment like sliding foot plants. We determine appropriate temporal and po- sitional constraints for each foot plant using a novel technique, then apply an off-the-shelf inverse kinematics technique to enforce the constraints. This synthesis procedure yields good-looking transi- tions between distinct motions with very low online cost.
Conference Paper
This paper presents a physics-based method for creating complex multi-c haracter motions from short single- character sequences. We represent multi-character motion synthesis as a spacetime optimization problem where constraints represent the desired character interactions. We extend sta ndard spacetime optimization with a novel timewarp parameterization in order to jointly optimize the motion and the interaction constraints. In addition, we present an optimization algorithm based on block coordinate descent a nd continuations that can be used to solve large problems multiple characters usually generate. This framewor k allows us to synthesize multi-character motion drastically different from the input motion. Consequently, a small set of input motion dataset is sufficient to express a wide variety of multi-character motions.
Article
Creating controllable, responsive avatars is an important problem in computer games and virtual environments. Recently, large collections of motion capture data have been exploited for increased realism in avatar animation and control. Large motion sets have the advantage of accommodating a broad variety of natural human motion. However, when a motion set is large, the time required to identify an appropriate sequence of motions is the bottleneck for achieving interactive avatar control. In this paper, we present a novel method of precomputing avatar behavior from unlabelled motion data in order to animate and control avatars at minimal runtime cost. Based on dynamic programming, our method finds a control policy that indicates how the avatar should act in any given situation. We demonstrate the effectiveness of our approach through examples that include avatars interacting with each other and with the user.
Article
In this paper, we present an interactive dancing game based on motion capture technology. We address the problem of real-time recognition of the user's live dance performance in order to determine the interactive motion to be rendered by a virtual dance partner. The real-time recognition algorithm is based on a human body partition indexing scheme with flexible matching to determine the end of a move as well as to detect unwanted motion. We show that the system can recognize the live dance motions of users with good accuracy and render the interactive dance move of the virtual partner. Copyright © 2011 John Wiley & Sons, Ltd.
Article
This paper proposes a new framework to simulate the real-time attack-and-defense interactions by two virtual wrestlers in 3D computer games. The characters are controlled individually by two different players—one player controls the attacker and the other controls the defender. A finite state machine of attacks and defenses based on topology coordinates is precomputed and used to control the virtual wrestlers during the game play. As the states are represented by topology coordinates, which is an abstract representation for the spatial relationship of the bodies, the players have much more degree of freedom to control the virtual characters even during attacks and defenses. Experimental results show the methodology can simulate realistic competitive interactions of wrestling in real time, which is difficult by previous methods. Copyright © 2010 John Wiley & Sons, Ltd.
Article
This paper presents an inverse kinematics system based on a learned model of human poses. Given a set of constraints, our system can produce the most likely pose satisfying those constraints, in realtime. Training the model on different input data leads to different styles of IK. The model is represented as a probability distribution over the space of all possible poses. This means that our IK system can generate any pose, but prefers poses that are most similar to the space of poses in the training data. We represent the probability with a novel model called a Scaled Gaussian Process Latent Variable Model. The parameters of the model are all learned automatically; no manual tuning is required for the learning component of the system. We additionally describe a novel procedure for interpolating between styles.
Article
Large motion data sets often contain many variants of the same kind of motion, but without appropriate tools it is difficult to fully exploit this fact. This paper provides automated methods for identifying logically similar motions in a data set and using them to build a continuous and intuitively parameterized space of motions. To find logically similar motions that are numerically dissimilar, our search method employs a novel distance metric to find "close" motions and then uses them as intermediaries to find more distant motions. Search queries are answered at interactive speeds through a precomputation that compactly represents all possibly similar motion segments. Once a set of related motions has been extracted, we automatically register them and apply blending techniques to create a continuous space of motions. Given a function that defines relevant motion parameters, we present a method for extracting motions from this space that accurately possess new parameters requested by the user. Our algorithm extends previous work by explicitly constraining blend weights to reasonable values and having a run-time cost that is nearly independent of the number of example motions. We present experimental results on a test data set of 37,000 frames, or about ten minutes of motion sampled at 60 Hz.
Article
We present a new approach to realtime character animation with interactive control. Given a corpus of motion capture data and a de- sired task, we automatically compute near-optimal controllers us- ing a low-dimensional basis representation. We show that these controllers produce motion that fluidly responds to several dimen- sions of user control and environmental constraints in realtime. Our results indicate that very few basis functions are required to create high-fidelity character controllers which permit complex user navi- gation and obstacle-avoidance tasks.
Article
A simple but highly eective approach for transforming high-performance implementations on cache- based architectures of matrix-matrix multiplication into implementations of other commonly used matrix- matrix computations (the level-3 BLAS) is presented. Exceptional performance is demonstrated on various architectures.
Conference Paper
A method of real-time recognition of body motion for virtual dance collaboration system is described. Fourteen feature values are extracted from motion captured body motion data, and the dimension of data is reduced by using principal component analysis (PCA). In the training phase, templates for motion recognition are constructed from training samples of several types of motion. In the recognition phase, feature values obtained from a real dancer's motion data are projected to the subspace obtained by PCA, and the system recognizes the real dancer's motion by comparing with the motion templates. In this paper, the method and the experiments using seven kinds of basic motions are presented. The recognition experiment proved that the method could be used for motion recognition. A preliminary experiment in which a real dancer and a virtual dancer collaborate with body motion was also carried out.
Article
In March 2001 we started to collect the CMU Motion of Body (MoBo) database. To date the database contains 25 individuals walking on a treadmill in the CMU 3D room. The subjects perform four different activities: slow walk, fast walk, incline walk and walking with a ball. All subjects are captured using six high resolution color cameras distributed evenly around the treadmill. In this technical report we describe the capture setup, the collection procedure and the organization of the database.
Article
In this paper we present a novel method for creating realistic, controllable motion. Given a corpus of motion capture data, we automatically construct a directed graph called a motion graph that encapsulates connections among the database. The motion graph consists both of pieces of original motion and automatically generated transitions. Motion can be generated simply by building walks on the graph. We present a general framework for extracting particular graph walks that meet a user's specifications. We then show how this framework can be applied to the specific problem of generating different styles of locomotion along arbitrary paths.
Article
3.37> ffl There are no exponential factors in space, implying that the data structure is practical even for very large data sets in high dimensional spaces, irrespective of ffl. ANN is written as a testbed for a class of nearest neighbor searching algorithms, particularly those based on orthogonal decompositions of space. These include k-d trees [3, 4], balanced box-decomposition trees [2] and other related spatial data structures (see Samet [5]). The library supports a number of different methods for building search structures. It also supports two methods for searching these structures: standard tree-ordered search [1] and priority search [2]. In priority search, the cells of the data structure are visited in increasing order of distance from the query point. In addition to the library there are two programs provided for testing and evaluating the performance of various search methods. The first, called ann<F2
The cmu motion of body (mobo). Tech. rep. CMU-RI-TR-01-18
  • R Gross
  • J Shi
Motion synthesis from annotations
  • O Arikan
  • D A Forsyth
  • J F Brien
ARIKAN, O., FORSYTH, D. A., AND O'BRIEN, J. F. 2003. Motion synthesis from annotations. ACM Trans. Graph. 22, 3, 402-408.
Animatronic humanoid robot project in prototype robot exhibition Aichi EXPO
  • Yamane Nakamura
  • Laboratory
NAKAMURA AND YAMANE LABORATORY. 2005. Animatronic humanoid robot project in prototype robot exhibition, Aichi EXPO. http://www.ynl.t.u-tokyo.ac.jp/research/expo2005/expo2005-e.html.