Content uploaded by Tu-Hoa Pham

Author content

All content in this area was uploaded by Tu-Hoa Pham on Jan 15, 2018

Content may be subject to copyright.

1551-3203 (c) 2017 IEEE. Translations and content mining are permitted for academic research only. Personal use is also permitted, but republication/redistribution requires IEEE permission. See

http://www.ieee.org/publications_standards/publications/rights/index.html for more information.

This article has been accepted for publication in a future issue of this journal, but has not been fully edited. Content may change prior to final publication. Citation information: DOI 10.1109/TII.2017.2760912, IEEE

Transactions on Industrial Informatics

IEEE TRANSACTIONS ON INDUSTRIAL INFORMATICS, VOL. XX, NO. X, MM 20YY 1

Multi-Contact Interaction Force Sensing

from Whole-Body Motion Capture

Tu-Hoa Pham, St´

ephane Caron, and Abderrahmane Kheddar, Senior Member, IEEE

Abstract—We present a novel technique that unobtrusively

estimates forces exerted by human participants in multi-contact

interaction with rigid environments. Our method uses motion

capture only, thus circumventing the need to setup cumbersome

force transducers at all potential contacts between the human

body and the environment. This problem is particularly chal-

lenging, as the knowledge of a given motion only characterizes

the resultant force, which can generally be caused by an inﬁnity

of force distributions over individual contacts. We collect and

release a large-scale dataset on how humans instinctively regulate

interaction forces on diverse multi-contact tasks and motions.

The force estimation framework we propose leverages physics-

based optimization and neural networks to reconstruct force

distributions that are physically realistic and compatible with

real interaction force patterns. We show the effectiveness of our

approach on various locomotion and multi-contact scenarios.

Index Terms—Force sensing from motion capture, neural

networks, physics-based optimization, whole-body, multi-contact.

I. INT ROD UC TI ON

HUMAN motions result from skilled control of the phys-

ical interactions with the environment through contacts.

Thus, haptic perception is a fundamental theme towards action

understanding and control. The monitoring of contact forces

is already widely used in various ﬁelds such as robot learning

from demonstration and control [1], [2], physics-based ani-

mation [3], [4], and visual tracking [5], [6]. Measurement

of contact forces is usually achieved by mounting force

transducers at pre-ﬁxed contact locations, making it a costly,

cumbersome and intrusive process that is difﬁcult to use in

daily settings. Mounting force transducers on the persons

obstructs their natural motion and is not sustainable for daily

use. In contrast, the accurate monitoring of interaction forces

from motion capture alone, which can readily be achieved

using consumer-grade cameras [7], [8], would enable a wide

range of applications in personal robotics, human-computer

interaction, and rehabilitation [9] as a new unobtrusive bio-

sensor for the healthcare Internet of Things [10].

However, this problem is very difﬁcult due to the indetermi-

nacy of force distributions in multi-contact. Indeed, while the

knowledge of external and internal forces uniquely determines

the resulting kinematics for a given articulated system, even a

perfectly known motion does not sufﬁce to fully characterize

Manuscript received November 21, 2016; revised May 12, 2017; accepted

October 2, 2017.

T.-H. Pham, S. Caron and A. Kheddar are with the Interactive Digital

Humans group of CNRS-University of Montpellier LIRMM, UMR5506,

Montpellier, France. T.-H. Pham and A. Kheddar are also with the CNRS-

AIST Joint Robotics Laboratory, UMI3218/RL, Tsukuba, Japan,

the underlying forces in multi-contact. Instead, the resultant

force can be distributed in inﬁnitely many different ways on a

given set of contacts. Illustrating this indeterminacy, consider

a human participant standing still with both feet on the ground.

Even in the elementary case of a static biped stance, the

participant can exert tangential forces that compensate each

other out, e.g., by pushing their feet apart. While substantial

work was dedicated to the problem of force indeterminacy

during gait, general contact conﬁgurations (e.g., involving

hands) have been comparatively less studied in the literature

(Section II).

We address the force distribution problem in multi-contact

by combining the beneﬁts of machine learning techniques and

physics-based optimization to capture the variability in the way

humans naturally regulate interaction forces while ensuring

their physical compatibility with the observed motion.

•We formulate an optimization problem allowing the esti-

mation of physically valid forces either from motion ob-

servations alone or from a reference signal (Section III).

•We construct a novel dataset on human whole-body

kinodynamics containing 2.4h of synchronized force and

motion measurements under diverse conﬁgurations of

tasks, participants and contacts (Section IV).

•We propose two neural network architectures allowing

the prediction of contact force distributions from motion

observations as well as their interactive correction by

physics-based optimization (Section V)

•We validate our approach with ground-truth force mea-

surements on various multi-contact scenarios and assess

the respective contributions of physics-based optimization

and neural networks (Section VI)

Finally, we discuss the limitations, applications and future ex-

tensions of our work (Section VII). Besides a signiﬁcantly ex-

tended dataset, our current work enhances the earlier approach

of [11] with: an improved formulation of the optimization

problem accounting for motion measurement uncertainties, the

consideration of individual contact normals in the learning

features enabling more ﬁne-grained predictions by neural net-

work models, as well as algorithmic descriptions and extensive

validation experiments that have not been presented before. To

foster the research on this new topic and encourage alterna-

tive implementations, we make the whole-body kinodynamics

dataset and algorithms publicly available1.

1https://github.com/jrl-umi3218/WholeBodyKinodynamics.

1551-3203 (c) 2017 IEEE. Translations and content mining are permitted for academic research only. Personal use is also permitted, but republication/redistribution requires IEEE permission. See

http://www.ieee.org/publications_standards/publications/rights/index.html for more information.

This article has been accepted for publication in a future issue of this journal, but has not been fully edited. Content may change prior to final publication. Citation information: DOI 10.1109/TII.2017.2760912, IEEE

Transactions on Industrial Informatics

IEEE TRANSACTIONS ON INDUSTRIAL INFORMATICS, VOL. XX, NO. X, MM 20YY 2

II. RE LATE D WOR K

Research on human-computer interaction has resulted in

multiple techniques for whole-body motion capture from

markerless visual observations [7], [8], magnetic trackers [12]

or wearable inertial sensors [13], [14]. Force sensors were

notably used in conjunction with inertial sensors and vision

to improve the motion reconstruction in [3], [4]. Instead

of physical force sensors, numerical models were used to

compute physically plausible distributions supporting visual

observations in hand-object tracking [5], [6]. The problem

of estimating the real forces applied on the environment was

tackled in the case of deformable objects [15] and conversely

by considering the human body elastic [16]. In the inspiring

work of [17], ground reaction forces were computed with a

spring-based contact model to estimate internal joint torques

during locomotion. General contact conﬁgurations are com-

monly addressed in simulation and robotics using constrained

optimization [18], which alone may not result in the forces

humans instinctively apply, as illustrated in Section III-C.

Inverse optimization approaches in kinesiology research

address the force distribution indeterminacy by modeling the

objective function(s) supposedly optimized by the central

nervous system [19]. However, such approches are difﬁcult due

to the redundancy of the human body and the difﬁculty to ob-

serve physiological parameters without invasive surgery [20].

The variability of inverse dynamics solutions with different

body segment inertial parameter (BSIP) models was notably

discussed in [21], [22]. Towards this issue, [23] introduced an

optimization framework for the online estimation of robot and

human BSIPs from motion and force-torque measurements.

An alternative approach for BSIP reconstruction was proposed

in [24], along with a data-driven approach to estimate contact

forces from motion tracking between the feet and the ground.

Recent successes for the control of robot arms [25], [26] or

general articulated characters [27] using neural networks illus-

trated their ability to account for complex model uncertainties.

Neural networks were also used to resolve force indeterminacy

cases during gait [28] and manipulation [29]. To account for

temporal continuity, recurrent neural networks (RNN) [30]

with long short-term memory (LSTM) [31] neurons were

used in [32], [33], still for manipulation. Whole-body inter-

actions were ﬁrst addressed using an RNN in combination

with a second-order program (SOCP) [34] for physics-based

optimization in [11]. Our current study generalizes this idea

to more complex multi-contact scenarios, supported by an

extended dataset that is signiﬁcantly more diverse in terms

of contact conﬁgurations, tasks and participants.

III. WHO LE -BODY CO NTAC T FOR CE OPTIMIZATI ON

A. Equations of Motion and Friction Constraints

We consider an articulated system of rigid bodies subject to

Nτinternal joint torques τ=τ(i)

1, . . . , τ (i)

NτT

and NFex-

ternal wrenches Fk= (τk,fk)T, with τkand fkthe respective

external torque and force at contact k, expressed in the global

frame. With the position and orientation of a chosen base

link, the number of degrees of freedom is NDoF =Nτ+ 6.

We denote by q,˙

q,¨

qthe respective generalized coordinates,

velocity and acceleration of the articulated system. The whole-

body equations of motion can be expressed as:

H(q)¨

q+C(q,˙

q) = 06

τ+

NF

X

k=1

JT

kFk,(1)

with:

•H(q)the NDoF ×NDoF mass matrix,

•C(q,˙

q)the NDoF ×1bias vector of the Coriolis,

centrifugal forces and gravity terms,

•Jkthe NDoF ×6the kth contact Jacobian matrix,

•06the 6×1internal wrench directly applied at the root of

the kinematic tree in case of linkage with the environment

(zero for the case of the ﬂoating base).

We assume the parameters of the dynamic model to be

known [23], [24]. For each contact k, we denote by zkthe

(uniquely deﬁned) normal vector oriented from the environ-

ment to the body, and by xkand yktwo orthogonal vectors

in the tangential plane. We thus obtain a local decomposition

for each external wrench Fkin the contact frame Ck=

(xk,yk,zk):

CkFk= (τx

k, τ y

k, τ z

k, f x

k, f y

k, f z

k)T,

with (τk=τx

kxk+τy

kyk+τz

kzk

fk=fx

kxk+fy

kyk+fz

kzk

.(2)

Having chosen zkoriented towards the body, each normal

force component is such that:

fz

k≥0.(3)

With µkthe friction coefﬁcient at contact k, the tangential

force is constrained by the normal component as follows:

kfx

kxk+fy

kykk2≤µkfz

k.(4)

Contact torque constraints are usually obtained by discretizing

the contact surface into individual contact points subject to

3D forces only. Closed-form formulae were derived for rect-

angular support areas in [35]. We observed in our experiments

that such constraints could be violated due to motion tracking

uncertainties and omitted them in this study.

B. Physics-Based Optimization

In this section, we discuss the extraction of physically

plausible force distributions compatible with a given motion,

characterized by generalized coordinates q, velocities ˙

qand

accelerations ¨

q. Such force distributions can be obtained as

solutions of a second-order cone program (SOCP) of the form:

min C(x) = 1

2xTPx +rTx

s.t.

kAjx+bjk2≤cT

jx+dj, j = 1, . . . , m

Ex ≤f

Gx =h,

(5)

with xa vector of Nx= 6 + Nτ+ 6NFforce variables:

x=FMEW,τ,CkFkk=1,NFT

.(6)

1551-3203 (c) 2017 IEEE. Translations and content mining are permitted for academic research only. Personal use is also permitted, but republication/redistribution requires IEEE permission. See

http://www.ieee.org/publications_standards/publications/rights/index.html for more information.

This article has been accepted for publication in a future issue of this journal, but has not been fully edited. Content may change prior to final publication. Citation information: DOI 10.1109/TII.2017.2760912, IEEE

Transactions on Industrial Informatics

IEEE TRANSACTIONS ON INDUSTRIAL INFORMATICS, VOL. XX, NO. X, MM 20YY 3

Here, FMEW represents a measurement error wrench (MEW)

applied to the ﬂoating base of the kinematic tree. This

wrench is 06in the ideal case of perfect measurements

and dynamic model. However, trying to enforce the strict

constraint FMEW =06on noisy measurements and with

an approximative dynamic model results in unfeasible SOCP

problems. To allow for uncertainties, we relax this constraint

and rather make the solver enforce it at best (i.e., minimizing

FMEW

), as detailed thereafter.

Inequality constraints. In Eq. (5), linear inequality matrices

E,fand cone inequality matrices A,bj,cj,djcan directly

be computed from Eqs. (3) and (4), respectively.

Equality constraints. We consider the whole-body equations

of motion. Given an instance of (q,˙

q,¨

q), the term hin Eq. (5)

corresponds directly to the left-hand side of Eq (1):

h=H(q)¨

q+C(q,˙

q).(7)

his a vector of NDoF elements. The matrix Gin Eq (5)

is here of size NDoF ×Nxand can be decomposed using

selection matrices Gτand (GFk)k=1,NFsuch that:

Gτx=FMEW

τand GFkx=Fk(8)

Note that each GFkmust incorporate the rotation matrix

between the contact frame Ckand the world frame. We obtain:

G=Gτ+

NF

X

k=1

JT

kGFk.(9)

Cost function. Having incorporated the previous constraints

in the SOCP, physically plausible force distributions can be

computed by minimizing a chosen cost function depending

only on the optimization variables, e.g., a weighted sum of

the squared L2norms of the optimization variables:

Cα,β,γ (x) = α

FMEW

2+βkτk2+γ

NF

X

k=1

kFkk2.(10)

In practice, it is preferrable to set αgreater than βand γso

that FMEW is only used when the observed motion is otherwise

unfeasible. The two other parameters βand γcan be tuned

to minimize either internal joint torques or applied contact

wrenches. Alternatively, when target values e

Fkfor the contact

wrenches are available (e.g., from force-torque sensors), it

is possible to extract force distributions in their vicinity that

are also guaranteed to be physically plausible, by minimizing

the discrepancy to the optimized wrenches in the SOCP cost

function [36]:

C

e

Fk

α,β,γ (x) = α

FMEW

2+βkτk2+γ

NF

X

k=1

Fk−e

Fk

2

.(11)

Contact forces and internal joint torques occurring during gait

are typically in the order of 100 N and 1N m respectively [21].

In our experiments, we chose α= 102,β= 10−2and

γ= 1 so that FMEW only compensates unfeasible raw motion

measurements and internal joint torques can vary as needed to

prioritize matching optimized and target contact wrenches.

650

700

750

800

Net force [N]

0

200

400

600

800

Left foot [N]

0

200

400

600

Right foot [N]

50

0

50

100

150

200

Left hand [N]

0 1 2 3 4 5 6

100

0

100

200

300

Right hand [N]

Contact forces - vertical axis (Z)

Force sensors SOCP: min L2SOCP: min discrepancy

Fig. 1. Force sensor noise and uncertainties in the resultant force (top plot)

can be corrected using physics-based optimization. In multi-contact, directly

minimizing the norm of the individual forces (green) results in forces that are

physically plausible but signiﬁcantly differ from real measurements (red). By

minimizing the discrepancy to the latter (blue), we reconstruct forces that are

both physically plausible and in agreement with natural force distributions.

C. Motivating Example: Triple Contact Indeterminacy

We illustrate the crucial role played by the SOCP cost

function. We consider a participant standing still next to a table

and taking support on it using the right hand, then the left. We

represent the vertical component of the measured forces in

Fig. 1. In addition, we compute force distributions of minimal

L2norm using Eq. (10) and minimizing the discrepancy to

the sensor measurements using Eq. (11).

With the participant standing still, the equations of motion

dictate that the net contact forces (top plot) should mostly

oppose the participant’s weight. However, individual force

sensor uncertainties result in rather noisy force estimates. In

contrast, all SOCP variants accurately reconstruct the net force

directly from the measured kinematics. Using the cost function

of Eq. (10) results in forces that are physically plausible

but may greatly differ from actual measurements. Using the

cost function of Eq. (11) enables the reconstruction of force

distributions that are both physically plausible and in the

vicinity of target forces when available. The aim of our work

is to circumvent the need for force sensors. Thus, in the

following, we train recurrent neural networks to predict such

target force distributions directly from motion observations.

IV. WHO LE -B ODY KI NO DYNAM IC S DATASET

A. Experimental Setup

We depict our complete acquisition system in Fig. 2.

Whole-body motion. We track the whole-body motion using

the Xsens MVN Awinda inertial motion capture system [13],

comprised of 17 inertial measurement units (IMU) worn and

strapped at speciﬁed body landmarks on the participant’s

body. The motion capture system is battery-powered and wire-

less, transmitting accelerometer, gyroscope and magnetometer

measurements to the computer at 100 Hz. The motion of

the human body, modeled as a 23-segment skeleton, is then

readily provided in the form of the 6-DoF pose, velocity

http://www.ieee.org/publications_standards/publications/rights/index.html for more information.

Transactions on Industrial Informatics

IEEE TRANSACTIONS ON INDUSTRIAL INFORMATICS, VOL. XX, NO. X, MM 20YY 4

(a) Inertial motion capture system. (b) Captured motion and forces.

(c) Shoes and gloves instrumented with force-torque sensors.

Fig. 2. Acquisition system for whole-body kinematics and contact forces.

and acceleration of each segment. We enable the dynamics

analysis of Section III by converting these quantities into

generalized coordinates, velocities and accelerations (q,˙

q,¨

q),

with a kinematic tree composed of 23 segments linked by

22 spherical joints and rooted at the participant’s pelvis.

Measuring the participant’s body measurements and weight,

we compute the BSIPs using the anthropomorphic tables

of [37]. Inertial motion capture systems by themselves do not

provide absolute positioning and are prone to drift compared to

marker-based tracking methods (e.g., Vicon). We are working

towards attenuating this problem. Still, the choice of this

motion capture system was motivated by strong occlusions that

are inherent to whole-body interactions with the environment

and hinder vision-based motion capture systems (e.g., Vicon).

In contrast, the inertial motion capture system allows us

to explore various interaction scenarios in uncontrolled and

cluttered environments, e.g., when crouching under a table. In

the future, the system could even be employed outdoors, or

used in combination with a limited number of visual sensors

to solve the issue of drift and absolute positioning.

Contact forces. We measure the contact forces exerted by

the participant onto the environment both at the feet and at

the hands. Contact forces at the feet are monitored using

instrumented shoes (Xsens ForceShoe). Each shoe is equipped

with two force-torque sensors and two IMUs, providing con-

tact forces measured individually at the heel and toes, and

transmitted to the computer via Bluetooth at 50 Hz. We mon-

itor contact forces exerted at the hands using two additional

force-torque sensors (ATI Mini-45) attached to gloves worn

by the participant during interaction experiments. The force-

torque sensors are wired to dedicated acquisition cards on the

computer and measurements are also recorded at 50 Hz. Both

ForceShoe and ATI sensor signals are linearly interpolated

to 100 Hz, matching the motion capture sampling rate. In

comparison to static force plates commonly used in gait

analysis, wearable force sensors can be less accurate. Still,

a major advantage of our lightweight setup is that it enables

the efﬁcient and continuous acquisition of contact forces on

arbitrary contact conﬁgurations, highly dynamic motions, and

relatively unrestricted movement areas. In contrast, using static

force plates considerably reduce the range of possible tasks,

contacts and motions.

B. Newton-Euler Equations and Signal Synchronization

Each type of sensor used in this work (i.e., motion capture

suit, force-sensing shoes, ATI Mini-45 sensors) is individually

monitored using a dedicated acquisition program. Therefore,

raw measurements need to be temporally synchronized with

each other before further analysis. This step is performed using

the Newton-Euler equations taken at the center of mass Gof

the whole-body articulated system. For each body segment s

of the 23-element set S, we denote by msits mass and Gsits

center of mass. In the global frame, we denote by vsthe linear

velocity of Gsand Rsits orientation matrix. In the segment

frame, we denote by ωsand Isits local angular velocity

and inertia tensor, respectively. With mthe total mass of the

articulated system and Gits centroid, the linear momentum

P

P

Pand angular momentum L

L

LGat Gare deﬁned by:

P

P

P=X

s∈S

msvs,

L

L

LG=X

s∈S

ms

−−−→

GGs×vs+RsIsωs.

(12)

With ˙

L

L

LGand ˙

P

P

Pthe time derivatives of the angular and linear

momenta, respectively, gthe gravity vector and GFkthe

contact wrench at contact ktransformed to G, the Newton-

Euler equations for centroidal dynamics state that:

˙

L

L

LG

˙

P

P

P=0

mg+

NF

X

k=1

GFk.(13)

We gather gravity, linear and angular momenta as a centroidal

wrench wGdue to contact forces, taken at G[38]:

wG=˙

L

L

LG

˙

P

P

P − mg(14)

With Pkthe location of contact k, Eq. (13) becomes:

wG=

NF

X

k=1 "τk+−−−→

GPk×fk

fk#.(15)

wGis a purely kinematic term that can be directly computed

from the whole-body pose and its derivatives using Eq. (14),

but also from the contact forces using Eq. (15). Thus, syn-

chronizing motion capture and force measurements amounts

to synchronizing wGestimates from kinematics and forces.

For this purpose, we start each experiment by having the

participant walk a few seconds, then take support on a table

with the left and right hand, alternatively. To synchronize

kinematic and ForceShoe signals, we plot the components

of their respective estimates for wG

kin and wG

shoe during the

http://www.ieee.org/publications_standards/publications/rights/index.html for more information.

Transactions on Industrial Informatics

IEEE TRANSACTIONS ON INDUSTRIAL INFORMATICS, VOL. XX, NO. X, MM 20YY 5

walking phase and select by hand a constant time shift to

match the two signals at best. We then compute the residual

wrench wG

res =wG

kin −wG

shoe. When the participants leans on a

table with one hand, wG

res should be equal to the wrench wG

hand

measured by the corresponding force-torque sensor. Again, we

ﬁnd a constant time shift to match wG

hand and wG

res at best, thus

synchronizing hand sensors with kinematic-ForceShoe signals.

Following the temporal synchronization, we perform the

following signal processing. All measurements are subject to

noise, e.g., from the sensors themselves or due to interferences

in the transmission (both wired and wireless). We attenuate

it by smoothing all signals with a Gaussian ﬁlter of kernel

σ= 0.05 s. In addition, a slow-varying bias can appear in the

force-torque measurements with repeated stress and battery

drain. We estimate this bias through time by averaging the

signals that persist when the sensors are not in contact with

the environment, which should only be caused by the inertia

of the moving parts attached to the sensing surface (e.g., force

shoe external sole). Since the inertial motion capture system

does not provide absolute positioning, we could not reliably

identify the occurrence of contacts with the environment based

solely on the whole-body motion observations. Therefore, we

identiﬁed them by direct thresholding on the force sensor

measurements. Still, this material limitation does not affect the

generality of our approach and can be fully circumvented with

additional visual observations (see also [39] for the retrieval

of contact points without environment knowledge). Finally, we

correct the remaining force sensor uncertainties by combining

their measurements with the motion capture data using the

SOCP approach illustrated in Section III-C. In the following,

we call ground truth the SOCP-corrected sensor measurements

(relative to the dynamic model).

C. Experiments

The importance of collecting ground-truth measurements

on how humans naturally distribute contact forces not only

during locomotion, but across a variety of multi-contact con-

ﬁgurations was established in [11]. Thus, in this work, we

purposefully explore a wide range of motion dynamics as

well as diverse contact conﬁgurations that exhibit strong force

distribution indeterminacy. The following tasks were chosen

from daily activities to cover a spectrum of three features:

number of contacts involved, orientation of hand contacts

(when applicable) and effort required to perform the motion:

•Walking, i.e. with always at least one foot on the ground

(1 contact, low effort). Straight and curved paths were

considered separately;

•Running, i.e. with at most one foot on the ground (1

contact, medium effort);

•Hopping on one foot, e.g., forward or in place (1 contact,

high effort);

•Balancing the upper body while keeping both feet static,

e.g., leg stretching or performing arm motions (2 contacts,

low effort);

•Jumping using both feet, e.g., forward or to the side (2

contacts, high effort);

•Taking support on a table with one hand, e.g., to reach

for an object further on the table (3 contacts, horizontal

hand contact, low effort);

•Crouch and stand by taking support with one hand on

a table, e.g., to reach for an object under the table (3

contacts, horizontal hand contact, high effort);

•Leaning against a wall with one hand (3 contacts, vertical

hand contact, low effort);

•Leaning on a wall with one hand and reach forward, e.g.,

to look around a corner or grab an object (3 contacts,

vertical hand contact, high effort);

•Taking support on a table with both hands (4 contacts,

horizontal hand contacts, low to high effort);

•Leaning on a wall with both hands, e.g., to stretch or

push a heavy object (4 contacts, vertical hand contacts,

low to high effort).

Contact is a complementarity condition involving the dual

geometric and force spaces. The ﬁrst two features we used

to categorize our tasks (namely number and orientations of

contacts) ensure coverage of the geometric part of the condi-

tion, while the last one (perceived effort) aims for coverage

of the force space. Considering the two variants (straight and

curved) of walking experiments separately, we thus construct

a repertoire of twelve motion types, six of them involving

contacts between the feet and the ground only and the six

others involving both feet and hands. We illustrate this dataset

in Table I.

Six volunteers, three males and three females, took part in

our study. Their weights (between 45.0kg and 86.0kg, plus

the 5.0kg acquisition system), heights (between 1.57 m and

1.92 m), and individual body segment lengths were measured

to initialize the motion capture skeletal tracking model and

BSIPs following the procedure described in Section IV-A.

Before each experiment, all sensors (i.e., inertial motion

capture system, force-sensing shoes and glove-mounted force

sensors) were calibrated and reset following the manufactur-

ers’ recommended acquisition procedure to reduce the effects

of measurement drift and hysteresis. We divided the 12 motion

types of Table I into two sequences of 6 motions. Each

sequence consisted in 3 tasks involving the hands and 3 tasks

involving only the feet, executed in alternation for one minute

each. Participants were given time between consecutive tasks

to put on, or take off instrumented gloves, so locomotion tasks

were not constrained by unnecessary force sensor wires. In

total, each task was executed twice by each participant. For one

particpant, we observed force measurement errors of abnormal

magnitude on the right-hand sensor and discarded the corre-

sponding recordings from the dataset. For another participant,

the motion capture system was disconnected during a hopping

task. Overall, our new dataset on human whole-body kinody-

namics in multi-contact totals 2.4h of synchronized motion

and force measurements, classiﬁed into 12 task primitives.

V. CAPT UR IN G HUM AN FO RC E DIS TR IB UT IO N PATTERNS

A. Learning Features

Let Kdenote a set of (input) whole-body kinematic fea-

tures, and Da set of (output) contact force features. The

http://www.ieee.org/publications_standards/publications/rights/index.html for more information.

Transactions on Industrial Informatics

IEEE TRANSACTIONS ON INDUSTRIAL INFORMATICS, VOL. XX, NO. X, MM 20YY 6

TABLE I

INT ERAC TI ON CON FIG URATI ON S FROM T HE WHO LE -BO DY KINODYNAMICS DATAS ET.

Feet only Feet and hands

Motion Hop Run Jump Walk (×2) Balancing Take support Crouch Lean Lean and reach Take support Lean

Contacts 0/1F 0/1F 0/2F 1/2F 2F 1/2F-1H 1/2F-1H 1/2F-1H 1/2F-1H 1/2F-2H 1/2F-2H

Human

action

Motion-

force

data

desired contact force estimation mapping Fis of the form:

D=F(K).(16)

We model this mapping Fusing a neural network trained on

our whole-body kinodynamics dataset. The dynamic features

Dsimply correspond to the set of contact wrenches Fkwe

seek to estimate. A straightforward approach to construct the

set of kinematic features Kcould be to take all the remaining

parameters appearing in the whole-body equations of motion

of Eq. (1), e.g., the mass Hand bias Cmatrices, joint

accelerations ¨

q, and Jacobian matrices Jkrepresenting the

contact conﬁguration. However, doing so would result in a

particularly large number of parameters that can make the

neural network training process difﬁcult. We instead propose to

construct a selection of high-level kinematic features based on

the Newton-Euler equations of Eq. (13), which extract the gist

of locomotory dynamics. In particular, from the formulation

of Eq. (15), we take as ﬁrst input features the centroidal

wrench wG, which can be computed from kinematics only

with Eq. (14), and the contact positions relative to the center of

mass, −−−→

GPk. Since these quantities are expressed in the world

frame, we account for translational and rotational invariances

by transforming them to a reference frame Gof origin G

and ﬁxed with respect to a chosen body segment (e.g., the

pelvis). Walking straight to the North is thus locally equivalent

to walking straight to the East. To facilitate the modeling of

the mapping of Eq. (16) with a neural network, we construct

Kas a ﬁxed-size input vector. We continuously monitor Nc

potential contacting body segments over time and encode their

activity with parameters δk,i such that:

δk,i =(1if contact kis active at time step i,

0otherwise.(17)

In our experiments, we considered the forces applied at the

heels and toes separately in both the SOCP and the neural

network model, so that Nc= 6 including the hand palms.

Finally, in addition to the contact locations, we consider their

orientation through the contact normals zk. Denoting by GwG,

Body kinematics, contact conﬁg. (i)

WBND

KiD(raw)

i

(i−1) (i+1)

Ki+1 WBND

(a) Forces are direcly computed from the kinematics and contact conﬁguration.

Body kinematics, contact conﬁg. (i)

WBNF

Ki

Di−1

SOCP

(i−1) (i+1)

SOCP

D(raw)

iDi

Ki+1

WBNF

(b) Force predictions are corrected between consecutive time steps.

Fig. 3. Direct and feedback whole-body network architectures.

GPk,Gzkthe respective coordinates of wG,Pk,zkin the

frame G, the complete input features at time step iare:

Ki=GwG

i,GPk,i, δk ,i,zk,i ,k=1,NcT

.(18)

Similarly, the output features are the target wrenches in G:

Di=GFk,ik=1,NcT

.(19)

B. Neural Network Architecture

We model the evolution of motion and force distributions

as time series using RNNs with LSTM neurons in order to

account for temporal continuity between consecutive samples.

In this section, we propose two neural network architectures

to be used in conjunction with physics-based optimization.

The ﬁrst architecture, WBND(whole-body network, direct),

directly maps the observed motion to the underlying forces:

Di=WBND(Ki).(20)

Once trained, this network is used as follows. At each step, a

new vector of kinematic features Kiis fed to WBND, yielding

raw output features D(raw)

i. Since the RNN model does not

enforce the equations of motion, the corresponding forces

F(raw)

k,i may not be readily compatible with the observed motion.

We compute physically plausible forces Fk,i in their vicinity

using the SOCP of Eq. (5) with the cost function of Eq. (11).

http://www.ieee.org/publications_standards/publications/rights/index.html for more information.

Transactions on Industrial Informatics

IEEE TRANSACTIONS ON INDUSTRIAL INFORMATICS, VOL. XX, NO. X, MM 20YY 7

TABLE II

FOR CE ESTIMATION ERROR S ON FUL L TES TIN G SET (23 min)

Raw SOCP correction

Force sensors 1.6% ground truth

SOCP min.L2N/A 7.0%

WBND8.3% 6.4%

WBNF6.6% 5.8%

400

200

0

200

Net force [N]

Tangent (X)

50

0

50

100

Left foot [N]

0.0 0.5 1.0 1.5 2.0

400

200

0

200

Right foot [N]

400

200

0

200

Tangent (Y)

50

0

50

100

0.0 0.5 1.0 1.5 2.0

400

200

0

200

0

500

1000

1500

Normal (Z)

100

0

100

0.0 0.5 1.0 1.5 2.0

0

500

1000

1500

Ground truth WBND(raw) WBND+ SOCP

Fig. 4. SOCP correction on single contact motion (hop on the right foot).

Alternatively, we enable the interactive correction of RNN

predictions by constructing a network WBNF(whole-body

network, feedback) that takes as inputs both the current

kinematics and the distribution at the previous time step:

Di=WBNF(Ki,Di−1).(21)

When using WBNFfor prediction, we initialize D0to the

distribution of minimal L2norm following Eq. (10). At each

time step, Kiand Di−1are fed together to WBNF, yielding

raw predictions D(raw)

iBy SOCP correction, we reconstruct

physically accurate forces Fk,i and extract the corresponding

dynamic features Di, used for prediction at the next time step.

We depict the two proposed architectures in Fig. 3. Note

that for WBND, raw predictions D(raw)

imay be corrected inde-

pendently from each other, enabling opportunities for parallel

computing if desired. In contrast, the intertwined RNN-SOCP

approach of WBNFimposes a sequential prediction process.

VI. RE SU LTS

A. Prediction-Correction Framework

We implement the two proposed neural network architec-

tures within the Torch7 framework [40] as two LSTM hidden

layers of size 256 followed by a linear output layer of size

6Nc, the number of output features. We partition the whole-

body kinodynamics dataset into three subsets of respective size

70 %, 15 % and 15 % for training, validation and testing. We

train the neural networks by minimizing a mean square error

regression criterion using mini-batch stochastic gradient de-

scent and dropout to avoid overﬁtting [41]. We estimated from

the dataset that participants maintained each contact on aver-

age for 2.07 s. We thus set the length of the training batches to

2.0s. The SOCP correction is implemented separately using

the CVXOPT library for convex optimization [42]. We run the

prediction process for each task of the testing set and compute

the root mean square errors (RMSE) between reconstructed

forces and ground truth distributions. We normalize the RM-

SEs with the range of the normal forces measured in the

testing set, fz

max = 1378 N. For the sake of completeness, we

also quantify the force sensor measurement uncertainties, the

estimation errors for distributions computed by straightforward

minimization of their L2norm, and prediction errors for the

neural networks alone, without SOCP correction. We report

the resulting normalized RMSE (NRMSE) in Table II.

Expectedly, the lowest estimation errors are attained using

physical force sensors, that directly measure the applied con-

tact wrenches. Still, this level of accuracy was obtained using

costly, cumbersome force sensors. Table II yields three major

outcomes. First, we conﬁrm the previous observation that

physics-based optimization alone does not sufﬁce to address

the issue of force indeterminacy in multi-contact, since the L2-

minimizing cost function provides the worst results of the 2nd

column. Second, even without SOCP correction (ﬁrst column),

the accuracy of WBNFexceeds that of WBND, and even that of

the L2-minimizing SOCP alone. Thus, RNNs can successfully

capture interaction force patterns even without enforcing the

equations of motion during training. In particular, the bet-

ter performance of WBNFcompared to WBNDshows that

providing the RNN with past forces as inputs helps handle

force indeterminacy, i.e., associating a given motion (unique

Ki) to multiple possible distributions (different Di). Third,

combining RNN and SOCP yields the best results overall,

improving the accuracy of WBNDand WBNFby 23 % and

11 % respectively, and that of the SOCP alone by 17 %.

B. Accuracy in Multi-Contact Indeterminacy

The effectiveness of the SOCP to correct inaccurate force

predictions is particularly visible for the hopping sequence

depicted in Fig. 4. Indeed, for this motion, the presence of only

one foot on the ground at each instant makes it straightforward

for the SOCP to enforce that the force exerted at the only

contact is exactly causing the acceration of the centroid. We

further investigate the respective contributions of RNN and

SOCP by separating experiments with only feet or with both

feet and hands. Since the former involves relatively large

impulses (e.g., during jumping), we normalize the estimation

errors of each category by the range of their respective mea-

surements. We report the resulting NRMSEs in Table III. For

both categories, combining RNN and SOCP yields signiﬁcant

improvements compared to either in isolation. Importantly,

the NRMSEs of all three estimation methods are larger when

also considering hand contacts, which illustrates the increased

multi-contact indeterminacy.

Finally, we further decompose the tasks involving feet and

hands and assess the estimation accuracy by body segment

in Table IV. For all conﬁgurations, again, the SOCP greatly

improves the accuracy of both neural network architectures.

However, while for the feet, the three estimation methods

yield comparable NRMSEs, the estimation errors of the SOCP

alone on the hands (rightmost column) are now signiﬁcantly

larger than that of WBN variants. This result shows that

recurrent neural networks are well suited to tackle the issue of

force indeterminacy in multi-contact, for which physics-based

optimization can serve as a valuable complement.

http://www.ieee.org/publications_standards/publications/rights/index.html for more information.

Transactions on Industrial Informatics

IEEE TRANSACTIONS ON INDUSTRIAL INFORMATICS, VOL. XX, NO. X, MM 20YY 8

TABLE III

ESTIMATION ERRORS BYCO NTACT CO NFI GUR ATIO N

Feet only (13 min) Feet + hands (10 min)

fz

max = 1378Nfz

max = 750N

Raw SOCP Raw SOCP

Force sensors 2.1% ground truth 1.9% ground truth

SOCP min.L2N/A 8.7% N/A 9.6%

WBND9.9% 7.6% 12.2% 9.4%

WBNF7.5% 6.6% 10.3% 9.3%

TABLE IV

ESTIMATION ERRORS BYSE GM ENT O N FEE T + HAND TASK S (10 min)

Feet: fz

max = 750N Hands: fz

max = 177N

Raw SOCP Raw SOCP

Force sensors 2.0% ground truth 5.7% ground truth

SOCP min.L2N/A 10.8% N/A 21.9%

WBND14.2% 11.0% 14.4% 10.5%

WBNF12.0% 10.8% 13.2% 12.9%

We depict sample force reconstruction results for two-,

three- and four-contact motions in Fig. 5. In all cases, we

conﬁrm that the net force is reconstructed accurately by all

methods, as expected. In the two-contact balancing scenario,

we see that the WBNDnetwork fails to capture weight shifts

between feet (Z component, rightmost column) and tends

to predict uniform distributions, while the WBNFnetwork

tracks them suitably thanks to its ability to capture time-

dependent variations. With more contacts, time-dependency

over pressure-distribution becomes less signiﬁcant and both

networks perform reasonably.

VII. DIS CU SS IO N AN D FUT UR E WOR K

Our work establishes that the estimation of interaction

forces, a problem that pertains to the human sense of touch,

could be tackled through the lens of motion capture. The dual

optimization and learning framework we propose extends the

state of the art in capturing human force distribution patterns

beyond gait analysis, to general multi-contact conﬁgurations

used to interact with the environment. This important result

makes it possible to completely circumvent costly, cumber-

some and intrusive transducing technologies with any whole-

body tracking system. Indeed, while we collected our (public)

dataset using inertial motion capture, the RNN architectures

we propose only rely on centroidal dynamics, making them

agnostic with respect to the actual motion capture system

employed. Meanwhile, an SOCP can be formulated for any

whole-body kinematic model. As such, our framework is

readily compatible with existing markerless visual tracking

techniques, thus enabling novel interfaces for in-home, unob-

trusive force monitoring for personal robotics or rehabilitation.

In its current implementation, our work has some limita-

tions. First we consider the academic point-contact model,

whereas in practice contacts are between surfaces, yielding

additional complementarity conditions [35] that are, as we

observed, difﬁcult to take into account under motion-tracking

100

0

100

200

Net force [N]

Tangent (X)

200

100

0

100

Left foot [N]

0 1 2 3 4 5 6 7

100

0

100

200

Right foot [N]

100

0

100

200

Tangent (Y)

200

100

0

100

0 1 2 3 4 5 6 7

100

0

100

200

750

800

850

900

950

Normal (Z)

200

400

600

800

0 1 2 3 4 5 6 7

0

500

Ground truth WBND+ SOCP WBNF+ SOCP

(a) Two contacts: upper-body balancing with static feet.

100

50

0

50

Net force [N]

Tangent (X)

100

50

0

50

100

Left foot [N]

50

0

50

Right foot [N]

20

0

20

40

60

80

Left hand [N]

0 2 4 6 8 10 12

40

20

0

20

40

60

Right hand [N]

100

50

0

50

Tangent (Y)

100

50

0

50

100

50

0

50

20

0

20

40

60

80

0 2 4 6 8 10 12

40

20

0

20

40

60

550

600

650

Normal (Z)

0

200

400

600

0

200

400

600

0

50

100

0 2 4 6 8 10 12

0

50

100

Ground truth WBND+ SOCP WBNF+ SOCP

(b) Three contacts: taking support on a table with one hand (alternating).

100

50

0

50

Net force [N]

Tangent (X)

100

50

0

50

Left foot [N]

100

50

0

50

Right foot [N]

20

0

20

40

Left hand [N]

0 2 4 6 8 10 12

40

20

0

20

40

Right hand [N]

100

50

0

50

Tangent (Y)

100

50

0

50

100

50

0

50

20

0

20

40

0 2 4 6 8 10 12

40

20

0

20

40

600

650

Normal (Z)

0

200

400

600

0

200

400

600

0

50

100

0 2 4 6 8 10 1

0

50

Ground truth WBND+ SOCP WBNF+ SOCP

(c) Four contacts: leaning against a wall with two hands at the same time.

Fig. 5. Force proﬁles in various contact conﬁgurations. Net forces are

measured in the world frame, while contact forces are reported in their

respective (local) contact frames.

uncertainties. Contacts also include a certain amount of defor-

mation that we did not model. Assessing the contact force

by a portable force sensor also affects the natural motion

behavior. Instead, one could distribute force sensing devices in

the experimented environment, but at the cost of many more

sensing units. Considering all body limbs for contact would

be presently difﬁcult, as wearable force sensing suits do not

exist in the current state of the technology. We therefore chose

to focus on foot and hand contacts, at the expense of other

http://www.ieee.org/publications_standards/publications/rights/index.html for more information.

Transactions on Industrial Informatics

IEEE TRANSACTIONS ON INDUSTRIAL INFORMATICS, VOL. XX, NO. X, MM 20YY 9

kinds of interaction such as shoulder or waist contacts (e.g.

for seated motions).

To deal with these limitations and consider other features,

our work can (and should) be extended to arbitrary contact

conﬁgurations and motions. While a short-term solution could

be to collect additional force and motion measurements (e.g.,

with force sensors at the knees and elbows), we anticipate

that the increased level of instrumentation would strongly in-

terfere with natural interaction behaviors, or even render some

impossible (e.g., performing a cartwheel). Instead, our future

work involves considering the distribution of contact forces as

an inverse optimal control problem, i.e., ﬁnding optimization

criteria privileging the forces measured in reality. Note that

in the meantime, we ensured that the forces estimated by our

framework would always be at least physically plausible (if not

resembling human forces) by making the SOCP formulation

independent of the acquired dataset. In the long term, we also

plan to apply our framework to force-based robot learning

from demonstration, on-line multi-contact motion retargeting

and knowledge-based multi-contact planning and control [43].

ACK NOW LE DG ME NT S

All participants gave informed consent prior to participa-

tion and the study was approved by the Ethics and Safety

Committee of the University of Montpellier, France. This

work was partially supported by the H2020 RIA COMANOID

project (www.comanoid.eu), by JSPS Grant-in-Aid for Scien-

tiﬁc Research (B) Number 16H02886 (“Cutting-edge multi-

contact behaviors) and by the Bpifrance project ROMEO 2

(www.projetromeo.com).

REF ER EN CE S

[1] L. Rozo, P. Jim´

enez, and T. Carme, “A robot learning from demonstra-

tion framework to perform force-based manipulation tasks,” Intelligent

Service Robotics, vol. 6, no. 1, pp. 33–51, 2013.

[2] J. Englsberger, P. Kozlowski, and C. Ott, “Biologically inspired dead-

beat controller for bipedal running in 3d,” in IEEE/RSJ International

Conference on Intelligent Robots and Systems, 2015, pp. 989–996.

[3] S. Ha, Y. Bai, and C. K. Liu, “Human motion reconstruction from force

sensors,” in ACM SIGGRAPH/Eurographics Symposium on Computer

Animation, 2011, pp. 129–138.

[4] P. Zhang, K. Siu, J. Zhang, C. K. Liu, and J. Chai, “Leveraging depth

cameras and wearable pressure sensors for full-body kinematics and

dynamics capture,” ACM Trans. on Graphics, vol. 33, no. 6, p. 221,

2014.

[5] N. Kyriazis and A. A. Argyros, “Physically plausible 3d scene tracking:

The single actor hypothesis,” in IEEE Conference on Computer Vision

and Pattern Recognition, 2013, pp. 9–16.

[6] Y. Wang, J. Min, J. Zhang, Y. Liu, F. Xu, Q. Dai, and J. Chai, “Video-

based hand manipulation capture through composite motion control,”

ACM Trans. on Graphics, vol. 32, no. 4, p. 43, 2013.

[7] C. Tran and M. M. Trivedi, “3-d posture and gesture recognition for

interactivity in smart spaces,” IEEE Trans. on Industrial Informatics,

vol. 8, no. 1, pp. 178–187, 2012.

[8] D. Michel, K. Panagiotakis, and A. A. Argyros, “Tracking the articulated

motion of the human body with two rgbd cameras,” Machine Vision

Applications, vol. 26, no. 1, pp. 41–54, 2015.

[9] A. Gonz´

alez, M. Hayashibe, and P. Fraisse, “Subject-speciﬁc center of

mass estimation for in-home rehabilitation–kinect-wii board vs. vicon-

force plate,” in Converging Clinical and Engineering Research on

Neurorehabilitation. Springer, 2013, pp. 705–709.

[10] G. Yang, L. Xie, M. M¨

antysalo, X. Zhou, Z. Pang, L. Da Xu, S. Kao-

Walter, Q. Chen, and L.-R. Zheng, “A health-iot platform based on

the integration of intelligent packaging, unobtrusive bio-sensor, and

intelligent medicine box,” IEEE Trans. on Industrial Informatics, vol. 10,

no. 4, pp. 2180–2191, 2014.

[11] T.-H. Pham, A. Bufort, S. Caron, and A. Kheddar, “Whole-body

contact force sensing from motion capture,” in IEEE/SICE International

Symposium on System Integration, 2016, pp. 58–63.

[12] S. J. Lee, Y. Motai, and H. Choi, “Tracking human motion with

multichannel interacting multiple model,” IEEE Trans. on Industrial

Informatics, vol. 9, no. 3, pp. 1751–1763, 2013.

[13] D. Roetenberg, H. Luinge, and P. Slycke, “Xsens mvn: full 6dof

human motion tracking using miniature inertial sensors,” Xsens Motion

Technologies BV, Tech. Rep, 2009.

[14] H. Ghasemzadeh and R. Jafari, “Physical movement monitoring using

body sensor networks: A phonological approach to construct spatial

decision trees,” IEEE Trans. on Industrial Informatics, vol. 7, no. 1,

pp. 66–77, 2011.

[15] M. Mohammadi, T. L. Baldi, S. Scheggi, and D. Prattichizzo, “Fingertip

force estimation via inertial and magnetic sensors in deformable object

manipulation,” in IEEE Haptics Symposium, 2016, pp. 284–289.

[16] Y. Zhu, C. Jiang, Y. Zhao, D. Terzopoulos, and S.-C. Zhu, “Inferring

forces and learning human utilities from videos,” in IEEE Conference

on Computer Vision and Pattern Recognition, 2016, pp. 3823–3833.

[17] M. A. Brubaker, L. Sigal, and D. J. Fleet, “Estimating Contact Dynam-

ics,” in IEEE International Conference on Computer Vision, 2009, pp.

2389–2396.

[18] S. Nakaoka, S. Hattori, F. Kanehiro, S. Kajita, and H. Hirukawa,

“Constraint-based dynamics simulator for humanoid robots with shock

absorbing mechanisms,” in IEEE/RSJ International Conference on In-

telligent Robots and Systems, 2007, pp. 3641–3647.

[19] V. M. Zatsiorsky, Kinetics of human motion. Human Kinetics, 2002.

[20] B. I. Prilutsky and V. M. Zatsiorsky, “Optimization-based models of

muscle coordination,” Exercise and Sport Sciences Reviews, vol. 30,

no. 1, p. 32, 2002.

[21] G. Rao, D. Amarantini, E. Berton, and D. Favier, “Inﬂuence of body

segments’ parameters estimation models on inverse dynamics solutions

during gait,” Journal of Biomechanics, vol. 39, no. 8, pp. 1531–1536,

2006.

[22] A. Muller, C. Germain, C. Pontonnier, and G. Dumont, “A comparative

study of 3 body segment inertial parameters scaling rules,” Computer

Methods in Biomechanics and Biomedical Engineering, vol. 18, no.

sup1, pp. 2010–2011, 2015.

[23] J. Jovic, A. Escande, K. Ayusawa, E. Yoshida, A. Kheddar, and

G. Venture, “Humanoid and human inertia parameter identiﬁcation using

hierarchical optimization,” IEEE Trans. on Robotics, vol. 32, no. 3, pp.

726–735, 2016.

[24] X. Lv, J. Chai, and S. Xia, “Data-driven inverse dynamics for human

motion,” ACM Transactions on Graphics (SIGGRAPH Asia), vol. 35,

no. 6, pp. 163:1–163:12, November 2016.

[25] C. Yang, Y. Jiang, Z. Li, W. He, and C.-Y. Su, “Neural control of

bimanual robots with guaranteed global stability and motion precision,”

IEEE Trans. on Industrial Informatics, 2016.

[26] S. Levine, P. Pastor, A. Krizhevsky, J. Ibarz, and D. Quillen, “Learning

hand-eye coordination for robotic grasping with deep learning and large-

scale data collection,” International Journal of Robotics Research, 2016.

[27] I. Mordatch, K. Lowrey, G. Andrew, Z. Popovic, and E. V. Todorov,

“Interactive control of diverse complex characters with neural networks,”

in Advances in Neural Information Processing Systems, 2015, pp. 3132–

3140.

[28] S. E. Oh, A. Choi, and J. H. Mun, “Prediction of ground reaction forces

during gait based on kinematics and a neural network model,” Journal

of biomechanics, vol. 46, no. 14, pp. 2372–2380, 2013.

[29] T.-H. Pham, A. Kheddar, A. Qammaz, and A. A. Argyros, “Towards

force sensing from vision: Observing hand-object interactions to infer

manipulation forces,” in IEEE Conference on Computer Vision and

Pattern Recognition, 2015, pp. 2810–2819.

[30] J. L. Elman, “Finding structure in time,” Cognitive Science, vol. 14,

no. 2, pp. 179–211, 1990.

[31] S. Hochreiter and J. Schmidhuber, “Long short-term memory,” Neural

computation, vol. 9, no. 8, pp. 1735–1780, 1997.

[32] T.-H. Pham, N. Kyriazis, A. A. Argyros, and A. Kheddar, “Hand-object

contact force estimation from markerless visual tracking,” IEEE Trans.

on Pattern Analysis and Machine Intelligence, 2017, to appear.

[33] C. Ferm¨

uller, F. Wang, Y. Yang, K. Zampogiannis, Y. Zhang, F. Bar-

ranco, and M. Pfeiffer, “Prediction of manipulation actions,” Interna-

tional Journal of Computer Vision, pp. 1–17, 2017.

[34] S. P. Boyd and L. Vandenberghe, Convex Optimization. New York, NY,

USA: Cambridge University Press, 2004.

[35] S. Caron, Q.-C. Pham, and Y. Nakamura, “Stability of surface contacts

for humanoid robots: Closed-form formulae of the contact wrench cone

http://www.ieee.org/publications_standards/publications/rights/index.html for more information.

Transactions on Industrial Informatics

IEEE TRANSACTIONS ON INDUSTRIAL INFORMATICS, VOL. XX, NO. X, MM 20YY 10

for rectangular support areas,” in IEEE International Conference on

Robotics and Automation, 2015, pp. 5107–5112.

[36] Y. Nakamura, K. Yamane, Y. Fujita, and I. Suzuki, “Somatosensory

computation for man-machine interface from motion-capture data and

musculoskeletal human model,” IEEE Trans. on Robotics, vol. 21, no. 1,

pp. 58–66, 2005.

[37] R. Dumas, L. Cheze, and J.-P. Verriest, “Adjustments to mcconville

et al. and young et al. body segment inertial parameters,” Journal of

Biomechanics, vol. 40, no. 3, pp. 543–553, 2007.

[38] S. Caron, Q.-C. Pham, and Y. Nakamura, “Zmp support areas for multi-

contact mobility under frictional constraints,” IEEE Trans. on Robotics,

vol. 33, no. 1, pp. 67–80, 2016.

[39] S. Lengagne, ¨

O. Terlemez, S. Laturnus, T. Asfour, and R. Dillmann,

“Retrieving contact points without environment knowledge,” in IEEE-

RAS International Conference on Humanoid Robots, 2012, pp. 841–846.

[40] R. Collobert, K. Kavukcuoglu, and C. Farabet, “Torch7: A matlab-like

environment for machine learning,” in BigLearn, NIPS Workshop, 2011.

[41] N. Srivastava, G. Hinton, A. Krizhevsky, I. Sutskever, and R. Salakhut-

dinov, “Dropout: A simple way to prevent neural networks from overﬁt-

ting,” Journal of Machine Learning Research, vol. 15, no. 1, pp. 1929–

1958, 2014.

[42] M. Andersen, J. Dahl, and L. Vandenberghe, “Cvxopt: A python package

for convex optimization,” abel.ee.ucla.edu/cvxopt, 2013.

[43] K. Bouyarmane and A. Kheddar, “Humanoid robot locomotion and

manipulation step planning,” Advanced Robotics, vol. 26, no. 10, pp.

1099–1126, 2012.

Tu-Hoa Pham is currently a postdoctoral researcher

at IBM Research Tokyo. He received the Dipl.-

Ing. SupA´

ero degree from ISAE, the M.Sc. in

Mathematics from Universit´

e Paul Sabatier (France,

2013) and the Ph.D. in robotics from Universit´

e de

Montpellier (France, 2016), conducted between the

CNRS–AIST Joint Robotics Laboratory, Japan, and

CNRS–UM LIRMM, France. His research interests

include robot vision and learning for monitoring of

human activities and learning from demonstration.

St´

ephane Caron is a researcher in humanoid

locomotion at CNRS–University of Montpellier

LIRMM (France). An alumni of the ´

Ecole Normale

Sup´

erieure (ENS Paris), he stayed one year at the

Technicolor Lab., Palo Alto (California), before join-

ing the Nakamura Lab. to undergo his doctoral stud-

ies. He received the Ph.D. in Mechano-Informatics

from the University of Tokyo (Japan) in 2016, with

a thesis on multi-contact motion planning for hu-

manoid robots. His research interests include contact

interaction, numerical optimization and predictive

control, all related to the broader ﬁeld of humanoid locomotion.

Abderrahmane Kheddar (M’04, SM’12) received

the BS in Computer Science degree from the Institut

National d’Informatique (ESI), Algiers, the MSc and

PhD degree in robotics, both from the University

of Pierre et Marie Curie, Paris. He is presently

Directeur de Recherche at CNRS and the Director

of the CNRS–AIST Joint Robotic Laboratory (JRL),

UMI3218/RL, Tsukuba, Japan. He is also lead-

ing the Interactive Digital Humans (IDH) team at

CNRS–University of Montpellier LIRMM, France.

His research interests include haptics, humanoids

and thought-based control using brain machine interfaces. He is a founding

member of the IEEE/RAS chapter on haptics, the co-chair and founding

member of the IEEE/RAS Technical committee on model-based optimization.

He is a member of the steering committee of the IEEE Brain Initiative, Editor

of the IEEE Transactions on Robotics and within the editorial board of some

other robotics journals; he is a founding member of the IEEE Transactions

on Haptics and served in its editorial board during three years (2007-2010).

He is an IEEE senior member, titular member of the National Academy of

Technology of France, and knight in the National Order of the Merit.