Content uploaded by Marie Charbonneau

Author content

All content in this area was uploaded by Marie Charbonneau on Jul 21, 2018

Content may be subject to copyright.

A Control Architecture with Online Predictive Planning for

Position and Torque Controlled Walking of Humanoid Robots

Stefano Dafarra1,2, Gabriele Nava1,2, Marie Charbonneau1,2, Nuno Guedelha1,2, Francisco Andrade1,2,

Silvio Traversaro1, Luca Fiorio1, Francesco Romano3, Francesco Nori3, Giorgio Metta1, and Daniele Pucci1

Abstract— A common approach to the generation of walking

patterns for humanoid robots consists in adopting a layered

control architecture. This paper proposes an architecture com-

posed of three nested control loops. The outer loop exploits

a robot kinematic model to plan the footstep positions. In

the mid layer, a predictive controller generates a Center of

Mass trajectory according to the well-known table-cart model.

Through a whole-body inverse kinematics algorithm, we can

deﬁne joint references for position controlled walking. The

outcomes of these two loops are then interpreted as inputs of

a stack-of-task QP-based torque controller, which represents

the inner loop of the presented control architecture. This

resulting architecture allows the robot to walk also in torque

control, guaranteeing higher level of compliance. Real world

experiments have been carried on the humanoid robot iCub.

I. INTRODUCTION

Despite decades of research in the subject, robust bipedal

locomotion of humanoid robots is still a challenge for the

Robotics community. The unpredictability of the terrain,

the nonlinearity of the robot-environment models, and the

low efﬁciency of the robot actuators - that are a far cry

from the human musculoskeletal system - are only a few

complexities that render robot bipedal locomotion an active

research domain. In this context, feedback control algorithms

for robust bipedal locomotion are of primary importance.

This paper contributes towards this direction by presenting an

on-line predictive kinematic planner for foot-step positioning

and center-of-mass (CoM) trajectories. This planner is also

integrated with a stack-of-task torque controller, which en-

sures a degree of robot compliance and further increases the

overall system robustness to external perturbations.

A recent approach for bipedal locomotion control that

became popular during the DARPA Robotics Challenge [1]

consists in deﬁning a hierarchical control architecture. Each

layer of this architecture receives inputs from the robot and

its surrounding environment, and provides references to the

the layer below. The lower the layer, the shorter the time

horizon that is used to evaluate the outputs. Also, lower

layers usually employ more complex models to evaluate the

outputs, but the shorter time horizon often results in faster

computations for obtaining these outputs.

This project has received funding from the European Unions Horizon

2020 research and innovation programme under grant agreement No. 731540

(An.Dy).

1iCub Facility Department, Istituto Italiano di Tecnologia, 16163 Gen-

ova, Italy (e-mail: name.surname@iit.it)

2Universit`

a degli Studi di Genova, DIBRIS

3Romano (fraromano@google.com) and Nori (fnori@google.com) are

with Google DeepMind, London, UK.

From higher to lower layers, trajectory optimization for

foot-step planning,receding horizon control (RHC), and

whole-body quadratic programming control represent a com-

mon stratiﬁcation of the control architecture for bipedal

locomotion control.

Trajectory optimization for foot-step planning is in charge

of ﬁnding a sequence of robot footholds, which is also funda-

mental for maintaining the robot balance. This optimisation

can be achieved by considering both kinematic and dynamic

robot models [2], [3], [4], and with different optimisation

techniques, such as the Mixed Integer Programming [5]. The

robust-and-repeatable application of these sophisticated tech-

niques, however, still require time-consuming gain tuning

since they usually consider complex models characterizing

the hazardous environment surrounding the robot.

For a certain number of applications, the terrain can be

considered to be ﬂat. In these cases, it is known that the

human upper body is usually kept tangent to the walking

path [6], [7] all the more so because stepping aside, i.e.

perpendicular to the path, is energetically inefﬁcient [8]. All

these considerations suggest to use a simple kinematic model

to generate the walking trajectories: the unicycle model (see,

e.g., [9]). This model can be used to plan footsteps in a

corridor with turns and junctions using cameras [10], or

to perform evasive robot motions [11]. In all these cases,

however, the robot velocity is kept to a constant value.

Receding horizon control (RHC), also referred to as Model

Predictive Control (MPC) [12], is often in charge of ﬁnding

feasible trajectories for the robot center of mass along the

walking path. The computational burden to ﬁnd feasibility

regions, however, usually calls for using simpliﬁed models

to characterise the robot dynamics. The Linear Inverted

Pendulum [13] and the Capture Point [14] models represent

two widespread simpliﬁed robot models. These simpliﬁed

linear models have allowed on-line RHC, also providing

references for the footstep locations [15], [16], [17].

Whole-body quadratic-programming (QP) controllers are

instantaneous algorithms that usually ﬁnd (desired) joint

torques and contact forces achieving some desired robot

accelerations. In this framework, the generated joint-torques

and contact forces can satisfy inequality constraints, which

allow to meet friction constraints. The desired accelerations,

that QP controllers track, shall ensure the stabilisation of

reference positions coming from the RHC layer. Although

the reference positions may be stabilised by a joint-position

control loop, joint-torque based controllers have shown to

ensure a degree of compliance, which also allows safe inter-

arXiv:1807.05395v1 [cs.RO] 14 Jul 2018

actions with the environment [18], [19]. From the modeling

point of view, full-body ﬂoating-base models are usually

employed in QP controllers. These controllers are often

composed of several tasks, organised with strict or weighted

hierarchies [20], [21], [22], [23], [24].

This paper presents a reactive control architecture for

bipedal robot locomotion, namely, the entire architecture uses

on-line feedback from the robot and user-desired quantities.

This architecture implements the above three layers, and can

be launched on both position and torque controlled robots.

The trajectory optimization for foot-step planning is

achieved by a planning module that uses a simpliﬁed kine-

matic robot model, namely, the unicycle model. Feet po-

sitions are updated depending on the robot state and on

a desired direction coming from a joypad, which gives a

human user teleoperating the robot the possibility of deﬁning

desired walking paths. Differently from [10], we do not

ﬁx the robot velocity. Compared to [15], [16], [17], we do

not assume the robot to be always in single support. As a

consequence, the robot avoids to step in place continuously

if the desired robot position does not change, or changes

slowly. Another drawback of these approaches is that feet

rotations are planned separately from linear positions, and

this drawback is overcome by our approach.

Once footsteps are deﬁned, a receding horizon control

(RHC) module generates kinematically feasible trajectories

for the robot center of mass and joint trajectories by using the

LIP model [25] and whole-body inverse kinematics. Hence,

we separate the generation of footsteps from the computation

of feasible CoM trajectories. The implemented RHC module

runs at a frequency of 100 Hz.

The CoM and feet trajectories generated by the RHC

module are then streamed as desired values to either a

joint-position control loop, or to a whole-body quadratic-

programming (QP) controller running at 100 Hz. This latter

controller generates desired joint torques, ensuring a degree

of robustness and robot compliance. The desired joint torques

are then stabilised by a low-level joint torque controller

running at 1 kHz. Experimental validations of the proposed

approach are carried out on the iCub humanoid robot [26],

with both position and joint torque control experiments.

In light of the above, this paper presents a torque-control,

on-line RHC architecture for bipedal robot locomotion.

II. BACKGROU ND

A. Notation

In this paper we are going to use the following notation:

•The ith component of a vector xis denoted as xi.

•Iis a ﬁxed inertial frame with respect to (w.r.t.) which

the robot’s absolute pose is measured.

•e1:= (1,0)Tand e2:= (0,1)Tdenote the canonical

basis vectors of R2.

•R2(θ)∈SO(2) is the rotation matrix of an angle θ∈R;

S2=R2(π/2) is the unitary skew-symmetric matrix.

•given a function of time f(t)the dot notation denotes

the time derivative, i.e. ˙

f:=df

dt. Higher order deriva-

tives are denoted with a corresponding amount of dots.

Ox

y

v

ωθ

Bx

By

d

F

2m

Fig. 1: Notation. The unicycle model is a planar model of

a robot having two wheels placed at a distance 2m,m∈

Rwith a coinciding rotation axis. Hence, this mobile robot

cannot move sideways, i.e. along the wheel axis, but it can

turn by moving the wheel at different velocity. Bis a frame

attached to the robot whose origin is located in the middle of

the wheels axis. Point Fis attached to the robot. Its position

expressed in Bis given by d∈R2, a constant vector.

•1n∈Rn×ndenotes the identity matrix of dimension n.

•S(·)is the skew-symmetric operation associated with

the cross product in R3.

•The vee operator denoted by (·)∨is the inverse of the

S(·)operation.

•The Euclidean norm of a vector of coordinates v∈Rn

is denoted by kvk.

•ARB∈SO(3) and ATB∈SE(3) denote the rotation

and transformation matrices which transform a vector

expressed in the Bframe into a vector expressed in the

Aframe.

B. The unicycle model

The unicycle model, represented in Fig. 1. is described by

the following model equations [27]:

˙x=vR2(θ)e1,(1a)

˙

θ=ω, (1b)

with v∈Rand ω∈Rthe robot’s rolling and rotational

velocity, respectively. x∈R2is the unicycle position in the

inertial frame I, while θ∈Rrepresents the angle around the

z−axis of Iwhich aligns the inertial reference frame with a

unicycle ﬁxed frame. The variables vand ωare considered

as kinematic control inputs.

A reasonable control objective for this kind of model is

to asymptotically stabilize the point Fabout a desired point

F∗whose position is deﬁned as x∗

F. For this purpose, deﬁne

the error ˜xas

˜x:=xF−x∗

F.(2)

so that the control objective is equivalent to the asymptotic

stabilization of ˜xto zero. Since

xF=x+R2(θ)d,

then by differentiation it yields

˙xF= ˙x+ωR2(θ)S2d. (3)

Eq. (3) describes the output dynamics. Substituting Eq. (1)

into Eq. (3), we can rewrite the output dynamics as:

˙xF=B(θ)u, (4)

where u= [v ω]is the vector of control inputs. It is easy to

show that det B(θ) = d1, which means that when the control

point Fis not located on the wheels’ axis, its stabilization

to an arbitrary reference position F∗can be achieved by the

use of simple feedback laws. For example, if we deﬁne

u=B(θ)−1( ˙x∗

F−K˜x)(5)

with Ka positive deﬁnite matrix, then we have

˙

˜x=−K˜x. (6)

Thus, the origin of the error dynamics is an asymptotically

stable equilibrium. Notice that this control law is not deﬁned

when d1= 0.

C. Zero Moment Point Preview Control

The MPC controller implemented in this work has been

derived from the Zero Moment Point (ZMP) preview control

described in [25]. This algorithm adopts a simpliﬁed model,

viz. the table-cart model, based on the Linear Inverted

Pendulum (LIP) approximate model. The humanoid model is

assimilated to an inverted pendulum that is linearised around

the vertical position. The ZMP can be related to the CoM

position and acceleration by the following equation:

xZMP =P2DxCoM −xCoM,z

gP2D¨xCoM (7)

where gis the gravitational constant, xZMP ∈R2denotes the

Zero Moment Point (ZMP) [28] coordinate, while xCoM ∈R3

is the center of mass 3D coordinate. xCoM,z , i.e. the CoM

height, is assumed constant and P2Dis the matrix projecting

the CoM on the 2D plane, i.e. it considers only the xand y

coordinates of the center of mass.

Assuming to have a desired ZMP trajectory x∗

ZMP(t),

we want to track this signal at every time instant. One

possibility is to consider the ZMP as an output of following

the dynamical system:

˙χ=AZMP χ+BZMPu, y =CZMPχ(8)

where the new state variable χand control uare deﬁned as

χ:=

xCoM,x,y

˙xCoM,x,y

¨xCoM,x,y

∈R6, u :=...

xCoM,x,y∈R2.(9)

The system matrices are deﬁned as in the following:

AZMP =

02×21202×2

02×202×212

02×202×202×2

, BZMP =04×2

12

CZMP =h1202×2−xCoM,z

g12i.

(10)

Deﬁning a cost function J

J=Ztf

0

kx∗

ZMP −CZMPχk2

Q+kuk2

Rdτ. (11)

where Qand Rare two weight matrices of suitable dimen-

sion. Jpenalizes the output tracking error plus a regulariza-

tion term on the effort. In order to minimize Jgiven Eq. (8)

we can implement a Linear Quadratic controller. This simple

controller allows to track a desired ZMP trajectory.

D. System Modeling

The full model of a humanoid robot is composed of n+ 1

rigid bodies, called links, connected by njoints with one

degree of freedom each. We also assume that none of the

links has an a priori constant pose with respect to an inertial

frame, i.e. the system is free ﬂoating.

The robot conﬁguration space can then be characterized

by the position and the orientation of a frame attached to

a robot’s link, called base frame B, and the joint conﬁg-

urations. Thus, the robot conﬁguration space is the group

Q=R3×SO(3) ×Rn. An element q∈Qcan be deﬁned

as the following triplet: q= (IpB,IRB, s)where IpB∈R3

denotes the position of the base frame with respect to the

inertial frame, IRB∈R3×3is a rotation matrix representing

the orientation of the base frame, and s∈Rnis the joint

conﬁguration.

The velocity of the multi-body system can be characterized

by the algebra Vof Qdeﬁned by: V=R3×R3×Rn.

An element of Vis then a triplet ν= (I˙pB,IωB,˙s) =

(vB,˙s), where IωBis the angular velocity of the base frame

expressed w.r.t. the inertial frame, i.e. I˙

RB=S(IωB)IRB.

A more detailed description of the ﬂoating base model is

provided in [29].

We also assume that the robot is interacting with the envi-

ronment exchanging ncdistinct wrenches1. The application

of the Euler-Poincar´

e formalism [30, Ch. 13.5]:

M(q) ˙ν+C(q, ν )ν+G(q) = Bτ +

nc

X

k=1

J>

Ckfk(12)

where M∈Rn+6×n+6 is the mass matrix, C∈

R(n+6)×(n+6) accounts for Coriolis and centrifugal effects,

G∈Rn+6 is the gravity term, B= (0n×6,1n)>is a selector

matrix, τ∈Rnis a vector representing the internal actuation

torques, and fk∈R6denotes the k-th external wrench

applied by the environment on the robot. The Jacobian

JCk=JCk(q)is the map between the robot’s velocity ν

and the linear and angular velocity at the k-th contact link.

Lastly, it is assumed that a set of holonomic constraints

acts on System (12). These holonomic constraints are of the

form c(q) = 0, and may represent, for instance, a frame

having a constant pose w.r.t. the inertial frame. In the case

this frame corresponds to the location at which a rigid contact

occurs on a link, we represent the holonomic constraint as

JCk(q)ν= 0.

Hence, the holonomic constraints associated with all the

rigid contacts can be represented as

J(q)ν=

JC1(q)

· · ·

JCnc(q)

ν=JbJjν=JbvB+Jj˙s= 0,(13)

with Jb∈R6nc×6, Jj∈R6nc×n. The base frame velocity is

denoted by vB∈R6.

1As an abuse of notation, we deﬁne as wrench a quantity that is not the

dual of a twist, but a 6D force/moment vector.

III. ARCHITECTURE

In this section we summarise the components constituting

the presented architecture, namely:

•the footstep planner,

•the RH controller,

•the stack-of-task-balancing controller.

A. The Footstep Planner

Consider the unicycle model presented in Section II-B.

Assuming to know the reference trajectory for point Fup to

time T, we can integrate the closed-loop system described in

Section II-B to obtain the trajectory spanned by the unicycle.

The next step consists in discretizing the unicycle trajectories

at a ﬁxed rate 1/dt. This passage allows us to search for the

best foot placement option in a smaller space, constituted

by the set of discretization points taken from the original

unicycle trajectories.

Given a discrete instant k∈N, we can deﬁne xk=x(kdt)

as the position of the unicycle at instant k, while θk=θ(kdt)

is its orientation along the z−axis. We can use the discretized

trajectories to reconstruct the desired position of the left and

right foot, xl

kand xr

k, using the following relations:

xl

k=xk+R2(θk)0

m, xr

k=xk+R2(θk)0

−m.(14a)

θl

k=θr

k=θk.(14b)

A step contains two phases: double support and single

support. During double support, both robot feet are on

ground and depending on the foot, we can distinguish two

different states: switch-in if the foot is being loaded, switch-

out otherwise. In single support instead, a foot is in a swing

state if it is moving, stance otherwise. The instant in which

a foot lands on the ground is called impact time,tf

imp. After

this event, the foot will experience the following sequence

of states: switch-in →stance →switch-out →swing.This

sequence is terminated by an another impact time. At the

beginning of the switch-out phase, the other foot has landed

on the ground with an impact time t∼f

imp. The step duration,

∆tis then deﬁned as:

∆t=t∼f

imp −tf

imp.

We deﬁne additional quantities relating the two feet when in

double support (ds):

•the orientation difference when in double support ∆θ=

|θl

ds −θr

ds|.

•The feet distance ∆x=kxl

ds −xr

dsk.

•The position of the left foot expressed on the frame

rigidly attached to the right foot, rPl.

These quantities will be used to determine the feet trajecto-

ries starting from the unicycle ones.

Given the discretized unicycle trajectories, a possible

policy consists in ﬁxing the duration of a step, namely

∆t=const, or in ﬁxing its length, i.e. setting ∆xto a

constant. Both these two strategies are not desirable. In the

former case the robot will always take (eventually) very short

steps at maximum speed. In the latter, if the unicycle is

advancing slowly (because of slow moving references), the

robot will take steps always at maximum length sacriﬁcing

the walking speed. In view of these considerations, we would

like the planner to modify step length and speed depending

on the reference trajectory. Since we would like to avoid

ﬁxing any variable, it is necessary to deﬁne a cost function.

It is composed of two parts. The ﬁrst part weights the squared

inverse of ∆t, thus penalizing fast steps, while the second

penalize the squared 2-norm of ∆x, avoiding to take long

steps. Thus, the cost function φcan be written as:

φ=kt

1

∆2

t

+kxk∆xk2,(15)

where ktand kxare two positive numbers. Depending on

their ratio, the robot will take long-and-slow or short-and-

fast steps. Notice that φis not deﬁned when ∆t= 0, but the

robot would not be able to take steps so fast. Thus, we need

to bound ∆t:

tmin ≤∆t≤tmax (16)

where the upper bound avoids a step to be too slow.

Regarding ∆xit needs to be lower than an upper-bound

dmax, bounding the swinging foot into a circular area drawn

around the stance foot with radius dmax. Another constraint

to be considered is the relative rotation of the two feet. In

particular the absolute value of ∆θmust be lower than θmax.

Finally, in order to avoid the robot to twist its legs, we

simply resort to a bound on the y−component of rPlvector.

Indeed, it corresponds to the width of the step. By applying

a lower-bound wmin on it, we avoid the left foot to be on

the right of the other foot.

Finally we can write the constraints deﬁned above, to-

gether with φ, as an optimization problem:

minimize

timp

Kt

1

∆2

t

+Kxk∆xk2(17a)

s.t. :tmin ≤∆t≤tmax (17b)

∆x≤dmax (17c)

|∆θ| ≤ θmax (17d)

wmin ≤rPl,y.(17e)

The decision variables are the impact times. If we select

an instant kto be the impact time for a foot, then we can

obtain the corresponding foot position and orientation using

Eq. (14) and Eq. (15).

The solution of the optimization problem is obtained

through a simple iterative algorithm. The initialization can

be done by using the measured position of the feet. Starting

from this conﬁguration we can integrate the unicycle tra-

jectories assuming to know the reference trajectories. Once

we discretize them, we can iterate over kuntil we ﬁnd a set

of points which satisfy the conditions deﬁned above. Among

the remaining points we can evaluate φto ﬁnd our optimum.

Once we planned footsteps for the full time horizon, we

can proceed in interpolating the feet trajectories and other

relevant quantities for the walking motion, e.g. the Zero

Moment Point [28].

While walking, references may change and it is not

desirable to wait the end of the planned trajectories before

changing them. Thus the idea is to “merge” two trajecto-

ries instead of serializing them. When generating a new

trajectory, the robot is supposed to be standing on two

feet and the ﬁrst switch needs to last half of its normal

time. In view of these considerations, the middle instant of

the double support phase is a particularly suitable point to

merge two trajectories. The choice of this point eases the

merging process since the feet are not supposed to move

in that instant, hence the initialization of these trajectories

does not need to know their initial desired velocity and/or

acceleration. Notice that there may be different merge points

along the trajectories, depending on the number of (full)

double support phases. Two trivial merge points are the very

ﬁrst instant of the trajectories themselves and the ﬁnal point

(serialization case).

This method can be assimilated to the Receding Horizon

Principle [31]. In fact, we plan trajectories for an horizon T

but only a portion of them will be actually used, namely the

ﬁrst generated step. This simple strategy allows us to change

the unicycle reference trajectory on-line (through the use of a

joystick for example) directly affecting the robot motion. In

addition, we could correct the new trajectories with the actual

position of the feet. This is particularly suitable when the foot

placement is not perfect, as in torque-controlled walking.

B. The Receding Horizon Controller

The receding horizon controller used in our architecture

inherits from the basic formulation described in Sec. II-C.

Nevertheless, differently from [25], we have added time-

varying constraints to bound the ZMP inside the convex hull

given by the supporting foot or feet. In this way we increase

the robustness of the controller, avoiding overshoots that may

cause the robot to tip around one of the foot edges. These

constraints are modeled as linear inequalities acting on the

state variables, i.e.

Z(t)χ−z(t)≤0.(18)

Note that the constraint matrix and vector do depend on

the time. In particular, we can exploit the knowledge on the

desired feet positions to constraint the ZMP throughout the

whole horizon, while retaining linearity

To summarize, the full optimal control problem can be

represented by the following minimisation problem:

min

χ,u Ztf

0

kx∗

ZMP −CZMPχk2

Q+kuk2

Rdτ.

s.t. ˙χ=Aχ +Bu, ∀t∈[t0, tf)

Z(t)χ−z(t)≤0

χ(0) = ¯χ.

(19)

The problem in Eq. (19) is solved at each control sampling

time by means of a Direct Multiple Shooting method [32],

[33]. This choice has been driven by the necessity of for-

mulating state constraints, Eq. (18), throughout the whole

horizon.

Applying the Receding Horizon Principle, only the ﬁrst

output of the MPC controller is used. Since the control input

is the CoM jerk, we integrate it in order to obtain a desired

CoM velocity and position. The latter is also sent to the

inverse kinematics (IK) library [34, InverseKinematics sub-

library], together with the desired feet positions to retrieve

the desired joints coordinates. Both the MPC an the IK

modules are implemented using IPOpt [35].

C. The Stack of Tasks Balancing Controller

The balancing controller has as objective the stabilization

of the center of mass position, the left and right feet positions

and orientations by deﬁning joint torques. The velocities

associated to these tasks are stacked together into Υ:

Υ = h˙p>

Gv>

Lv>

Ri>

(20)

where ˙pG∈R3is the linear velocity of the center of mass,

frame, vL∈R6and vR∈R6are vectors of linear and

angular velocities of the frames attached to the left and right

feet.

Letting JG,JLand JRdenote respectively the Jacobians

of the center of mass position, left and right foot conﬁg-

urations, JΥcan be deﬁned as a stack of the Jacobians

associated to each task:

JΥ=hJ>

GJ>

LJ>

Ri>

(21)

Furthermore, the task velocities Υcan be computed from

νusing Υ = JΥν. By deriving this expression, the task

acceleration is

˙

Υ = ˙

JΥν+JΥ˙ν(22)

In view of (12) and (22), the task accelerations ˙

Υcan be

formulated as a function of the control input ucomposed of

joint torques τand contact wrenches fk:

˙

Υ(u) = ˙

JΥν+JΥM−1(Bτ +

nc

X

k=1

J>

Ckfk−h).(23)

In our formulation we want ˙

Υ(u)to track a speciﬁed task

acceleration ˙

Υ∗, namely:

˙

Υ(u) = ˙

Υ∗.(24)

The reference accelerations ˙

Υ∗are computed using a simple

PD control strategy for what concerns the linear terms. In

parallel, the rotational terms are obtained by adopting a PD

controller on SO(3) (see [36, Section 5.11.6, p.173]), thus

avoiding to use a parametrization like Euler angles.

While the task deﬁned above can be considered as high

priority tasks, we may conceive other possible objectives

to shape the robot motion. One example regards the torso

orientation, which we would like to keep in an upright

posture. Similarly to what have been done above, we have

T=JTνwith JTthe torso Jacobian. By differentiation we

would obtain a result similar to Eq. (23). Since this task is

considered as low priority, we are interested in minimizing

the squared Euclidean distance of ˙

T(u)from a desired value

˙

T∗, i.e

min

τ,fk

1

2k˙

T(u)−˙

T∗k2.(25)

The reference acceleration ˙

T∗is obtained through a PD

controller on SO(3).

Finally, we also added another lower priority postural task

to track joint conﬁgurations. Similarly as before, the joint

accelerations ¨s(u), which depends upon the control inputs

through the dynamics equations Eq. (12), are kept close to

a reference ¨s∗:

min

τ,fk

1

2k¨s(u)−¨s∗k2.(26)

The postural desired accelerations are deﬁned through a

PD+gravity compensation control law, [23].

Considering the contact wrenches fkas control inputs, we

need to ensure their actual feasibility given the contacts. They

are modeled as unilateral, with a limited friction coefﬁcient.

An additional condition regards the Center of Pressure,

CoP, which needs to lie inside the contact patch. All these

conditions can be grouped in a set of inequality constraints

applied to the contact wrenches

Cfk≤b∀k≤nc.(27)

Finally, we can group Eq.s (24)−(27) in the following QP

formulation:

min τ, fkΓ(28a)

s.t. :˙

Υ(u) = ˙

Υ∗(28b)

Cfk≤b∀k≤nc(28c)

where Γ = 1

2k¨s(u)−¨s∗k2+wT

2k˙

T(u)−˙

T∗k2+wτ

2kτk2.

With respect to Eq. (25) and (26), an additional term is

inserted, namely the 2-norm of the joint torques, which

serves as a regularization. Among several feasible solutions,

we are mostly interested in the one which requires the least

effort. By changing the weights wwe can tune the relative

importance of each term.

Let us focus on the feet contact wrenches fland fr.

During the switch phases we expect one of the two wrenches

to vanish in order to smoothly deactivate the corresponding

contact. In order to ease this process we added a cost term in

Γ, equal to wf

2Frkflk+Flkfrkwhere Fr(respectively

Fl) is the normalized load that the right (respectively left)

foot is supposed to carry. It is 1when the corresponding

foot is supposed to hold the full weight of the robot, 0

when unloaded. This information is provided by the planner

described in Sec III-A. The QP problem of Eq. (28) is solved

at a rate of 100Hz through qpOASES [37].

IV. EXP ERI MEN TS

The presented architecture is composed by three different

parts. In this section, we are going to show three different

experiments whose aim is to test each part in an incremental

way. All the experiments have been performed on the iCub

robot, controlling 23 Dofs either in position or in torque

control. The complete experiments videos are available as

multimedia attachements to this paper

MPC RRR Inverse

Kinematics Robot

u≡...

xCoM

¨xCoM

˙xCoM

xCoM

xd

ZMP xfeet

sds, ˙s

Forward

Kinematics

xCoM

˙xCoM

¨xCoM xCoM

Fig. 2: A ﬁrst skeleton of the architecture, composed only by

the MPC (RH) controller connected to the inverse kinematics

(IK). Their output is directly sent to the robot.

(a) t=t0(b) t=t0+ 1s(c) t=t0+ 2s(d) t=t0+ 3s

Fig. 3: Snapshots extracted from the accompanying video.

The robot is walking straight in position control.

A. Test of the RH controller in position control

In the ﬁrst experiment we used the RH module to control

the robot joints in position control. As depicted in the

architecture of Figure 2 the control loop is closed on the

CoM position only, avoiding to stream in the controller

noisy measurements like joints velocities. The desired feet

trajectories xfeet and the desired ZMP proﬁle xd

ZMP are

obtained by an ofﬂine planner described in [38].

Both the RHC and the IK are running on the iCub

head, which is equipped with a 4th generation Intel R

Core

i7@1.7GHz and 8GB of RAM. The whole architecture takes

in average less than 8ms for a control loop.

The robot is taking steps of 14cm long in 1.25s(of which

0.65sin double support). Fig. 3 presents some snapshots of

the robot while walking.

B. Adding the unicycle planner

The controller has been improved by connecting the uni-

cycle planner, as shown in Fig. 4. This allow us to adapt the

robot walking direction in an online fashion. As an example,

by using a joystick we can drive a reference position x∗

F

for the point Fattached to the unicycle. Depending on how

much we press the joypad, this point moves further away

from the robot.

Using this planner, the step timings and dimensions are

not ﬁxed a priori. In this particular experiment, a single step

could last between 1.3sand 5.0s, while the swing foot can

land in a radius of 0.175mfrom the stance foot. Fig. 5

presents some snapshots of the robot while negotiating a

right turn.

MPC RRR Inverse

Kinematics Robot

u≡...

xCoM

¨xCoM

˙xCoM

xCoM

xd

ZMP xfeet

sds, ˙s

Forward

Kinematics

xCoM

˙xCoM

¨xCoM xCoM

Unicycle

Planner

x∗

F

Fig. 4: The scheme of Figure 2 has been upgrade with the

unicycle planner which is responsible of providing online

references.

(a) t=t0(b) t=t0+ 1s(c) t=t0+ 2s(d) t=t0+ 3s

Fig. 5: Thanks to the unicycle planner, the robot walks while

turning right.

C. Complete Architecture

Finally, we plugged also the task based balancing con-

troller presented in Sec. III-C. The overall architecture is

depicted in Figure 6, highlighting that the stack of task

balancing controller is now in charge of sending commands

the robot. In order to draw comparisons with the ﬁrst

experiments, we followed again a straight trajectory. Even if

the task is similar, it is necessary to use the Unicycle Planner

now in order to cope with feet positioning errors (see Fig.

8). In fact, trajectories can be updated in order to take into

account possible feet misplacements. Fig. 7 shows the CoM

tracking capabilities of the presented balancing controller.

During this experiment, the minimum stepping time had to be

increased to 3s(the maximum is still 5s), while maintaining

the same maximum step length of the previous experiment.

Even if we had to slow down a bit the walking speed, the

results are still promising. While walking, the robot is much

smoother in its motion, reducing the probabilities of falling

in case of an unforeseen disturbances.

V. CONCLUSIONS

This architecture is able to cope with variable walking

speed and advance while turning while keeping certain

degree of compliance to better adapt to the online changes

of desired reference trajectories. Focusing on torque control,

experiments have shown some bottlenecks that, for the time

being, prevented to achieve higher performances. These

issue are mainly related to the estimation of joint torques

MPC RRR Inverse

Kinematics

Balancing

Controller

u≡...

xCoM

¨xCoM

˙xCoM

xCoM

xd

ZMP xfeet

sd

s, ˙s

Forward

Kinematics

xCoM

˙xCoM

¨xCoM xCoM

Unicycle

Planner

x∗

F

Robot

s, ˙sτ

Fig. 6: The complete architecture. The output of the IK is no

longer sent to the robot, but to the stack-of-task balancing

controller, together with the desired CoM and feet position.

Joint positions and velocities are taken as feedback from the

robot.

0 10 20 30 40 50 60

-0.2

0

0.2

0.4

0.6

Fig. 7: Center of Mass (CoM) tracking when taking some

steps in torque control. The quantities are expressed in an

inertial frame I

(limiting the performances of the joint torque controller) and

to the ﬂoating base estimation. As a future work, we plan

to increase the robustness of the architecture against these

uncertainties.

Even if the results presented on this paper have to be

considered as preliminary, they enlighten the ﬂexibility prop-

erties of the proposed architecture. The result on torque

control walking is promising, since its inherent compliance

can increase the robustness of the walking motion, against,

for example, un-modeled ground slope variations.

REFERENCES

[1] S. Feng, E. Whitman, X. Xinjilefu, and C. G. Atkeson, “Optimization-

based full body control for the darpa robotics challenge,” Journal of

Field Robotics, vol. 32, no. 2, pp. 293–312, 2015.

[2] H. Dai, A. Valenzuela, and R. Tedrake, “Whole-body motion planning

with centroidal dynamics and full kinematics,” in 2014 IEEE-RAS

International Conference on Humanoid Robots, pp. 295–302.

[3] A. Herzog, N. Rotella, S. Schaal, and L. Righetti, “Trajectory gen-

eration for multi-contact momentum control,” in Humanoid Robots

(Humanoids), 2015 IEEE-RAS 15th International Conference on.

IEEE, 2015, pp. 874–880.

[4] J. Carpentier, S. Tonneau, M. Naveau, O. Stasse, and N. Mansard, “A

versatile and efﬁcient pattern generator for generalized legged locomo-

tion,” in Robotics and Automation (ICRA), 2016 IEEE International

Conference on. IEEE, 2016, pp. 3555–3561.

[5] R. Deits and R. Tedrake, “Footstep planning on uneven terrain

with mixed-integer convex optimization,” in Humanoid Robots (Hu-

manoids), 2014 14th IEEE-RAS International Conference on.

0 10 20 30 40 50 60

-0.1

0

0.1

0.2

0.3

0.4

Fig. 8: Tracking of the left foot positions. Notice that after

the step is taken, the desired values are updated according

to the measured position.

[6] D. Flavigne, J. Pettr´

ee, K. Mombaur, J.-P. Laumond, et al., “Reactive

synthesizing of human locomotion combining nonholonomic and

holonomic behaviors,” in Biomedical Robotics and Biomechatronics

(BioRob), 2010 3rd IEEE RAS and EMBS International Conference

on. IEEE, 2010, pp. 632–637.

[7] K. Mombaur, A. Truong, and J.-P. Laumond, “From human to hu-

manoid locomotionan inverse optimal control approach,” Autonomous

robots, vol. 28, no. 3, pp. 369–383, 2010.

[8] M. L. Handford and M. Srinivasan, “Sideways walking: preferred

is slow, slow is optimal, and optimal is expensive,” Biology letters,

vol. 10, no. 1, p. 20131006, 2014.

[9] P. Morin and C. Samson, Handbook of Robotics. Springer, 2008, ch.

Motion control of wheeled mobile robots, pp. 799–826.

[10] A. Faragasso, G. Oriolo, A. Paolillo, and M. Vendittelli, “Vision-based

corridor navigation for humanoid robots,” in Robotics and Automation

(ICRA), 2013 IEEE International Conference on, pp. 3190–3195.

[11] M. Cognetti, D. De Simone, L. Lanari, and G. Oriolo, “Real-time

planning and execution of evasive motions for a humanoid robot,” in

Robotics and Automation (ICRA), 2016 IEEE International Conference

on. IEEE, 2016, pp. 4200–4206.

[12] D. Mayne, J. Rawlings, C. Rao, and P. Scokaert, “Constrained model

predictive control: Stability and optimality,” Automatica, vol. 36, no. 6,

pp. 789 – 814, 2000.

[13] S. Kajita, F. Kanehiro, K. Kaneko, K. Yokoi, and H. Hirukawa, “The

3d linear inverted pendulum mode: a simple modeling for a biped

walking pattern generation,” in Intelligent Robots and Systems, 2001.

Proceedings. IEEE/RSJ International Conference on, pp. 239–246.

[14] J. Pratt, J. Carff, S. Drakunov, and A. Goswami, “Capture point: A

step toward humanoid push recovery,” in Humanoid Robots, 2006 6th

IEEE-RAS International Conference on, Dec 2006, pp. 200–207.

[15] H. Diedam, D. Dimitrov, P.-B. Wieber, K. Mombaur, and M. Diehl,

“Online walking gait generation with adaptive foot positioning through

linear model predictive control,” in 2008 IEEE/RSJ International

Conference on Intelligent Robots and Systems. IEEE, pp. 1121–1126.

[16] M. Missura and S. Behnke, “Balanced walking with capture steps,” in

Robot Soccer World Cup. Springer, 2014, pp. 3–15.

[17] M. Bombile and A. Billard, “Capture-point based balance and reac-

tive omnidirectional walking controller,” in IEEE RAS International

Conference on Humanoid Robots, no. EPFL-CONF-231920, 2017.

[18] L. Saab, O. E. Ramos, F. Keith, N. Mansard, P. Sou`

eres, and J. Y.

Fourquet, “Dynamic whole-body motion generation under rigid con-

tacts and other unilateral constraints,” IEEE Transactions on Robotics,

vol. 29, no. 2, pp. 346–362, April 2013.

[19] C. Ott, M. A. Roa, and G. Hirzinger, “Posture and balance control

for biped robots based on contact force optimization,” in 2011 11th

IEEE-RAS International Conference on Humanoid Robots, pp. 26–33.

[20] B. J. Stephens and C. G. Atkeson, “Dynamic balance force control

for compliant humanoid robots,” in 2010 IEEE/RSJ International

Conference on Intelligent Robots and Systems, pp. 1248–1255.

[21] A. Herzog, L. Righetti, F. Grimminger, P. Pastor, and S. Schaal, “Bal-

ancing experiments on a torque-controlled humanoid with hierarchical

inverse dynamics,” in 2014 IEEE/RSJ International Conference on

Intelligent Robots and Systems, Sept 2014, pp. 981–988.

[22] S.-H. Lee and A. Goswami, “A momentum-based balance controller

for humanoid robots on non-level and non-stationary ground,” Au-

tonomous Robots, vol. 33, no. 4, pp. 399–414, Nov 2012.

[23] G. Nava, F. Romano, F. Nori, and D. Pucci, “Stability analysis and

design of momentum-based controllers for humanoid robots,” in 2016

IEEE/RSJ International Conference on Intelligent Robots and Systems

(IROS), Oct 2016, pp. 680–687.

[24] D. Pucci, F. Romano, S. Traversaro, and F. Nori, “Highly dynamic

balancing via force control,” in 2016 IEEE-RAS 16th International

Conference on Humanoid Robots (Humanoids), pp. 141–141.

[25] S. Kajita, F. Kanehiro, K. Kaneko, K. Fujiwara, K. Harada, K. Yokoi,

and H. Hirukawa, “Biped walking pattern generation by using preview

control of zero-moment point,” in 2003 IEEE International Conference

on Robotics and Automation, Sept, pp. 1620–1626 vol.2.

[26] L. Natale, C. Bartolozzi, D. Pucci, A. Wykowska, and G. Metta, “icub:

The not-yet-ﬁnished story of building a robot child,” Science Robotics,

vol. 2, no. 13, 2017.

[27] D. Pucci, L. Marchetti, and P. Morin, “Nonlinear control of unicycle-

like robots for person following,” in Intelligent Robots and Systems

(IROS), 2013 IEEE/RSJ International Conference on, pp. 3406–3411.

[28] M. Vukobratovi´

c and B. Borovac, “Zero-moment point: thirty ﬁve

years of its life,” International journal of humanoid robotics, vol. 1,

no. 01, pp. 157–173, 2004.

[29] S. Traversaro, D. Pucci, and F. Nori, “A uniﬁed view of the

equations of motion used for control design of humanoid robots,” On

line, 2017. [Online]. Available: https://traversaro.github.io/preprints/

changebase.pdf

[30] J. E. Marsden and T. S. Ratiu, Introduction to Mechanics and Symme-

try: A Basic Exposition of Classical Mechanical Systems. Springer

Publishing Company, Incorporated, 2010.

[31] H. Michalska and D. Q. Mayne, “Receding horizon control of nonlin-

ear systems,” in Decision and Control, 1989., Proceedings of the 28th

IEEE Conference on. IEEE, 1989, pp. 107–108.

[32] H. Bock and K. Plitt, “A multiple shooting algorithm for direct solution

of optimal control problems*,” IFAC Proceedings Volumes, vol. 17,

no. 2, pp. 1603 – 1608, 9th IFAC World Congress: A Bridge Between

Control Science and Technology, Budapest, Hungary, 2-6 July 1984.

[33] M. Diehl, H. G. Bock, H. Diedam, and P.-B. Wieber, “Fast direct

multiple shooting algorithms for optimal robot control,” in Fast

motions in biomechanics and robotics. Springer Berlin Heidelberg,

2006, pp. 65–93.

[34] Dynamic Interaction Control Lab - Istituto Italiano di Tecnologia,

“iDynTree library,” https://github.com/robotology/idyntree, 2016.

[35] A. W¨

achter and L. T. Biegler, “On the implementation of an interior-

point ﬁlter line-search algorithm for large-scale nonlinear program-

ming,” Mathematical Programming, vol. 106, no. 1, pp. 25–57.

[36] R. Olfati-Saber, “Nonlinear control of underactuated mechanical sys-

tems with application to robotics and aerospace vehicles,” Ph.D.

dissertation, Massachusetts Institute of Technology, 2001.

[37] H. Ferreau, C. Kirches, A. Potschka, H. Bock, and M. Diehl,

“qpOASES: A parametric active-set algorithm for quadratic program-

ming,” Mathematical Programming Computation, vol. 6, no. 4, pp.

327–363, 2014.

[38] Y. Hu, J. Eljaik, K. Stein, F. Nori, and K. Mombaur, “Walking of

the icub humanoid robot in different scenarios: implementation and

performance analysis,” in Humanoid Robots (Humanoids), 2016 IEEE-

RAS 16th International Conference on, pp. 690–696.