A Control Architecture with Online Predictive Planning for
Position and Torque Controlled Walking of Humanoid Robots
Stefano Dafarra1,2, Gabriele Nava1,2, Marie Charbonneau1,2, Nuno Guedelha1,2, Francisco Andrade1,2,
Silvio Traversaro1, Luca Fiorio1, Francesco Romano3, Francesco Nori3, Giorgio Metta1, and Daniele Pucci1
Abstract— A common approach to the generation of walking
patterns for humanoid robots consists in adopting a layered
control architecture. This paper proposes an architecture com-
posed of three nested control loops. The outer loop exploits
a robot kinematic model to plan the footstep positions. In
the mid layer, a predictive controller generates a Center of
Mass trajectory according to the well-known table-cart model.
Through a whole-body inverse kinematics algorithm, we can
deﬁne joint references for position controlled walking. The
outcomes of these two loops are then interpreted as inputs of
a stack-of-task QP-based torque controller, which represents
the inner loop of the presented control architecture. This
resulting architecture allows the robot to walk also in torque
control, guaranteeing higher level of compliance. Real world
experiments have been carried on the humanoid robot iCub.
Despite decades of research in the subject, robust bipedal
locomotion of humanoid robots is still a challenge for the
Robotics community. The unpredictability of the terrain,
the nonlinearity of the robot-environment models, and the
low efﬁciency of the robot actuators - that are a far cry
from the human musculoskeletal system - are only a few
complexities that render robot bipedal locomotion an active
research domain. In this context, feedback control algorithms
for robust bipedal locomotion are of primary importance.
This paper contributes towards this direction by presenting an
on-line predictive kinematic planner for foot-step positioning
and center-of-mass (CoM) trajectories. This planner is also
integrated with a stack-of-task torque controller, which en-
sures a degree of robot compliance and further increases the
overall system robustness to external perturbations.
A recent approach for bipedal locomotion control that
became popular during the DARPA Robotics Challenge 
consists in deﬁning a hierarchical control architecture. Each
layer of this architecture receives inputs from the robot and
its surrounding environment, and provides references to the
the layer below. The lower the layer, the shorter the time
horizon that is used to evaluate the outputs. Also, lower
layers usually employ more complex models to evaluate the
outputs, but the shorter time horizon often results in faster
computations for obtaining these outputs.
This project has received funding from the European Unions Horizon
2020 research and innovation programme under grant agreement No. 731540
1iCub Facility Department, Istituto Italiano di Tecnologia, 16163 Gen-
ova, Italy (e-mail: email@example.com)
a degli Studi di Genova, DIBRIS
3Romano (firstname.lastname@example.org) and Nori (email@example.com) are
with Google DeepMind, London, UK.
From higher to lower layers, trajectory optimization for
foot-step planning,receding horizon control (RHC), and
whole-body quadratic programming control represent a com-
mon stratiﬁcation of the control architecture for bipedal
Trajectory optimization for foot-step planning is in charge
of ﬁnding a sequence of robot footholds, which is also funda-
mental for maintaining the robot balance. This optimisation
can be achieved by considering both kinematic and dynamic
robot models , , , and with different optimisation
techniques, such as the Mixed Integer Programming . The
robust-and-repeatable application of these sophisticated tech-
niques, however, still require time-consuming gain tuning
since they usually consider complex models characterizing
the hazardous environment surrounding the robot.
For a certain number of applications, the terrain can be
considered to be ﬂat. In these cases, it is known that the
human upper body is usually kept tangent to the walking
path ,  all the more so because stepping aside, i.e.
perpendicular to the path, is energetically inefﬁcient . All
these considerations suggest to use a simple kinematic model
to generate the walking trajectories: the unicycle model (see,
e.g., ). This model can be used to plan footsteps in a
corridor with turns and junctions using cameras , or
to perform evasive robot motions . In all these cases,
however, the robot velocity is kept to a constant value.
Receding horizon control (RHC), also referred to as Model
Predictive Control (MPC) , is often in charge of ﬁnding
feasible trajectories for the robot center of mass along the
walking path. The computational burden to ﬁnd feasibility
regions, however, usually calls for using simpliﬁed models
to characterise the robot dynamics. The Linear Inverted
Pendulum  and the Capture Point  models represent
two widespread simpliﬁed robot models. These simpliﬁed
linear models have allowed on-line RHC, also providing
references for the footstep locations , , .
Whole-body quadratic-programming (QP) controllers are
instantaneous algorithms that usually ﬁnd (desired) joint
torques and contact forces achieving some desired robot
accelerations. In this framework, the generated joint-torques
and contact forces can satisfy inequality constraints, which
allow to meet friction constraints. The desired accelerations,
that QP controllers track, shall ensure the stabilisation of
reference positions coming from the RHC layer. Although
the reference positions may be stabilised by a joint-position
control loop, joint-torque based controllers have shown to
ensure a degree of compliance, which also allows safe inter-
arXiv:1807.05395v1 [cs.RO] 14 Jul 2018
actions with the environment , . From the modeling
point of view, full-body ﬂoating-base models are usually
employed in QP controllers. These controllers are often
composed of several tasks, organised with strict or weighted
hierarchies , , , , .
This paper presents a reactive control architecture for
bipedal robot locomotion, namely, the entire architecture uses
on-line feedback from the robot and user-desired quantities.
This architecture implements the above three layers, and can
be launched on both position and torque controlled robots.
The trajectory optimization for foot-step planning is
achieved by a planning module that uses a simpliﬁed kine-
matic robot model, namely, the unicycle model. Feet po-
sitions are updated depending on the robot state and on
a desired direction coming from a joypad, which gives a
human user teleoperating the robot the possibility of deﬁning
desired walking paths. Differently from , we do not
ﬁx the robot velocity. Compared to , , , we do
not assume the robot to be always in single support. As a
consequence, the robot avoids to step in place continuously
if the desired robot position does not change, or changes
slowly. Another drawback of these approaches is that feet
rotations are planned separately from linear positions, and
this drawback is overcome by our approach.
Once footsteps are deﬁned, a receding horizon control
(RHC) module generates kinematically feasible trajectories
for the robot center of mass and joint trajectories by using the
LIP model  and whole-body inverse kinematics. Hence,
we separate the generation of footsteps from the computation
of feasible CoM trajectories. The implemented RHC module
runs at a frequency of 100 Hz.
The CoM and feet trajectories generated by the RHC
module are then streamed as desired values to either a
joint-position control loop, or to a whole-body quadratic-
programming (QP) controller running at 100 Hz. This latter
controller generates desired joint torques, ensuring a degree
of robustness and robot compliance. The desired joint torques
are then stabilised by a low-level joint torque controller
running at 1 kHz. Experimental validations of the proposed
approach are carried out on the iCub humanoid robot ,
with both position and joint torque control experiments.
In light of the above, this paper presents a torque-control,
on-line RHC architecture for bipedal robot locomotion.
II. BACKGROU ND
In this paper we are going to use the following notation:
•The ith component of a vector xis denoted as xi.
•Iis a ﬁxed inertial frame with respect to (w.r.t.) which
the robot’s absolute pose is measured.
•e1:= (1,0)Tand e2:= (0,1)Tdenote the canonical
basis vectors of R2.
•R2(θ)∈SO(2) is the rotation matrix of an angle θ∈R;
S2=R2(π/2) is the unitary skew-symmetric matrix.
•given a function of time f(t)the dot notation denotes
the time derivative, i.e. ˙
dt. Higher order deriva-
tives are denoted with a corresponding amount of dots.
Fig. 1: Notation. The unicycle model is a planar model of
a robot having two wheels placed at a distance 2m,m∈
Rwith a coinciding rotation axis. Hence, this mobile robot
cannot move sideways, i.e. along the wheel axis, but it can
turn by moving the wheel at different velocity. Bis a frame
attached to the robot whose origin is located in the middle of
the wheels axis. Point Fis attached to the robot. Its position
expressed in Bis given by d∈R2, a constant vector.
•1n∈Rn×ndenotes the identity matrix of dimension n.
•S(·)is the skew-symmetric operation associated with
the cross product in R3.
•The vee operator denoted by (·)∨is the inverse of the
•The Euclidean norm of a vector of coordinates v∈Rn
is denoted by kvk.
•ARB∈SO(3) and ATB∈SE(3) denote the rotation
and transformation matrices which transform a vector
expressed in the Bframe into a vector expressed in the
B. The unicycle model
The unicycle model, represented in Fig. 1. is described by
the following model equations :
with v∈Rand ω∈Rthe robot’s rolling and rotational
velocity, respectively. x∈R2is the unicycle position in the
inertial frame I, while θ∈Rrepresents the angle around the
z−axis of Iwhich aligns the inertial reference frame with a
unicycle ﬁxed frame. The variables vand ωare considered
as kinematic control inputs.
A reasonable control objective for this kind of model is
to asymptotically stabilize the point Fabout a desired point
F∗whose position is deﬁned as x∗
F. For this purpose, deﬁne
the error ˜xas
so that the control objective is equivalent to the asymptotic
stabilization of ˜xto zero. Since
then by differentiation it yields
˙xF= ˙x+ωR2(θ)S2d. (3)
Eq. (3) describes the output dynamics. Substituting Eq. (1)
into Eq. (3), we can rewrite the output dynamics as:
where u= [v ω]is the vector of control inputs. It is easy to
show that det B(θ) = d1, which means that when the control
point Fis not located on the wheels’ axis, its stabilization
to an arbitrary reference position F∗can be achieved by the
use of simple feedback laws. For example, if we deﬁne
with Ka positive deﬁnite matrix, then we have
Thus, the origin of the error dynamics is an asymptotically
stable equilibrium. Notice that this control law is not deﬁned
when d1= 0.
C. Zero Moment Point Preview Control
The MPC controller implemented in this work has been
derived from the Zero Moment Point (ZMP) preview control
described in . This algorithm adopts a simpliﬁed model,
viz. the table-cart model, based on the Linear Inverted
Pendulum (LIP) approximate model. The humanoid model is
assimilated to an inverted pendulum that is linearised around
the vertical position. The ZMP can be related to the CoM
position and acceleration by the following equation:
xZMP =P2DxCoM −xCoM,z
where gis the gravitational constant, xZMP ∈R2denotes the
Zero Moment Point (ZMP)  coordinate, while xCoM ∈R3
is the center of mass 3D coordinate. xCoM,z , i.e. the CoM
height, is assumed constant and P2Dis the matrix projecting
the CoM on the 2D plane, i.e. it considers only the xand y
coordinates of the center of mass.
Assuming to have a desired ZMP trajectory x∗
we want to track this signal at every time instant. One
possibility is to consider the ZMP as an output of following
the dynamical system:
˙χ=AZMP χ+BZMPu, y =CZMPχ(8)
where the new state variable χand control uare deﬁned as
∈R6, u :=...
The system matrices are deﬁned as in the following:
, BZMP =04×2
Deﬁning a cost function J
where Qand Rare two weight matrices of suitable dimen-
sion. Jpenalizes the output tracking error plus a regulariza-
tion term on the effort. In order to minimize Jgiven Eq. (8)
we can implement a Linear Quadratic controller. This simple
controller allows to track a desired ZMP trajectory.
D. System Modeling
The full model of a humanoid robot is composed of n+ 1
rigid bodies, called links, connected by njoints with one
degree of freedom each. We also assume that none of the
links has an a priori constant pose with respect to an inertial
frame, i.e. the system is free ﬂoating.
The robot conﬁguration space can then be characterized
by the position and the orientation of a frame attached to
a robot’s link, called base frame B, and the joint conﬁg-
urations. Thus, the robot conﬁguration space is the group
Q=R3×SO(3) ×Rn. An element q∈Qcan be deﬁned
as the following triplet: q= (IpB,IRB, s)where IpB∈R3
denotes the position of the base frame with respect to the
inertial frame, IRB∈R3×3is a rotation matrix representing
the orientation of the base frame, and s∈Rnis the joint
The velocity of the multi-body system can be characterized
by the algebra Vof Qdeﬁned by: V=R3×R3×Rn.
An element of Vis then a triplet ν= (I˙pB,IωB,˙s) =
(vB,˙s), where IωBis the angular velocity of the base frame
expressed w.r.t. the inertial frame, i.e. I˙
A more detailed description of the ﬂoating base model is
provided in .
We also assume that the robot is interacting with the envi-
ronment exchanging ncdistinct wrenches1. The application
of the Euler-Poincar´
e formalism [30, Ch. 13.5]:
M(q) ˙ν+C(q, ν )ν+G(q) = Bτ +
where M∈Rn+6×n+6 is the mass matrix, C∈
R(n+6)×(n+6) accounts for Coriolis and centrifugal effects,
G∈Rn+6 is the gravity term, B= (0n×6,1n)>is a selector
matrix, τ∈Rnis a vector representing the internal actuation
torques, and fk∈R6denotes the k-th external wrench
applied by the environment on the robot. The Jacobian
JCk=JCk(q)is the map between the robot’s velocity ν
and the linear and angular velocity at the k-th contact link.
Lastly, it is assumed that a set of holonomic constraints
acts on System (12). These holonomic constraints are of the
form c(q) = 0, and may represent, for instance, a frame
having a constant pose w.r.t. the inertial frame. In the case
this frame corresponds to the location at which a rigid contact
occurs on a link, we represent the holonomic constraint as
Hence, the holonomic constraints associated with all the
rigid contacts can be represented as
· · ·
with Jb∈R6nc×6, Jj∈R6nc×n. The base frame velocity is
denoted by vB∈R6.
1As an abuse of notation, we deﬁne as wrench a quantity that is not the
dual of a twist, but a 6D force/moment vector.
In this section we summarise the components constituting
the presented architecture, namely:
•the footstep planner,
•the RH controller,
•the stack-of-task-balancing controller.
A. The Footstep Planner
Consider the unicycle model presented in Section II-B.
Assuming to know the reference trajectory for point Fup to
time T, we can integrate the closed-loop system described in
Section II-B to obtain the trajectory spanned by the unicycle.
The next step consists in discretizing the unicycle trajectories
at a ﬁxed rate 1/dt. This passage allows us to search for the
best foot placement option in a smaller space, constituted
by the set of discretization points taken from the original
Given a discrete instant k∈N, we can deﬁne xk=x(kdt)
as the position of the unicycle at instant k, while θk=θ(kdt)
is its orientation along the z−axis. We can use the discretized
trajectories to reconstruct the desired position of the left and
right foot, xl
k, using the following relations:
A step contains two phases: double support and single
support. During double support, both robot feet are on
ground and depending on the foot, we can distinguish two
different states: switch-in if the foot is being loaded, switch-
out otherwise. In single support instead, a foot is in a swing
state if it is moving, stance otherwise. The instant in which
a foot lands on the ground is called impact time,tf
this event, the foot will experience the following sequence
of states: switch-in →stance →switch-out →swing.This
sequence is terminated by an another impact time. At the
beginning of the switch-out phase, the other foot has landed
on the ground with an impact time t∼f
imp. The step duration,
∆tis then deﬁned as:
We deﬁne additional quantities relating the two feet when in
double support (ds):
•the orientation difference when in double support ∆θ=
•The feet distance ∆x=kxl
•The position of the left foot expressed on the frame
rigidly attached to the right foot, rPl.
These quantities will be used to determine the feet trajecto-
ries starting from the unicycle ones.
Given the discretized unicycle trajectories, a possible
policy consists in ﬁxing the duration of a step, namely
∆t=const, or in ﬁxing its length, i.e. setting ∆xto a
constant. Both these two strategies are not desirable. In the
former case the robot will always take (eventually) very short
steps at maximum speed. In the latter, if the unicycle is
advancing slowly (because of slow moving references), the
robot will take steps always at maximum length sacriﬁcing
the walking speed. In view of these considerations, we would
like the planner to modify step length and speed depending
on the reference trajectory. Since we would like to avoid
ﬁxing any variable, it is necessary to deﬁne a cost function.
It is composed of two parts. The ﬁrst part weights the squared
inverse of ∆t, thus penalizing fast steps, while the second
penalize the squared 2-norm of ∆x, avoiding to take long
steps. Thus, the cost function φcan be written as:
where ktand kxare two positive numbers. Depending on
their ratio, the robot will take long-and-slow or short-and-
fast steps. Notice that φis not deﬁned when ∆t= 0, but the
robot would not be able to take steps so fast. Thus, we need
to bound ∆t:
tmin ≤∆t≤tmax (16)
where the upper bound avoids a step to be too slow.
Regarding ∆xit needs to be lower than an upper-bound
dmax, bounding the swinging foot into a circular area drawn
around the stance foot with radius dmax. Another constraint
to be considered is the relative rotation of the two feet. In
particular the absolute value of ∆θmust be lower than θmax.
Finally, in order to avoid the robot to twist its legs, we
simply resort to a bound on the y−component of rPlvector.
Indeed, it corresponds to the width of the step. By applying
a lower-bound wmin on it, we avoid the left foot to be on
the right of the other foot.
Finally we can write the constraints deﬁned above, to-
gether with φ, as an optimization problem:
s.t. :tmin ≤∆t≤tmax (17b)
|∆θ| ≤ θmax (17d)
The decision variables are the impact times. If we select
an instant kto be the impact time for a foot, then we can
obtain the corresponding foot position and orientation using
Eq. (14) and Eq. (15).
The solution of the optimization problem is obtained
through a simple iterative algorithm. The initialization can
be done by using the measured position of the feet. Starting
from this conﬁguration we can integrate the unicycle tra-
jectories assuming to know the reference trajectories. Once
we discretize them, we can iterate over kuntil we ﬁnd a set
of points which satisfy the conditions deﬁned above. Among
the remaining points we can evaluate φto ﬁnd our optimum.
Once we planned footsteps for the full time horizon, we
can proceed in interpolating the feet trajectories and other
relevant quantities for the walking motion, e.g. the Zero
Moment Point .
While walking, references may change and it is not
desirable to wait the end of the planned trajectories before
changing them. Thus the idea is to “merge” two trajecto-
ries instead of serializing them. When generating a new
trajectory, the robot is supposed to be standing on two
feet and the ﬁrst switch needs to last half of its normal
time. In view of these considerations, the middle instant of
the double support phase is a particularly suitable point to
merge two trajectories. The choice of this point eases the
merging process since the feet are not supposed to move
in that instant, hence the initialization of these trajectories
does not need to know their initial desired velocity and/or
acceleration. Notice that there may be different merge points
along the trajectories, depending on the number of (full)
double support phases. Two trivial merge points are the very
ﬁrst instant of the trajectories themselves and the ﬁnal point
This method can be assimilated to the Receding Horizon
Principle . In fact, we plan trajectories for an horizon T
but only a portion of them will be actually used, namely the
ﬁrst generated step. This simple strategy allows us to change
the unicycle reference trajectory on-line (through the use of a
joystick for example) directly affecting the robot motion. In
addition, we could correct the new trajectories with the actual
position of the feet. This is particularly suitable when the foot
placement is not perfect, as in torque-controlled walking.
B. The Receding Horizon Controller
The receding horizon controller used in our architecture
inherits from the basic formulation described in Sec. II-C.
Nevertheless, differently from , we have added time-
varying constraints to bound the ZMP inside the convex hull
given by the supporting foot or feet. In this way we increase
the robustness of the controller, avoiding overshoots that may
cause the robot to tip around one of the foot edges. These
constraints are modeled as linear inequalities acting on the
state variables, i.e.
Note that the constraint matrix and vector do depend on
the time. In particular, we can exploit the knowledge on the
desired feet positions to constraint the ZMP throughout the
whole horizon, while retaining linearity
To summarize, the full optimal control problem can be
represented by the following minimisation problem:
s.t. ˙χ=Aχ +Bu, ∀t∈[t0, tf)
χ(0) = ¯χ.
The problem in Eq. (19) is solved at each control sampling
time by means of a Direct Multiple Shooting method ,
. This choice has been driven by the necessity of for-
mulating state constraints, Eq. (18), throughout the whole
Applying the Receding Horizon Principle, only the ﬁrst
output of the MPC controller is used. Since the control input
is the CoM jerk, we integrate it in order to obtain a desired
CoM velocity and position. The latter is also sent to the
inverse kinematics (IK) library [34, InverseKinematics sub-
library], together with the desired feet positions to retrieve
the desired joints coordinates. Both the MPC an the IK
modules are implemented using IPOpt .
C. The Stack of Tasks Balancing Controller
The balancing controller has as objective the stabilization
of the center of mass position, the left and right feet positions
and orientations by deﬁning joint torques. The velocities
associated to these tasks are stacked together into Υ:
Υ = h˙p>
where ˙pG∈R3is the linear velocity of the center of mass,
frame, vL∈R6and vR∈R6are vectors of linear and
angular velocities of the frames attached to the left and right
Letting JG,JLand JRdenote respectively the Jacobians
of the center of mass position, left and right foot conﬁg-
urations, JΥcan be deﬁned as a stack of the Jacobians
associated to each task:
Furthermore, the task velocities Υcan be computed from
νusing Υ = JΥν. By deriving this expression, the task
Υ = ˙
In view of (12) and (22), the task accelerations ˙
formulated as a function of the control input ucomposed of
joint torques τand contact wrenches fk:
Υ(u) = ˙
In our formulation we want ˙
Υ(u)to track a speciﬁed task
Υ(u) = ˙
The reference accelerations ˙
Υ∗are computed using a simple
PD control strategy for what concerns the linear terms. In
parallel, the rotational terms are obtained by adopting a PD
controller on SO(3) (see [36, Section 5.11.6, p.173]), thus
avoiding to use a parametrization like Euler angles.
While the task deﬁned above can be considered as high
priority tasks, we may conceive other possible objectives
to shape the robot motion. One example regards the torso
orientation, which we would like to keep in an upright
posture. Similarly to what have been done above, we have
T=JTνwith JTthe torso Jacobian. By differentiation we
would obtain a result similar to Eq. (23). Since this task is
considered as low priority, we are interested in minimizing
the squared Euclidean distance of ˙
T(u)from a desired value
The reference acceleration ˙
T∗is obtained through a PD
controller on SO(3).
Finally, we also added another lower priority postural task
to track joint conﬁgurations. Similarly as before, the joint
accelerations ¨s(u), which depends upon the control inputs
through the dynamics equations Eq. (12), are kept close to
a reference ¨s∗:
The postural desired accelerations are deﬁned through a
PD+gravity compensation control law, .
Considering the contact wrenches fkas control inputs, we
need to ensure their actual feasibility given the contacts. They
are modeled as unilateral, with a limited friction coefﬁcient.
An additional condition regards the Center of Pressure,
CoP, which needs to lie inside the contact patch. All these
conditions can be grouped in a set of inequality constraints
applied to the contact wrenches
Finally, we can group Eq.s (24)−(27) in the following QP
min τ, fkΓ(28a)
Υ(u) = ˙
where Γ = 1
With respect to Eq. (25) and (26), an additional term is
inserted, namely the 2-norm of the joint torques, which
serves as a regularization. Among several feasible solutions,
we are mostly interested in the one which requires the least
effort. By changing the weights wwe can tune the relative
importance of each term.
Let us focus on the feet contact wrenches fland fr.
During the switch phases we expect one of the two wrenches
to vanish in order to smoothly deactivate the corresponding
contact. In order to ease this process we added a cost term in
Γ, equal to wf
Fl) is the normalized load that the right (respectively left)
foot is supposed to carry. It is 1when the corresponding
foot is supposed to hold the full weight of the robot, 0
when unloaded. This information is provided by the planner
described in Sec III-A. The QP problem of Eq. (28) is solved
at a rate of 100Hz through qpOASES .
IV. EXP ERI MEN TS
The presented architecture is composed by three different
parts. In this section, we are going to show three different
experiments whose aim is to test each part in an incremental
way. All the experiments have been performed on the iCub
robot, controlling 23 Dofs either in position or in torque
control. The complete experiments videos are available as
multimedia attachements to this paper
MPC RRR Inverse
Fig. 2: A ﬁrst skeleton of the architecture, composed only by
the MPC (RH) controller connected to the inverse kinematics
(IK). Their output is directly sent to the robot.
(a) t=t0(b) t=t0+ 1s(c) t=t0+ 2s(d) t=t0+ 3s
Fig. 3: Snapshots extracted from the accompanying video.
The robot is walking straight in position control.
A. Test of the RH controller in position control
In the ﬁrst experiment we used the RH module to control
the robot joints in position control. As depicted in the
architecture of Figure 2 the control loop is closed on the
CoM position only, avoiding to stream in the controller
noisy measurements like joints velocities. The desired feet
trajectories xfeet and the desired ZMP proﬁle xd
obtained by an ofﬂine planner described in .
Both the RHC and the IK are running on the iCub
head, which is equipped with a 4th generation Intel R
firstname.lastname@example.orgGHz and 8GB of RAM. The whole architecture takes
in average less than 8ms for a control loop.
The robot is taking steps of 14cm long in 1.25s(of which
0.65sin double support). Fig. 3 presents some snapshots of
the robot while walking.
B. Adding the unicycle planner
The controller has been improved by connecting the uni-
cycle planner, as shown in Fig. 4. This allow us to adapt the
robot walking direction in an online fashion. As an example,
by using a joystick we can drive a reference position x∗
for the point Fattached to the unicycle. Depending on how
much we press the joypad, this point moves further away
from the robot.
Using this planner, the step timings and dimensions are
not ﬁxed a priori. In this particular experiment, a single step
could last between 1.3sand 5.0s, while the swing foot can
land in a radius of 0.175mfrom the stance foot. Fig. 5
presents some snapshots of the robot while negotiating a
MPC RRR Inverse
Fig. 4: The scheme of Figure 2 has been upgrade with the
unicycle planner which is responsible of providing online
(a) t=t0(b) t=t0+ 1s(c) t=t0+ 2s(d) t=t0+ 3s
Fig. 5: Thanks to the unicycle planner, the robot walks while
C. Complete Architecture
Finally, we plugged also the task based balancing con-
troller presented in Sec. III-C. The overall architecture is
depicted in Figure 6, highlighting that the stack of task
balancing controller is now in charge of sending commands
the robot. In order to draw comparisons with the ﬁrst
experiments, we followed again a straight trajectory. Even if
the task is similar, it is necessary to use the Unicycle Planner
now in order to cope with feet positioning errors (see Fig.
8). In fact, trajectories can be updated in order to take into
account possible feet misplacements. Fig. 7 shows the CoM
tracking capabilities of the presented balancing controller.
During this experiment, the minimum stepping time had to be
increased to 3s(the maximum is still 5s), while maintaining
the same maximum step length of the previous experiment.
Even if we had to slow down a bit the walking speed, the
results are still promising. While walking, the robot is much
smoother in its motion, reducing the probabilities of falling
in case of an unforeseen disturbances.
This architecture is able to cope with variable walking
speed and advance while turning while keeping certain
degree of compliance to better adapt to the online changes
of desired reference trajectories. Focusing on torque control,
experiments have shown some bottlenecks that, for the time
being, prevented to achieve higher performances. These
issue are mainly related to the estimation of joint torques
MPC RRR Inverse
Fig. 6: The complete architecture. The output of the IK is no
longer sent to the robot, but to the stack-of-task balancing
controller, together with the desired CoM and feet position.
Joint positions and velocities are taken as feedback from the
0 10 20 30 40 50 60
Fig. 7: Center of Mass (CoM) tracking when taking some
steps in torque control. The quantities are expressed in an
inertial frame I
(limiting the performances of the joint torque controller) and
to the ﬂoating base estimation. As a future work, we plan
to increase the robustness of the architecture against these
Even if the results presented on this paper have to be
considered as preliminary, they enlighten the ﬂexibility prop-
erties of the proposed architecture. The result on torque
control walking is promising, since its inherent compliance
can increase the robustness of the walking motion, against,
for example, un-modeled ground slope variations.
 S. Feng, E. Whitman, X. Xinjilefu, and C. G. Atkeson, “Optimization-
based full body control for the darpa robotics challenge,” Journal of
Field Robotics, vol. 32, no. 2, pp. 293–312, 2015.
 H. Dai, A. Valenzuela, and R. Tedrake, “Whole-body motion planning
with centroidal dynamics and full kinematics,” in 2014 IEEE-RAS
International Conference on Humanoid Robots, pp. 295–302.
 A. Herzog, N. Rotella, S. Schaal, and L. Righetti, “Trajectory gen-
eration for multi-contact momentum control,” in Humanoid Robots
(Humanoids), 2015 IEEE-RAS 15th International Conference on.
IEEE, 2015, pp. 874–880.
 J. Carpentier, S. Tonneau, M. Naveau, O. Stasse, and N. Mansard, “A
versatile and efﬁcient pattern generator for generalized legged locomo-
tion,” in Robotics and Automation (ICRA), 2016 IEEE International
Conference on. IEEE, 2016, pp. 3555–3561.
 R. Deits and R. Tedrake, “Footstep planning on uneven terrain
with mixed-integer convex optimization,” in Humanoid Robots (Hu-
manoids), 2014 14th IEEE-RAS International Conference on.
0 10 20 30 40 50 60
Fig. 8: Tracking of the left foot positions. Notice that after
the step is taken, the desired values are updated according
to the measured position.
 D. Flavigne, J. Pettr´
ee, K. Mombaur, J.-P. Laumond, et al., “Reactive
synthesizing of human locomotion combining nonholonomic and
holonomic behaviors,” in Biomedical Robotics and Biomechatronics
(BioRob), 2010 3rd IEEE RAS and EMBS International Conference
on. IEEE, 2010, pp. 632–637.
 K. Mombaur, A. Truong, and J.-P. Laumond, “From human to hu-
manoid locomotionan inverse optimal control approach,” Autonomous
robots, vol. 28, no. 3, pp. 369–383, 2010.
 M. L. Handford and M. Srinivasan, “Sideways walking: preferred
is slow, slow is optimal, and optimal is expensive,” Biology letters,
vol. 10, no. 1, p. 20131006, 2014.
 P. Morin and C. Samson, Handbook of Robotics. Springer, 2008, ch.
Motion control of wheeled mobile robots, pp. 799–826.
 A. Faragasso, G. Oriolo, A. Paolillo, and M. Vendittelli, “Vision-based
corridor navigation for humanoid robots,” in Robotics and Automation
(ICRA), 2013 IEEE International Conference on, pp. 3190–3195.
 M. Cognetti, D. De Simone, L. Lanari, and G. Oriolo, “Real-time
planning and execution of evasive motions for a humanoid robot,” in
Robotics and Automation (ICRA), 2016 IEEE International Conference
on. IEEE, 2016, pp. 4200–4206.
 D. Mayne, J. Rawlings, C. Rao, and P. Scokaert, “Constrained model
predictive control: Stability and optimality,” Automatica, vol. 36, no. 6,
pp. 789 – 814, 2000.
 S. Kajita, F. Kanehiro, K. Kaneko, K. Yokoi, and H. Hirukawa, “The
3d linear inverted pendulum mode: a simple modeling for a biped
walking pattern generation,” in Intelligent Robots and Systems, 2001.
Proceedings. IEEE/RSJ International Conference on, pp. 239–246.
 J. Pratt, J. Carff, S. Drakunov, and A. Goswami, “Capture point: A
step toward humanoid push recovery,” in Humanoid Robots, 2006 6th
IEEE-RAS International Conference on, Dec 2006, pp. 200–207.
 H. Diedam, D. Dimitrov, P.-B. Wieber, K. Mombaur, and M. Diehl,
“Online walking gait generation with adaptive foot positioning through
linear model predictive control,” in 2008 IEEE/RSJ International
Conference on Intelligent Robots and Systems. IEEE, pp. 1121–1126.
 M. Missura and S. Behnke, “Balanced walking with capture steps,” in
Robot Soccer World Cup. Springer, 2014, pp. 3–15.
 M. Bombile and A. Billard, “Capture-point based balance and reac-
tive omnidirectional walking controller,” in IEEE RAS International
Conference on Humanoid Robots, no. EPFL-CONF-231920, 2017.
 L. Saab, O. E. Ramos, F. Keith, N. Mansard, P. Sou`
eres, and J. Y.
Fourquet, “Dynamic whole-body motion generation under rigid con-
tacts and other unilateral constraints,” IEEE Transactions on Robotics,
vol. 29, no. 2, pp. 346–362, April 2013.
 C. Ott, M. A. Roa, and G. Hirzinger, “Posture and balance control
for biped robots based on contact force optimization,” in 2011 11th
IEEE-RAS International Conference on Humanoid Robots, pp. 26–33.
 B. J. Stephens and C. G. Atkeson, “Dynamic balance force control
for compliant humanoid robots,” in 2010 IEEE/RSJ International
Conference on Intelligent Robots and Systems, pp. 1248–1255.
 A. Herzog, L. Righetti, F. Grimminger, P. Pastor, and S. Schaal, “Bal-
ancing experiments on a torque-controlled humanoid with hierarchical
inverse dynamics,” in 2014 IEEE/RSJ International Conference on
Intelligent Robots and Systems, Sept 2014, pp. 981–988.
 S.-H. Lee and A. Goswami, “A momentum-based balance controller
for humanoid robots on non-level and non-stationary ground,” Au-
tonomous Robots, vol. 33, no. 4, pp. 399–414, Nov 2012.
 G. Nava, F. Romano, F. Nori, and D. Pucci, “Stability analysis and
design of momentum-based controllers for humanoid robots,” in 2016
IEEE/RSJ International Conference on Intelligent Robots and Systems
(IROS), Oct 2016, pp. 680–687.
 D. Pucci, F. Romano, S. Traversaro, and F. Nori, “Highly dynamic
balancing via force control,” in 2016 IEEE-RAS 16th International
Conference on Humanoid Robots (Humanoids), pp. 141–141.
 S. Kajita, F. Kanehiro, K. Kaneko, K. Fujiwara, K. Harada, K. Yokoi,
and H. Hirukawa, “Biped walking pattern generation by using preview
control of zero-moment point,” in 2003 IEEE International Conference
on Robotics and Automation, Sept, pp. 1620–1626 vol.2.
 L. Natale, C. Bartolozzi, D. Pucci, A. Wykowska, and G. Metta, “icub:
The not-yet-ﬁnished story of building a robot child,” Science Robotics,
vol. 2, no. 13, 2017.
 D. Pucci, L. Marchetti, and P. Morin, “Nonlinear control of unicycle-
like robots for person following,” in Intelligent Robots and Systems
(IROS), 2013 IEEE/RSJ International Conference on, pp. 3406–3411.
 M. Vukobratovi´
c and B. Borovac, “Zero-moment point: thirty ﬁve
years of its life,” International journal of humanoid robotics, vol. 1,
no. 01, pp. 157–173, 2004.
 S. Traversaro, D. Pucci, and F. Nori, “A uniﬁed view of the
equations of motion used for control design of humanoid robots,” On
line, 2017. [Online]. Available: https://traversaro.github.io/preprints/
 J. E. Marsden and T. S. Ratiu, Introduction to Mechanics and Symme-
try: A Basic Exposition of Classical Mechanical Systems. Springer
Publishing Company, Incorporated, 2010.
 H. Michalska and D. Q. Mayne, “Receding horizon control of nonlin-
ear systems,” in Decision and Control, 1989., Proceedings of the 28th
IEEE Conference on. IEEE, 1989, pp. 107–108.
 H. Bock and K. Plitt, “A multiple shooting algorithm for direct solution
of optimal control problems*,” IFAC Proceedings Volumes, vol. 17,
no. 2, pp. 1603 – 1608, 9th IFAC World Congress: A Bridge Between
Control Science and Technology, Budapest, Hungary, 2-6 July 1984.
 M. Diehl, H. G. Bock, H. Diedam, and P.-B. Wieber, “Fast direct
multiple shooting algorithms for optimal robot control,” in Fast
motions in biomechanics and robotics. Springer Berlin Heidelberg,
2006, pp. 65–93.
 Dynamic Interaction Control Lab - Istituto Italiano di Tecnologia,
“iDynTree library,” https://github.com/robotology/idyntree, 2016.
 A. W¨
achter and L. T. Biegler, “On the implementation of an interior-
point ﬁlter line-search algorithm for large-scale nonlinear program-
ming,” Mathematical Programming, vol. 106, no. 1, pp. 25–57.
 R. Olfati-Saber, “Nonlinear control of underactuated mechanical sys-
tems with application to robotics and aerospace vehicles,” Ph.D.
dissertation, Massachusetts Institute of Technology, 2001.
 H. Ferreau, C. Kirches, A. Potschka, H. Bock, and M. Diehl,
“qpOASES: A parametric active-set algorithm for quadratic program-
ming,” Mathematical Programming Computation, vol. 6, no. 4, pp.
 Y. Hu, J. Eljaik, K. Stein, F. Nori, and K. Mombaur, “Walking of
the icub humanoid robot in different scenarios: implementation and
performance analysis,” in Humanoid Robots (Humanoids), 2016 IEEE-
RAS 16th International Conference on, pp. 690–696.