PreprintPDF Available

Reactive Locomotion Decision-Making and Robust Motion Planning for Real-Time Perturbation Recovery

Preprints and early-stage research may not have been peer reviewed yet.

Abstract and Figures

In this paper, we examine the problem of push recovery for bipedal robot locomotion and present a reactive decision-making and robust planning framework for locomotion resilient to external perturbations. Rejecting perturbations is an essential capability of bipedal robots and has been widely studied in the locomotion literature. However, adversarial disturbances and aggressive turning can lead to negative lateral step width (i.e., crossed-leg scenarios) with unstable motions and self-collision risks. These motion planning problems are computationally difficult and have not been explored under a hierarchically integrated task and motion planning method. We explore a planning and decision-making framework that closely ties linear-temporal-logic-based reactive synthesis with trajec-tory optimization incorporating the robot's full-body dynamics, kinematics, and leg collision avoidance constraints. Between the high-level discrete symbolic decision-making and the low-level continuous motion planning, behavior trees serve as a reactive interface to handle perturbations occurring at any time of the locomotion process. Our experimental results show the efficacy of our method in generating resilient recovery behaviors in response to diverse perturbations from any direction with bounded magnitudes.
Content may be subject to copyright.
Reactive Locomotion Decision-Making and Robust Motion Planning
for Real-Time Perturbation Recovery
Zhaoyuan Gu, Nathan Boyd, and Ye Zhao
Abstract In this paper, we examine the problem of push
recovery for bipedal robot locomotion and present a reactive
decision-making and robust planning framework for locomotion
resilient to external perturbations. Rejecting perturbations is
an essential capability of bipedal robots and has been widely
studied in the locomotion literature. However, adversarial
disturbances and aggressive turning can lead to negative lateral
step width (i.e., crossed-leg scenarios) with unstable motions
and self-collision risks. These motion planning problems are
computationally difficult and have not been explored under a
hierarchically integrated task and motion planning method. We
explore a planning and decision-making framework that closely
ties linear-temporal-logic-based reactive synthesis with trajec-
tory optimization incorporating the robot’s full-body dynamics,
kinematics, and leg collision avoidance constraints. Between
the high-level discrete symbolic decision-making and the low-
level continuous motion planning, behavior trees serve as a
reactive interface to handle perturbations occurring at any time
of the locomotion process. Our experimental results show the
efficacy of our method in generating resilient recovery behaviors
in response to diverse perturbations from any direction with
bounded magnitudes.
As legged robots are increasingly deployed in complex
environments, the need for robots to accomplish tasks
through symbolic planning and decision-making becomes
more apparent. Although locomotion robustness has been
extensively explored at the motion planning level, resilience
to uncertainties and external disturbances at the task planning
level has been largely overlooked. Hierarchically integrated
task and motion planning (TAMP) is capable of handling
logical and whole-body dynamics objectives simultaneously.
Unexpected errors or even failures at the lower-level can lead
to expensive re-planning at the higher task planning level.
On the other hand, high-level discrete task plans can result
in infeasible low-level motion plans. With these cascading
effects, novel TAMP methods are imperative to make robust
locomotion decisions resilient to environmental perturbations
and enable robots to efficiently recompute plans at both task
and motion planning levels.
At the motion planning level, push recovery of bipedal
locomotion has been extensively studied in previous works
and inspired by human locomotion biomechanics [1], [2].
Various strategies such as hip, ankle, and foot placement
The authors are with the Laboratory for Intelligent Decision
and Autonomous Robots, Woodruff School of Mechanical
Engineering, Georgia Institute of Technology. {zgu78, nboyd31,
This work was funded by the NSF grant # IIS-1924978 and Georgia Tech
Institute for Robotics and Intelligent Machines Seed Grant.
Fig. 1: a) Human is forced to cross legs to recover from an external
disturbance. b) Human must execute leg crossing to traverse stepping stones.
c) An illustration of recovery motion of bipedal robot Cassie.
strategies are proposed to handle external perturbations [3]–
[5]. However, many of these push recovery strategies employ
reduced-order models (RoMs) such as inverted pendulum or
centroidal momentum models, making it difficult to guaran-
tee leg self-collision avoidance. This challenge arises from
solving full-leg kinematic constraints in these RoMs. It is
a strong assumption to state that a robot will never be in
close contact with itself in highly dynamic locomotion. Liu
et al. [6] demonstrated a complete control framework that
considers self-collision under various disturbances, but does
not consider more complicated multi-step or non-periodic
recoveries. Reactive approaches for high dimensional robots
have also been explored [7]–[9], which rely on a distance
metric to generate safe repulsive motions, but can lead
to significant motion plan discrepancies. Behavior libraries
have also been used to generate robust real-time walking
in unstructured or constrained environments [10], [11]. In
addition, few motion planning strategies incorporate higher-
level task planning.
In high-level task planning, reactivity is critical to account
for environmental changes at runtime. Temporal-logic-based
reactive synthesis [12]–[14] has been widely explored to
find strategies that generate formally-guaranteed safe and
provably correct robot actions in response to environmental
events. However, this method has been under-explored for
dynamic locomotion problems until recent years. Recent
works [15]–[17] adopted linear temporal logic (LTL) to
synthesize reactive locomotion navigation plans over rough
terrains. However, the feasibility of executing synthesized
task plans on high degree-of-freedom legged robots is un-
explored. To address this challenge, this study leverages
trajectory optimization (TO) as the low-level motion planner
to verify the synthesized task feasibility.
Behavior Trees (BTs), as graphical mathematical models,
arXiv:2110.03037v1 [cs.RO] 6 Oct 2021
Trajectory Optimization
LTL Reactive Synthesis
External Disturbance
Behavior Tree
Locomotion Subtree
Constrained Whole-Body
Motion Generator
Two-Player Game
(c) (d)
Fig. 2: Block diagram of the proposed framework. a) Experiments of Cassie disturbed during stable walking; b) The high-level task planner synthesis,
employing an LTL two-player game; c) The BTs act as a middle layer that reactively execute subtrees based on real-time environmental disturbances; d) A
whole-body motion planner is used to generate feasible motions and refine LTL specifications ψ. The high-level task planner and the phase-space planner
are integrated in an online fashion as shown by the solid black arrows.
have been widely explored to schedule autonomous tasks
and handle unexpected environmental changes [18], [19].
Their reactive and modular structure can authorize multiple
behavioral plans and achieve fault-tolerant task executions
[20], [21]. [22] devised finite state machine (FSM) con-
trollers for unexpected terrain height variation, but relied
on large handmade state machines. Intuitively speaking,
BTs can be viewed as a feature-rich, acyclic version of
FSM for complex behavior execution. Recent LTL-based
reactive synthesis work [15] proposed reactive TAMP in
combination with robust reachability analysis for dynamic
maneuvers and disturbance rejection, but only accounts for
perturbations applied at specific instances. Formal methods
handling perturbation at any locomotion phase require fur-
ther investigation. The BTs naturally handle the continuous
environmental perturbations by designing actions online to
amend the synthesized discrete automaton.
This study addresses the push recovery problem for legged
robots subject to external perturbations that can happen any-
time. We propose a combined TAMP framework composed
of hierarchical planning layers operating at different temporal
and spatial scales (Fig. 2). First, the LTL planning designs
safety-guaranteed decisions on keyframe states, including
center of mass (CoM) state or foot placements, in response
to the keyframe perturbations. When perturbations occur at
non-keyframe instants, analytical Riemannian manifolds are
used to recalculate a new keyframe transition online for the
current walking step. BTs are integrated to allow updated
keyframes to be any continuous value within the allowable
range, instead of a finite set of discrete values quantified
in the LTL-based planner. Finally, full-body legged motions
are generated using kinodynamic-aware TO for non-periodic
multi-step locomotion with self-collision constraints.
The core contributions of this paper are summarized as:
We present a hierarchically integrated LTL-BT TAMP
framework for dynamic locomotion that reacts to envi-
ronmental changes or new temporal logic specifications
Our LTL-BT planner is capable of reacting to con-
tinuous environmental perturbations for resilient task
We employ Riemannian manifolds to quantify loco-
motion keyframe robustness margins and design robust
transitions enabled by the reactive task planner.
We propose a collision-aware, kinodynamic TO that
generates collision-free and non-periodic full-body mo-
tions and use this TO to refine feasibility specifications
in reactive synthesis.
This section details the symbolic decision-making and mo-
tion planning framework (Fig. 2). Our hierarchical reactive
framework is composed of (i) LTL-level reactive synthesis
handling perturbations at keyframe instants, (ii) BT for robust
execution of one walking step (OWS) between keyframe
instances, (iii) full-body motion primitive generation from
kinodynamic-aware TO.
A. Keyframe-based Non-periodic Locomotion
To define a multi-step walking motion for bipedal robot
walking, we separate the entire trajectory into multiple OWS
phases that start and end at a keyframe state [23]. The
ith OWS cycle can be represented by a discrete keyframe
transition pair (ki, ki+1 ). The keyframe contains the sagittal
and lateral CoM apex state, as well as the stance foot index
(Sec. II-B). During the keyframe transition of OWS, the CoM
trajectory obeys the locomotion dynamics. In the sagittal
plane, forward and backward numerical integration is used
to solve the contact switching time of a OWS (t1and t2
for the first-half and second-half OWS phases, respectively).
The next sagittal keyframe will determine the next lateral
keyframe state. Namely, given two consecutive keyframe
states in the sagittal plane, the next lateral keyframe can be
calculated by meeting the t1and t2timing constraint, due
to the simultaneous contact switch in both directions. Then
the lateral keyframe transition is determined as well as the
lateral CoM state (pswitch,˙pswitch)at contact switch instant.
The next lateral foot placement can be computed with the
analytical solution:
pfoot =pswitch +(e2ωasymt21) ˙pswitch
(e2ωasymt2+ 1)ωasym
where the asymptote slope ωasym =pg/hapex and hap ex is
the relative apex CoM height with respect to the stance foot
height. gis the gravity constant. Detailed derivations can be
referred in the work of [23].
y [m]
left foothold crossed right foothold
cotangent Riemannian-
space cells
y [m/s]
Fig. 3: An illustration of a phase-space Riemannian partition and non-
deterministic lateral keyframe transition for disturbance recovery.
Compared to periodic walking, where the robot repeats the
same motion pattern periodically, keyframe-based walking
allows for non-periodic walking that better accommodates
rough terrain and environment disturbances. A keyframe
decision-maker is thus necessary to design safe and feasible
keyframe transitions at runtime.
B. LTL Specifications for Push Recovery
As the complexity of locomotion tasks increases, making
safe decisions on keyframe states to recovery from push
becomes intricate. To address this challenge, we employ
reactive synthesis, which is built upon task specifications and
an abstraction of dynamical systems [13], [24]. The tasks are
represented by LTL specifications, which describe temporal
and logical relations of the system properties. The abstraction
(i.e., transition system) is a discrete description of the system
and environment dynamics. An LTL formula operates over
atomic propositions (APs) that can be True (ϕ∨ ¬ϕ) or
False (¬True). The formulas use logical symbols of negation
(¬), disjunction (), and conjunction (). Temporal operators
such as “next” (), “eventually” (), and “always” () are
used as extensions to the propositional logic. Detailed LTL
semantics are omitted due to space limit and can be found
in [25].
To formally guarantee locomotion task completion under
environmental disturbances, we adopt the General Reactivity
of Rank 1 (GR(1)) [26], a fragment of LTL. GR(1) provides
correct-by-construction guarantees of the realizability of LTL
specifications. Provided a transition system TSE and LTL
specification ψ, the reactive synthesis problem aims for a
winning strategy for the robot system such that the execution
path satisfies ψ[17]. If the specification is realizable, an
automaton will be constructed and provide correct transitions
for any environmental actions obeying the assumptions.
Definition 2.1 (Riemannian partition): The transition sys-
tem discretizes the continuous robot state space (i.e., robot’s
CoM phase-space near the apex state) into Riemannian
partitions defined as:
R:=Rposition ∪ Rvelocity
={rp,n, rp,z , rp,p}∪{rv,z , rv,s, rv,m , rv,f }
where the elements in Rposition define the relative position
(negative, zero, positive) of CoM with respect to the stance
foot frame, and Rvelocity defines the CoM apex velocity
(zero, slow, medium, fast). Riemannian partitions are defined
for both sagittal and lateral phase-space, each constitutes 12
Fig. 3 shows a disturbed keyframe state (rp,n, rv,m),
which represents a negative position and medium velocity. A
keyframe state whose CoM velocity is zero in sagittal axis
is noted as (rp)s= (rp,z)s. The Riemannian partitions use
the analytical manifolds of CoM dynamics derived from the
Prismatic Inverted Pendulum Model (PIPM) and align the
robustness margin sets with the locomotion dynamics. More
details will be introduced in Sec. II-E.
Definition 2.2 (Locomotion keyframe): A keyframe Kis
defined as a system apex state composed of the sagittal
partition Rs, the lateral partition Rl, as well as the stance
foot index set Fst ={left,right}(used to identify the leg
crossing or wider lateral step strategies).
K:=Rs∪ Rl∪ Fst.
The system takes actions asys ∈ Asys ⊆ Rs∪ Rl
L ∪ W to decide the next keyframe state kn.L=
{small,medium,large}and W={small,medium,large}
represent the step length and width. l∈ L and w∈ W
are the distances between the current and the next foot
placements projected on sagittal and lateral axis. Note that
Rs,Rlare the relative CoM apex state in foot frame while
L,Ware the global distances between footholds. Together,
they completely define a transition action.
The environment state is represented by a perturbation set
penv ∈ Penv :=Rs∪ Rl∪ {∅} that pushes the system to
a specific Riemannian cell center. In the task planner, we
assume that the environment action is a perturbation only
applied at a keyframe instant. The perturbation induces a
CoM position and velocity jump after applying an external
force to the robot’s pelvis frame. The environment can also
choose to not perturb, i.e., penv =. The system action Asys
and environment action Penv together decides the next apex
keyframe state kn=TSE(kc, asys, penv ). Both actions are a
part of the automaton state S.
Definition 2.3 (Steady state keyframe): A special set of
keyframes are defined as steady state keyframes kss ∈ Kss
during perturbation-free walking.
Kss ={kss|kss =(rp,z , rv,·)s,(rp,·, rv,z )l, fst}
where (rp,z, rv,·)smeans that the sagittal CoM apex position
is on top of the nominal foot placement and can take any
allowable sagittal velocities, while (rp,·, rv,z)lmeans that the
lateral CoM apex position can take any values and the apex
velocity has to be zero rv,z (see Fig. 3).
Let the system start from a steady state kss
sys =
((rp,z, rv,m)s,(rp,z, rv,z )l,right). We have
sinit = (kinit, ainit
sys , pinit
=kss,((rp,z , rv,m)s,(rp,z , rv,z)l),
The robot chooses to maintain stable walking so long
as there is no perturbation from the environment. In the
presence of perturbations, the keyframe state returns to a
steady state within two steps:
k=¬kss (k=kss)(  k=kss )
The feasibility of transitions must be verified by the low-
level full-body TO (Sec. II-F). Certain high-level transitions
should be removed due to infeasible full-body kinematics
and dynamics constraints. In this way, we define a set
of TO-refined task specifications. For example, after the
TO refinement, we obtain all full-body-dynamics-feasible
transitions offline and encode TO-refined specifications. An
example of TO-refined specification can be:
k= ((rp,z, rv,m)s,(rp,z, rv,m)l,right)
a= ((rp,z, rv,m)s,(rp,z, rv,s)l,small,small)
a= ((rp,z, rv,m)s,(rp,z, rv,m)l,small,medium)
In the presence of perturbations, recovering to a steady
state kss requires the next keyframe knto decrease the lateral
apex velocity and minimizes the sagittal apex deviation from
its normal value. For example, a current medium apex
velocity indicates the next apex velocity is either medium,
small or zero:rv=rv,m (rv=rv,m rv,s rv ,z ).
A smaller step width w∈ W will be chosen rather than larger
ones. (w=small w=large)(w=small).
For the recovery motion execution not to be interrupted,
we assume the environment perturbation happens at most
once per two steps: penv =¬∅ ⇒ (penv =)
C. Task Planner Synthesis
Given the LTL specifications above, the task planner
models the robot system and the environment interplay as a
two-player game. We construct the keyframe transition game
structure in the form of a tuple G:= (S, sinit,TSE )with:
S=K × Asys × Penv is the possible automaton state
of the transition system,
sinit = (kinit, ainit
sys , pinit
env)is the initial automaton state
TSE S × S is a transition describing the possible
moves of the robot system and antagonist environment.
In the extreme case where the disturbance is towards the
stance leg (see Fig. 1), the foot placement of the swing leg
would naturally move closer to the stance foothold location
or require a crossed-leg motion. A minimum of two steps
is required to recover, during which self-collision poses a
challenge for making safe decisions. The TO-refined tran-
sition specifications guarantee that the task planner makes
dynamics-informed decisions on keyframe transitions. By
construction, the TO indicates that all the constraints on the
full-body motion are fulfilled and a keyframe transition is
At each keyframe instant, the decision-maker uses the
estimated current system keyframe state kcand plans a
sequence of transitions until the final state goes back to the
steady keyframe state, i.e., kf=kss. The action roll out
produces an action plan P={kc, . . . , kf}.
D. Behavior-Tree-Based Dynamic Replanning
To address continuous perturbations at non-keyframe
instants, we propose a perturbation-aware behavior tree
Looping Decorator
Fallback Node
Sequence Node
Action Node
Condition Node
Fig. 4: An illustration of the PABT structure. The PABT groups a set of
locomotion subtrees Ψi. Each subtree is a fallback tree that encodes a
keyframe transition (kc,i, kn,i )and a Riemannian recalculation action.
(PABT) that allows online local modification and new
keyframe transition assignment. In the case of moderate
perturbations, the PABT complements the reactive synthesis
by locally modifying the keyframe transitions, given the real-
time estimated CoM state (pCoM,˙pCoM). In the case of large
perturbations where the CoM state is perturbed to a different
Riemannian cell, the PABT recalculates a new keyframe
transition based on the original one (kc,d, kn,d)from the
high-level reactive synthesis.
The PABT groups a set of locomotion subtrees Ψ = S
Each Ψiencodes a pair of the current-to-next keyframe states
(kc,i, kn,i). These pairs are represented as condition nodes
in the locomotion subtrees (Fig. 4). The locomotion subtrees
are fallback BTs that execute their action nodes when the
desired keyframe transition from the high-level matches their
condition nodes. For instance, the pre-condition nodes check
if the desired transition kc,d matches with their keyframe
condition kc,i, the same for the post-condition nodes.
The PABT modifies its keyframe transitions locally to han-
dle non-keyframe perturbations. After the modification, the
desired keyframe transition remains feasible despite the CoM
state deviation. The action node Aican also be a keyframe
recalculation procedure. Here we use the recovery strategy
[23] to perform a Riemannian recalculation, which resolves
the motion plan when the CoM state is perturbed off from
the nominal manifold. Specifically, the PABT uses a position
guard strategy (Sec. II-E) to recalculate the keyframe state. It
is worth noting that the modified keyframe state may not be
a Riemannian cell center or even end up in a different cell,
which is suitable for continuous perturbation recovery. The
keyframe transition determined by the synthesized decision-
maker only starts from the Riemannian cell center. The local
BT modifications will handle the discrepancy between the
current (perturbed) CoM state and its Riemannian cell center
by recomputing an updated keyframe transition (see Fig. 5).
The PABT grows as the new action plan Pis commanded
from the task planner. The PABT constructs new subtrees
Ψcthat represent the transitions (kc, kn)from P. The new
subtrees are inserted under the root node as new behaviors.
A tick of the PABT will trigger the corresponding subtree
that matches the subtree conditions. The PABT expansion
and execution process is illustrated in Alg. 1.
E. Riemannian Robustness Margin Design
To quantify robustness margin, we use the Riemannian
distance metric proposed in [23] to measure the deviation
Algorithm 1: Keyframe Decision Making and PABT
1Input: PABT Ψ, Decision-maker DM , current time;
2Set: status = success;
3while status == success do
4kc, time = StateEstimation();
5if time == keyframe instant then
6P=DM (kc);
7for (kc, kn)in Pdo
8Ψc= LocomotionSubtree(kc, kn);
10 end
11 end
/*PABT Riemannian Recalculation */
12 status =Ψ.Tick();
13 (kc, kn)0=Ψ.GetModifiedTransition();
14 end
15 Output: updated PABT Ψ, modified keyframe
transition (kc, kn)0;
of CoM state from the nominal CoM manifolds in the CoM
phase-space. This Riemannian metric discretizes the phase-
space with tangent and cotangent locomotion manifolds,
instead of using na¨
ıve Euclidean-type discretization. The
tangent and cotangent manifolds comply with the PIPM
locomotion dynamics and provide an intuitive trajectory
recalculation strategy for CoM deviations. As shown in
Fig. 3, the CoM state near the stance foothold is partitioned
into 12 Riemannian cells in sagittal and lateral directions,
including top 9 cells with non-zero velocities and bottom
3 cells representing states with zero velocities. The top 9
cells are centered around the nominal sagittal apex CoM
state during walking. The size of each cell represents the
robust margin, which indicates the amount of perturbation
the system can handle without switching to a neighbouring
cell. Detailed derivations of Riemannian manifolds can be
found in the work of [23].
We use the position guard strategy to recalculate the
next CoM apex state. Assuming the CoM state jumps to
CoM)on a new tangent manifold σ0, the recalculated
next CoM apex state is:
(papex,˙papex )=(pfoot,
CoM ±q˙p04
CoM 4ω2
Note that the (papex,˙papex )corresponds to the next keyframe
knat the LTL level. The motion primitive set interpolates a
full-body motion that connects the current CoM state to the
updated next keyframe.
F. Collision-Aware Kinodynamic Trajectory Optimization
The task planner and PABTs generate keyframe transitions
robust to perturbations. However, mapping the transitions to
whole-body trajectories in real-time often poses a challenge
due to the curse of dimensionality. To address this, we
0 0.4 0.6 0.8 1 1.2 1.4 1.6 x(m)
Sagittal Keyframe Disturbance Response.
0 0.2 0.4 0.6 0.8 1
Stable Walking
Crossed-leg Recovery
Wider Step Recovery
Perturbation Jump
Left Footholds
Right Footholds
Riemannian Center
Keyframe Perturbation
Non-Keyframe Perturbation
Lateral Keyframe Disturbance Response.
Fig. 5: Lateral and sagittal responses to diagonal disturbances at keyframe
and non-keyframe instants while walking at 0.5m/s apex velocity. Each
color represents a single step generated by the LTL-BT.
use TO to create a set of motion primitives offline. The
motion primitives represent OWS trajectories that can be
sequentially composed for continuous walking including the
recovery cases. The TO generates desired motions that satisfy
the physical constraints while minimizing the trajectory
cost [27]–[29]. Our TO incorporates self-collision avoidance
and keyframe boundary constraints to ensure that the opti-
mal full-body motion is collision-free and obeys high-level
keyframe decisions. The TO is also used as a verification to
check the feasibility of high-level keyframe transitions. The
nonlinear program (NLP) of TO is formulated as:
arg min
j· Lj(xj
i, uj
s.t. Hj(xj
i) ˙xi+Vj(xj
i) + Gj(xj
i) = ui,(dynamics)
0= ∆j(xj
Nj),(reset map)
λc,z 0,|λc,xy| µλc,z,(friction)
i, uj
i) = 0 (keyframe boundary)
where the domains include D= 2 continuous single stance
phases and 1 velocity reset map. Each single stance contains
Njnodes, which represents a state-control pair nj
i= (xj
i, uj
at the ith instant; the state represents the x= [q; ˙q]with q
denoting the robot generalized coordinate states. The NLP
above solves the optimal state-control trajectory X={nj
by minimizing the pseudo energy L(·) = ||ui||2with weights
jwhile enforcing the physical constraints of the robot.
The physical constraints shape the resultant trajectory.
The dynamics constraint is enforced between node points
using Hermite-Simpson collocation. H,V, and Gdenote
the inertia, coriolis, and gravity matrices of the robot’s rigid
body dynamics. xj+1
0= ∆j(xj
Nj)maps the velocity jump
between two consecutive stance modes. The horizontal con-
tact forces λc,xy are bounded by a linearized friction cone.
The kinematics constraints Ckin
i)ensure that the joint
angles, foot positions, and CoM trajectories are bounded. M
geometric point pairs (gm
r)on two legs are selected as
self-collision constraints (see Fig. 2d). The signed distances
dmare evaluated at each geometric point pair using forward
kinematics F Kgm(xi)for all mM,iNj. The
Fig. 6: Maximum allowable velocity change exerted on the CoM for a single
step at 30increments. Values on the left half resulted in single wider step
recoveries and values on the right half require crossed-leg maneuvers.
minimally allowed distances are denoted as dm
dm(xi) = F Kgm
l(xi)F Kgm
j(xi) = ||dm
2− ||dm(xi)||2
The keyframe transition (kc, kn)commanded from the task
planner is enforced as a boundary condition for the foot
placement, the apex CoM position and velocity.
To demonstrate the robustness of the proposed methods,
we have tested various scenarios in simulation using Matlab
Simulink with a bipedal robot, Cassie. The kinematics and
dynamics functions were generated using the Fast Robot
Optimization and Simulation Toolkit (FROST) [28]. The
NLP solver IPOPT [30] solved the TO problems detailed
in Eq. 3. Our framework, together with a virtual constraint
controller [11], ran at a rate of 2kHz online. Impulse forces
were measured through near discontinuous changes in CoM
velocity. As for the task planner, the SLUGS reactive syn-
thesis toolbox [31] was used to design LTL specifications
with APs and synthesize the keyframe-based automaton.
For our crossed-leg experimentation, we used the 9 par-
titions with non-zero apex velocities for Rc
land Rn
respectively. For each (rc
s, rc
l, rn
s)pair, we used phase-space
planning to find the next allowable rn
l. This Riemannian
abstraction provided 9×9×9 = 729 possible crossed-
leg transitions prior to the full-body TO. The TO stopping
criteria of the constraints and variable bound violation were
set on the order of 103.4pairs of collision points were
selected. Three of them were at the middle of shin, tar-
sus, and toe links. The fourth point pair was located at
the edge of Cassie’s heel spring, as shown in Fig. 2(d).
We evaluated the feasible transitions and automatically
generated specifications into structured SLUGS files. These
specifications encoded the high-level decisions of feasible
keyframe transitions. For stable walking and wider step
recovery scenario, the lateral rs
land rn
lwere the bottom
three Riemannian partitions.
We evaluated the performance of our framework through
multiple push recovery studies. Pushing tasks are of partic-
Fig. 7: Success rate of recovery motion when disturbance happens anytime
during OWS at multiple directions. Three disturbances are used with a)
small 0.1 m/s b) medium 0.2 m/s and c) large 0.3 m/s disturbances
Fig. 8: Tracking controller results for Cassie executing a 0.4m/s laterally
disturbed leg crossing maneuver from a 0.5m/s stable forward walking.
ular interest since they demand fast reactive responses and
often result in large errors between the high-level decision-
maker and the low-level motion planner. In our test, we
applied the perturbations once every other walking steps. The
perturbation can happen at any phase of one walking step.
As shown in Fig. 5, the system was capable of composing
multiple OWS trajectories according to the reactive synthesis
plan. The robot was firstly disturbed to the non-apex velocity
(0.4,0.4) m/s at keyframe instant. The keyframe decision-
maker planned a two-step recovery strategy that took one
crossed-leg step and one succeeding wider step to come back
to a steady state kss. Disturbances at non-keyframe states
required the robot to recalculate a new CoM trajectory to
an updated keyframe state. The PABT locally modified the
desired keyframe transition and allowed the transitions to
start and terminate in non-Riemannian-cell-centers. The re-
active synthesis could still update the keyframe transitions as
long as the CoM state was inside the Riemannian robustness
bound (the grey areas in Fig. 5). This preserved the notion
of continuous recovery rather than that only from a finite set
of discrete keyframe transitions.
In Fig. 6, we compared the maximum impulse velocity
changes the system can recover from in 12 directions during
OWS. We measured the perturbation by the velocity change
instead of force impulse, because the CoM state is measur-
able, whereas the external forces are difficult to quantify in
general. The robot walked sagittally (positive x direction) at
0.5m/s apex velocity. After the perturbation, it was allowed
to recover using up to two steps. While multi-step recovery
strategy can be designed to allow the robot to handle larger
velocity impulses, our study adopted the two-step recovery
strategy as it is the minimum required step number for the
crossed-leg ability. When the push direction was lateral left
(positive y), the robot would take a wider step to come
back at kss; otherwise, when the push direction was lateral
right, the robot needs to adopt the crossed-leg maneuvers.
The perturbations are applied at 4 different phases, with
phase φ= 0% and 90% closer to keyframe states (boundary
phases), and φ= 30% and 60% closer to the contact switch
phase (50%). The result shows that the phases close to
keyframes were better at absorbing large left perturbations.
Closer to the contact switch phase, the right side push is
handled better due to the increased lateral velocity halfway
through the step.
We conducted an experiment to study the recovery suc-
cess rate with 100 trials in 4 directions (Fig. 8). Diagonal
disturbances were applied at 45from the front to the right.
For each trial, the robot took the same was disturbed with
3 instantaneous velocity jumps of 0.1,0.2,0.3m/s. The
perturbations for each trial were spaced evenly (φ= 1%)
for the entire phase duration. Failures primarily occurred at
the point of maximum velocity for the stance phase (right
stance: φ10% and φ90%, left stance: 40% φ
60%). Similar trends were seen in the maximum velocity
disturbances Fig. 6.
Finally, the tracking performance for the system was
evaluated for ±0.4m/s lateral disturbances while the left
leg was in stance, during a stable walking with 0.5m/s apex
velocity. The positive disturbance forced a two-step crossed-
leg recovery due to the stance feet (Fig. 8) and has a RMS
tracking error of 0.0084 m and 0.0593 m/s. For negative
lateral disturbances, the system stabilized within 1 wide step,
similar to methods like capture point [3] with a RMS tracking
error of 0.0039 m and 0.0363 m/s in lateral phase-space.
In this paper, we presented a locomotion framework
for reactive disturbance rejection at the symbolic decision-
making and continuous motion planning level. We combined
reactive synthesis with BTs to demonstrate safe, continuous,
disturbance rejection capabilities. At the low level, the TO
generates full-body locomotion trajectories and refines fea-
sible keyframe specifications in the reactive synthesis to fill
the gap between the high-level decisions making and the
low-level full-body motion planning.
For future work, we plan to explore multiple directions.
We will quantify environment disturbances in the real world
by evaluating results on the Computer Assisted Rehabilita-
tion ENvironment (CAREN) system [32] and outdoors. The
framework can also be generalized to handle more environ-
mental events for either navigation collision avoidance or
collaborative tasks with other robots and humans.
[1] Benjamin Stephens. Push recovery control for force-controlled hu-
manoid robots. Carnegie Mellon University, 2011.
[2] Pierre-brice Wieber. Trajectory free linear model predictive control for
stable walking in the presence of strong perturbations. In IEEE-RAS
International Conference on Humanoid Robots, pages 137–142, 2006.
[3] Jerry Pratt, John Carff, Sergey Drakunov, and Ambarish Goswami.
Capture point: A step toward humanoid push recovery. In IEEE-RAS
international conference on humanoid robots, pages 200–207. IEEE,
[4] Seung-Joon Yi, Byoung-Tak Zhang, Dennis Hong, and Daniel D
Lee. Online learning of a full body push recovery controller for
omnidirectional walking. In IEEE-RAS International Conference on
Humanoid Robots, pages 1–6. IEEE, 2011.
[5] Milad Shafiee, Giulio Romualdi, Stefano Dafarra, Francisco Javier An-
drade Chavez, and Daniele Pucci. Online dcm trajectory generation
for push recovery of torque-controlled humanoid robots, 2019.
[6] Chengju Liu, Jing Ning, Kang An, and Qijun Chen. Ac-
tive balance of humanoid movement based on dynamic task-
prior system. International Journal of Advanced Robotic Systems,
14(3):1729881417710793, 2017.
[7] Chengxu Zhou, Cheng Fang, Xin Wang, Zhibin Li, and Nikos
Tsagarakis. A generic optimization-based framework for reactive
collision avoidance in bipedal locomotion. In IEEE International
Conference on Automation Science and Engineering (CASE), pages
1026–1033, 2016.
[8] Arne-Christoph Hildebrandt, Robert Wittmann, Daniel Wahrmann,
Alexander Ewald, and Thomas Buschmann. Real-time 3d collision
avoidance for biped robots. In IEEE/RSJ International Conference on
Intelligent Robots and Systems, pages 4184–4190, 2014.
[9] Alexander Dietrich, Thomas Wimb¨
ock, Holger T¨
aubig, Alin Albu-
affer, and Gerd Hirzinger. Extensions to reactive self-collision
avoidance for torque and position controlled humanoids. In IEEE
International Conference on Robotics and Automation, pages 3455–
3462, 2011.
[10] Quan Nguyen, Xingye Da, J. Grizzle, and K. Sreenath. Dynamic
walking on stepping stones with gait library and control barrier
functions. In WAFR, 2016.
[11] Yukai Gong, Ross Hartley, Xingye Da, Ayonga Hereid, Omar Harib,
Jiunn-Kai Huang, and Jessy Grizzle. Feedback control of a cassie
bipedal robot: Walking, standing, and riding a segway, 2018.
[12] Hadas Kress-Gazit, Georgios E Fainekos, and George J Pappas.
Temporal-logic-based reactive mission and motion planning. IEEE
transactions on robotics, 25(6):1370–1381, 2009.
[13] Jun Liu, Necmiye Ozay, Ufuk Topcu, and Richard M Murray. Synthe-
sis of reactive switching protocols from temporal logic specifications.
IEEE Transactions on Automatic Control, 58(7):1771–1785, 2013.
[14] Keliang He, Andrew M. Wells, Lydia E. Kavraki, and Moshe Y.
Vardi. Efficient symbolic reactive synthesis for finite-horizon tasks. In
International Conference on Robotics and Automation (ICRA), pages
8993–8999, 2019.
[15] Ye Zhao, Yinan Li, Luis Sentis, Ufuk Topcu, and Jun Liu. Reactive
task and motion planning for robust whole-body dynamic locomotion
in constrained environments, 2018.
[16] Sutej Kulgod, Wentao Chen, Junda Huang, Ye Zhao, and Nikolay
Atanasov. Temporal logic guided locomotion planning and control in
cluttered environments. In 2020 American Control Conference (ACC),
pages 5425–5432, 2020.
[17] Jonas Warnke, Abdulaziz Shamsah, Yingke Li, and Ye Zhao. Towards
safe locomotion navigation in partially observable environments with
uneven terrain. In IEEE Conference on Decision and Control (CDC),
pages 958–965, 2020.
[18] Alejandro Marzinotto, Michele Colledanchise, Christian Smith, and
Petter ¨
Ogren. Towards a unified behavior trees framework for robot
control. In 2014 IEEE International Conference on Robotics and
Automation (ICRA), pages 5420–5427. IEEE, 2014.
[19] Shen Li, Daehyung Park, Yoonchang Sung, Julie A Shah, and Nicholas
Roy. Reactive task and motion planning under temporal logic speci-
fications. arXiv preprint arXiv:2103.14464, 2021.
[20] Michele Colledanchise and Petter ¨
Ogren. Behavior trees in robotics
and AI: An introduction. CRC Press, 2018.
[21] Matteo Iovino, Edvards Scukins, Jonathan Styrud, Petter ¨
Ogren, and
Christian Smith. A survey of behavior trees in robotics and ai. arXiv
preprint arXiv:2005.05842, 2020.
[22] Hae-Won Park, Alireza Ramezani, and J. W. Grizzle. A finite-
state machine for accommodating unexpected large ground-height
variations in bipedal robot walking. IEEE Transactions on Robotics,
29(2):331–345, 2013.
[23] Ye Zhao, Benito R Fernandez, and Luis Sentis. Robust optimal
planning and control of non-periodic bipedal locomotion with a
centroidal momentum model. The International Journal of Robotics
Research, 36(11):1211–1242, 2017.
[24] Hadas Kress-Gazit, Tichakorn Wongpiromsarn, and Ufuk Topcu. Cor-
rect, reactive, high-level robot control. IEEE Robotics & Automation
Magazine, 18(3):65–74, 2011.
[25] Christel Baier and Joost-Pieter Katoen. Principles of model checking.
MIT press, 2008.
[26] Nir Piterman, Amir Pnueli, and Yaniv Sa’ar. Synthesis of reactive(1)
designs. In Verification, Model Checking, and Abstract Interpretation,
pages 364–380. Springer, 2006.
[27] Anil Rao. A survey of numerical methods for optimal control.
Advances in the Astronautical Sciences, 135, 01 2010.
[28] Ayonga Hereid and Aaron D. Ames. Frost: Fast robot optimization
and simulation toolkit. In IEEE/RSJ International Conference on
Intelligent Robots and Systems (IROS), pages 719–726, 2017.
[29] Mikhail Koptev, Nadia Figueroa, and Aude Billard. Real-time self-
collision avoidance in joint space for humanoid robots. IEEE Robotics
and Automation Letters, 6(2):1240–1247, 2021.
[30] Andreas W¨
achter and Lorenz Biegler. On the implementation of
an interior-point filter line-search algorithm for large-scale nonlinear
programming. Mathematical programming, 106:25–57, 03 2006.
[31] V. Raman R. Ehlers. Slugs: Extensible gr(1) synthesis, 2016.
[32] Brad Isaacson, Thomas Swanson, and Paul Pasquina. The use
of a computer-assisted research environment (caren) for enhancing
wounded warrior rehabilitation regimens. The journal of spinal cord
medicine, 36:296–299, 07 2013.
ResearchGate has not been able to resolve any citations for this publication.
Full-text available
Contact-based decision and planning methods are becoming increasingly important to endow higher levels of autonomy for legged robots. Formal synthesis methods derived from symbolic systems have great potential for reasoning about high-level locomotion decisions and achieving complex maneuvering behaviors with correctness guarantees. This study takes a first step toward formally devising an architecture composed of task planning and control of whole-body dynamic locomotion behaviors in constrained and dynamically changing environments. At the high level, we formulate a two-player temporal logic game between the multi-limb locomotion planner and its dynamic environment to synthesize a winning strategy that delivers symbolic locomotion actions. These locomotion actions satisfy the desired high-level task specifications expressed in a fragment of temporal logic. Those actions are sent to a robust finite transition system that synthesizes a locomotion controller that fulfills state reachability constraints. This controller is further executed via a low-level motion planner that generates feasible locomotion trajectories. We construct a set of dynamic locomotion models for legged robots to serve as a template library for handling diverse environmental events. We devise a replanning strategy that takes into consideration sudden environmental changes or large state disturbances to increase the robustness of the resulting locomotion behaviors. We formally prove the correctness of the layered locomotion framework guaranteeing a robust implementation by the motion planning layer. Simulations of reactive locomotion behaviors in diverse environments indicate that our framework has the potential to serve as a theoretical foundation for intelligent locomotion behaviors.
Full-text available
In this work, we propose a real-time self-collision avoidance approach for whole-body humanoid robot control. To achieve this, we learn the feasible regions of control in the humanoid's joint space as smooth self-collision boundary functions. Collision-free motions are generated online by treating the learned boundary functions as constraints in a Quadratic Program based Inverse Kinematic solver. As the geometrical complexity of a humanoid robot joint space grows with the number of degrees-of-freedom (DoF), learning computationally efficient and accurate boundary functions is challenging. We address this by partitioning the robot model into multiple lower-dimensional submodels. We present a thorough evaluation of several state-of-the-art machine learning techniques to learn such boundary functions. Our approach is validated on the 29-DoF iCub humanoid robot, demonstrating highly accurate real-time self-collision avoidance.
Conference Paper
Full-text available
This study proposes an integrated task and motion planning method for dynamic locomotion in partially observable environments with multi-level safety guarantees. This layered planning framework is composed of a high-level symbolic task planner and a low-level phase-space motion planner. A belief abstraction at the task planning level enables belief estimation of dynamic obstacle locations and guarantees navigation safety with collision avoidance. The high-level task planner, i.e., a two-level navigation planner, employs linear temporal logic for a reactive game synthesis between the robot and its environment while incorporating low-level safe keyframe policies into formal task specification design. The synthesized task planner commands a series of locomotion actions including walking step length, step height, and heading angle changes, to the underlying keyframe decision-maker, which further determines the robot center-of-mass apex velocity keyframe. The low-level phase-space planner uses a reduced-order locomotion model to generate non-periodic trajectories meeting balancing safety criteria for straight and steering walking. These criteria are characterized by constraints on locomotion keyframe states, and are used to define keyframe transition policies via viability kernels. Simulation results of a Cassie bipedal robot designed by Agility Robotics demonstrate locomotion maneuvering in a three-dimensional, partially observable environment consisting of dynamic obstacles and uneven terrain.
Conference Paper
Full-text available
We present planning and control techniques for non-periodic locomotion tasks specified by temporal logic in rough cluttered terrains. Our planning approach is based on a discrete set of motion primitives for the center of mass (CoM) of a general bipedal robot model. A deterministic shortest path problem is solved over the Büchi automaton of the temporal logic task specification, composed with the graph of CoM keyframe states generated by the motion primitives. A low-level controller based on quadratic programming is proposed to track the resulting CoM and foot trajectories. We demonstrate dynamically stable, non-periodic locomotion of a kneed compass gait bipedal robot satisfying complex task specifications.
Conference Paper
Full-text available
This paper presents FROST, an open-source MATLAB toolkit for modeling, trajectory optimization and simulation of hybrid dynamical systems with a particular focus in dynamic locomotion. The design objective of FROST is to provide a unified software environment for developing model-based control and motion planning algorithms for robotic systems whose dynamics is hybrid in nature. In particular, FROST uses directed graphs to describe the underlying discrete structure of hybrid system models, which renders it capable of representing a wide variety of robotic systems. Equipped with a custom symbolic math toolbox in MATLAB using Wolfram Mathematica, one can rapidly prototype the mathematical model of robot kinematics and dynamics and generate optimized code of symbolic expressions to boost the speed of optimization and simulation in FROST. In favor of agile and dynamic behaviors, we utilize virtual constraint based motion planning and feedback controllers for robotic systems to exploit the full-order dynamics of the model. Moreover, FROST provides a fast and tractable framework for planning optimal trajectories of hybrid dynamical systems using advanced direct collocation algorithms. FROST has been successfully used to synthesize dynamic walking in multiple bipedal robots. Case studies of such applications are considered in this paper, wherein different types of walking gaits are generated for two specific humanoid robots and validated in simulation.
Dynamical bipedal walking subject to precise footstep placements is crucial for navigating real world terrain with discrete footholds such as stepping stones, especially as the spacing between the stone locations significantly vary with each step. Here, we present a novel methodology that combines a gait library approach along with control barrier functions to enforce strict constraints on footstep placement. We numerically validate our proposed method on a planar dynamical walking model of MARLO, an underactuated bipedal robot. We show successful single-step transitions from a periodic walking gait with a step length of 10 (cm) to a stepping stone with a 100 (cm) separation (10x step length change), while simultaneously enforcing motor torque saturation and ground contact force constraints. The efficacy of our method is further demonstrated through dynamic walking over a randomly generated set of stepping stones requiring single-step step length changes in the range of [10:100] (cm) with a foot placement precision of 2 (cm).