Conference PaperPDF Available

STOMP: Stochastic trajectory optimization for motion planning


Abstract and Figures

We present a new approach to motion planning using a stochastic trajectory optimization framework. The approach relies on generating noisy trajectories to explore the space around an initial (possibly infeasible) trajectory, which are then combined to produced an updated trajectory with lower cost. A cost function based on a combination of obstacle and smoothness cost is optimized in each iteration. No gradient information is required for the particular optimization algorithm that we use and so general costs for which derivatives may not be available (e.g. costs corresponding to constraints and motor torques) can be included in the cost function. We demonstrate the approach both in simulation and on a mobile manipulation system for unconstrained and constrained tasks. We experimentally show that the stochastic nature of STOMP allows it to overcome local minima that gradient-based methods like CHOMP can get stuck in.
Content may be subject to copyright.
STOMP: Stochastic Trajectory Optimization for Motion Planning
Mrinal Kalakrishnan1Sachin Chitta2Evangelos Theodorou1Peter Pastor1Stefan Schaal1
Abstract We present a new approach to motion planning
using a stochastic trajectory optimization framework. The
approach relies on generating noisy trajectories to explore
the space around an initial (possibly infeasible) trajectory,
which are then combined to produced an updated trajectory
with lower cost. A cost function based on a combination of
obstacle and smoothness cost is optimized in each iteration. No
gradient information is required for the particular optimization
algorithm that we use and so general costs for which derivatives
may not be available (e.g. costs corresponding to constraints
and motor torques) can be included in the cost function. We
demonstrate the approach both in simulation and on a mobile
manipulation system for unconstrained and constrained tasks.
We experimentally show that the stochastic nature of STOMP
allows it to overcome local minima that gradient-based methods
like CHOMP can get stuck in.
Motion planning for manipulation and mobile systems has
received extensive interest in recent years. Motion planning
for avoiding collision has been the most common goal, but
there are other objectives like constraint handling, torque or
energy minimization and achieving smooth paths that may
be important in certain scenarios as well. Domestic and
retail scenarios, in particular, will have lots of cases where
constraint satisfaction may be a prime goal, e.g. carrying
a glass of water. Mobile manipulation systems may need
to minimize energy to conserve power and stay active for
a longer period of time. Non-smooth jerky trajectories can
cause actuator damage. There is definitely a need for a
motion planner that can address such scenarios.
In this paper, we present a new approach to motion
planning that can deal with general constraints. Our ap-
proach involves stochastic trajectory optimization using a
series of noisy trajectories. In each iteration, a series of
such trajectories is generated. The generated trajectories are
simulated to determine their costs which are then used to
update the candidate solution. Since no gradient information
is required in this process, general constraints and additional
non-smooth costs can be optimized.
We demonstrate our approach through both simulation
and experimental results with the PR2 mobile manipulation
robot. We consider end-effector orientation constraints, e.g.
constraints that require the end-effector to keep a grasped
object horizontal through the entire trajectory. We attempt
1Mrinal Kalakrishnan, Evangelos Theodorou, Peter
Pastor, and Stefan Schaal are with the CLMC Labora-
tory, University of Southern California, Los Angeles, USA
2Sachin Chitta is with Willow Garage Inc., Menlo Park, CA 94025,
(a) (b)
Fig. 1. (a) The Willow Garage PR2 robot manipulating objects in a
household environment. (b) Simulation of the PR2 robot avoiding a pole
in a torque-optimal fashion.
to minimize collision costs using a signed distance field
to measure proximity to obstacles. We further impose a
smoothness cost that results in trajectories that can be
directly executed on the robot without further smoothing.
We also demonstrate the optimization of motor torques used
to perform a movement.
Sampling-based motion planning algorithms have proved
extremely successful in addressing manipulation prob-
lems [1], [2], [3], [4], [5]. They have been used to address
task space constraints and to impose torque constraints while
lifting heavy objects [6]. While sampling based planners
result in feasible plans, they are often lacking in the quality
of the paths produced. A secondary shortcutting step is often
required to smooth and improve the plans output by these
planners [7].
Another class of motion planners that have been applied
to manipulation systems are optimization based planners.
Motion planning for cooperating mobile manipulators was
addressed in [8] where an optimal control approach was used
to formulate the motion planning problem with obstacles
and constraints. The problem was solved using a numerical
approach by discretizing and solving for the desired plan.
In [9], a covariant gradient descent technique was used for
motion planning for a 7 DOF manipulator. This algorithm
(called CHOMP) minimized a combination of smoothness
and obstacle costs and required gradients for both. It used
a signed distance field representation of the environment to
derive the gradients for obstacles in the environment. We
adopt a similar cost function for our approach, but in contrast
to CHOMP, our optimization approach can handle general
cost functions for which gradients are not available.
Traditionally, motion planning is defined as the problem of
finding a collision-free trajectory from the start configuration
to the goal configuration. We treat motion planning as an
optimization problem, to search for a smooth trajectory that
minimizes costs corresponding to collisions and constraints.
Specifically, we consider trajectories of a fixed duration T,
discretized into Nwaypoints, equally spaced in time. In
order to keep the notation simple, we first derive the algo-
rithm for a 1-dimensional trajectory; this naturally extends
later to multiple dimensions. This 1-dimensional discretized
trajectory is represented as a vector θRN. We assume
that the start and goal of the trajectory are given, and are
kept fixed during the optimization process.
We now present an algorithm that iteratively optimizes
this discretized trajectory, subject to arbitrary state-dependent
costs. While we keep the cost function general in this section,
Section IV discusses our formulation for obstacle, constraint,
energy, and smoothness costs. We start with the following
optimization problem:
θi) + 1
where ˜
θ=N(θ,Σ)is a noisy parameter vector with
mean θand variance Σ.q(˜
θi)is an arbitrary state-dependent
cost function, which can include obstacle costs, constraints
and torques. Ris a positive semi-definite matrix representing
control costs. We choose Rsuch that θTRθrepresents the
sum of squared accelerations along the trajectory. Let A
be a finite differencing matrix that when multiplied by the
position vector θ, produces accelerations ¨
1 0 0 0 0 0
210· · · 0 0 0
12 1 0 0 0
0 0 0 1 2 1
0 0 0 · · · 0 1 2
0 0 0 0 0 1
Thus, the selection of R=ATAensures that θTRθrepre-
sents the sum of squared accelerations along the trajectory.
Previous work [9] has demonstrated the optimization of
the non-stochastic version of Eqn. 1 using covariant func-
tional gradient descent techniques. In this work, we instead
optimize it using a derivative-free stochastic optimization
method. This allows us to optimize arbitrary costs q(˜
which derivatives are not available, or are non-differentiable
or non-smooth.
Taking the gradient of the expectation in Eqn. 1 with
respect to ˜
θ, we get:
θ E"N
θi) + 1
θ#!= 0 (5)
which leads to:
θ E"N
θi)#! (6)
Further analysis results in:
θ=R1E ˜
θi)#! (7)
The expression above can be written in the form E˜
δθGwhere ˆ
δθGis now the gradient estimate defined
as follows:
θG=E ˜
θi)#! (8)
Previous approaches [9] have used the analytical func-
tional gradient to derive an iterative gradient descent update
rule. While this may be efficient, it requires a cost function
that is smooth and differentiable. Moreover, even though this
has not been proposed [9], for a given cost function J(θ)the
positive definiteness condition of the hessian θθJ(θ)>0
is required to guarantee convergence. Our proposed gradient
estimation is motivated by the limitations of gradient based
optimization when it comes to non-differentiable or non-
smooth cost functions. Inspired by previous work in the
probability matching literature [10] as well as recent work
in the areas of path integral reinforcement learning [11], we
propose an estimated gradient formulated as follows:
Essentially, the equation above is the expectation of δθ
(the noise in the parameter vector ˜
θ) under the probability
metric P= exp 1
θ)where S(˜
θ)is the state de-
pendent cost defined on the trajectory and it is designed as
θ) = hPN
i=1 q(˜
θi)i. Thus the stochastic gradient is now
formulated as follows:
θG=Zexp 1
Even though our optimization procedure is static in the
sense that it does not require the execution of a trajectory
rollout, our gradient estimation process has strong connection
to how the gradient of the value function is computed in
the path integral stochastic optimal control framework [11].
More precisely the goal in the framework of stochastic
optimal control is to find optimal controls that minimize
a performance criterion. In the case of the path integral
stochastic optimal control formalism, these controls are
computed for every state xti as δˆ
δuare the sampled controls and p(x)corresponds to the
5 10 15 20 25 30 35 40 45
1.2 x 10−4
5 10 15 20 25 30 35 40 45
(a) (b)
Fig. 2. (a) Each curve depicts a column/row of the symmetric matrix R1.
(b) 20 random samples of , drawn from a zero mean normal distribution
with covariance Σ=R1.
Start and goal positions x0and xN
An initial 1-D discretized trajectory vector θ
An state-dependent cost function q(θi)
A=finite difference matrix (Eqn 2)
R1= (ATA)1
M=R1, with each column scaled such that the maximum
element is 1/N
Repeat until convergence of trajectory cost Q(θ):
1) Create Knoisy trajectories, ˜
θKwith parameters θ+k,
where k=N(0,R1)
2) For k= 1...K, compute:
a) S(˜
θk,i) = q(˜
b) P˜
3) For i= 1...(N1), compute: [˜
k=1 P(˜
4) Compute δθ=M˜
5) Update θθ+δθ
6) Compute trajectory cost Q(θ) = PN
i=1 q(θi) + 1
probability of every trajectory τistarting from xti and
ending in the terminal state xtN . This probability is defined
as p(x) = exp (S(τi)) with S(τi)being the cost of the
path τi= (xti, ..., xtN ). Thus, p(x)is inversely proportional
to the cost S(τi)and therefore paths with higher cost will
have a lower contribution to the optimal controls than paths
with lower cost. The process above is repeated for every state
xti until the terminal xtN (multistage optimization). Then
the controls are updated according to u=u+δˆ
uand new
trajectories are generated. In our case, we assume that each
state cost q(θi)is purely dependent only on the parameter
θi, and we do not blame future or past costs on the current
state. Hence, we simplify the problem by defining a local
trajectory cost S(θi) = q(θi), i.e., we remove cumulation of
costs. We find that this simplification significantly accelerates
convergence in the experiments presented in Sec. V.
The final update equations for STOMP are presented
in Table I. There are a few points which warrant further
A. Exploration
In order to keep control costs of noisy trajectories low, we
sample the noise from a zero mean normal distribution,
with Σ=R1as the covariance matrix, as shown in
Figure 2(b). This is preferable to sampling with Σ=I
for several reasons: (1) samples have a low control cost
TR, and thus allow exploration of the state space without
significantly impacting the trajectory control cost; (2) these
noisy trajectories may be executed without trouble on a
physical system; (3) the samples do not cause the trajectory
to diverge from the start or goal. Goal-convergent exploration
is highly desirable in trajectory-based reinforcement learning
of point to point movements, where dynamical systems have
been designed that satisfy this property [12].
B. Trajectory updates
After generating Knoisy trajectories, we compute their
costs per time-step S(θk, i)(Table I, Step 2(a)). In Step
2(b), we compute the probabilties P(θk, i)of each noisy
trajectory, per time-step. The parameter λregulates the sen-
sitivity of the exponentiated cost, and can be automatically
optimized per time-step to maximally discriminate between
the experienced costs. We compute the exponential term in
Step 2(b) as:
λS(θk,i)=ehS(θk,i )min S(θk,i)
max S(θk,i)min S(θk,i ), (11)
with hset to a constant, which we chose to be h= 10 in
all our evaluations. The max and min operators are over all
noisy trajectories k. The noisy update for each time-step is
then computed in Step 3 as the probability-weighted convex
combination of the noisy parameters for that time-step.
Finally, in Step 4, we smooth the noisy update using the
Mmatrix, before updating the trajectory parameters in Step
5. The Mmatrix is formed by scaling each column of R1
(shown in Figure 2(a)) such that the highest element in the
column has a magnitude 1/N. This scaling ensures that no
updated parameter exceeds the range that was explored in
the noisy trajectories. Multiplication with Mensures that
the updated trajectory remains smooth, since it is essentially
a projection onto the basis vectors of R1shown in Fig-
ure 2(a).
These trajectory updates can be considered safer than a
standard gradient descent update rule. The new trajectory is
essentially a convex combination of the noisy trajectories
which have already been evaluated, i.e. there are no un-
expected jumps to unexplored parts of the state space due
to a noisy gradient evaluation. Our iterative update rule is
analogous to an expectation-maximization (EM) algorithm,
in which we update the mean of our trajectory sampling
distribution to match the distribution of costs obtained from
sampling in the previous iteration. This procedure guarantees
that the average cost is non-increasing, if the sampling is
assumed to be dense [10], [13]. An additional advantage is
that no gradient step-size parameter is required; the only
open parameter in this algorithm is the magnitude of the
exploration noise.
In this section, we discuss the application of the stochastic
trajectory optimization algorithm in Table I to the problem
of motion planning of a high-dimensional robot manipulator.
We address the design of a cost function that allows planning
for obstacle avoidance, optimization of task constraints, and
minimization of joint torques.
A. Setup
The algorithm in Table I was derived for a one-
dimensional discretized trajectory. Scaling this to multiple
dimensions simply involves performing the sampling and
update steps for each dimension independently in each
iteration. The computational complexity of the algorithm thus
scales linearly with the dimensionality of the problem. For
the application to robot motion planning, we represent the
trajectory in joint space, with a fixed duration and discretiza-
tion. We assume that the start and goal configurations are
provided in joint space.
B. Cost Function
The cost function we use is comprised of obstacle costs
qo, constraint costs qc, and torque costs qt.
q(θ) =
qo(θt) + qc(θt) + qt(θt)(12)
The additional smoothness cost θTRθis already incorpo-
rated in the optimization problem in Equation 1.
1) Obstacle costs: We use an obstacle cost function
similar to that used in previous work on optimization-
based motion planning [9]. We start with a boolean voxel
representation of the environment, obtained either from a
laser scanner or from a triangle mesh model. Although our
algorithm can optimize such non-smooth boolean-valued cost
functions, faster convergence can be achieved by using a
function for which local gradients are available (or can be
estimated by sampling). We compute the signed Euclidean
Distance Transform (EDT) [14] of this voxel map. The
signed EDT d(x), computed throughout the voxel grid,
provides information about the distance to the boundary
of the closest obstacle, both inside and outside obstacles.
Values of the EDT are negative inside obstacles, zero at
the boundary, and positive outside obstacles. Thus, the EDT
provides discretized information about penetration depth,
contact and proximity.
We approximate the robot body Bas a set of overlapping
spheres b∈ B. We require all points in each sphere to be
a distance at least away from the closest obstacle. This
constraint can be simplified as the center of the sphere being
at least +raway from obstacles, where ris the sphere
radius. Thus, our obstacle cost function is as follows:
qo(θt) = X
max(+rbd(xb),0)k˙xbk, (13)
where rbis the radius of sphere b,xbis the 3-D workspace
position of sphere bat time tas computed from the kinematic
model of the robot. A straightforward addition of cost
magnitudes over time would allow the robot to move through
a high-cost area very quickly in an attempt to lower the cost.
(a) (b) (c)
Fig. 3. (a) Simulation setup used to evaluate STOMP as a robot arm
motion planner. (b) Initial straight-line trajectory between two shelves.
(c) Trajectory optimized by STOMP to avoid collision with the shelf,
constrained to maintain the upright orientation of the gripper.
Multiplication of the cost by the magnitude of the workspace
velocity of the sphere (k˙xbk) avoids this effect [9].
2) Constraint costs: We optimize constraints on the end-
effector position and/or orientation by adding the magnitude
of constraint violations to the cost function.
qc(θt) = X
|vc(θt)|, (14)
where Cis the set of all constraints, vcis a function that
computes the magnitude of constraint violation for constraint
3) Torque costs: Given a suitable dynamics model of
the robot, we can compute the feed-forward torque required
at each joint to track the desired trajectory using inverse
dynamics algorithms [15]. The motor torques at every instant
of time are a function of the joint states and their derivatives:
xt), (15)
where xt=θtrepresents the joint states at time t, and
xtare the joint velocities and accelerations respectively,
obtained by finite differentiation of θ.
Minimization of these torques can be achieved by adding
their magnitudes to the cost function:
qt(θt) =
|τt|dt (16)
C. Joint Limits
Joint limits can potentially be dealt with by adding a
term to the cost function that penalizes joint limit violations.
However, we prefer to eliminate them during the exploration
phase by clipping noisy trajectories θ+at the limits, i.e. an
L1projection on to the set of feasible values. Since the noisy
trajectory stays within the limits, the updated trajectory at
each iteration must also respect the limits, since it is a convex
combination of the noisy trajectories. Additionally, since the
convex combination of noise is smoothed through the M
matrix, the resulting updated trajectory smoothly touches the
joint limit as opposed to bumping into it at high speed.
STOMP is an algorithm that performs local optimization,
i.e. it finds a locally optimum trajectory rather than a global
one. Hence, performance will vary depending on the initial
Unconstrained Unconstrained Constrained
Number of 210 / 210 149 / 210 196 / 210
successful plans
Avg. planning time 0.88 ±0.40 0.71 ±0.25 1.86 ±1.25
to success (sec)
Avg. iterations 52.1 ±26.6 167.1 ±113.8 110.1 ±78.0
to success
trajectory used for optimization. STOMP cannot be expected
to solve typical motion planning benchmark problems like
the alpha puzzle [16] in a reasonable amount of time. In
this section, we evaluate the possibility of using STOMP
as a motion planner for a robot arm in typical day-to-day
tasks that might be expected of a manipulator robot. We
conduct experiments on a simulation of the Willow Garage
PR2 robot in a simulated world, followed by a demonstration
of performance on the real robot.
A. Simulation
We created a simulation world containing a shelf with 15
cabinets, as shown in Figure 3(a). Seven of these cabinets
were reachable by the 7 degree-of-freedom right arm of the
PR2. We tested planning of arm trajectories between all
pairs of the reachable cabinets in both directions, resulting
in 42 different planning problems. Since the algorithm is
stochastic in nature, we repeated these experiments 5 times.
In each case, the initial trajectory used was a straight-line
through configuration space of the robot. Each trajectory was
5 seconds long, discretized into 100 time-steps.
These planning problems were repeated in two scenarios
with two different cost functions. The unconstrained scenario
used a cost function which consisted of only obstacle costs
and smoothness costs. Success in this scenario implies the
generation of a collision-free trajectory. In the constrained
scenario, we added a constraint cost term to the cost function.
The task constraint was to keep the gripper upright at all
times, i.e. as though it were holding a glass of water.
Specifically, the roll and pitch of the gripper was constrained
to be within ±0.2 radians. Success in this scenario involves
generation of a collision-free trajectory that satisfies the task
constraint within the required tolerance. We also evaluated
the performance of CHOMP, a gradient-based optimizer [9]
on the unconstrained scenario using the same cost function.
The exploration noise magnitude for STOMP, and the gradi-
ent descent step size for CHOMP were both tuned to achieve
good performance without instability. Both algorithms were
capped at 500 iterations. For STOMP, K= 5 noisy trajectory
samples were generated at each iteration, and an additional
5 best samples from previous iterations used in the update
Table II shows the results obtained from these experi-
ments. STOMP produced a collision-free trajectory in all
(a) (b) (c)
Fig. 4. Planning problem used to evaluate torque minimization. (a) Plan
obtained without torque minimization: arm is stretched. (b,c) Two different
plans obtained with torque minimization. In (b), the entire arm is pulled
down, while in (c) the elbow is folded in: both solutions require lower
gravity compensation torques. Figure 5(b) shows the torques required for
these movements.
trials in the unconstrained scenario. In contrast, CHOMP
fails on nearly 30% of them, presumably due to the gradient
descent getting stuck in local minima of the cost function1.
The execution times are comparable, even though CHOMP
usually requires more iterations to achieve success. This is
because each iteration of STOMP requires multiple trajectory
cost evaluations, but can make larger steps in a more stable
fashion than the CHOMP gradient update.
In the constrained scenario, 93.3% of trials resulted in
plans that were both collision-free and satisfied the task
constraints. Figure 5(a) shows the iterative evolution of
trajectory costs for one of the constrained planning problems,
averaged over ten trials.
These results are obtained when initializing the algorithm
with a na¨
ıve straight-line trajectory through configuration
space, usually infeasible due to collisions with the envi-
ronment (see Fig. 3(b)). The algorithm is able to push the
trajectory out of collision with the environment, as well as
optimize secondary costs like constraints and torques. The
trajectory after optimization is ready for execution on a
robot, i.e., no smoothing is necessary as is commonly used
in conjunction with sampling-based planners [7].
In order to test the part of the cost function that deals with
minimization of torques, we ran 10 trials on the planning
problem shown in Figure 4, with and without the torque term
in the cost function. The resulting torques of the optimized
movements in both cases are shown in Figure 5(b). Since
the movements are rather slow, most of the torques are used
in gravity compensation. Hence the torques at the beginning
and end of the movement stay the same in both cases. The
planner finds parts of the state space towards the middle of
the movement which require lower torques to support the
weight of the arm. Interestingly, the planner usually finds
one of two solutions as shown in Figures 4(b) and 4(c): (1)
the entire arm is pulled down, or (2) the elbow is folded in on
itself, both of which require lower torque than the stretched
out configuration.
B. Real Robot
The attached video shows demonstrations of trajectories
planned using STOMP in a household environment, executed
1This result was obtained using the standard CHOMP gradient update
rule, not the Hamiltonian Monte Carlo variant. Please refer Sec. VI for
50 100 150 200
Iteration number
Trajectory cost
Trajectory cost
± 1 standard deviation
1 2 3 4
Time (sec)
Sum of abs. joint torques (Nm)
No torque opt.
Torque opt.
(a) (b)
Fig. 5. (a) Iterative evolution of trajectory costs for 10 trials of STOMP on
a constrained planning task. (b) Feed-forward torques used in the planning
problem shown in Figure 4, with and without torque optimization, averaged
over 10 trials.
on the Willow Garage PR2 robot.
C. Code, Replication of Results
All of the software written for this work has been pub-
lished under the BSD open source license, and makes use of
the Robot Operating System (ROS) [17]. Further instructions
on installing the software and replicating the results in this
paper can be found at [18].
A Hamiltonian Monte Carlo variant of CHOMP is dis-
cussed in [9], [19], as a principled way of introducing
stochasticity into the CHOMP gradient update rule. While
theoretically sound, we found this method difficult to work
with in practice. It introduces additional parameters which
need to be tuned, and requires multiple random restarts
to obtain a successful solution. In contrast, our algorithm
requires minimal parameter tuning, does not need cost func-
tion gradients, and uses a stable update rule which under
certain assumptions guarantees that the average cost is non-
We have presented an algorithm for planning smooth
trajectories for high-dimensional robotic manipulators in the
presence of obstacles. The planner uses a derivative-free
stochastic optimization method to iteratively optimize cost
functions that may be non-differentiable and non-smooth. We
have demonstrated the algorithm both in simulation and on
a mobile manipulator, for obstacle avoidance, optimization
of task constraints and minimization of motor torques used
to execute the trajectory.
A possibility for future work is to augment this local
trajectory optimizer with a trajectory library approach, which
can recall previous trajectories used in similar situations, and
use them as a starting point for futher optimization [20].
The STOMP algorithm could also be applied to problems
in trajectory-based reinforcement learning, where costs can
only be measured by execution on a real system; we intend
to explore these avenues in future work.
This research was conducted while Mrinal Kalakrish-
nan and Peter Pastor were interns at Willow Garage. This
research was additionally supported in part by National
Science Foundation grants ECS-0326095, IIS-0535282,
IIS-1017134, CNS-0619937, IIS-0917318, CBET-0922784,
EECS-0926052, CNS-0960061, the DARPA program on
Advanced Robotic Manipulation, the Army Research Office,
the Okawa Foundation, and the ATR Computational Neuro-
science Laboratories. Evangelos Theodorou was supported
by a Myronis Fellowship.
[1] D. Berenson, S. S. Srinivasa, D. Ferguson, A. Collet, and J. J.
Kuffner, “Manipulation planning with workspace goal regions,” in
IEEE International Conference on Robotics and Automation, May
[2] R. Diankov, N. Ratliff, D. Ferguson, S. Srinivasa, and J. J. Kuffner.,
“Bispace planning: Concurrent multi-space exploration,” in Robotics:
Science and Systems, Zurich, Switzerland 2008.
[3] J. Kuffner, K. Nishiwaki, S. Kagami, M. Inaba, and H. Inoue, “Motion
planning for humanoid robots,” in In Proc. 20th Int’l Symp. Robotics
Research (ISRR’03), 2003.
[4] R. B. Rusu, I. A. S¸ucan, B. Gerkey, S. Chitta, M. Beetz, and L. E.
Kavraki, “Real-time perception guided motion planning for a personal
robot,” in International Conference on Intelligent Robots and Systems,
St. Louis, USA, October 2009.
[5] I. A. S¸ucan, M. Kalakrishnan, and S. Chitta, “Combining planning
techniques for manipulation using realtime perception,” in IEEE
International Conference on Robotics and Automation, Anchorage,
Alaska, May 2010.
[6] D. Berenson, S. S. Srinivasa, D. Ferguson, and J. J. Kuffner, “Ma-
nipulation planning on constraint manifolds,” in IEEE International
Conference on Robotics and Automation, May 2009.
[7] K. Hauser and V. Ng-Thow-Hing, “Fast Smoothing of Manipulator
Trajectories using Optimal Bounded-Acceleration Shortcuts,” in IEEE
International Conference on Robotics and Automation, 2010.
[8] J. P. Desai and V. Kumar, “Motion planning for cooperating mobile
manipulators,” Journal of Robotic Systems, vol. 16(10), pp. 557–579,
[9] N. Ratliff, M. Zucker, J. Bagnell, and S. Srinivasa, “CHOMP: Gra-
dient optimization techniques for efficient motion planning,” in IEEE
International Conference on Robotics and Automation, 2009, pp. 12–
[10] P. Dayan and G. E. Hinton, “Using em for reinforcement learning,
Neural Computation, vol. 9, 1997.
[11] E. Theodorou, J. Buchli, and S. Schaal, “Reinforcement learning of
motor skills in high dimensions: a path integral approach,” in IEEE
International Conference on Robotics and Automation, 2010.
[12] A. J. Ijspeert, J. Nakanishi, and S. Schaal, “Learning attractor
landscapes for learning motor primitives,” in Advances in Neural
Information Processing Systems 15. MIT Press, 2002, pp. 1547–
[13] D. Wierstra, T. Schaul, J. Peters, and J. Schmidhuber, “Fitness expec-
tation maximization,” Parallel Problem Solving from Nature–PPSN X,
pp. 337–346, 2008.
[14] Q. Ye, “The signed Euclidean distance transform and its applications,
in 9th International Conference on Pattern Recognition, 1988., 1988,
pp. 495–499.
[15] R. Featherstone, Rigid body dynamics algorithms. Springer-Verlag
New York Inc, 2008.
[16] L. Zhang, X. Huang, Y. Kim, and D. Manocha, “D-plan: Efficient
collision-free path computation for part removal and disassembly,
Journal of Computer-Aided Design and Applications, vol. 5, no. 6,
pp. 774–786, 2008.
[17] M. Quigley, K. Conley, B. Gerkey, J. Faust, T. B. Foote, J. Leibs,
R. Wheeler, and A. Y. Ng, “Ros: an open-source robot operating
system,” in International Conference on Robotics and Automation,
ser. Open-Source Software workshop, 2009.
[18] “STOMP: Stochastic trajectory optimization for motion planning,” Kalakrishnan.
[19] N. D. Ratliff, “Learning to search: structured prediction techniques
for imitation learning,” Ph.D. dissertation, Carnegie Mellon University,
[20] N. Jetchev and M. Toussaint, “Trajectory Prediction in Cluttered Voxel
Environments,” in IEEE International Conference on Robotics and
Automation, 2010.
... However, many different strategies have been proposed. For example, one may start from a randomly initialised candidate trajectory and proceed by adding random perturbations to explore the search space and generate approximate gradients, allowing any arbitrary form of cost functional to be encoded [11]. The same approach can be used to search for control signals and a local motion plan concurrently [12]. ...
... In Stochastic Trajectory Optimisation for Motion Planning (STOMP) [11] the authors address this by approximating the gradient from stochastic samples of noisy trajectories, allowing for non-differentiable costs. Another approach used in motion planning are quality diversity algorithms, at the intersection of optimisation and evolutionary strategies, of which Covariance Matrix Adaptation Evolution Strategy (CMA-ES) is the most prominent [21]- [23]. ...
... where H is a Reproducing Kernel Hilbert Space (RKHS) induced by a positive-definite kernel k : X × X → R, and q [εφ ] indicates the particle distribution resulting from taking an update step as in Eq. (11). Recall that an RKHS H associated with a kernel k is a Hilbert space of functions endowed with an inner product ⟨·, ·⟩ such that f (x) = ⟨ f , k(·, x)⟩ for any f ∈ H and any x ∈ X [47]. ...
Full-text available
Motion planning can be cast as a trajectory optimisation problem where a cost is minimised as a function of the trajectory being generated. In complex environments with several obstacles and complicated geometry, this optimisation problem is usually difficult to solve and prone to local minima. However, recent advancements in computing hardware allow for parallel trajectory optimisation where multiple solutions are obtained simultaneously, each initialised from a different starting point. Unfortunately, without a strategy preventing two solutions to collapse on each other, naive parallel optimisation can suffer from mode collapse diminishing the efficiency of the approach and the likelihood of finding a global solution. In this paper we leverage on recent advances in the theory of rough paths to devise an algorithm for parallel trajectory optimisation that promotes diversity over the range of solutions, therefore avoiding mode collapses and achieving better global properties. Our approach builds on path signatures and Hilbert space representations of trajectories, and connects parallel variational inference for trajectory estimation with diversity promoting kernels. We empirically demonstrate that this strategy achieves lower average costs than competing alternatives on a range of problems, from 2D navigation to robotic manipulators operating in cluttered environments.
... Motion planning is a crucial component of autonomous robot systems [1], [2], [3], [4]. It addresses the problem of finding a feasible, smooth, and collision-free path between a start and a goal point in a robot's configuration space, which can subsequently be executed by a lower-level controller [5]. ...
... Commonly used approaches for motion planning are either sampling [6], [7], [8], [9] or optimization based [2], [10], [11], [3]. Sampling-based methods possess a completeness property, assuring a global optimum given infinite compute time [9]. ...
... However, in practice, they often suffer from sample inefficiency and tend to produce non-smooth trajectories [12]. On the other hand, optimization-based planners optimize initial trajectories via either preconditioned gradient descent [2], [10], [11]; or stochastic update rules [3], [13] and can integrate desired properties such as smoothness as costs to be optimized. Nevertheless, optimization-based planners depend on a good initialization and can get trapped in local minima due to the non-convexity of complex problems. ...
Full-text available
Learning priors on trajectory distributions can help accelerate robot motion planning optimization. Given previously successful plans, learning trajectory generative models as priors for a new planning problem is highly desirable. Prior works propose several ways on utilizing this prior to bootstrapping the motion planning problem. Either sampling the prior for initializations or using the prior distribution in a maximum-a-posterior formulation for trajectory optimization. In this work, we propose learning diffusion models as priors. We then can sample directly from the posterior trajectory distribution conditioned on task goals, by leveraging the inverse denoising process of diffusion models. Furthermore, diffusion has been recently shown to effectively encode data multimodality in high-dimensional settings, which is particularly well-suited for large trajectory dataset. To demonstrate our method efficacy, we compare our proposed method - Motion Planning Diffusion - against several baselines in simulated planar robot and 7-dof robot arm manipulator environments. To assess the generalization capabilities of our method, we test it in environments with previously unseen obstacles. Our experiments show that diffusion models are strong priors to encode high-dimensional trajectory distributions of robot motions.
... Modern approaches handle these challenges in different ways. Most motion planning algorithms can be divided into three broad categories: methods based on sampling [13,12,2,3], optimization [31,11,23,22,18,15], and probabilistic inference [26,1,28,27]. These categories are formulated using different mathematical principles. ...
... From (8), we deduce that probabilistic-inference-based motion planners based on variational inference are stochastic extensions of traditional, optimization-based motion planning algorithms such as STOMP and CHOMP [11,31]. The differences are that: (i) instead of a single trajectory, we now return a distribution over trajectories, (ii) the loss contains an extra expectation term, and (iii) we include an additional KL-based regularizer to ensure we do not stray too far from the prior, thereby allowing us to control smoothness and related task-independent properties in a distributional manner. ...
Full-text available
To control how a robot moves, motion planning algorithms must compute paths in high-dimensional state spaces while accounting for physical constraints related to motors and joints, generating smooth and stable motions, avoiding obstacles, and preventing collisions. A motion planning algorithm must therefore balance competing demands, and should ideally incorporate uncertainty to handle noise, model errors, and facilitate deployment in complex environments. To address these issues, we introduce a framework for robot motion planning based on variational Gaussian Processes, which unifies and generalizes various probabilistic-inference-based motion planning algorithms. Our framework provides a principled and flexible way to incorporate equality-based, inequality-based, and soft motion-planning constraints during end-to-end training, is straightforward to implement, and provides both interval-based and Monte-Carlo-based uncertainty estimates. We conduct experiments using different environments and robots, comparing against baseline approaches based on the feasibility of the planned paths, and obstacle avoidance quality. Results show that our proposed approach yields a good balance between success rates and path quality.
... The limitation of this type of methods is that the manipulator may extend the footprint of the robot which makes it difficult to find an available path since a footprint may be in collision with the 2D projected map while the mobile manipulator is collision-free in the 3D environment. Some researchers [12,13] considered motion planning as a trajectory optimization problem and proposed STOMP and CHOMP, where the goal is to find suitable trajectories that minimize a given cost function without violating the given constraints. Some researchers use sampling-based planners such as RRT [14], PRM [15], their variants RRT-Connect [16], RRT* [17], Bidirectional RRT* [18], and Informed RRT* [19] for mobile manipulator motion planning due to their ability to plan in high-dimensional space. ...
... A sampling-based method is used to generate the collision-free configuration (Lines [10][11][12][13][14][15][16][17][18][19]. Each angle is sampled in the selected range near the corresponding angle of the template configuration. ...
Full-text available
This paper focuses on motion planning for mobile manipulators, which includes planning for both the mobile base and the manipulator. A hierarchical motion planner is proposed that allows the manipulator to change its configuration autonomously in real time as needed. The planner has two levels: global planning for the mobile base in two dimensions and local planning for both the mobile base and the manipulator in three dimensions. The planner first generates a path for the mobile base using an optimized A* algorithm. As the mobile base moves along the path with the manipulator configuration unchanged, potential collisions between the manipulator and the environment are checked using the environment data obtained from the on-board sensors. If the current manipulator configuration is in a potential collision, a new manipulator configuration is searched. A sampling-based heuristic algorithm is used to effectively find a collision-free configuration for the manipulator. The experimental results in simulation environments proved that our heuristic sampling-based algorithm outperforms the conservative random sampling-based method in terms of computation time, percentage of successful attempts, and the quality of the generated configuration. Compared with traditional methods, our motion planning method could deal with 3D obstacles, avoid large memory requirements, and does not require a long time to generate a global plan.
... The second method is for the avoidance of dynamic obstacles in real-time operations by selecting an optimal action plan including potential field and cost function-based methods based on an intelligent algorithm. The cost function-based techniques estimate each action plan, and later select the minimum criteria [9]. In the case of the potential field technique, firstly the resultant force is created for dynamic obstacle avoidance and encouraged the robot to obstacle-free movement [10]. ...
Full-text available
In the realm of mobile robotics, navigating around obstacles is a fundamental task, particularly in constantly changing situations. Although deep reinforcement learning (DRL) techniques exist that utilize the positional information of robot’s, environmental states, and input dataset for neural networks. Although, the positional information alone does not provide sufficient insights into the motion trends of obstacles. To solve this issue, this paper presents a dynamic obstacle mobility pattern approach for mobile robots (MRs) that rely on DRL. This method employs the positional details of dynamic obstacles dependent upon time for establishing a movement trend vector. This vector, in conjunction with another mobility state attribute, forms the MR mobility guidance matrix, that essentially conveys the pattern variation of dynamic obstacles trend over a specified interval. Using this matrix, the robot can choose its avoidance action. Also, this methodology uses the DRL-based dynamic policy algorithm for the testing and validation of the proposed technique through Python programming. The experimental outcomes demonstrate that this technique substantially improves the safety of avoiding dynamic obstacles.
... CHOMP [13] uses functional gradient descent to minimize the squared norm of velocity and collision avoiding cost. Extending this spirit, STOMP [14] avoids computing the gradient and performs stochastic analysis with a series of noisy trajectories. TrajOpt [15] approximates the cost and constraints as the sequential convex optimization problem and uses convex-convex collision detection with continuoustime collision checking, while other recent work solves it with sequential quadratic programming [18,19] or optimization methods such as interior point optimization [20] and Gaussian Process [21]. ...
Full-text available
Generating motions to grasp the objects in the scene is critical in robotic manipulation. The feasibility and quality of the output trajectory are closely related to both the grasping pose and the motion to reach the object, and decoupling grasp and motion planning cannot balance these two factors. In this paper, a motion planning method is proposed that comprehensively considers trajectory optimization, trajectory goal selection, and grasp adjustment. Trajectory optimization is formulated as an equality-constrained optimization problem, in which the end configuration of the trajectory can be changed during optimization. The update rule is derived and acts as a basic rule for goal selection and adjustment. The grasp configuration is chosen from a goal set, which helps the optimization to jump out of the local minima and generate smooth and feasible trajectories. We propose a two-stage grasp selection algorithm to balance between exploitation of the trajectory optimization and exploration of the goal set. To further optimize the trajectory, we propose a grasp adjustment method that allows the grasp pose to rotate about the connection line of the contact points in the workspace. To address this complicated problem, we divide the problem into two subproblems: computing the angle to rotate in the workspace and calculating the grasp configuration in the joint space. We design the overall planner and incorporate our method into a complete robotic arm platform with perception and the grasp generation network. All proposed methods are validated by performing comparative simulations and experiments.
... Navigating these intricacies, trajectory optimization [3]- [7] emerges as an alternative paradigm, generating sub-optimal trajectories through a constrained optimization formulation. Direct or collocation methods [3], [4], [6], [7] optimize over control and trajectory spaces, while indirect methods [8] focus exclusively on optimizing control inputs Although deterministic robot motion planning is a wellstudied problem [1] [5], motion planning for robots [9] in the presence of uncertainties remains a challenging task. Scenarios such as UAVs navigating turbulent airstreams [10], [11], legged robots traversing unknown terrains [12], and manipulators grasping with sensor-induced noise [13] underscore the indispensability of motion planning techniques that accommodate dynamic and environmental uncertainties. ...
Full-text available
We consider the motion planning problem under uncertainty and address it using probabilistic inference. A collision-free motion plan with linear stochastic dynamics is modeled by a posterior distribution. Gaussian variational inference is an optimization over the path distributions to infer this posterior within the scope of Gaussian distributions. We propose Gaussian Variational Inference Motion Planner (GVI-MP) algorithm to solve this Gaussian inference, where a natural gradient paradigm is used to iteratively update the Gaussian distribution, and the factorized structure of the joint distribution is leveraged. We show that the direct optimization over the state distributions in GVI-MP is equivalent to solving a stochastic control that has a closed-form solution. Starting from this observation, we propose our second algorithm, Proximal Gradient Covariance Steering Motion Planner (PGCS-MP), to solve the same inference problem in its stochastic control form with terminal constraints. We use a proximal gradient paradigm to solve the linear stochastic control with nonlinear collision cost, where the nonlinear cost is iteratively approximated using quadratic functions and a closed-form solution can be obtained by solving a linear covariance steering at each iteration. We evaluate the effectiveness and the performance of the proposed approaches through extensive experiments on various robot models. The code for this paper can be found in
Natural language instructions are effective at tasking autonomous robots and for teaching them new knowledge quickly. Yet, human instructors are not perfect and are likely to make mistakes at times, and will correct themselves when they notice errors in their own instructions. In this paper, we introduce a complete system for robot behaviors to handle such corrections, during both task instruction and action execution. We then demonstrate its operation in an integrated cognitive robotic architecture through spoken language in two tasks: a navigation and retrieval task and a meal assembly task. Verbal corrections occur before, during, and after verbally taught sequences of tasks, demonstrating that the proposed methods enable fast corrections not only of the semantics generated from the instructions, but also of overt robot behavior in a manner shown to be reasonable when compared to human behavior and expectations.
Full-text available
We present a novel approach to compute a collision-free path for part disassembly simulation and virtual prototyping of part removal. Our algorithm is based on sample-based motion planning that connects collision-free samples in the configuration space using local planning. In order to effectively handle the tight-fitting scenarios, we describe techniques to generate samples in narrow passages and efficient local planning algorithms to connect them with collision-free paths. Our approach is general and makes no assumption about model connectivity or object topology, and can handle polygon soup models that frequently arise in CAD applications. We highlight the performance on many challenging benchmarks including the Alpha puzzle, maintainability of the windscreen wiper motion, and disassembly of a seat from the interior of a car body.
Full-text available
Humanoid robotics hardware and control techniques have advanced rapidly during the last five years. Presently, several companies have announced the commercial availability of various humanoid robot prototypes. In order to improve the autonomy and overall functionality of these robots, reliable sensors, safety mechanisms, and general integrated software tools and techniques are needed. We believe that the development of practical motion planning algorithms and obstacle avoidance software for humanoid robots represents an important enabling technology. This paper gives an overview of some of our recent efforts to develop motion planning methods for humanoid robots for application tasks involving navigation, object grasping and manipulation, footstep placement, and dynamically-stable full-body motions. We show experimental results obtained by implementations running within a simulation environment as well as on actual humanoid robot hardware.
Conference Paper
Full-text available
Trajectory planning and optimization is a fundamental problem in articulated robotics. It is often viewed as a two phase problem of initial feasible path planning around obstacles and subsequent optimization of a trajectory satisfying dynamical constraints. There are many methods that can generate good movements when given enough time, but planning for high-dimensional robot configuration spaces in realistic environments with many objects in real time remains challenging. This work presents a novel way for faster movement planning in such environments by predicting good path initializations. We build on our previous work on trajectory prediction by adapting it to environments modeled with voxel grids and defining a frame invariant prototype trajectory space. The constructed representations can generalize to a wide range of situations, allowing to predict good movement trajectories and speed up convergence of robot motion planning. An empirical comparison of the effect on planning movements with a combination of different trajectory initializations and local planners is presented and tested on a Schunk arm manipulation platform with laser sensors in simulation and hardware.
Conference Paper
Full-text available
We present a novel combination of motion planning techniques to compute motion plans for robotic arms. We compute plans that move the arm as close as possible to the goal region using sampling-based planning and then switch to a trajectory optimization technique for the last few centimeters necessary to reach the goal region. This combination allows fast computation and safe execution of motion plans even when the goals are very close to objects in the environment. The system incorporates realtime sensory inputs and correctly deals with occlusions that can occur when robot body parts block the sensor view of the environment. The system is tested on a 7 degree-of-freedom robot arm with sensory input from a tilting laser scanner that provides 3D information about the environment.
Introduction.- Spatial Vector Algebra.- Dynamics of Rigid Body Systems.- Modelling Rigid Body Systems.- Inverse Dynamics.- Forward Dynamics - Inertia Matrix Methods.- Forward Dynamics - Propagation Methods.- Closed Loop Systems.- Hybrid Dynamics and Other Topics.- Accuracy and Efficiency.- Contact and Impact.
We address the problem of motion planning for nonholonomic cooperating mobile robots manipulating and transporting objects while holding them in a stable grasp. We present a general approach for solving optimal control problems based on the calculus of variations. We specialize this approach to solving the motion planning problem and obtaining trajectories and actuator forces/torques for any maneuver in the presence of obstacles. The approach allows geometric constraints such as joint limits, kinematic constraints such as nonholonomic velocity constraints, and dynamic constraints such as frictional constraints and contact force constraints to be incorporated into the planning scheme. The application of this method is illustrated by computing motion plans for several examples, and these motions plans are implemented on an experimental testbed.
Inverse dynamics is the calculation of the forces required at a robot’s joints in order to produce a given set of joint accelerations. The principal uses of inverse dynamics are in robot control and trajectory planning. In control applications it is usually incorporated as an element in a feedback loop to convert accelerations computed according to some control law into the joint forces which will achieve those accelerations (e.g., see [27], [39], [33]). Rapid execution is essential to achieve real-time control. In trajectory planning, inverse dynamics can be used to check that the forces needed to execute a proposed trajectory do not exceed the actuators’ limits [16], and recently discovered time-scaling properties of inverse dynamics have facilitated its use in planning minimum-time trajectories [52], [23]. Inverse dynamics is also used as a building block for constructing forward dynamics algorithms (see Chapter 5).
Conference Paper
Abstract— This paper gives an overview of ROS, an open- source robot operating,system. ROS is not an operating,system in the traditional sense of process management,and scheduling; rather, it provides a structured communications layer above the host operating,systems,of a heterogenous,compute,cluster. In this paper, we discuss how ROS relates to existing robot software frameworks, and briefly overview some of the available application software,which,uses ROS.
Conference Paper
Existing high-dimensional motion planning algorithms are simultaneously overpowered and underpowered. In domains sparsely populated by obstacles, the heuristics used by sampling-based planners to navigate “narrow passages” can be needlessly complex; furthermore, additional post-processing is required to remove the jerky or extraneous motions from the paths that such planners generate. In this paper, we present CHOMP, a novel method for continuous path refinement that uses covariant gradient techniques to improve the quality of sampled trajectories. Our optimization technique both optimizes higher-order dynamics and is able to converge over a wider range of input paths relative to previous path optimization strategies. In particular, we relax the collision-free feasibility prerequisite on input paths required by those strategies. As a result, CHOMP can be used as a standalone motion planner in many real-world planning queries. We demonstrate the effectiveness of our proposed method in manipulation planning for a 6-DOF robotic arm as well as in trajectory generation for a walking quadruped robot.