Page 1
Control for Throwing Manipulation by One Joint Robot
Hideyuki Miyashita, Tasuku Yamawaki and Masahito Yashima
Abstract—This paper proposes a throwing manipulation
strategy for a robot with one revolute joint. The throwing
manipulation enables the robot not only to manipulate the
object to outside of the movable range of the robot, but also to
control the position of the object arbitrarily in the vertical plane
even though the robot has only one degree of freedom. In the
throwing manipulation, the robot motion is dynamic and quick,
and the contact state between the robot and the object changes.
These make it difficult to obtain the exact model and solve its
inverse problem. In addition, since the throwing manipulation
requires more powerful actuators than the static manipulation,
we should set the control input by taking consideration of the
performance limits of the actuators. The present paper proposes
the control strategy based on the iteration optimization learning
to overcome the above problems and verifies its effectiveness
experimentally.
I. INTRODUCTION
In general, a robot manipulates an object with grasp. By
grasping objects, the robot can gain high stability for manipu
lation. However, a grasp limits the flexibility of manipulation.
For instance, the workspace of a grasp manipulation is
limited by the movable range of a manipulator.
In this paper, we discuss the throwing manipulation by
one joint robot which can control the object position to
multiple goal positions as shown in Fig.1. If the robot can
accomplish such throwing manipulation, the robot can not
only manipulate the object to outside of the movable range of
the robot, but also control the position of the object arbitrarily
in the vertical plane, even though the robot has only one
degree of freedom [1], [2].
Fig. 2 (a) shows one of the applications of the throwing
manipulation, in which the one joint robot throws the various
types of object carried by the beltconveyer and stores them
into each goal box container. As the box containers are
placed at the apex point of the object’s trajectory, the colli
sion impact can be lessened. Fig. 2 (b) shows the juggling
with the throwing manipulation. If the one joint robot can
throw the object to different goal positions, the robot can
perform a humanlike dexterous juggling.
In the throwing manipulation, the robot motion is dynamic
and quick, and the contact state between the robot and the
object changes [3]. These make it difficult to obtain the
exact model and solve its inverse problem. In addition, since
the throwing manipulation requires more powerful actuators
than the static manipulation, we should set the control input
by taking consideration of the performance limits of the
actuators.
The authors are with Dept. of Mechanical Systems Engineering, National
Defense Academy of Japan, 11020, Hashirimizu, Yokosuka, Kanagawa,
JAPAN {g47091, yamawaki, yashima}@nda.ac.jp
Goal
position 2
Goal
position 3
Goal position 1
Fig. 1.Throwing manipulation by one joint robot
There are several researches discussing the manipulation
with a throwing motion [1], [2], [4][9]. Lynch proposes the
modelbased controller which uses nonlinear optimization
techniques [4]. The experimental results are less successful
because of the modeling error. To overcome the modeling
error problem, several researches apply the learning control
to throwing manipulation [2], [5], [6]. However, these re
searches neglect the performance limits of the actuator and
do not attempt the learning control for the multiple goals.
The present paper discusses the throwing manipulation
method based on the iteration optimization learning, which
can take into consideration the modeling error problem, the
performance limits of the actuator and the stability of the
learning method.
II. MODELING
In this section, we develop the throwing model, which
maps the trajectory parameter that decides the arm trajectory
to the apex point of the object where the object attains its
highest elevation.
Object
trajectory 1
Object
trajectory 2
(a) Storage task(b) Juggling task
Fig. 2.Application of the throwing manipulation
Page 2
0
0
Time [s]
θ [rad/s2]
0
0
θ [rad/s]
0
0
θ [rad]
β /2α
2
β /2α
2
β
β
θs
β/α
2β/α
3β/α
4β/α
β/α
β/α
2β/α
2β/α
3β/α
3β/α
4β/α
4β/α
Fig. 3.Arm trajectory
A. Arm trajectory
To simplify the modeling, the throwing motion remains in
the vertical xy plane. We assume that the object is regarded
as a point mass and air resistance can be neglected. The
reference frame is located on the axis of the joint. When the
arm is horizontal, the angle of the arm θ is defined as 0 deg.
The counterclockwise rotation is set as positive.
It assumed that the arm trajectory is given by a thirdorder
polynomial about time t.
θ(t) =α
˙θ(t) =α
¨θ(t) = αt + β
6t3+β
2t2+ βt
2t2+ θs
(1)
(2)
(3)
where θs is the initial angle, α and β are the jerk of the
trajectory and the initial angular acceleration, respectively.
From these equations, the arm trajectory can be given by
three parameters, which are θs, α, and β. To enable the robot
to throw the object in the counterclockwise direction, we
define the range of α, β and θsas follows:
−π/2 < θs< 0
α < 0
β > 0
(4)
(5)
(6)
The approaching arm trajectory to the maximum arm angle
at the time t = −2β/α can be shown as Fig. 3. At the
moment when the throwing condition, which is described
in section IID, is satisfied, the object is thrown in the air
before t = −2β/α. We set the return trajectory which is
symmetrical to the approaching trajectory with respect to
t = −2β/α.
B. The apex point of the object trajectory
We formulate the motion of the thrown object. Assuming
that the object is thrown with the angular velocity˙θ =˙θtat
the angle θ = θtby the arm and that the motion of the object
is given by the ballistic flight equation, the apex point of
the freeflying trajectory where the object attains the highest
point, (xm,ym), which is called the model’s apex point, is
t
ml??
? ?
2
t
ml??
mg
t ?
t ??
t ?? ?
t
cos mg
?
t
sinmg
?
) sinmg ml(
t
2
t
???
?
?
Fig. 4. Throwing condition
written by
?xm
ym
?
=
?lcosθt
lsinθt
?
+
?−l˙θtsinθt
l˙θtcosθt
??
??
l˙θtcosθt
g
?
?2
−
?0
g
2
l˙θtcosθt
g
(7)
where l is the radius of the arm and g is the gravitational
acceleration. As seen from (7), the position of the model’s
apex point (xm,ym) can be given by θtand˙θt.
C. Forward model
It is assumed that the object is thrown at time t = tt> 0
(throwing time). From (1), (2) and (3), the arm’s throwing
angle θt, angular velocity˙θt and angular acceleration¨θt
can be described by four parameters, α,β,θsand tt. These
parameters are called a trajectory parameter u, which is
expressed by
u = (α, β, θs, tt)T
(8)
As seen from (7), the position of the model’s apex
point (xm,ym) can be expressed by using θt and˙θt. By
substituting (1) and (2) into (7), the model’s apex point
(xm,ym) is described by the trajectory parameter u. The
forward model of the throwing, which relates the model’s
apex point (xm,ym) with the trajectory parameter u, can be
expressed by using a nonlinear function f, which is
(xm, ym)T= f(u)
(9)
D. Throwing condition
We derive the throwing condition from equilibrium of
forces at the moment when the object is thrown in the air, as
shown in Fig. 4. If the acceleration in the vertical direction
with respect to the arm’s surface satisfies the following
equation
hT(u) = ml¨θt+mgcosθt+µ(ml˙θ2
t−mgsinθt) = 0 (10)
then the object is thrown. The object’s mass is m and the
frictional coefficient is µ. Since (10) is expressed by the θt,˙θt
and¨θt, the throwing condition hT(u) can be also described
by the trajectory parameter u.
Page 3
inputoutput
u
T
mm
)y ,x (
Constraint
0hT=0hM<
,
f
Fig. 5. Throwing model
E. Constraint on trajectory parameter
This section describes the constraint on the trajectory pa
rameter. We should take into consideration the constraint on
the actuator performance. As seen from Fig. 3, the maximum
angular acceleration of the arm is β. This acceleration does
not exceed the maximum torque τmaxof the actuator, which
is written by
Iβ ≤ τmax
where I is moment of inertia of the arm and the rotor of the
motor. From (6) and (11), we get the inequality constraint
on the parameter β, which is
(11)
0 < β ≤ τmax/I
(12)
Similarly, the maximum angular velocity of the arm is
−β2/2α. This velocity does not exceed the maximum angu
lar velocity˙θmaxof the actuator, which is written by
−β2/2α ≤˙θmax
Linearizing (13) within the range of β shown in (12) yields
α + τmaxβ/2˙θmaxI ≤ 0
Therefore, the constraints on the actuator performance can
be given by (12) and (14).
Finally, combining (4), (5), (12) and (14) yields the
constraint on the trajectory parameter, which is written by
(13)
(14)
hM(u) ≤ 0
(15)
F. Throwing model
Summing (9), (10) and (15) yields the throwing model
(16), which relates the trajectory parameter u with the
object’s apex point (xm, ym) under the constraint conditions,
as shown in Fig. 5.
III. ITERATIVE LEARNING CONTROL USING
OPTIMIZATION
A. Learning control based on virtual goal apex point
If we can obtain the exact model of the robot, we can
calculate the motion pattern for the arm to throw the object
to the goal. To obtain the exact model, we need to consider
kinematics and dynamics including the interaction of the
(xm, ym)T= f(u)
hT(u) = 0
hM(u) ≤ 0
(16)
Robot
) y ,
d
x (
d
) y ,x (
1i
c
1i
c
++
Virtual
Goal
Generator
ie
) y , x (
1i
d
1i
d
++
1iu+
C
Fig. 6.Learning algorithm using virtual goal generator
robot and the object, and the correction of the vision sensor
etc. In general, it is impossible to obtain such exact model.
The throwing model (16) is obtained by simplifying the
actual motion. Therefore, we cannot control the apex point
of the object trajectory to the goal apex point (xd,yd) by
applying the control input (trajectory parameter) u obtained
from the abovementioned throwing model to the robot.
Instead of setting (xd,yd) as the goal apex point for the
controller, we set a virtual goal apex point (? xd, ? yd) for the
throwing trial in order to enable the robot to throw the object
to the goal apex point (xd,yd). The virtual goal apex point
(? xd, ? yd) is obtained by the tasklevel learning approach [6].
written as
ei= (xd, yd)T− (xi
where the subscript i indicates the ith throwing.
The virtual goal apex point (? xi+1
(? xi+1
convergence of the learning.
controller C as shown in Fig. 6, which is updated at each
The difference between the goal apex point (xd,yd) and
the measured apex point (xi
c,yi
c) with a camera can be
c, yi
c)T
(17)
d
, ? yi+1
d)T+ kei
d
) of the i+1th
throwing is updated by using the ith error eiof (17) as
d
, ? yi+1
d
)T= (? xi
d, ? yi
(18)
where 0 < k ≤ 1 is a constant parameter which affects the
B. Optimization of trajectory parameter
The controller C in Fig. 6 finds the trajectory param
eter u which achieves throwing to the virtual goal apex
(? xd, ? yd) by using the throwing model (16), which is called
necessary to consider the various control purposes as well
as the throwing condition and the actuator’s constraints to
improve the performance of the throwing task. Therefore,
in order to obtain the trajectory parameter ui+1of the
i+1th throwing, the inverse problem is solved by using the
nonlinear programming problem described by the following
equations.
the inverse problem of the throwing manipulation. It is
min :J =wx(? xi+1
subj. to: eq. (16)
d
−xm(ui+1))2+ wy(? yi+1
d
−ym(ui+1))2
+ wT(−βi+1/αi+1)2+ ∆uTWu∆u
(19)
where wx, wy, wT are weights and Wu is a diagonal
weighting matrix.
Here, (19) is the objective function. The first and second
terms of the righthand side, which indicate the difference
between the virtual goal apex point (? xi+1
d
, ? yi+1
d
) and the
Page 4
Optimization
Arm
PDcontroller
No
Yes
+

) y ,
m
x (
m
) y ,
d
x (
d
??
) y ,
c
x (
c
) y ,
d
x (
d
??
e 
Virtual
Goal
Generator
u
Throwing
Model
Camera
Learning Controller
Object
Memory
Fig. 7.Iterative learning control algorithm
model’s apex point (xm,ym), contributes to the derivation of
the trajectory parameter achieving the virtual goal apex point.
The third term (−βi+1/αi+1), which indicates the time when
the arm reaches the maximum angular velocity, contributes
to the improvement of the motion performance of the arm.
The variable ∆u in the fourth term indicates the change of
the trajectory parameter, which is described as
∆u = ui+1− ui
(20)
Therefore, the fourth term helps the trajectory parameter to
avoid a drastic change and helps the stability of the control
system to be improved.
We solve the nonlinear optimization problem expressed
by (19) and (16) at each throwing, and update the i+1th
trajectory parameter ui+1. We use the sequential quadratic
programming (SQP) method of which the advantage is the
high convergence [10].
C. Learning algorithm
Fig. 7 shows the flow of the iterative learning control
algorithm, whose details of the procedure are shown below.
1) We obtain the trajectory parameter u1for the first
throwing. We set the goal apex point (xd,yd) as the
first virtual goal apex point (? x1
apex point by solving (19) under (16). We input the
first trajectory parameter u1to the robot. The robot
throws the object. We measure the apex point (x1
by using the camera. In the SQP method, the solutions
depend on the initial value. Thus we give a variety of
the initial value and obtain all trajectory parameters
given by them. We chose the trajectory parameter that
has the minimum value of the objective function as the
first trajectory parameter. In this procedure, we set the
4th term of the righthand side of (19) as Wu= 0.
2) We calculate the error eibetween the apex point
(xi
(xd,yd) by using (17). If the norm of the error is less
than the value of the threshold ε, we assume that the
robot accomplishes the desired throwing task and stop
the learning, otherwise we progress to the step 3).
3) We derive the virtual goal apex point (? xi+1
using the virtual goal generator (18).
d, ? y1
d), and obtain the
trajectory parameter u1achieving the first virtual goal
c,y1
c)
c,yi
c) measured with camera and the goal apex point
d
, ? yi+1
d
) of
the i+1th trial from the ith measured apex point by
Arm
Object
Camera
paddle
l
Camera
Object
Arm
(xc, yc)
Fig. 8. Throwing robot system
4) We solve the nonlinear optimization problem (19) by
taking into account the throwing model (16) and obtain
the trajectory parameter ui+1.
5) Substituting the trajectory parameter α, β and θs of
ui+1into (1) and (2) yields the desired arm trajec
tories, θd(t) and˙θd(t). The robot is controlled with
a PDcompensator along with the desired trajectories
and the object is thrown.
6) We measure several positions of the flying object with
the camera and estimate the object’s ballistic trajec
tory through the least square approximations. Then
we calculate the apex point of the object trajectory
(? xi+1
IV. LEARNING CONTROL EXPERIMENT OF
THROWING
A. Experimental condition
We verify that the proposed learning control is still effec
tive even though the goal apex point is switched in the way
of learning. Let the first goal apex point be (xd1,yd1) =
(0.1,0.3) m, the second goal apex point be (xd2,yd2) =
(0.15,0.35) m, the weighting factors be wx = wy = 1,
wT = 0.01, Wu = diag(0.1,0.1,0.1,0.1), the parameter
of (18) be k = 0.7, and the threshold value of the error
norm be ε = 5 × 10−3m. To inspect the stability of the
learning control after the error norm satisfies the termination
condition e ≤ ε, we continue the learning experiment 10
times additionally.
c
, ? yi+1
c
) from the estimated trajectory and return
to the step 2).
B. Throwing robot system
Fig. 8 shows the throwing robot systems applied to the
experiment. The arm is driven by the AC motor containing
a harmonic drive gear. The angle of the arm is measured
by the encoder. The arm is performed with simple PD
compensators. The motor has the maximum torque τmax=
18 Nm, and the maximum rotational speed˙θmax= 4π rad/s.
The radius length of the arm is 0.3 m. To enable the robot
to keep the object on the arm’s edge, we install the arm wall
of which the height is 0.03 m on the arm’s edge. The total
moment of inertia including that of the arm and that of the
rotor of the actuator is I = 0.074 kgm2. Several positions of
the flying object in the vertical plane are measured with the
Page 5
051015 20 2530
0
0.01
0.005
0.02
0.03
0.04
0.05
0.06
0.07
Trial number i
Error e [m]
ε
Switch
Fig. 9.Performance error at each trial
0510152025 30
0.28
0.3
0.32
0.34
0.35
0.36
Trial number i
Object position yc
051015 2025 30
0.08
0.1
0.12
0.14
0.15
0.16
Object position xc
xd1
xd2
yd1
yd2
Switch
Switch
Fig. 10.Observed apex point of the object at each trial
highspeed camera at a rate of 1 kHz, which is developed by
Hamamatsu Photonics K. K. These object’s positional data
are used to estimate the object’s apex point (xc,yc). The
sampling rate of the controller is 1 ms. We use a beanbag as
the throwing object whose mass and diameter are 50 g and
0.05 m, respectively.
C. Experimental results and discussion
1) Control stability: Fig. 9 shows the transition of the
error norm e. The value of the error norm is reduced
gradually by repeating the learning, and the error norm of
the 8th throwing satisfies the termination condition, e ≤ ε.
After that, the error norms of the 917th throwings can keep
the termination condition.
The goal apex point is switched at the 18th throwing. Due
to the change of the goal, the error jumps. However, the
error is attenuated gradually by repeating the learning, and
the error norm of the 23rd throwing is less than ε and satisfies
the termination condition. After that, the error norms of the
2432nd can be kept less than ε.
Fig. 10 shows the apex point (xi
highspeed camera. The measured apex point can get close to
the goal apex point gradually by repeating the throwing, even
though the goal apex point is switched at the 18th throwing.
Fig. 11 shows the norm of the change of the trajectory
parameter ? ∆u ?, of which value decreases gradually by
repeating the throwing. This result yields that the trajectory
parameter can converge at the local optimal point despite
c,yi
c) measured with the
0510 15 202530
0
0.02
0.04
0.06
0.08
0.1
0.12
Trial number i
∆ u
Switch
Fig. 11. Transition of ∆u at each trial
0 200205 210 215 220 225 230 235 240 245
3,600
3,400
3,200
3,000
2,800
2,600
2,400
2,200
0
Trajectory parameter β
Trajectory parameter α
1
2032
19
718
6
5
4
3
2
Fig. 12.Transition of value of α and β at each trial
the goal apex point is switched. The convergence of the
trajectory parameter provides not only the convergence of
the error norm ei as shown in Fig. 9 but also the stability
of the learning control.
2) Constraints on actuator’s performance: Fig. 12 shows
the transition of the trajectory parameters, α and β, which
relate to the performance of the actuator as shown in IIE.
The plotting numbers indicate the iteration number, and the
allowable range of α and β, which is given by (12) and (14),
is painted in gray color. This figure shows that α and β can
keep staying within the allowable range.
At the 19th throwing after the goal apex is switched to be
higher, the trajectory parameter significantly changes onto
the borderline of the allowable range. This indicates that the
robot learns the trajectory parameter making best use of the
actuator’s performance.
For comparison, Fig. 13 shows the transition of α and β
for the learning control without consideration of the actua
tor’s performance limits. Unlike with Fig. 12, the trajectory
parameter exceeds the allowable range of the actuator’s
performance after the change of the goal apex. Therefore,
the arm fails to track the desired trajectory corresponding to
the trajectory parameter. Fig. 14 shows the transition of the
error norm. The learning failed as seen from the results that
the error norm oscillates drastically and is not reduced after
the change of the goal apex.