ArticlePDF Available

Abstract and Figures

Following rapidly and precisely a moving target has become the core functionality in robotic systems for transportation, manufacturing, and medical devices. Among existing targeting following methods, vision-based tracking continues to thrive as one of the most popular, and is the closest method to human perception. However, the low sampling rate and the time delays of visual outputs fundamentally hinder real-time applications. In this paper, we show the potential of significant performance gain in vision-based target following when partial knowledge of the target dynamics is available. Specifically, we propose a new framework with Kalman filters and multi-rate model-based prediction (1) to reconstruct fast-sampled 3D target position and velocity data, and (2) to compensate the time delay. Along the path, we study the impact of slow sampling and the delay duration, and we experimentally verify different algorithms with a robot manipulator equipped with an eye-in-hand camera. The results show that our approach can achieve 95% error reduction rate compared with the commonly used visual servoing technique, when the target is moving with high speed and the visual measurements are slow and incapable of providing timely information.
Content may be subject to copyright.
International Joournal of Intelligent Robotics and Applications manuscript No.
(will be inserted by the editor)
Robotic Target Following with Slow and Delayed Visual
Hui Xiao ·Xu Chen
Received: date / Accepted: date
Abstract Following rapidly and precisely a moving tar-
get has become the core functionality in robotic sys-
tems for transportation, manufacturing and surgery.
Among existing targeting following methods, vision-
based tracking continues to thrive as one of the most
popular, and is the closest method to human percep-
tion. However, the low sampling rate and the time de-
lays of visual outputs fundamentally hinder real-time
applications. In this paper, we show the potential of
significant performance gain in vision-based target fol-
lowing when partial knowledge of the target dynamics
is available. Specifically, we propose a new framework
with Kalman filters and multi-rate model-based predic-
tion (1) to reconstruct fast-sampled 3D target position
and velocity data, and (2) to compensate the time delay.
Along the path, we study the impact of slow sampling
and the delay duration, and we experimentally verify
different algorithms with a robot manipulator equipped
with an eye-in-hand camera. The results show that our
approach can achieve 95% error reduction rate com-
pared with the commonly used visual servoing tech-
nique, when the target is moving with high speed and
the visual measurements are slow and incapable of pro-
viding timely information.
Keywords Visual servoing ·Kalman filter ·Delay
Hui Xiao, PhD Student
Mechanical Engineering, University of Washington
3900 E Stevens Way NE, Seattle, WA 98195
Xu Chen, Assistant Professor
Mechanical Engineering, University of Washington
3900 E Stevens Way NE, Seattle, WA 98195
corresponding author
1 Introduction
This paper considers the problem of controlling a robot
to follow a moving target based on only visual feed-
back. Such capability relates to high-impact robotic ap-
plications ranging from autonomous ground or aerial
vehicles that follow a leading target, to surgical robot
arms that track the motions of human organs, and to
robotic manipulators that perform pick-and-place tasks
in a highly dynamic environment (e.g., above the sea or
in a turbulence airflow). In those applications and the
like, control based on visual feedback is a prime target
for innovations. Indeed, vision sensors (e.g. cameras)
are becoming ubiquitous and non-contact imaging is
increasingly preferred in unstructured environments.
In typical vision-based target following, image pro-
cessing first extracts useful information from raw pixel
data, then visual servo control algorithms calculate the
motion commands of the robot. From a controls per-
spective, the goal of visual servoing is to minimize the
errors between the desired and measured visual fea-
tures. Based on how the visual features are defined,
model-based visual servoing approaches can be clas-
sified into position-based and image-based [8,9]. The
position-based visual servoing (PBVS) defines visual
features in the 3D space while the image-based visual
servoing (IBVS) defines visual features in the 2D im-
age space [8]. There are also extended approaches that
use more sophisticated visual features [10,6], or deploy
a hybrid system [20,11, 22] that combines advantages
of IBVS and PBVS while trying to avoid their short-
comings. The recent learning-based approaches directly
train neural network models to predict the control com-
mands [4,24].
The above mentioned approaches are based on the
assumption of a motionless target. A moving target,
2 Hui Xiao, Xu Chen
however, demands additional and careful considerations
in tracking. One design is to add an integral term in the
velocity control law to compensate the object-following
error [30]—effective only if the target moves at a con-
stant velocity. A more commonly used approach is to
estimate the target velocity by Kalman filters [14,19,
16,25], extended Kalman filters [21] or other filtering
techniques [5]. Then one can compute the robot mo-
tion commands, assuming the target is moving at the
estimated velocity.
The major barriers for implementing visual servoing
algorithms for moving target following are that the slow
sampling speed of cameras and the time delays induced
by time-consuming image processing. Most vision-based
robot controls are framed into systems of systems with
multiple sampling rates, where the robot motion con-
troller runs at a higher sampling rate (usually more
than 100Hz) under fast encoder feedback, and the vi-
sual servo loop runs at a lower sampling rate (e.g. 25Hz
for a standard vision system). To integrate the two sam-
pling rate, the simplest method is to upsample the vi-
sual feedbacks by repeating the latest data between the
slow sampling instances. However, such an approach
does not consider target dynamics and can still cause
large following errors. When the target moves with a
high speed, a large tracking error could cause the cam-
era to lose the target in its field of view and trigger a
tracking failure. Many works for tackling the above bar-
riers have been focused on developing advanced imag-
ing hardware or communication protocals [23,7, 13] that
can provide fast and timely image data. However, those
methods often require costly hardware and the speed
improvement is still limited by the time-consumming
image processing.
To reduce the effect of slow-sampling and time de-
lays, visual sensing dynamics compensation (VSDC)
[26,18] was developed and approved to be successful in
target tracking experiments [27]. It formulated a cus-
tomized dual-rate Kalman filter where the prediction
runs at a faster rate and the correction step runs at
a slower rate. On the contrary in this paper, we pro-
vide an approach where state estimation performs op-
timally at sensor speeds and post compensation recov-
ers a fast-sampled and delay-free signal. This is made
possible by multi-rate model-based prediction (MMP)
[12] and knowing (partially) the underlying physics of
the target dynamics. The contribution of this work is
the development of a new vision-based target follow-
ing algorithm that can track a high-speed target when
the vision system is slow and incapable of providing
timely information. Specifically, we provide four differ-
ent target dynamic models to cover a wide range of
motion trajectories, then use Kalman filters to estimate
the target 3D position and velocity sampled at a slow
rate. Based on polynomial models derived from the dy-
namic models, we extend the usages of MMPs to re-
construct a fast-sampled target position and velocity,
and to compensate the time delays of visual measure-
ments. Finally, we derive a target tracking controller
that substantially extends the traditional visual servo-
ing method, which only applies to follow a motionless
target. Compared to other visual servo control involv-
ing the robot dynamics, our controller design inherits
the algorithmic advantages in [8] and assures good con-
vergence properties.
The remainder of this paper is organized as follows.
Section 2 formulates the problem. Section 3 derives the
proposed target following algorithm. Section 4 describes
how to estimate the target motion using Kalman fil-
ters. Section 5 details the algorithm of MMP and the
procedure to use it for interpolation and delay compen-
sation. Numerical simulation and robot experiment are
reported in section 6 and section 7, respectively. Finally,
section 8 concludes the paper. A preliminary version of
this article was presented in [29]. This article is a sub-
stantially extended study that includes new simulations
and the full analysis results.
{A}A 3D coordinate system with origin at porint
AξBThe 3D pose of frame {B}with respect to frame
AvBThe 3D velocity of frame {B}with respect to
frame {A}.
The composition operator of relative poses, e.g.,
ARBThe rotation matrix corresponding to the rela-
tive pose AξB.
AtBThe translation vector corresponding to the rel-
ative pose AξB.
Rx(θ) The rotation matrix that corresponds to the 3D
rotating of θdegrees about xaxis.
A0The transpose of matrix A.
2 Problem Formulation
Coordinate system definition. The target following
system includes a moving target T, a follower robot R,
and a vision system Cstatically attached to the robot.
We define a static world coordinate system {W}and
attach moving coordinate systems {T}and {C}to the
target and to the vision system, respectively. Figure 1
shows the relationship between the coordinate systems,
Robotic Target Following with Slow and Delayed Visual Feedback 3
Fig. 1: Coordinate systems of target following problem
where {C}represents the desired position and orien-
tation of the vision system.
We now discuss the major elements in Figure 1:
The observed target pose in the camera frame CξT:
There are many approaches to estimate the target
pose from images. For example, with a calibrated
camera and a known 3D model of the target, CξT
can be obtained from the solution of Perspective-
n-Point (PnP) problem based on n3D-to-2D point
correspondences [17, 31]. With a stereo camera or an
RGB-D camera, it is possiable to estimated the 3D
pose of the target without using the 3D model of
the target as a prior [3,1]. There are also learning-
based pose estimation proposed recently [28]. In this
paper, we use the PnP algorithm for simplicity and
will assume an estimation Cˆ
ξTis available hereafter.
The camera pose in the world coordinate system
WξC: Because the camera is rigidly attached to the
robot, WξCcan be calculated by the forward kine-
matics of the robot.
The target pose in the world frame WξT: This is un-
known and is estimated by the vision system. Most
important, we consider the case when the target is
moving with unknown velocity WvT(t).
Data acquisition and time delay. In this pa-
per, we consider the case when the image-based pose
estimation Cˆ
ξThas a time delay of τseconds, and is
updated every Tss seconds. However, the controller of
the robot is running with a much shorter period of Tsf
seconds. We assume that the sampling rate of the imag-
ing system can be adjusted such that
Tss =LTsf ,(1)
where Lis an integer greater than one.
Control goal. Similar to the standard PBVS [8],
the vision-based target tracking problem can be formu-
lated as minimizing the visual feature error e(t) given
e(t) = s(t)s(t),(2)
where srepresents a set of visual features and sis the
desired value of s. For the target following problem, we
define the visual featuer as the pose difference between
the desired and the actual camera pose s,CξC. An
estimation of the visual feature is calculated from the
target pose estimation
where CξTis a user-defined value and Tˆ
ξCis the re-
verse of the pose estimation Cˆ
ξT. The goal of target
following is to control the camera velocity CvCsuch
that CξTwill converge to CξT, that is, sconverges to
3 PBVS with a Moving Target
In this paper, we parameterize the feature error as e=
CξC= (CtC, θu), where CtCand θare the transla-
tion vector and the rotation angles from {C}to {C},
respectively. uis a unit vector representing the corre-
sponding rotation axis. Let the relative instantaneous
velocity of the camera with respect to the target be
TvCR6×1,then the relationship between the time
derivative of visual feature error ˙
eand the relative ve-
locity TvCis
where Leis the feature Jacobian and Lθuis given by
2[u]×+ 1sincθ
Here, sinc(x) is the sinus cardinal such that xsinc(x) =
sin(x) and sinc(0)=1. [u]×is a skew-symmetric matrix
defined as
0 -uzuy
uz0 -ux
To ensure a decreasing feature error, we design a veloc-
ity controller
TvC= -λL1
ee= -λTRC0
4 Hui Xiao, Xu Chen
where λ > 0.Then from Eq. (7) and Eq. (4):
e= -λe.(8)
That is to say, the visual feature error will exponentially
decrease to zero. Note that L1
θuθu=θu. Eq. (7) can
then be simplified as
(TvC,l = -λ·TRCCtC
TvC,a = -λθ ·TRCu(9)
where TvC,l and TvC,a are the linear and the angular
parts of the velocity vector TvC, respectively.
Note that in the standard PBVS, the feature Ja-
cobian is defined with respect to the camera’s velocity
in the camera frame CvC[8], i.e., ˙
e=LeCvC. Such
a relationship is valid only if the target is motionless.
In this paper, the feature Jacobian is defined with re-
spect to the relative velocity TvCbetween two moving
frames, such that Eq. (4) is still valid when the target is
maneuvering. To compute the velocity command CvC,
we use a velocity transformation formula
Here, the target velocity TvTis obtained from Kalman
filters, as is described in the following section.
4 Pose and Velocity Estimation
While we can estimate the target pose from the percep-
tion system, the visual measurements are usually noisy.
To reduce the measurement noise and to estimate the
unknown target velocity, we build a set of Kalman fil-
ters that can fit to common robotic target motions.
4.1 Dynamic models of the moving target
We list first the widely used constant velocity model
and constant acceleration model for a moving target,
then derive a constant frequency model and a com-
pound constant frequency model.
4.1.1 Constant velocity model
When the acceleration is relatively small, we can as-
sume a constant speed in the state update equation and
model the acceleration as white noise, which formulates
to the discrete white noise acceleration (DWNA) model
[2]. Define the system state as x= (η, ˙η). Then the sys-
tem state-space model is given by
x(k+ 1) = 1Tss
0 1 x(k) + 1
Tss v(k),(11)
z(k+ 1) = 1 0 x(k+ 1) + ω(k+ 1),(12)
where v(k) is the process noise and ω(k+ 1) is the
measurement noise.
4.1.2 Constant acceleration model
When the acceleration is nearly constant, we use dis-
crete Wiener process acceleration (DWPA) model that
assumes a constant acceleration within each update pe-
riod. Defining the system state as x= (η, ˙η, ¨η), the sys-
tem model is given as
x(k+ 1) =
1Tss 1
0 1 Tss
0 0 1
x(k) +
z(k+ 1) = 100x(k+ 1) + ω(k+ 1).(14)
4.1.3 Constant frequency (CF) model
When the movement is approximately periodic, both
constant velocity model and constant acceleration model
will fail to timely capture the velocity and acceleration
changes. This model mismatch is significant when the
moving frequency is fast. The vast amount of periodic
tasks can be decomposed to sinusodial motions from
the Fourier theory. Consider η(t) = asin(Ωt +ϕ) and
its derivative ˙η(t) = aΩ cos(t +ϕ). Let the system
state be x= (η, ˙η). Note that the second derivative
¨η(t) = aΩ2sin(Ωt+ϕ) = 2η(t). Then we can write
the continuous-time state-space model as
dtx(t) = 0 1
Discretizing Eq. (15) at a sampling time Tss , we have
x(k+ 1)
= eAcTss x(k) = cos(ΩTss ) sin(ΩTss )/Ω
-sin(ΩTss ) cos(ΩTss )x(k).(16)
When the sinusoidal signal has a bias term, i.e., η(t) =
asin(Ωt +ϕ) + b, we can augment the system to in-
clude the bias: x= (η, ˙η, b). The corresponding ex-
tended model with noise is
x(k+ 1)
cos(ΩTss ) sin(ΩTss )/Ω 0
-sin(ΩTss ) cos(ΩTss ) 0
0 0 1
x(k) +
z(k+ 1) = 101x(k+ 1) + ω(k+ 1).(18)
Robotic Target Following with Slow and Delayed Visual Feedback 5
4.1.4 Compound constant frequency (CCF) model
We consider here the case when the core movement is
a mixture of multiple sinusoids, i.e.,
η(t) =
aisin(it+ϕi)+b, Ωi6=j6= 0; i6=j(19)
In order to model the kinematics of the above, let x=
(η1,˙η1, . . . , ηN,˙ηN, b). The state is defined such that the
i-th pair (ηi,˙ηi) corresponds to the i-th frequency com-
ponent, and has the same discrete model as in Eq. (16).
Each (ηi,˙ηi) is independent from other pairs, thus we
have the following state-space model:
x(k+ 1) =
x(k) +
z(k+ 1) = 1 0 1 0 · · · 0 1 x(k+ 1) + ω(k+ 1) (21)
where Aiis a 2 ×2 matrix defined as
Ai=cos(iTss) sin(iTss )/Ωi
-isin(iTss) cos(iTss ), i = 1,· · · , N.
4.2 Linear position and velocity estimation
We decouple the 3D target motions to x,yand zaxes
and estimate the components independently. As a re-
sult, the system order can be reduced, and we only need
to consider the problem of estimating the position and
velocity in one generic axis. Recall that a relative pose
measurement Cˆ
ξTsampled at 1/Tss Hz is available from
the vision system. Combined with the known camera
pose WξC, the target pose measurements in the world
coordinate system can be obtained as WξT=WξC
ξT.Then the position measurement WtT= (tx, ty, tz)
can be extracted from WξT. For the motion in each
axis, we choose the appropriate dynamic model based
on an assessment of the motion type.
x(k+ 1) = F x(x) + v(k) (23)
z(k+ 1) = Hx(k+ 1) + w(k) (24)
Denote the process noise covariance as Qand the mea-
surement noise covariance as R. Kalman filter predic-
tion and update steps, Eqs. (25-31), are applied to filter
the noisy measurement and estimate the position and
velocity for each axis. In the prediction step, we first
compute the estimated state x(k+ 1|k) and state co-
variance P(k+ 1|k).
ˆx(k+ 1|k) = Fˆx(k|k) (25)
P(k+ 1|k) = F P (k|k)F0+Q(26)
Here, (k+1|k) indicates estimated variables given mea-
surements up to and includeing z(k). Similarly, (k+
1|k+1) indicates estimates given measurement include-
ing z(k+1). The measurement prediction ˆz(k+1|k) and
the corresponding prediction covariance S(k+ 1) are
ˆz(k+ 1|k) = Hˆx(k+ 1|k) (27)
S(k+ 1) = R+HP (k+ 1|k)H0(28)
In the correction step, we compute the Kalman filter
gain K(k+ 1) and the updated state estimation and
covairance using latest measurement z(k+ 1).
W(k+ 1) = P(k+ 1|k)H0S(k+ 1)1(29)
ˆx(k+ 1|k+ 1) = ˆx(k+ 1|k)
+W(k+ 1)(z(k+ 1) ˆz(k+ 1|k))
P(k+ 1|k+ 1) =
P(k+ 1|k)W(k+ 1)S(k+ 1)W(k+ 1)0(31)
4.3 Angular position and velocity estimation
In order to build Kalman filters for estimating the ro-
tation angles and velocities, we first extract the mea-
sured rotation matrix WRTfrom WξT, then convert it
to Euler angles (α, β, γ ). Note that there are twelve dif-
ferent representations of the Euler angles and any one
of them can be used. Here we use the roll-pitch-yaw rep-
resentation (i.e., with the rotation order “ZYX”) that
is popular for ships, aircraft and vehicles. Similar to the
linear position estimation, we view each Euler angle as
an individual component and apply Eqs. (25-31) with
the appropriate dynamic model. Note that the physi-
cal meanings of the angular velocity estimates ( ˙α, ˙
β, ˙γ)
depend on the choice of Euler representation. For ex-
ample, in the roll-pitch-yaw representation, ˙
βand ˙γ
are the angular velocities about the rotated yand x-
axes instead of the original ones. We obtain the angu-
lar velocity with respect to the world coordinate system
WvT,a = (ωx, ωy, ωz) by coordinate conversion:
WvT,a = ˙α
y+Ry(β) ˙γ
Expanding Eq. (32), we have
ωx= ˙γcos(α) cos(β)˙
βcos(α) + ˙γcos(β) sin(α)
ωz= ˙α˙γsin(β)
6 Hui Xiao, Xu Chen
5 Interpolation and Delay Compensation
The target position and velocity estimated from the
Kalman filter are only sampled at fss = 1/Tss Hz.
They also inherit the time delays from the visual mea-
surements. In this section, we build a multi-rate model-
based predictor to construct new data points from the
slow sampled Kalman filter outputs. Substantially ex-
tending our past theoretical results [12], the proposed
algorithm also compensates the time delays.
5.1 Multi-rate model-based prediction revisited
Consider a continuous-time signal dc(t) and its two dis-
cretized sequences df[n] and ds[n] with sampling pe-
riods of Tsf and Tss =LTsf ), respectively. That is,
df[Ln] = ds[n] = dc(nTss).If there exists a polynomial
model A(z1) = 1 + a1z1+· · · +amzm(am6= 0)
such that A(z1)df[n] = 0 at the steady state, then
df[nL +k] can be expressed with d[n] as
df[nL +k] = Wk(z1)d[n]
=wk,0d[n] + wk,1d[n1] + ···+wk ,m1d[nm+ 1].
The coefficients of Wk(z1) can be solved from a sys-
tem of linear equations
where MkR[L(m1)+k]×[l(m1)+k]and is given by the
polynomial model A(z1) (see [12] for details).
5.2 Polynomial models of the target movement
Given a discrete sequence d[n], its polynomial model (if
exists) is defined by A(z1) = 1 + a1z1+· · · +amzm
and A(z1)d[n]0 when n→ ∞. We show that the
polynomial model of d[n] can be easily obtained from
its Z-transform.
Lemma 1 If a discrete sequence d[n]has Z-transform
Z{d[n]}=B(z1)/A(z1), then the polynomial model
of d[n]is A(z1).
Proof Suppose a discrete linear time-invariant system
with a transfer function B(z1)/A(z1). From the prop-
erty of the transfer function, d[n] is equivalent to the
Dynamic model Position Velocity
DWNA (1 z1)21z1
DWPA (1 z1)3(1 z1)2
CF 1 2 cos(ΩTsf )z1+z2
i=1 12 cos(iTsf )z1+z2
Table 1: Polynomial models of the position and velocity
profiles for different dynamic models
Fig. 2: Illustration of the interpolation and delay com-
pensation procedure. Here, L=Tss /Tsf = 3.
impulse response of the system. That is, A(z1)d[n] =
B(z1)δ[n],where δ[n] is the delta impulse signal. Then
we have A(z1)d[n] = 0 for n>nb, where nbis the or-
der of polynomial B(z1).
For example, for the constant frequency model, the
ideal velocity profile (sampled at 1/Tsf Hz) is a pure
sinusoid df[n] = aΩ cos(ΩTsf n+ϕ), whose polynomial
model can be found as A(z1) = 1 2 cos(Tsf )z1+
z2by taking the Z-transform of df[n]. Table 1 summa-
rizes the polynomial models of the position and velocity
profiles for each dynamic model mentioned in section
5.3 Interpolation and delay compensation algorithm
As is shown in subsection 5.1, the MMP can construct a
connection between the historical slow-sampled data set
(i.e., ds[n], ds[n1] and so on) and a future data point
df[nL +k] in the fast-sampled sequence. This connec-
tion allows to predict the position and velocity sampled
at 1/Tsf Hz using Kalman filter outputs. Furthermore,
by properly adjusting the kvalue, the measurement
delay time τcan be compensated. In more details, at
each discrete step, we first calculate the index kcsuch
that nL +kccorresponds to the current time tc(see
Figure 2). Then we calculate the length of prediction
steps kp= round(τ/Tsf ) needed to compensate the de-
lay time τ. Finally we use k=kc+kpand apply Eq.
(34). The algorithm to compute the fast-sampled data
dfwith delay compensated is summarized in Algorithm
A simulated interpolation and delay compensation
result is shown in Figure 3. In the simulation, the fast
and slow sampling times are Tsf = 8ms and Tss =
Robotic Target Following with Slow and Delayed Visual Feedback 7
Algorithm 1: Interpolation and delay com-
pensation (one circle)
input : most recent ds[n], delay time τand current
time tc
output: ˆ
1if ds[n]has a new value then
2update the data storage that keeps the most m
recent ds[n] values {ds[n],··· , ds[nm+ 1]};
3tlast tc;
4end if
5kc(tctlast)/Tsf ;
6kpround(τ/Tsf );
8From Eq. (35), solve wk,0,··· , wk,m1;
9From Eq.(34), compute ˆ
df[n] ;
0 0.5 1 1.5 2 2.5 3
Time (seconds)
Fig. 3: Simulated interpolation and delay compensation
48ms, respectively. Thus L=Tss/Tsf = 6. The polyno-
mial model used here corresponds to the CCF dynamic
model with two frequencies 1= 2π×1.2 rad/s and
2= 2π×3.1 rad/s. Based on Table 1, A(z1) can be
derived as 1 3.97z1+ 5.94z23.97z3+z4.The
results shows that MMP can accurately predict the in-
tersample data and compensate the measurement delay
(56ms in this case).
6 Numerical Simulation
In this section, we simulate the performances of the
proposed target following algorithm under various sce-
narios. The simulated algorithm combines the PBVS
considering a moving target (see Section 3), Kalman
filters (see Section 4) and the interpolation and delay
compensation using MMP (see Section 5). An overview
of the system structure is shown in Figure 4. In the sim-
ulation, the target moves following a fixed 2D track as
is showing in the Figure 5. The target started from posi-
tion (0,0) and then went through the full circle multiple
times. Here, the target pose measurement from the pose
estimation block is simulated by adding a zero-mean
Gaussian noise to the true target pose. The noises are
independently distributed across each axis and have a
standard derivation of 0.01 meters. To exclude the ran-
domness in each simulation runs, we did 50 independent
simulation runs for each scenario and use the averaged
following error for comparison.
In the first scenario, the robot controller loop is run-
ning at 125Hz (i.e., Tsf = 8 micro seconds). However,
the pose estimation from the vision system is running
eight times slower and is delayed by 64 micro seconds.
Figure 6 shows the tracking error statistics for the first
four laps. Here, we also provide simulation results when
using the basic PBVS (see [8]) or our modified PBVS
(see Section 3) but without interpolation and delay
compensation. In the first lap, we observed that the
following error reduced from its initial value to a signifi-
cant lower level. This can be seen in the error bars where
the maximum errors in the first lap all start with around
3 meters and the minimum errors are below 0.1 meters.
In the following laps, the simulation all transient to a
steady-state where the following errors are constrained
in a fixed range. The average tracking errors using the
basic PBVS is 87 millimeters in the steady-state, which
is significant considering that the track dimension is
only 320 millimeters by 160 millimeters. Implementing
our improved PBVS which considered a moving target
could reduce the following error. However, the error is
still large due to the slow-sampling and delays in visual
feedback. Finally, using our proposed approach could
reduce the errors to a small range around 3 millime-
ters, yielding a 95% error reduction compared to the
basic PBVS.
In order to analyze the impact of slow-sampling and
time delay, we designed two sets of scenarios where the
target all follow the same track as is shown in Figure 5.
In the first set of scenarios, we fixed the system’s sam-
pling speed and the time delays but varied the target
moving speed. The steady-state following errors for the
first set of scenarios are shown in Figure 7. Not surpris-
ingly, we observe that the object following performance
downgrades as the target speed increases. When the
target moves 12 times faster than the base level, we see
that our improved PBVS actually induced more error
compared with the baseline, while MMP techniques can
still achieve a 80% error reduction rate.
In the second set of scenarios, we fixed the system’s
sampling speed and the target moving speed but varied
the time delays of the vision system. As is shown in Fig-
ure 8, the following performance will downgrade as the
delay time increase when we applied no delay compen-
sation. With Kalman filters and MMP, the approached
algorithm could still constrain the following error to a
small level when delay increases.
8 Hui Xiao, Xu Chen
PBVS with
Moving Target
and Delay
Kalman Filters Pose
image data
pose &
target pose &
desired robot
pose relative to
fast sampled encoder feedback
commands joint current
commands moving target
Fig. 4: The overview of the proposed target following algorithm. The dashed lines represent signals that are updated
every Tss seconds while the solid lines represent fast-sampled signals that will update every Tsf seconds.
0 0.05 0.1 0.15 0.2 0.25 0.3
x (meters)
y (meters)
Fig. 5: The target moving trajectory used in the simu-
Lap 1 Lap 2 Lap 3 Lap 4
Following error (meters)
8.7e-02 8.7e-02 8.7e-02
4.6e-02 4.6e-02 4.5e-02
3.4e-03 3.3e-03 3.3e-03
Basic PBVS
PBVS with velocity compensation
Our approach
Fig. 6: Comparison of the following errors in each lap.
Each error bar indicates the maximun, minimun and
mean errors in a lap.
7 Experiment Results
The proposed target following algorithm was tested on
a dual-arm robot, as is shown in Figure 9. The left
arm in the figure holds a target and the right arm has
1x 2x 6x 8x 10x 12x
Speed Multiplier
Following error (meters)
8.7e-02 1.2e-01 1.3e-01 1.3e-01 1.3e-01 1.2e-01
8.7e-02 1.1e-01 1.4e-01 2.0e-01
3.2e-03 4.3e-03
1.2e-02 1.6e-02 2.0e-02 2.5e-02
Basic PBVS
PBVS with velocity compensation
Our approach
Fig. 7: Comparison of the following error when the
target is moving at different levels of speed. At the
base level, target takes 2 seconds to complete one lap.
Each error bar indicates the maximun, minimun and
the mean values of the errors in the last lap.
0 32 64 96 128 160
Delay (micro seconds)
Following error (meters)
8.7e-02 9.8e-02 1.1e-01 1.2e-01 1.3e-01 1.4e-01
4.5e-02 6.1e-02 7.5e-02 8.9e-02
3.3e-03 3.5e-03 3.3e-03 3.5e-03 3.3e-03 3.6e-03
Basic PBVS
PBVS with velocity compensation
Our approach
Fig. 8: Comparison of the following error when the vi-
sion system has different levels of time dealys. Each er-
ror bar indicates the maximun, minimun and the mean
values of the errors in the last lap.
a camera mounted to the end-effector. The world co-
ordinate system {W}is attached to the base of the
right robot arm. In the experiment, we moved the tar-
get with a 2-D circle trajectory that is parallel to the
Robotic Target Following with Slow and Delayed Visual Feedback 9
Tar get
Cam er a
Fig. 9: Dual-arm robot used to test the target following
W-x-yplane. Specifically, the linear xand yvelocities of
the target is controlled to be sinusoidal with π/2 phase
difference. During the target following experiment, the
end-effector’s position and velocity of the left robot are
assumed to be in the 3D space and are unknown to
the right robot, but is used to calculate the following
errors at a high sampling speed for analysis purposes.
Note that the visual following errors are represented re-
spected to the table coordinate system (see Figure 9),
which has a 45 degree orientation difference from the
robot arm base.
Sampling speed and measurement delay. The
camera used is Mako G192C from Allied Vision Tech-
nology which has 60fps at a full resolution of 1600×1200
pixels. ArUco markers [15] are used for estimating the
target pose in the camera frame. Due to the heavy com-
putation cost of the marker detecting and pose estima-
tion process, the target pose measurements can only
be updated at a maximum rate of about 20Hz. On the
other hand, the robot servo loop is running at a high
sampling rate of 125Hz. To make the fast and slow sam-
pling rate almost integer multiples, we triggered the
camera to acquire images at a fixed rate of 17.85fps,
then L=Tss/Tsf = 7. The measurement delays can be
monitored by adding time stamps to the images when
captured. In our experiment setup, the measurement
delays were about 50ms.
Two scenarios were tested using the proposed tar-
get following algorithm. In the first scenario the tar-
get moved at 0.5Hz, while in the second scenario, the
target moved at a higher frequency of 1Hz. In both
scenarios, the CF model was used for linear xand y
axes for the Kalman filtering; DWNA model was used
for other axes. Then the appropriate polynomial mod-
els were chosen from Table 1 and MMPs were applied
for all axes. Note that if a time-invariant polynomial
Fig. 10: Moving trajectories of the target and robot in
model is assumed, then the MMP parameters w0
k,is can
be calculated off-line to save the on-line computation
Figure 11a shows the target following errors of the
linear x,yand zaxes in the first scenario. Here, we
tested four levels of target following algorithms. The
first level only uses the basic PBVS algorithm which is
designed based on a static target assumption [8]. The
second level considered a moving target, and the esti-
mated target positions and velocities from the Kalman
filters are directly used. The third level not only consid-
ered a moving target but also compensated the delays
using MMP. In order to show the importance of fast
sampling rate of measurement, we down-sampled the
MMP outputs in the third level to the slow sampling
rate 1/Tss Hz. Finally, in the fourth level, both Kalman
filters and MMP are used without down-sampling. Statis-
tic result of the target following error are shown in Fig-
ure 11b, where the 3σvalue plus the absolute mean of
the errors are computed for comparison. The same per-
formance improvements can also be observed by com-
paring the moving trajectories of the target and robot
(see Figure 10).
One can clearly observe the performance improve-
ment when the target velocity estimation, delay com-
pensation, and interpolation techniques are added to
the algorithm.
For the second scenario, the following error of lin-
ear x,yand z-axes are shown in Figure 11c and Figure
11d. The same four levels of target following algorithm
are tested. Different from the first scenario, the first
three levels actually failed to reduce the following er-
ror. This is because when the target moves at a higher
frequency, the consequences of measurement delay and
slow measurement sampling become more significant.
10 Hui Xiao, Xu Chen
As a result, it requires both delay compensation and
interpolation to reduce the following error.
8 Conclusion and Future Work
This paper proposed a method to track a moving target
with fast dynamics with only slow sampled and delayed
visual feedback. We discussed a general method to track
3D target movement with partially known target veloc-
ity. The proposed approach estimates target position
and velocity with Kalman filters and builds multi-rate
model-based predictions to reconstruct fast motion pro-
files and to compensate the time delay. The result from
a dual-arm robot experiment verified the effectiveness
of the proposed algorithm and demonstrated the danger
of slow-sampled and delay visual feedback.
The current proposed approach assumes that the
target follows a time-invariant model during tracking.
Adaptive filtering techniques can be further investigated
for applications with time-variant target models. An-
other future work is extended robustness in presence of
more uncertainties in the interpolation and delay com-
1. Aldoma, A., Marton, Z.C., Tombari, F., Wohlkinger, W.,
Potthast, C., Zeisl, B., Rusu, R.B., Gedikli, S., Vincze,
M.: Tutorial: Point cloud library: Three-dimensional ob-
ject recognition and 6 dof pose estimation. IEEE
Robotics & Automation Magazine 19(3), 80–91 (2012)
2. Bar-Shalom, Y., Li, X.R., Kirubarajan, T.: Estimation
with applications to tracking and navigation: theory al-
gorithms and software. John Wiley & Sons (2004)
3. Barrois, B., W¨ohler, C.: 3d pose estimation of vehicles us-
ing stereo camera. Encyclopedia of Sustainability Science
and Technology pp. 10589–10612 (2012)
4. Bateux, Q., Marchand, E., Leitner, J., Chaumette, F.,
Corke, P.: Training deep neural networks for visual servo-
ing. In: 2018 IEEE International Conference on Robotics
and Automation (ICRA), pp. 3307–3314 (2018)
5. Bensalah, F., Chaumette, F.: Compensation of abrupt
motion changes in target tracking by visual servoing. In:
Proceedings 1995 IEEE/RSJ International Conference on
Intelligent Robots and Systems. Human Robot Interac-
tion and Cooperative Robots, vol. 1, pp. 181–187. IEEE
6. Casta˜no, A., Hutchinson, S.: Visual compliance: Task-
directed visual servo control. IEEE transactions on
Robotics and Automation 10(3), 334–342 (1994)
7. Chalimbaud, P., Berry, F.: Embedded active vision sys-
tem based on an fpga architecture. EURASIP Journal
on Embedded Systems 2007(1), 26–26 (2007)
8. Chaumette, F., Hutchinson, S.: Visual servo control. I.
basic approaches. IEEE Robotics & Automation Maga-
zine 13(4), 82–90 (2006)
9. Chaumette, F., Hutchinson, S.: Visual servo control. ii.
advanced approaches [tutorial]. IEEE Robotics & Au-
tomation Magazine 14(1), 109–118 (2007)
10. Chaumette, F., Rives, P., Espiau, B.: Classification and
realization of the different vision-based tasks. In: Vi-
sual Servoing: Real-Time Control of Robot Manipulators
Based on Visual Sensory Feedback, pp. 199–228. World
Scientific (1993)
11. Chen, J., Dawson, D.M., Dixon, W.E., Behal, A.:
Adaptive homography-based visual servo tracking for a
fixed camera configuration with a camera-in-hand exten-
sion. IEEE Transactions on Control Systems Technology
13(5), 814–825 (2005)
12. Chen, X., Xiao, H.: Multirate forward-model distur-
bance observer for feedback regulation beyond nyquist
frequency. Systems & Control Letters 94, 181–188 (2016)
13. Delbr¨uck, T., Linares-Barranco, B., Culurciello, E.,
Posch, C.: Activity-driven, event-based vision sensors. In:
Proceedings of 2010 IEEE International Symposium on
Circuits and Systems, pp. 2426–2429. IEEE (2010)
14. Firouzi, H., Najjaran, H.: Real-time monocular vision-
based object tracking with object distance and motion
estimation. In: 2010 IEEE/ASME International Confer-
ence on Advanced Intelligent Mechatronics, pp. 987–992.
IEEE (2010)
15. Garrido-Jurado, S., Munoz-Salinas, R., Madrid-Cuevas,
F.J., Medina-Carnicer, R.: Generation of fiducial marker
dictionaries using mixed integer linear programming.
Pattern Recognition 51, 481–491 (2016)
16. Janabi-Sharifi, F., Marey, M.: A kalman-filter-based
method for pose estimation in visual servoing. IEEE
transactions on Robotics 26(5), 939–947 (2010)
17. Lepetit, V., Moreno-Noguer, F., Fua, P.: Epnp: An ac-
curate O(n) solution to the PnP problem. International
journal of computer vision 81(2), 155 (2009)
18. Lin, C.Y., Wang, C., Tomizuka, M.: Pose estimation in in-
dustrial machine vision systems under sensing dynamics:
A statistical learning approach. In: 2014 IEEE Interna-
tional Conference on Robotics and Automation (ICRA),
pp. 4436–4442. IEEE (2014)
19. Luo, R.C., Chou, S.C., Yang, X.Y., Peng, N.: Hybrid
eye-to-hand and eye-in-hand visual servo system for par-
allel robot conveyor object tracking and fetching. In:
IECON 2014-40th Annual Conference of the IEEE In-
dustrial Electronics Society, pp. 2558–2563. IEEE (2014)
20. Malis, E., Chaumette, F., Boudet, S.: 2 1/2 d visual ser-
voing. IEEE Transactions on Robotics and Automation
15(2), 238–250 (1999)
21. Mohebbi, A., Keshmiri, M., Xie, W.F.: An eye-in-hand
stereo visual servoing for tracking and catching moving
objects. In: Proceedings of the 33rd Chinese Control Con-
ference, pp. 8570–8577. IEEE (2014)
22. Morel, G., Liebezeit, T., Szewczyk, J., Boudet, S., Pot, J.:
Explicit incorporation of 2d constraints in vision based
control of robot manipulators. In: Experimental Robotics
VI, pp. 99–108. Springer (2000)
23. Nakabo, Y., Ishikawa, M., Toyoda, H., Mizuno, S.: 1 ms
column parallel vision system and its application of high
speed target tracking. In: Proceedings 2000 ICRA. Mil-
lennium Conference. IEEE International Conference on
Robotics and Automation. Symposia Proceedings (Cat.
No. 00CH37065), vol. 1, pp. 650–655. IEEE (2000)
24. Sadeghi, F.: Divis: Domain invariant visual servoing for
collision-free goal reaching. In: Robotics: Science and
Systems (RSS) (2019)
25. Siradjuddin, I., Behera, L., McGinnity, T.M., Coleman,
S.: A position based visual tracking system for a 7
dof robot manipulator using a kinect camera. In: The
2012 international joint conference on neural networks
(IJCNN), pp. 1–7. IEEE (2012)
Robotic Target Following with Slow and Delayed Visual Feedback 11
(a) Following error when target moves at 0.5Hz.
no tracking level 1 level 2 level 3 level 4
level of the tracking algorithm
Following error (meters)
error in x axis
error in y axis
error in z axis
(b) Following error statistics when target moves at 0.5Hz.
(c) Following error when target moves at 1Hz.
no tracking level 1 level 2 level 3 level 4
level of the tracking algorithm
Following error (meters)
error in x axis
error in y axis
error in z axis
(d) Following error statistics when target moves at 1Hz.
Fig. 11: Following errors for different levels of controls in the experiment.
26. Wang, C., Lin, C.Y., Tomizuka, M.: Visual servoing con-
sidering sensing dynamics and robot dynamics. IFAC
Proceedings Volumes 46(5), 45–52 (2013). 6th IFAC
Symposium on Mechatronic Systems
27. Wang, C., Lin, C.Y., Tomizuka, M.: Design of kinematic
controller for real-time vision guided robot manipulators.
In: 2014 IEEE International Conference on Robotics and
Automation (ICRA), pp. 4141–4146. IEEE (2014)
28. Xiang, Y., Schmidt, T., Narayanan, V., Fox, D.:
PoseCNN: A convolutional neural network for 6d object
pose estimation in cluttered scenes. In: Robotics: Science
and Systems (RSS) (2018)
29. Xiao, H., Chen, X.: Following fast-dynamic targets with
only slow and delayed visual feedback: A kalman filter
and model-based prediction approach. In: Dynamic Sys-
tems and Control Conference (2019)
30. Zhang, M., Liu, X., Xu, D., Cao, Z., Yu, J.: Vision-based
target-following guider for mobile robot. IEEE Transac-
tions on Industrial Electronics (2019)
31. Zheng, Y., Kuang, Y., Sugimoto, S., Astrom, K., Oku-
tomi, M.: Revisiting the PnP problem: A fast, general
and optimal solution. In: Proceedings of the IEEE Inter-
national Conference on Computer Vision, pp. 2344–2351
... where . In Expressions (21) and (22), the derivatives of the angles are used, which depend on the angular velocities measured by the onboard sensors of the AUV-follower, using the following expression [22]: ...
... Equations (20)- (22) are used to estimate the speeds and accelerations of the target points relative to the AUV-follower in the time intervals between updating the data about the current position and the orientation of the AUV-leader. ...
... It can be seen from (23) that the input signal for the model is the acceleration vector of the target point, obtained based on Expression (22). ...
Full-text available
The paper proposes a new method for the synthesis of spatial motion control systems of the AUV-leader and a group of AUV-followers during their cooperative movement in a desired formation. This system allows the provision of an accurate positioning of the followers relative to the leader using information received with a low frequency only from their onboard video cameras. To improve the accuracy of the created system it proposes a method of estimation of the parameters of the movement of the AUV-leader (its speeds and accelerations) to predict its movement relative to the AUV-follower in the time intervals between the updates of information received from video cameras.
... Parameters that are good indicators of slip have been identified in this paper, work can be done on how these parameters can be utilized to control the slip. Additionally, the sampling delay and sampling rate discrepancy between the tactile sensor and camera systems may result in delayed implementation of force control strategies, as observed in previous work [28], [29]. Further research can be done to improve the synchronization of the sensor data from two input sensors. ...
Detection of slip during object grasping and manipulation plays a vital role in object handling. Existing solutions largely depend on visual information to devise a strategy for grasping. Nonetheless, in order to achieve proficiency akin to humans and achieve consistent grasping and manipulation of unfamiliar objects, the incorporation of artificial tactile sensing has become a necessity in robotic systems. In this work, we propose a novel physics-informed, data-driven method to detect slip continuously in real time. The GelSight Mini, an optical tactile sensor, is mounted on custom grippers to acquire tactile readings. Our work leverages the inhomogeneity of tactile sensor readings during slip events to develop distinctive features and formulates slip detection as a classification problem. To evaluate our approach, we test multiple data-driven models on 10 common objects under different loading conditions, textures, and materials. Our results show that the best classification algorithm achieves an average accuracy of 99\%. We demonstrate the application of this work in a dynamic robotic manipulation task in which real-time slip detection and prevention algorithm is implemented.
... one of which is the motion target tracking [5][6][7]. Motion target tracking means that the servo system acquires the position information of the target through the vision sensor (usually the camera), and the servo controller generates control commands to control the servo mechanism to follow the target [8]. A typical servo system for motion target tracking consists of a camera and a robotic manipulator. ...
Conference Paper
Full-text available
A typical vision servo system for motion target tracking consists of a robotic manipulator and a camera mounted at its end. The image acquired by the camera is processed by a vision algorithm to obtain the position of the motion target in 3D space, whereby the servo controller controls the robotic manipulator to track the motion of the target and keep it in the camera’s field of view at all times. In this paper, we proposed an evaluation indicator to evaluate such tracking tasks, and proposed an adaptive damped least-square algorithm to avoid singularities during the motion of the robotic arm. An experimental platform consist of ZED binocular camera and UR5 robotic arm is built to conduct experiments on singularity avoidance and target tracking, verifying that the visual servo system with the singularity avoidance algorithm can achieve a good tracking result.
... Visual servoing still has several unsolved issues, as is the case with vision system latency (Yu et al. (2017); Xiao and Chen (2020)), i.e. the total time required to obtain the visual information. Nowadays, the main cause of latency in vision systems is due to image processing algorithms, which are often computationally intensive, limiting the refresh rate of the commands provided to the robot system (Castelli et al. (2017); Zhang et al. (2017)). ...
This paper develops the application of the Dual Rate Dual Sampling Reference Filtering Control Strategy to 2D and 3D visual feedback control. This strategy allows to overcome the problem of sensor latency and to address the problem of control task failure due to visual features leaving the camera field of view. In particular, a Dual Rate Kalman Filter is used to generate inter-sample estimations of the visual features to deal with the problem of vision sensor latency, whereas a Dual Rate Extended Kalman Filter Smoother is used to generate more convenient visual features trajectories in the image plane. Both 2D and 3D visual feedback control approaches are widely analyzed throughout the paper, as well as the overall system performance using different visual feedback controllers, providing a set of results that highlight t he improvements in terms of solution reachability, robustness, and time domain response. The proposed control strategy has been validated on an industrial system with hard real-time limitations, consisting of a 6 DOF industrial manipulator, a 5 MP camera, and a PLC as controller.
Conference Paper
Full-text available
Although visual feedback has enabled a wide range of robotic capabilities such as autonomous navigation and robotic surgery, low sampling rate and time delays of visual outputs continue to hinder real-time applications. When partial knowledge of the target dynamics is available, however, we show the potential of significant performance gain in vision-based target following. Specifically, we propose a new framework with Kalman filters and multirate model-based prediction (1) to reconstruct fast-sampled 3D target position and velocity data, and (2) to compensate the time delay for general robotic motion profiles. Along the path, we study the impact of modeling choices and the delay duration, build simulation tools, and experimentally verify different algorithms with a robot manipulator equipped with an eye-in-hand camera. The results show that the robot can track a moving target with fast dynamics even if the visual measurements are slow and incapable of providing timely information.
Conference Paper
Full-text available
We present a deep neural network-based method to perform high-precision, robust and real-time 6 DOF visual servoing. The paper describes how to create a dataset simulating various perturbations (occlusions and lighting conditions) from a single real-world image of the scene. A convolutional neural network is fine-tuned using this dataset to estimate the relative pose between two images of the same scene. The output of the network is then employed in a visual servoing control scheme. The method converges robustly even in difficult real-world settings with strong lighting variations and occlusions.A positioning error of less than one millimeter is obtained in experiments with a 6 DOF robot.
Full-text available
A fundamental challenge in digital control arises when the controlled plant is subjected to a fast disturbance dynamics but is only equipped with a relatively slow sensor. Such intrinsic difficulties are, however, commonly encountered in many novel applications such as laser- and electron-beam-based additive manufacturing, human–machine interaction, etc. This paper provides a discrete-time regulation scheme for exact sampled-data rejection of disturbances beyond Nyquist frequency. By introducing a model-based multirate predictor and a forward-model disturbance observer, we show that the inter-sample disturbances can be fully attenuated despite the limitations in sampling and sensing. The proposed control scheme offers several advantages in stability assurance and lucid design intuitions. Verification of the algorithm is conducted on a motion control platform that shares the general characteristics in several advanced manufacturing systems.
A vision-based target-following guider for a mobile robot is presented in this paper. It consists of three parts: the visual tracking part, the target re-detection part, and the visual servo part. In the visual tracking part, the target contour band (TCB) is proposed for irregular sampling, which can improve the performance of the correlation filter-based methods. In the target re-detection part, potential targets are searched by an off-line trained detector. Then an online training module is used to determine the real target and achieve accurate positioning. The interaction matrix of point features is used in the visual servo part for motion control to maintain the relative pose between the target and the robot. The visual tracking part and the target re-detection part are tested respectively on many videos, which are proved to work well. To show the effectiveness of our method, some state of the art methods are also test. The target-following guider is evaluated on mobile robots. In our experiments, the robot can robustly follow the human target over a long distance, which strongly proved the validity of our target-following guider.
Conference Paper
In this paper an image-based eye-in-hand stereo visual servoing system is proposed to perform a task of tracking and catching a moving object in real time. The presented method utilizes the conventional image based visual servoing algorithm for a stereo vision system mounted on the end-effector of 6 degrees of freedom (DOF) robot. In order to estimate the object motion trajectory in two dimensional image planes and predict the future positions, a Kalman filter with a linear model of the moving object and also an extended Kalman filter (EKF) with a non-linear model are employed. The procedure of tracking and catching a moving object is simulated and implemented based on a 6 DOF DENSO 6242G robot. Both simulation and experimental results are presented to verify the effectiveness of the proposed methods. The comparison with the case of a monocular image-based system is carried out.
The objective of this paper is to develop conveyer object tracking and fetching system. The traditional visual servo system for parallel robot is based on Eye-to-hand method which suffers from the narrow field of view (FOV). The basic step of moving assembling is multiple objects tracking which needs broad work space and big FOV of the parallel robot. In this paper, a Hybrid Eye-to-hand and Eye-in-hand Visual Servo System for parallel robot is proposed. Via this hybrid system, the parallel robot can achieve (multiple) object tracking and fetching with priority planning we designed. The eye-to-hand visual system with low resolution camera provides the estimations of the velocities and poses of the workpieces. Also the fetching priority is planned with this system. All aforementioned information is transferred to the eye-in-hand visual servo system for further use. Once the object appeared in the FOV of eye-in-hand system in the estimated position, the high resolution camera of the system is employed to acquire the pose information of the object with high accuracy and track it frame by frame. The hybrid system is successfully demonstrated and implemented on parallel robot which not only broadens the working space of parallel robot but also achieves great efficiency and accuracy.