ArticlePDF Available

Abstract and Figures

Sports robots have become a popular research topic in recent years. For table-tennis robots, ball tracking and trajectory prediction are the most important technologies. Several methods were developed in previous research efforts, and they can be divided into two categories: physical models and machine learning. The former use algorithms that consider gravity, air resistance, the Magnus effect, and elastic collision. However, estimating these external forces require high sampling frequencies that can only be achieved with high-efficiency imaging equipment. This study thus employed machine learning to learn the flight trajectories of ping-pong balls, which consist of two parabolic trajectories: one beginning at the serving point and ending at the landing point on the table, and the other beginning at the landing point and ending at the striking point of the robot. We established two artificial neural networks to learn these two trajectories. We conducted a simulation experiment using 200 real-world trajectories as training data. The mean errors of the proposed dual-network method and a single-network model were 39.6 mm and 42.9 mm, respectively. The results indicate that the prediction performance of the proposed dual-network method is better than that of the single-network approach. We also used the physical model to generate 330 trajectories for training and the simulation test results show that the trained model achieved a success rate of 97% out of 30 attempts, which was higher than the success rate of 70% obtained by the physical model. A physical experiment presented a mean error and standard deviation of 36.6 mm and 18.8 mm, respectively. The results also show that even without the time stamps, the proposed method maintains its prediction performance with the additional advantages of 15% fewer parameters in the overall network and 54% shorter training time.
Content may be subject to copyright.
sensors
Article
Ball Tracking and Trajectory Prediction for
Table-Tennis Robots
Hsien-I Lin 1,* , Zhangguo Yu 2and Yi-Chen Huang 1
1Graduate Institute of Automation Technology, National Taipei University of Technology, No. 1, Sec. 3,
Zhongxiao E. Rd., Taipei 10608, Taiwan; john6402j@gmail.com
2
School of Mechatronical Engineering, Beijing Institute of Technology, Beijing 100081, China; yuzg@bit.edu.cn
*Correspondence: sofin@ntut.edu.tw; Tel.:+886-2-27713343
Received: 26 November 2019; Accepted: 30 December 2019; Published: 7 January 2020


Abstract:
Sports robots have become a popular research topic in recent years. For table-tennis
robots, ball tracking and trajectory prediction are the most important technologies. Several methods
were developed in previous research efforts, and they can be divided into two categories: physical
models and machine learning. The former use algorithms that consider gravity, air resistance,
the Magnus effect, and elastic collision. However, estimating these external forces require high
sampling frequencies that can only be achieved with high-efficiency imaging equipment. This study
thus employed machine learning to learn the flight trajectories of ping-pong balls, which consist
of two parabolic trajectories: one beginning at the serving point and ending at the landing point
on the table, and the other beginning at the landing point and ending at the striking point of the
robot. We established two artificial neural networks to learn these two trajectories. We conducted
a simulation experiment using 200 real-world trajectories as training data. The mean errors of the
proposed dual-network method and a single-network model were 39.6 mm and 42.9 mm, respectively.
The results indicate that the prediction performance of the proposed dual-network method is better
than that of the single-network approach. We also used the physical model to generate 330 trajectories
for training and the simulation test results show that the trained model achieved a success rate of
97% out of 30 attempts, which was higher than the success rate of 70% obtained by the physical
model. A physical experiment presented a mean error and standard deviation of 36.6 mm and
18.8 mm, respectively. The results also show that even without the time stamps, the proposed method
maintains its prediction performance with the additional advantages of 15% fewer parameters in the
overall network and 54% shorter training time.
Keywords: table-tennis robots; ball tracking and trajectory prediction; artificial neural networks
1. Introduction
Recent years have seen the gradual maturing of sensory, machine vision, and control technology in
smart robots. Several domestic and foreign studies explored the applicability of robots to sports. KUKA
AG Robotics once made a commercial in which one of their robots played table tennis against a human,
and Omron once gave a demonstration with one of their suspended robotic arms playing table tennis
against a human at an automation exhibition. Table-tennis robots use a wide range of technologies,
including object recognition, object tracking, 3D reconstruction, object trajectory prediction, robot
motion planning, and system integration. This as well as the fact that they are easy to showcase
attracted the attention of many researchers.
A ping-pong ball trajectory system combines vision, 3D space, and prediction algorithms, none of
which are dispensable. The vision system must be able to detect and position the ball [
1
,
2
]. The data
captured by cameras are two-dimensional (2D), so three-dimensional (3D) data cannot be derived by
Sensors 2020,20, 333; doi:10.3390/s20020333 www.mdpi.com/journal/sensors
Sensors 2020,20, 333 2 of 23
simply searching for the locations of object pixels. Wei et al. [
3
] proposed a method that uses a single
camera to calculate object positions. In addition to using image recognition to locate the current
pixels of the ball, they also used the shadow of the flying ball on the table to triangulate the spatial
location of the ball. However, it is difficult to detect the shadow of a sphere, and the sources of light in
general environments are complex and unpredictable. Their proposed approach was only useful if
there was only a single clear light source in the environment with no sunlight or other light sources.
The installation of two or more cameras can be used to establish stereopsis. Detection of an object from
multiple perspectives can increase the dimensions of the image data and enable simple calculations of
3D data. Refs. [
4
,
5
] both adopted two cameras to track the ball in their vision system. To increase the
visual coverage and accuracy of the vision system, Chen et al. [
6
] installed two high-speed cameras
above the robot and above the opponent, which amounted to four cameras covering the entire table.
Yang et al. [
7
] used six cameras to cover the table (three on each side) to achieve high precision for
every possible location of the ball.
The processing speed of the vision system is another key factor because it indirectly affects
prediction of the direction of the ball, particularly in prediction methods using physical models.
Liu [8]
proposed an onboard stereo camera system to estimate ping-pong ball trajectories in real time under
the asynchronous observations from different cameras. Graphics processing units (plGPU) have
become popular computer components in the recent trend of deep learning in image classification
because they make the graphics card less dependent on the central processing unit (CPU) and improve
the performance of graphics computing. Lampert et al. [
9
] employed an NVIDIA GeForce GTX280
graphics card and a CUDA framework to accelerate image processing and used four cameras to
establish 3D space. Their system needed only 5 ms to process a 3D location. Furthermore, German
company Nerian launched a real-time 3D stereo vision core that uses a field-programmable gate array
(FPGA) as the hardware for parallel computation [
10
]. It can complete more tasks per clock cycle, send
the computation results to a personal computer, and presents good performance in object recognition
and 3D computation. Several studies used camera arrays to realize high-speed real-time object tracking
and trajectory prediction. Zhang et al. [
11
] used a total of three cameras, one of which was a pan-tilt
camera with a resolution of 640
×
480 pixels; its sampling frequency could reach less than 5 ms, and it
could simultaneously follow a flying ball and analyze its status.
Accurate prediction of the ball trajectory is vital to the capacity of the robot to hit the ball.
Existing prediction methods can be divided into two categories: physical models and machine learning.
Physical models assess the external forces affecting the trajectory of the ball, such as constant gravity
and air resistance. Balls with self-rotation are subject to a lateral force perpendicular to the plane
formed by the angular velocity vector and the motion velocity vector of the rotation. This force, known
as the Magnus effect, causes deflection in the flight trajectory of the ball. Ping-pong balls are not
heavy, so the deflection is greater than that it would be with heavier objects [
12
,
13
]. Wang et al. [
14
]
proposed a physical model to predict ball trajectories of topspin, backspin, rightward spin, leftward
spin, and combined spin. Huang et al. [
15
] proposed a physical model that considers the self-rotation
of the ball. It uses a real-time vision system to obtain a series of trajectory points and a camera system
combining DSP and FPGA and fits the 3D data into a quadratic polynomial, which is then used to
obtain the current flying velocity. In their experiments, their proposed method had good predictive
abilities when the sampling time was 1 ms. However, if the consecutive trajectory points were not
dense enough or if the sampling frequency was too low, the accuracy of current velocity estimates is
affected. Inputting these estimates into complex formula calculations would then increase distortion
in the final results. Zhang et al. [
11
] used two high-resolution cameras to analyze the self-rotation of
the ball.
Zhao et al. [16]
used an ultrahigh-speed camera to analyze the collision between a ping-pong
ball and the table and developed a physical model for self-rotation and collision effects. Ball status
estimates can also be obtained using various filters, such a fuzzy filter [
17
], an extended Kalman
filter [
18
], and an unscented Kalman filter [
19
]. Other studies presented analyses of the aerodynamics
and the friction between ping-pong balls and tables [20,21].
Sensors 2020,20, 333 3 of 23
Machine learning has also tended towards table-tennis research in recent years. Some of the
table-tennis robots developed in these studies employed locally weighted regression algorithms to
learn hitting movements [
22
,
23
]. In machine learning algorithms, machines analyze data to find
regularities and then use the regularities to make predictions of unknown data. Zhao et al. formulated
the likelihood of ping-pong ball motion state as a Gaussian Mixture Model (GMM) [
24
].
Deng et al. [25]
proposed a ball-size likelihood to estimate the ball position. Payeur et al. [
26
] used an artificial neural
network (ANN) to learn and predict the trajectories of moving objects; however, they merely performed
simulation experiments of simple trajectories and did not develop a novel vision system or robot.
Nakashima et al. Nakashima et al. [
27
] used a back-propagation artificial neural network (BPANN) to
learn striking points. The inputs are the initial location and velocity difference of the ball in a free-fall
model, and the outputs are the striking point and displacement between striking points estimated
using simple physics. Although their simulation results were good, the model requires 20 trajectory
points for fitting, which implies that the sampling frequency of the vision system must be high.
Zhang et al. [3]
also made an attempt with four initial trajectory points and time stamps as the inputs
of the network and the location and velocity of the striking point as the outputs.
Our objective is to track and predict ping-pong ball trajectories for a robot to hit the ball. Due
to the facts that physical models for ball flight prediction require advanced vision equipment and
that most physical models are fairly complex, we adopted machine learning to achieve prediction of
ping-pong ball trajectories with limited equipment. To achieve better prediction, this study included
a ball tracking system, 3D reconstruction, and ball trajectory prediction. The novelty of this work
is that the flight trajectory between the serving point and the end of the robotic arm was viewed as
two parabolas and two separate ANNs were used to learn these two parabolas. The ball trajectory
prediction strategy proposed in this study makes the following contributions:
We propose an ANN-based approach to learn historical trajectory data, thereby doing away with
the need for a complex flight trajectory model.
The proposed method can swiftly predict where the robot should strike the ball based on the ball
location data from only a few instants after it is served, thereby giving the robot time to react.
The inputs of ANNs are generally accompanied by time stamp data. We verified that removing
the time stamp data reduces the parameter demand of the entire network and greatly shortens
network training time.
Figure 1shows the flow of the proposed method. There are three main parts: 3D construction,
ping-pong ball tracking, and ping-pong trajectory prediction. 3D construction and ping-pong ball tracking
are used to obtain the accurate ball current position. To hit the ball, we propose the dual neural networks
to predict the ball position on the hitting plane. These three parts are explained in Section 2.
Ping-Pong Ball Tracking3D Construction
Image Projection
Calculation of 3D
Location
Ping-Pong Ball
Trajectory Prediction
Selection of Trajectory
Regression Model
Dual Neural Networks
Figure 1. Flow of the proposed method.
The remainder of this paper is organized as follows. Section 2introduces the framework for 3D
reconstruction, and Section 3explains the ball trajectory prediction method proposed in this study.
Section 4presents the experiment results, and Section 5contains our conclusion.
Sensors 2020,20, 333 4 of 23
2. 3D Reconstruction Framework
2.1. Hardware Setup
Figure 2displays the 3D space system of this study. The vision system comprises three IDS color
industrial cameras with 1.3 megapixel resolution. The FPS of the cameras was set at 169, and each had
to cover most of the table surface within their field of view. The cameras were placed on both sides of
the table: one on the right (camera#1), one on the left (camera#2), and an auxiliary camera (camera#3).
The farthest visible distance of the camera system was approximately 220 cm to facilitate the widest
range of tracking. In addition, we noted whether the color of the background would interfere with
tracking. The table-tennis robot was a Stäubli industrial robot arm, which has six degrees of freedom,
repeatability of
±
0.02 mm, a reach of 670 mm, and a maximum speed of 8.0 m/s at the endpoint.
We used the high-precision Phoenix Technologies Inc. VZ4000v 3D motion capture system, which can
achieve an accuracy of 0.015 mm within a distance of 1.2 m, to analyze 3D errors in the images.
Figure 2. 3D reconstruction system of table-tennis robot.
Camera synchronization is an important issue in the multi-camera vision system. When computer
CPUs use multithreading, the various threads may not be executed at the same time depending on the
resource allocation decisions of the CPU at the time. Thus, the timing at which the cameras capture
images may not be the same. We used a master camera to send image capture signals. The slave
cameras as well as the master camera wait for the signal. Figure 3shows the synchronization process
of the camera system. To validate the synchronization process, we collected one hundred trajectories
with synchronized and unsynchronized procedures. Curve fitting was performed for each trajectory,
and then the mean-square error was analyzed. The mean-square errors with synchronized and
unsynchronized procedures were 7.7 mm and 12.1 mm, respectively. The result validated that the
synchronization procedure helped the multi-camera vision system to acquire synchronized images.
Figure 4shows the control of the cameras with the sampling time of 0.007 s. Threads 1 to 3 are the
image processing flow for each of the three cameras. When the ball is getting away from camera#1
and camera#2, the 3D position error will increase. Therefore, the auxiliary camera#3 is used to reduce
the error. The timing of using camera#3 is when the
x
-axis position of the ball is greater than 700 mm
(the robot side is the origin), camera#1 and camera#2 are used; otherwise, camera#1 or camera#2 is
used with the auxiliary camera#3 depending on whether the ball falls to the left or right of the table.
Sensors 2020,20, 333 5 of 23
Figure 3. Synchronization procedure of the multi-camera vision system.
Figure 4. Control of the cameras.
2.2. 3D Reconstruction
We used two industrial cameras to establish a 3D space. Using the pixel locations of the target
from two perspectives, we positioned the location of the target in the 3D space. The coordinate system
of the cameras was then converted into the coordinate system of the robot arm. In this way, deriving
the 3D location of an object in images from the camera system would also give us the location of
the object in the 3D space. This approach included camera calibration and triangulation. However,
the error increases when the object is further away from the camera system. This is because the pixel
features of objects further away are not as clear and the pixel resolution at this distance has reached its
limits. We will give a complete explanation in Section 2.4.
Sensors 2020,20, 333 6 of 23
2.3. Ping-Pong Ball Tracking
To track the ball, the field of view of the cameras must encompass the entire table, which includes
various complex colors and noise. To simplify image segmentation, we painted the ping-pong ball
blue to distinguish other objects and the background. A Gaussian blur is first applied to the images
captured by the cameras to remove noise and facilitate the subsequent recognition process. HSV color
space conversion is then used to reduce the impact of bright light sources, and then a threshold value
is easily set for the color of the ball to obtain a binary image. To make the target object more complete,
morphological operators erosion and dilation are applied and then the median values of the binary image
are calculated to serve as the ball position. Performing these processes on images with large fields of view
is time-consuming. To save time, we employed the region-of-interest (ROI) operation. Once the camera
tracked the ball, then only the ROI in the images is processed. The FPS was set at 169 in this study. When
an entire image with 1280
×
1024 pixels is subjected to the object recognition procedure, the average
frequency is approximately 50.7 times/s. If an ROI is adopted, then subjecting the ROI (an image of
200
×
200 pixels) to object recognition results in an estimated frequency of 514.5 times/s. Clearly, the ROI
mechanism can significantly increase the efficiency of object recognition.
2.3.1. Image Projection
The camera calibration was to obtain pixel scale, intrinsic matrix, and extrinsic matrix of the
camera. Figure 5shows that camera calibration was conducted using the checkerboard data from
30 images of a 13
×
9 checkerboard with 60
×
60 mm
2
at various angles, depths, and locations.
Tables 14present the intrinsic and extrinsic parameters of the right and left cameras.
Figure 5. Calibration using checkerboard at different positions.
Table 1. Intrinsic parameters of the right camera.
U Axis V Axis Error+ Error
Focal Length (pix) 1043.25281 1049.54675 2.24798 2.12426
Principal Point (pix) 633.44849 561.99736 2.89133 2.42647
Pixel Error (pix) 0.26777 0.30308
Sensors 2020,20, 333 7 of 23
Table 2. Intrinsic parameters of the left camera.
U Axis V Axis Error+ Error
Focal Length (pix) 1049.44688 1052.73077 1.52311 1.51341
Principal Point (pix) 597.44982 513.76570 2.18344 1.66587
Pixel Error (pix) 0.23623 0.26618
Table 3. Extrinsic parameters of the right camera.
XAxis YAxis ZAxis
Translation vector (mm) 669.420640 371.402272 2129.031554
Rotation matrix
0.821611 0.568894 0.036249
0.262626 0.434195 0.861686
0.505947 0.698451 0.506146
Pixel Error (pix) 0.21674 0.33150
Table 4. Extrinsic parameters of the left camera.
XAxis YAxis ZAxis
Translation vector (mm) 90.227708 381.155522 2279.359066
Rotation matrix
0.850091 0.526621 0.003993
0.265893 0.435734 0.859905
0.454585 0.729936 0.510438
Pixel Error (pix) 0.15424 0.32854
2.3.2. Calculation of 3D Location
As shown in Figure 6, the location of the target object can be obtained when it is within view of
the two cameras.
P
is the location of the object, and
OLe f t
and
ORight
denote the respective origins
of the coordinate systems of the left and right cameras.
Pl
and
Pr
represent the pixel locations of the
object in the images taken by the left and right cameras. Figure 6shows that the plane formed by
OLe f t
,
ORight
, and
P
is defined as an epipolar plane. This characteristic can be used to identify the
physical relationship between the two cameras [
28
]. Another approach is to assume that
Ml
w
and
Mr
w
are homogeneous matrices converting world coordinates to left and right camera coordinates using
the extrinsic parameters of the cameras, respectively. With
Mr
w
as an example, formula calculations
produce
Mw
r
, as shown in Equation (1). Multiplying
Ml
w
by
Mw
r
then gives
Ml
r
, as shown in Equation (2).
This is the rotation and translation matrix converting the coordinates in the right camera system to
those in the left camera system.
Mw
r="(Rr
w)T(Rr
w)T(Tr
w)
01X31#(1)
(Mr
l) = (Mw
l)(Mw
r)(2)
Figure 6shows the centers of the left and right cameras both pointed at target
P
. The vectors
pointing from the centers of the two cameras to the pixel location of the target object are defined as
˜
Pl
and ˜
Pr, as shown below:
˜
Pl=
(x0
lclx )sl x
(y0
lcly )sly
fl
(3)
Sensors 2020,20, 333 8 of 23
˜
Pr=
(x0
rcrx )srx
(y0
rcry )sry
fr
(4)
where
x0
l
and
y0
l
is the pixel location of the object in the left image;
clx
and
cly
is the image center of the
left camera;
slx
and
sly
are the scale coefficients of the left camera;
x0
r
and
y0
r
is the pixel location of the
object in the right image;
crx
and
cry
is the image center of the right camera, and
srx
and
sry
are the
scale coefficients of the right camera.
Figure 6. Calculation of 3D location.
In the 3D space, these two vectors intersect at the location of the target object. However, this only
occurs in ideal circumstances. Figure 7displays the more likely circumstance in which the two
vectors are skew and do not intersect. Thus, we assume that the target object is located at the middle
point of the line segment that is the shortest distance between the two vectors. Based on rigid body
transformation, vector
˜
Pr
of the right camera can be converted into a vector with regard to the left
camera system using
Ml
r
, the physical relationship between the two cameras, which is
Rl
r
and
Tl
r
.
Equation (5) represents the distance to which the vector should extend to reach
Pup
. Here, we assume
there is an unknown coefficient
b
. Similarly, Equation (6) represents the distance to which the left
vector should extend to reach Pdown, where we assume there is an unknown coefficient a.
Pup = (Rl
r)b˜
Pr+Tl
r(5)
Pdown =a˜
Pl(6)
To derive
Pmid
, we must calculate the directional vector
q
of the line segment. This vector can be
obtained using the cross product of the left and right vectors, as shown in Equation (7). Please note
that ˜
Prmust also undergo coordinate rotation conversion Rl
r, and the unit vector of qis q
|q|.
q=˜
Pl×(Rl
r˜
Pr)(7)
The length of the line segment is unknown, so we suppose that cis the coefficient of the unit
vector of q. Based on basic concepts of vectors, this line segment can be expressed as follows:
Pup =Pdown +cq
|q|(8)
Sensors 2020,20, 333 9 of 23
Further derivation gives
a˜
Pl+c(˜
Pl×Rl
r˜
Pr)
|˜
Pl×Rl
r˜
Pr|=b(Rl
r)˜
Pr+Tl
r(9)
Suppose that
A=h˜
PlRl
r˜
Prq
|q|3×3i
Equation (9) can be simplified into Equation (10):
a
b
c
=A1Tl
r. (10)
We can then derive coefficients
a
,
b
, and
c
, and use
a
and
c
to calculate
Pmid
, as shown in
Equation (12):
Pmid =a˜
Pl+1
2(cq
|q|)(11)
Here,
Pmid
is the 3D location of the target object with regard to the left camera coordinate system.
Using
Mw
l
, we can convert the coordinates from the left camera system to the world coordinate system,
as shown in Equation (12). This matrix can be reversely obtained using extrinsic parameter
Ml
w
of the
left camera.
Pw=Mw
lPmid (12)
Figure 7. Two vectors in a skew relationship.
2.4. Estimation Error Analysis
Even using the algorithm above to calculate the 3D location of the target object, errors still exist in
the data, and they increase with the distance between the target object and the origins of the camera
coordinate systems. This section analyzes the errors in the use of the camera systems to calculate the
locations of the target object. Here, we used a PTI motion capture system to serve as the control group
for the 3D locations obtained using the three cameras. Figure 8exhibits the 25 trackers that we used.
The world coordinates of the PTI motion capture system overlapped those of the camera systems,
meaning that the camera systems and the PTI motion capture system had the same world coordinates.
To calculate the errors, we employed two stereopsis groups, one using the right camera (camera#1)
and the left camera (camera#2) to calculate the 3D location of the PTI tracker and the other using the
Sensors 2020,20, 333 10 of 23
left camera (camera#2) and the auxiliary camera (camera#3). Using cubic interpolation, we derived
a new error distribution graph. Figure 9displays the distributions of the errors interpolated using
cubic polynomials in the two stereopsis groups. As can be seen, the errors increase with the distances
between the left and right cameras and the trackers and is highest at Tracker No. 21, where the error is
40 mm. With the left camera and the auxiliary camera, the two cameras are closer to Tracker No. 21,
so the error reduces to approximately 25 mm.
Figure 8. Locations of 25 trackers.
(a)
(b)
Figure 9.
Distributions of the errors (
a
) using the right camera (camera#1) and the left camera
(camera#2); (b) using the left camera (camera#2) and the auxiliary camera (camera#3).
Sensors 2020,20, 333 11 of 23
3. Ping-Pong Ball Trajectory Prediction
If a table-tennis robot wants to hit the ball, it must be able to accurately predict where the ball is
flying. When people play table tennis, the ball is subjected to various external forces that impact its
flight direction and velocity, including air resistance, the Magnus effect, gravity, and the rebound force
of collision. Thus, calculating the various physical forces with precision and then predicting the flight
trajectory is difficult. In this section, we present a machine learning approach that enables robots to
predict trajectories. In this section, we demonstrate that the coefficients of a polynomial regression
model are advantageous to represent and predict ball trajectories.
Figure 10 shows the flight trajectories of a ping-pong ball. The flight trajectories are divided into
two parabolas P1 and P2. The first ANN is responsible for learning the parabola P1 in the figure, with
the anterior positions as the input and the regression coefficient of P1 as the output. The second ANN
learns parabola P2, with the regression coefficient of P1 as the input and the regression coefficient of P2
as the output. The trajectory coefficients above are the parameters of their mathematical expressions.
Once the vision system detects the anterior positions, the system will immediately derive the coefficient
P2 of the trajectory after the ball’s landing point on the table and then calculate a designated striking
point. Suppose
fx
,
fy
, and
fz
denote the mathematical regression formulas of trajectory P2 along the
x
,
y
, and
z
axes. Using
fx(t)
, the timing
t
of the designated striking point (
x
axis) is first derived.
The resulting
t
is then substituted into
fy
and
fz
to obtain the
y
and
z
coordinates of the striking point.
Since the regression model should be chosen to predict the ball trajectory, we evaluate several
models such as exponential, Fourier, Gaussian, and polynomial curve fitting methods in Section 3.1.
The result shows that polynomial curve fitting is the most suitable model to predict the ball trajectory.
In Section 3.2, we also theoretically verify the existence of the relationship between the polynomial
coefficients and ball trajectory. However, Section 3.2 shows that curve fitting is highly sensitive to
noise in the input data. Thus, we propose two ANNs to model the relationship in Section 3.4.
X-axis
Y-axis
Z-axis
Figure 10. Diagram of trajectory prediction.
3.1. Selection of Trajectory Regression Model
The data of any parabolic trajectory include time stamps and location data. Thus, we must
decide which type of regression model to use to express the trajectory data. This process mainly
involves finding the relationship between a set of independent and dependent variables and then
establishing an appropriate mathematical equation, which is referred to as a regression model. Below,
we investigate which type of regression model is more suitable for the trajectory data.
We can use experiment data to determine a minimal number of parameters of suitable regression
models. The experiment data comprised 10 random trajectories selected from the training data.
Each flight trajectory consists of two parabolic trajectories: the first from the initial point to the landing
point and the second from the landing point to the end point. We fitted the two trajectories using four
types of regression models, namely, Gaussian, exponential, Fourier, and polynomial, and examined
Sensors 2020,20, 333 12 of 23
their results. We selected 10 random trajectories from the testing data to test
R2
results of these
models. The red and blue bars in Figure 11 indicate the mean
R2
results. The Fourier and polynomial
regression models presented reasonable
R2
results for both trajectories, and their test results were
close to each other. However, the polynomial regression model used three parameters, whereas the
Fourier regression model used four parameters. With our objective of minimizing the number of
parameters, we ultimately chose the polynomial regression model to express the flight trajectory data.
The polynomial regression model for the Z axis used three parameters, which was the quadratic
polynomial. For the
X
and
Y
axes, Figure 12 shows
R2
of each trajectory and the average
R2
of the
10 random trajectories was 0.9903 using the first-order polynomial, which was close to 1. Thus, the
X
and Yaxes were represented by the first-order polynomial regression.
Mean R
2
of Regression Models
Figure 11. Diagram of trajectory prediction.
Figure 12. R2of the first-order polynomial regression on the Xand Yaxes.
3.2. Theoretical Verification of Trajectory and Polynomial Coefficients
When the two parts of the ping-pong ball trajectory can be fitted using a polynomial, it means that
the coefficients in the polynomial effectively express the trajectory. We could therefore use ANNs to
predict the polynomial coefficients of the ball trajectories. The polynomial coefficients and conversion
Sensors 2020,20, 333 13 of 23
relations between trajectories can be expressed using Equation (13). Suppose that (
tk
,
qk
) is several
sets of data points where
k=
0,
. . .
,
m
,
tk
is the time value, and
qk
is the positional value. Here there
exists a unique polynomial
q(t)
using the
nth
order. Equation (14) establishes the relation matrix pf
vectors
q
and
a
, which is known as the Vandermonde matrix
T
[
29
]. Using the pseudo inverse matrix
of
T
, Equation (15) gives coefficient a where a minimum squared-error exists between the coefficient
equation and the trajectory data.
q(t) = a0+a1t1+· · · +antn(13)
q=
q0
q1
.
.
.
qm1
qm
=
1t1
0. . . tn
0
1t1
1. . . tn
1
.
.
.
1t1
m. . . tn
m
a0
a1
.
.
.
an1
an
=Ta (14)
a=T1q(15)
The input and output data of ANNs must be interrelated in a certain way; otherwise, training
the model may be too time-consuming, or poor prediction results may be produced. To verify that
the use of ANNs to learn trajectories in this study is appropriate. Based on this mathematical theory,
the trajectory point data can be mapped onto polynomial coefficient data using the Vandermonde
matrix. Thus, a corresponding relationship exists between the elements of these two sets of data.
When the trajectory points serve as the inputs of the ANN and the trajectory coefficients are the
outputs, then the model has good learning ability.
3.3. Comparison of Artificial Neural Network and Polynomial Curve Fitting Predictions
In the prediction procedure of this study, once the vision system receives the data from a few
anterior positions, the ANN of the first trajectory can then predict the first trajectory. To demonstrate
the inadequacy of curve fitting and the prediction performance of the ANN for the first trajectory,
we predicted the landing point of the first trajectory using the proposed method and a quadratic
polynomial resulting from curve fitting. We used 30 items of data to train the ANN for the first
trajectory and used the trained ANN and the ten anterior positions in the trajectory to predict the
polynomial parameters of the first trajectory. The red circles and crosses in Figure 13 show the trajectory
obtained during testing. The blue curve presents the polynomial trajectory output by the trained ANN,
and the green curve shows the subsequent trajectory trend after the data from the ten anterior positions
were received. The mean error and standard deviation of the quadratic polynomial were 19.2 mm and
11.8 mm, respectively; however, for the ANN, the mean error and standard deviation of the quadratic
polynomial were 10.8 mm and 10.0 mm, respectively. The errors in the landing point predictions of the
ANN were smaller than those of the quadratic polynomial. Once some errors exist in capturing the ten
anterior positions of the ball, Figure 13 shows that the ANN is resistant to noise; even if an input node
contains noise or its data is incomplete, the single data does not greatly impact the overall result.
Sensors 2020,20, 333 14 of 23
Figure 13.
Trajectory based on ten positions (Blue: ANN; green: quadratic polynomial resulting from
curve fitting).
3.4. Two ANNs in Trajectory Prediction
Three things can be obtained from a complete flight trajectory: the trajectory before landing,
the trajectory after landing, and ten anterior positions. Figure 14 displays the procedure of generating
the required training data. First, the lowest point in the overall trajectory along the
z
axis is identified,
demarcating the data of the first and second trajectory. The first trajectory is between the initial point
and the lowest point, and the second trajectory is between the lowest point and the end of the trajectory.
Both trajectories are then expressed using polynomial regression to extract the trajectory parameters and
the data of the ten anterior positions. The data of the ten anterior positions and the parameters of the first
trajectory then serve as the training data for the first ANN (Network Model 1), and the parameters of the
first and second trajectories serve as the training data for the second ANN (Network Model 2).
We defined the 3D polynomial of the first trajectory as Equation (18), where
a1
,
b1
,
a2
,
b2
,
a3
,
b3
,
c3
are the
X
-,
Y
-, and
Z
-axis coefficients of the first trajectory. We defined the 3D polynomial of the second
trajectory as Equation (21), where
a0
1
,
b0
1
,
a0
2
,
b0
2
,
a0
3
,
b0
3
,
c0
3
are the
X
-,
Y
-, and
Z
-axis coefficients of the
second trajectory. Equations (22) and (23) are the corresponding inputs and outputs of the two networks.
X=a1t+b1(16)
Y=a2t+b2(17)
Z=a3t2+b3t+c3(18)
X=a0
1t+b0
1(19)
Y=a0
2t+b0
2(20)
Z=a0
3t2+b0
3t+c0
3(21)
[Pi,ti]i=1,...,10 Network model 1
[a1,b1,a2,b2,a3,b3,c3](22)
[a1,b1,a2,b2,a3,b3,c3]Network model 1
[a0
1,b0
1,a0
2,b0
2,a0
3,b0
3,c0
3](23)
Sensors 2020,20, 333 15 of 23
Figure 14. Procedure of generating the required training data.
4. Experimental Results
In this section, we present a series of experiments conducted to demonstrate the prediction
performance of the proposed method. In Section 4.1, we compare the striking point prediction
performance of the proposed dual-network method and a single ANN [
3
] using the experimental data.
In Section 4.2, we compare the striking point prediction performance of the proposed dual-network
method and a physical model. Section 4.3 presents an experiment investigating the influence of time
data removal, and Section 4.4 analyzes the errors of the proposed method in a physical experiment.
Figure 15 shows the flight trajectory data collected from human demonstrations, in which the ping-pong
ball was hit from the serving side to a robot arm on the opposite side. The first trajectories passed over
the net and landed on the table on the robot’s side. We collected data from a total of 200 trajectories,
among which 170 trajectories were used as training data and 30 trajectories comprised the testing data.
Once the ANNs were trained, we used the testing data to examine their predictive abilities. We used
a circle with a diameter of 75 mm to simulate the area of the paddle and calculated the errors, standard
deviation, the number of balls that hit the range of the paddle, and the success rates.
Figure 15. Flight trajectory data collected from human demonstrations.
4.1. Dual-Network vs. Single-Network Approach
[
3
] used a single ANN to predict the striking points. The inputs were the data of four anterior
positions, and the outputs were the striking point and the acceleration of the ball in the
X
,
Y
, and
Z
directions. We conducted an experiment to compare this single-network approach with the proposed
method. The inputs of the single network increased to ten anterior positions so that they were
Sensors 2020,20, 333 16 of 23
consistent with the inputs of the proposed method. The input, hidden, and output layers of the model
structure in the single-network approach were 40, 10, and 6. In contrast, the input, hidden, and output
layers of the model structure in our dual-network approach were 40, 10, and 7, and the second model
was set as 7-20-7. The both networks used the mean-square error as the loss function and the hyperbolic
tangent sigmoid as the activation function. The number of epochs was 10,000. The Levenberg–Marquardt
algorithm was used for optimization. Table 5summarizes the details of the two ANNs. After applying
the trained models to the testing data, we calculated the overall mean error, standard deviation, and the
success rate of balls falling within the boundaries of the paddle. Figure 16 presents the mean errors of the
proposed method and the single-network approach (39.6 mm and 42.9 mm), as well as their success rates
(89% and 86.7%). Based on the experiment results, we can conclude that the prediction performance of
the proposed method is better than that of the single-network approach. The difference between these
two methods is whether they consider t he process of the ball landing point (including the landing on the
table). A normal ping-pong ball trajectory consists of two trajectories in which the flight process of the
ball offers data. The single-network approach only considers the data of the anterior positions and the
final striking point, so its prediction performance is limited.
The average of mean error The average of mean success rate
(a) (b)
Figure 16.
Performance results of proposed dual-network method and single-network approach:
(a) mean error; (b) mean success rate.
Table 5. Details of the two ANNs.
1st ANN 2nd ANN
Input node number 40 7
Hidden node number 10 20
Output node number 6 7
Activation fun. Hyperbolic tangent sigmoid Hyperbolic tangent sigmoid
Loss fun. Mean-square error Mean-square error
Epoch number 10,000 10,000
Optimizer Levenberg–Marquardt Levenberg–Marquardt
4.2. Physical Model Testing
In this section, we use a normal physical formula to predict the striking point. The method was
basically the same: use the data of the ten anterior points, as shown in Figure 17, to calculate the initial
velocity in three dimensions and derive the striking point while taking into account gravitational
acceleration
g
, air resistance, and elastic collision. Elastic collision is associated with the coefficient of
restitution of the ball, which can be obtained from real-world trajectories. For the sake of accuracy,
we added 130 real-world trajectories to calculate this coefficient. The prediction process of the physical
model was as follows:
Sensors 2020,20, 333 17 of 23
1.
Using the 330 trajectories, we obtained the mean velocities before and after collision and then the
mean collision coefficient. Based on the formula for the coefficient of restitution in Equation (24),
we concluded that the mean coefficient was 0.9203.
e=Relative velocity after collision
Relative velocity before collision (24)
2.
The
Y
and
Z
positions of the testing trajectories at
x
= 400 mm (the striking point), i.e.,
Ytrue
and
Ztrue
,
were used to calculate the errors in the final prediction results by selecting 100 random trajectories.
3.
The data from the ten anterior positions of the testing trajectories were obtained, and then the
initial velocities in the X,Y, and Zdirections, i.e., vx,vy, and vz, using polynomial regression.
4.
The downward accelerate
az
was defined with air resistance taken into account as shown in
Equation (25), where
g
denotes gravitational acceleration and equals 9.81 m/s
2
;
m
is the weight
of the ping pong ball, which is approximately 2.7
×
10
3
kg;
Cd
is the resistance coefficient and
equals 0.5;
ρa
denotes air fluid density, which is 1.29 kg/m
3
; and
A
is the cross-sectional area of
the ball, which is roughly 1.3
×
10
3
m
2
. The mathematical formula indicates that the velocity
and acceleration of the object change with time. Here, we set sampling time Tsto be 0.005 s.
az=gCd·ρa·A
2mv2
z(25)
5.
Using the physical formulas in Equations (26)–(28), we derived the next displacement (acceleration
and velocity are updated at each sampling time).
Sz=vzTs+1
2azT2
s(26)
vt+1=vt+atTs(27)
at+1=g+Cd·ρa·A
2mvt+1(28)
6. When the ball reached the table (Zdirection), we used the mean collision coefficient obtained in
the first step to calculate the velocity of the ball after landing, as shown in Equation (29). Once the
velocity was calculated, we could continue to calculate the position of the ball in the
Z
direction.
va f ter =vb e f ore ·e(29)
7.
Not considering the influences of friction between the ball and the table surface and the
self-rotation of the ball, we could calculate the displacement of the ball in the
X
direction at
each sampling time using Equations (30) and (31). The velocity and acceleration of the ball in this
direction also changed with time, all the way to the striking point (
x
= 400 mm). The timing of the
striking point, Tend, was then recorded.
ax=Cd·ρa·A
2mv2
x(30)
Sx=vxTs+1
2axT2
s(31)
8.
Finally, we used the initial velocity in the
Y
direction,
vy
, air resistance acceleration,
ay
in
Equation (32), and timing of striking point,
Tend
, to derive the striking point in the
Y
direction.
Sensors 2020,20, 333 18 of 23
Using the predicted
Y
and
Z
positions of the striking point,
Ypredict
and
Zpredict
, and the actual
striking point, Ytrue and Ztrue (Step 2), we then calculated the error using Equation (33).
ay=Cd·ρa·A
2mv2
y(32)
Error =q(Ypredict Ytrue )2+ (Zpredict Ztrue )2(33)
Figure 17. Striking point prediction using physical model.
Using the physical prediction method above, we selected 30 testing trajectories to calculate striking
point prediction errors. The success rate of balls hitting the paddle was 70%, and Figure 18 shows
that the overall mean error and standard deviation of striking position were 57.9 mm and 30.3 mm,
respectively. The success rate of 70% of the physical model was worse than 97% of the proposed
dual-network method. In reality, the ball trajectory is affected by the friction of the table surface and
the Magnus effect, but it is difficult to obtain precise measurements of them without more advanced
vision equipment [20].
Figure 18. Striking position error of the physical model.
Sensors 2020,20, 333 19 of 23
4.3. Removal of Time Data
The vision system in this study had a fixed sampling time. Based on the training data, the
average time interval was 0.0075 s and the standard deviation was 5.19
×
10
4
. The inputs of the first
ANN include time stamp data, which comprises steady values. For this reason, we tried removing
the time stamps from the input and then tested the resulting performance. We used the data from
all 330 trajectories, with 30 trajectories serving as the testing data and ten training sessions each.
As shown in Figure 19, there are no significant differences between the prediction results of the two
conditions. Thus, removing the time stamps from the input of the first ANN does not affect its
prediction performance.
Figure 19. Prediction error of the proposed dual-network method with and without time stamps.
This approach has its advantages. It simplifies the entire network because the inputs of the
first ANN were reduced from 40 nodes to 30 nodes (see Equation (22)), which also means that the
number of parameters in the first ANN decreased. The original prediction model contained a total of
1178 parameters (of which the first ANN accounted for 871 parameters), whereas the current model
had 998 parameters (of which the first ANN accounted for 691 parameters). This represents a 15%
reduction in the overall number of parameters. The two conditions also differed in training time.
Figure 20 shows the training time of the ten tests conducted for each condition. The mean training
time was 457 s for the model with the time stamps and 209 s for the model without the time stamps,
representing a 54% reduction in training time.
Figure 20. Training time of the dual-network method with and without time stamps.
Sensors 2020,20, 333 20 of 23
4.4. Striking Point Errors
To measure the prediction error of the proposed method, we conducted an experiment as follows.
This experiment was done by covering the front of the paddle held by the robot arm with a piece
of white paper and then covering that with a piece of carbon paper. When the robot arm struck the
ping-pong ball with the paddle, the ball left a mark on the white paper, and the position of the robot arm
at the time was also recorded. The distance between the center of the paddle and the striking points (block
dots) indicate the prediction errors, which we measured using a vernier caliper. The area that the robot
arm could strike was 20 cm
×
18 cm, which equals 360 cm
2
. We conducted three sets of experiments
with ten trials each, which produced a total of 30 records. Figure 21 displays the first trials of the
striking points recorded on carbon paper in our physical experiment. Figure 22 shows that the thirty
trials and the overall mean error was 36.6 mm, and the standard deviation was 18.8 mm. Figure 23
shows the striking accuracy with respect to the paddle area. The yellow circle represents the mean of
error and the green area represents a standard deviation of error. Obviously, the area of striking error
is much smaller than the paddle area, which means that the robot was able to strike the ball.
Figure 21. Striking points on carbon paper.
Figure 22. Overall striking errors.
Sensors 2020,20, 333 21 of 23
Figure 23.
Striking accuracy with respect to the paddle area: the green area represents the area of
striking error center with the mean (yellow) and a standard deviation.
4.5. Experiment Discussion
In Section 4.1, we compared the striking point prediction performance of the dual-network
method and a single ANN. In Section 4.2, we compared the striking point prediction performance
of the physical model. Table 6summaries the prediction error among the proposed dual-network
method, single network, and physical model. The results show that the proposed method has less
trajectory error than the other methods.
Table 6.
Comparison of the prediction error among the proposed dual-network method, single network,
and physical model.
Proposed Dual Networks Single Network Physical Model
Mean (mm) 39.553 42.858 57.862
5. Conclusions
This study developed a ping-pong ball trajectory prediction system that included ball recognition,
3D positioning, and ball trajectory prediction. Trajectory prediction performance is key to whether
a table-tennis robot can hit the ping pong ball. However, most existing studies developed physical
models for prediction, which can only achieve good prediction effects if they have high-frequency
vision systems to analyze ball status in real time. Such advanced equipment is not readily accessible
and makes it difficult to conduct fundamental studies. We therefore adopted machine learning to predict
the flight trajectories of ping-pong balls, which uses historical data to learn the regularities within. There
is no need to establish a complex physical model, and fairly good prediction results can be achieved even
with general industrial cameras readily available on the market. Each complete flight trajectory consists
of a landing point on the table and two parabolic trajectories. We used two ANNs to learn the features of
these flight trajectories. The first ANN learns the first flight trajectory. Once the vision system receives the
anterior positions, it can instantly predict the first trajectory. We demonstrated that this approach was
superior to curve fitting due to the limited amount of data and noise filtering capabilities. The second
ANN learns the second flight trajectory. Once the first trajectory is known, the second flight trajectory can
be instantly predicted. The two ANNs were then combined.
A comparison of the ANN and curve fitting approaches revealed that the use of data from
ten anterior positions resulted in mean errors of 10.8 mm in the prediction results of the ANN and
19.2 mm in those of the quadratic polynomial resulting from curve fitting. We conducted a simulation
experiment using 200 real-world trajectories as training data. The mean errors of the proposed
Sensors 2020,20, 333 22 of 23
dual-network method and a single-network model were 39.6 mm and 42.9 mm, respectively, and the
mean success rates were 88.99% and 86.66%. These results indicate that the prediction performance of
the proposed dual-network method is better than that of the single-network approach. We also used
a simple physical model to predict striking points. We employed 330 real-world trajectories, and the
resulting mean error and success rate were 57.9 mm and 70%. The success rate of 70% of the physical
model was worse than 97% of the proposed dual-network method. In the proposed dual-network
method, the inputs of the first ANN include time stamps. As our vision system takes samples at fixed
time intervals, little variation exists in the time stamp data. We thus tried removing the time stamps
from the data. We used 330 trajectories for training. The mean errors of the proposed method with
and without the time stamps were 27.5 mm and 26.3 mm, and the mean success rates were 98.67%
and 97.33%. The results show that even without the time stamps, the proposed method maintains
its prediction performance with the additional advantages of 15% fewer parameters in the overall
network and 54% shorter training time. Finally, we tested the striking ability of our robot arm, which
produced a mean error and standard deviation of 36.6 mm and 18.8 mm, respectively.
Author Contributions:
Conceptualization, H.-I.L. and Y.-C.H.; methodology, H.-I.L. and Y.-C.H.; software, Y.-C.H.;
validation, H.-I.L. and Y.-C.H.; formal analysis, H.-I.L. and Y.-C.H.; investigation, H.-I.L. and Y.-C.H.; resources,
H.-I.L. and Z.Y.; data curation, Y.-C.H.; writing—original draft preparation, H.-I.L.; writing—review and editing,
H.-I.L.; visualization, H.-I.L.; supervision, H.-I.L. and Z.Y.; project administration, H.-I.L. and Z.Y.; funding
acquisition, H.-I.L. and Z.Y. All authors have read and agreed to the published version of the manuscript.
Funding:
This research was funded by National Taipei University of Technology grant number NTUT-BIT-105-1.
Conflicts of Interest: The authors declare no conflict of interest.
References
1. Shi, Q.; Li, C.; Wang, C.; Luo, H.; Huang, Q.; Fukuda, T. Design and implementation of an omnidirectional
vision system for robot perception. Mechatronics 2017,41, 58–66.
2.
Koç, O.; Maeda, G.; Peters, J. Online optimal trajectory generation for robot table tennis. Robot. Auton. Syst.
2018,105, 121–137.
3.
Zhang, Y.H.; Wei, W.; Yu, D.; Zhong, C.W. A tracking and predicting scheme for ping pong robot. J. Zhejiang
Univ. Sci. C 2011,12, 110–115.
4.
Li, H.; Wu, H.; Lou, L.; Kühnlenz, K.; Ravn, O. Ping-pong robotics with high-speed vision system.
In Proceedings of the 2012 12th International Conference on Control Automation Robotics & Vision
(ICARCV), Guangzhou, China, 5–7 December 2012; pp. 106–111.
5.
Liu, J.; Fang, Z.; Zhang, K.; Tan, M. Improved high-speed vision system for table tennis robot. In Proceedings
of the 2014 IEEE International Conference on Mechatronics and Automation, Tianjin, China, 3–6 August
2014; pp. 652–657.
6.
Chen, X.; Huang, Q.; Zhang, W.; Yu, Z.; Li, R.; Lv, P. Ping-pong trajectory perception and prediction by a PC
based High speed four-camera vision system. In Proceedings of the 2011 9th World Congress on Intelligent
Control and Automation, Taipei, Taiwan, 21–25 June 2011; pp. 1087–1092.
7.
Yang, P.; Xu, D.; Zhang, Z.; Chen, G.; Tan, M. A vision system with multiple cameras designed for humanoid
robots to play table tennis. In Proceedings of the 2011 IEEE International Conference on Automation Science
and Engineering, Trieste, Italy, 24–27 August 2011; pp. 737–742.
8.
Liu, Y.; Liu, L. Accurate real-time ball trajectory estimation with onboard stereo camera system for humanoid
ping-pong robot. Robot. Auton. Syst. 2018,101, 34–44.
9.
Lampert, C.H.; Peters, J. Real-time detection of colored objects in multiple camera streams with off-the-shelf
hardware components. J. Real-Time Image Process. 2012,7, 31–41.
10.
Schauwecker, K. Real-Time Stereo Vision on FPGAs with SceneScan. In Forum Bildverarbeitung 2018; KIT
Scientific Publishing: Deutschland, Germany, 2018, pp. 339.
11.
Zhang, Y.; Zhao, Y.; Xiong, R.; Wang, Y.; Wang, J.; Chu, J. Spin observation and trajectory prediction of
a ping-pong ball. In Proceedings of the 2014 IEEE International Conference on Robotics and Automation
(ICRA), Hong Kong, China, 31 May–7 June 2014; pp. 4108–4114.
Sensors 2020,20, 333 23 of 23
12.
Zhao, Y.; Zhang, Y.; Xiong, R.; Wang, J. Optimal state estimation of spinning ping-pong ball using continuous
motion model. IEEE Trans. Instrum. Meas. 2015,64, 2208–2216.
13.
Zhang, Z.; Xu, D.; Tan, M. Visual measurement and prediction of ball trajectory for table tennis robot.
IEEE Trans. Instrum. Meas. 2010,59, 3195–3205.
14.
Wang, P.; Zhang, Q.; Jin, Y.; Ru, F. Studies and simulations on the flight trajectories of spinning table tennis
ball via high-speed camera vision tracking system. Proc. Inst. Mech. Eng. Part P J. Sports Eng. Technol. 2019,
233, 210–226.
15.
Huang, Y.; Xu, D.; Tan, M.; Su, H. Trajectory prediction of spinning ball for ping-pong player
robot. In Proceedings of the 2011 IEEE/RSJ International Conference on Intelligent Robots and Systems,
San Francisco, CA, USA, 25–30 September 2011; pp. 3434–3439.
16.
Zhao, Y.; Xiong, R.; Zhang, Y. Rebound modeling of spinning ping-pong ball based on multiple visual
measurements. IEEE Trans. Instrum. Meas. 2016,65, 1836–1846.
17.
Su, H.; Fang, Z.; Xu, D.; Tan, M. Trajectory prediction of spinning ball based on fuzzy filtering and local
modeling for robotic ping–pong player. IEEE Trans. Instrum. Meas. 2013,62, 2890–2900.
18.
Zhang, Y.; Xiong, R.; Zhao, Y.; Wang, J. Real-time spin estimation of ping-pong ball using its natural brand.
IEEE Trans. Instrum. Meas. 2015,64, 2280–2290.
19.
Wang, Q.; Zhang, K.; Wang, D. The trajectory prediction and analysis of spinning ball for a table tennis
robot application. In Proceedings of the 4th Annual IEEE International Conference on Cyber Technology in
Automation, Control and Intelligent, Hong Kong, China, 4–7 June 2014; pp. 496–501.
20.
Nakashima, A.; Ogawa, Y.; Kobayashi, Y.; Hayakawa, Y. Modeling of rebound phenomenon of a rigid ball
with friction and elastic effects. In Proceedings of the 2010 American Control Conference, Baltimore, MD,
USA, 30 June–2 July 2010; pp. 1410–1415.
21.
Bao, H.; Chen, X.; Wang, Z.T.; Pan, M.; Meng, F. Bouncing model for the table tennis trajectory prediction and
the strategy of hitting the ball. In Proceedings of the 2012 IEEE International Conference on Mechatronics
and Automation, Chengdu, China, 5–8 August 2012; pp. 2002–2006.
22.
Matsushima, M.; Hashimoto, T.; Takeuchi, M.; Miyazaki, F. A learning approach to robotic table tennis.
IEEE Trans. Robot. 2005,21, 767–771.
23.
Miyazaki, F.; Matsushima, M.; Takeuchi, M. Learning to dynamically manipulate: A table tennis robot
controls a ball and rallies with a human being. In Advances in Robot Control; Springer: Berlin/Heidelberg,
Germany, 2006; pp. 317–341.
24.
Zhao, Y.; Xiong, R.; Zhang, Y. Model based motion state estimation and trajectory prediction of spinning ball
for ping-pong robots using expectation-maximization algorithm. J. Intell. Robot. Syst. 2017,87, 407–423.
25.
Deng, Z.; Cheng, X.; Ikenaga, T. Ball-like observation model and multi-peak distribution estimation based
particle filter for 3D Ping-pong ball tracking. In Proceedings of the 2017 Fifteenth IAPR International
Conference on Machine Vision Applications (MVA), Nagoya, Japan, 8–12 May 2017; pp. 390–393.
26.
Payeur, P.; Le-Huy, H.; Gosselin, C.M. Trajectory prediction for moving objects using artificial neural
networks. IEEE Trans. Ind. Electron. 1995,42, 147–158.
27.
Nakashima, A.; Takayanagi, K.; Hayakawa, Y. A learning method for returning ball in robotic table tennis.
In Proceedings of the 2014 International Conference on Multisensor Fusion and Information Integration for
Intelligent Systems (MFI), Beijing, China, 28–29 September 2014; pp. 1–6.
28.
Zhang, Z. Determining the epipolar geometry and its uncertainty: A review. Int. J. Comput. Vis.
1998
,
27, 161–195.
29.
Biagiotti, L.; Melchiorri, C. Trajectory Planning for Automatic Machines and Robots; Springer Science & Business
Media: Berlin/Heidelberg, Germany, 2008.
c
2020 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access
article distributed under the terms and conditions of the Creative Commons Attribution
(CC BY) license (http://creativecommons.org/licenses/by/4.0/).
... In order to meet these challenges, a lot of work had already been carried out. Most of the work is in sports domain such as soccer [16,17], cricket [18,19], basketball [7,20], tennis [21], and ping-pong playing robots [22][23][24][25]. Some work is also carried out for catching robots such as the work in [26] for a ball catching robot where a ball 8.5 cm in diameter was wrapped in retro-reflective foil and its flight trajectory was observed through stereo triangulation by two cameras mounted as the eyes of a catching humanoid robot. ...
... Except for ping-pong playing robots, all of the other above-mentioned work involves non-mechanical throws. The work of ping-pong playing robots such as [22][23][24][25] involves mechanical throw of ball as ping-pong ball is served mechanically by robotic arm. Mechanically thrown objects in production systems are unsymmetrical and not absolutely identical. ...
Article
Full-text available
Industry 4.0 smart manufacturing systems are equipped with sensors, smart machines, and intelligent robots. The automated in-plant transportation of manufacturing parts through throwing and catching robots is an attempt to accelerate the transportation process and increase productivity by the optimized utilization of in-plant facilities. Such an approach requires intelligent tracking and prediction of the final 3D catching position of thrown objects, while observing their initial flight trajectory in real-time, by catching robot in order to grasp them accurately. Due to non-deterministic nature of such mechanically thrown objects’ flight, accurate prediction of their complete trajectory is only possible if we accurately observe initial trajectory as well as intelligently predict remaining trajectory. The thrown objects in industry can be of any shape but detecting and accurately predicting interception positions of any shape object is an extremely challenging problem that needs to be solved step by step. In this research work, we only considered spherical shape objects as their3D central position can be easily determined. Our work comprised of development of a 3D simulated environment which enabled us to throw object of any mass, diameter, or surface air friction properties in a controlled internal logistics environment. It also enabled us to throw object with any initial velocity and observe its trajectory by placing a simulated pinhole camera at any place within 3D vicinity of internal logistics. We also employed multi-view geometry among simulated cameras in order to observe trajectories more accurately. Hence, it provided us an ample opportunity of precise experimentation in order to create enormous dataset of thrown object trajectories to train an encoder-decoder bidirectional LSTM deep neural network. The trained neural network has given the best results for accurately predicting trajectory of thrown objects in real time.
... From this scenario emerged numerous researches aimed at developing new technologies that would support the entire sports sector [3]. Several published works proposed solutions that apply computer vision aiming at processing games in order to help competitors, coaches and audiences [8,13,16]. A complete state-of-the-art review in visual tracking algorithms was performed by Yan et al. [35] and Thomas et al. [29]. ...
Article
Full-text available
Computer vision plays a crucial role in current technological development, understanding a scene from the properties of 2D images. This research line becomes valuable in sports applications, where the scenario can be challenging to take technical decisions only from the observation. This work aims to develop a system based on computer vision for analyzing tennis games. The implemented method captures videos during the game through cameras installed on the court. Machine learning methods and morphological operations will be used over the images to locate the ball position, the court lines and the players location. In addition, the algorithm determines the moment the ball bounces during the game and analyzes whether it occurred in or out of the field. These data are available to players and judges through an Android application, allowing all processed data to be accessed from mobile devices, providing the results quickly and accessible to the user. From the results obtained, the system demonstrated robustness and reliability.
... A dual artificial neural network to mimic the table tennis ball was utilized [25], which divides the table tennis ball trajectory into two segments bounded by the landing point. Here, historical data were used to learn the patterns in them, with a final experimental error of 39.6 mm. ...
Article
Full-text available
The latest development in computer technology in sports has increased the popularity of ping pong among people. It calls for designing an intelligent table tennis robot using computer technology to cooperate with table tennis enthusiasts and professional table tennis players to practice. Much research work has already been done in this area. However, this study explores the use of binocular vision to precisely identify, locate, and anticipate the flight trajectory and landing position of table tennis balls. Major issues addressed in this study include identifying fast-moving ping pong balls, calibrating the camera, obtaining the camera’s internal and external settings, and localizing the ping pong balls in three dimensions. A new target recognition method is proposed in combination with the actual needs of the combat ping pong robot. The method combines colour segmentation, background subtraction, and ellipse fitting, which can detect the tail range of flying ping pong balls and find the centre position of the balls. Based on the ellipse fitting analysis of the image, characteristics of the tail of the flying ping pong ball are studied. This study can aid in tracking the trajectories of high-speed flying objects, which is helpful for both aerospace and military industries.
... In the field of sports, the live broadcast of ball games can model and predict the trajectory of the ball on the field in real time to help the audience better experience the game, and offline can also help athletes better target training in the military field. In the field of aerospace [1][2][3], the tracking and interception of missiles, the prediction of the movement and landing of artillery shells, and the trajectory planning of unmanned fighter jets all have great national defense value; in the aerospace field, spacecraft launch and return, satellite orbits around the Earth, etc., need to track high-speed aircraft. Research: in the industrial field, the use of robots to realize the grasping of dynamic target objects has gradually become a new application hotspot [4][5][6]. ...
Article
Full-text available
The research on the space trajectory of high-speed moving and flying objects has very important research significance and application value in the fields of sports, military, aerospace, and industry. Table tennis has the characteristics of small size, fast flight speed, and complex motion model. It is very suitable as an experimental object for the study of flying object trajectory. This study takes table tennis as the research object to carry out research on the trajectory prediction of flying objects and builds a trajectory prediction system based on the trajectory prediction model, combining the constraints of the simple physical motion model and the deviation correction of the double LSTM neural network. Aiming at the problem of trajectory extraction of flying table tennis balls, a high-speed industrial camera was used to build a table tennis trajectory extraction system based on binocular vision. A multicamera information fusion method based on dynamic weights is proposed for the prediction of the trajectory of flying table tennis. In order to solve the problems that some model parameters are difficult to measure and the model is too complicated in the traditional physical motion model of table tennis trajectory, a method combining simple physics is proposed. This paper proposes a trajectory prediction model with motion model constraints and dual LSTM neural network bias correction. Experiments show that the proposed method can greatly improve the accuracy of the trajectory extraction and prediction system and can achieve a certain success rate of hitting.
... As China's national ball, table tennis has become an excellent way for people to increase their physical activity. In order to improve the accuracy of table tennis hits and promote the development of table tennis, a design for the table tennis capture system is designed in this article [1][2][3][4][5][6][7][8][9][10]. ...
Article
Full-text available
This article takes table tennis as the research object, mainly extracts a large number of table tennis video trajectories, and combines the kinematic analysis method to reduce the noise of the extracted target to improve the accuracy of the table tennis trajectory and obtain the table tennis trajectory. The parameter group is used to simulate the trajectory of table tennis; based on the MATLAB environment to realize the table tennis trajectory simulation that provides the initial velocity and coordinates, complete the capture of the table tennis drop point. This article uses the two-dimensional information in the image to estimate the parameter values that form the three-dimensional information and then calculates the three-dimensional information based on the estimated parameter values. The unscented Kalman filter is used to estimate the trajectory parameters and rotation parameters of table tennis, and the algorithm for calculating and updating markers in real time is proposed, which reduces the influence of calculation errors on the estimation process and improves the accuracy of parameter estimation. First, establish a table tennis movement model and choose a suitable model to describe the translation and rotation movement of the table tennis. Then, based on the extended Kalman filter algorithm and the unscented Kalman filter algorithm, the ball flight trajectory estimation algorithm is established. Finally, an actual image acquisition system is built to collect trajectory information and surface information images from the actual table tennis movement. Processing is performed to obtain the positions of the center of the ping pong ball and the marked points on the corresponding sensors. Simulation and experimental results show that the proposed algorithm can effectively estimate the trajectory, linear velocity, and angular velocity parameters of a table tennis ball, and the selection of the initial value has little effect on the algorithm.
... e method of the center point, the method of selecting the point that has correlation with more points to be clustered as the center point of the cluster, and the method of selecting the point with greater similarity with the points to be clustered as the center point of the cluster, all wait. e purpose of using these methods is to hope that the selected cluster center points are distributed in each class and are close to the class center points [4,[33][34][35]. ...
Article
Full-text available
How to facilitate users to quickly and accurately search for the text information they need is a current research hotspot. Text clustering can improve the efficiency of information search and is an effective text retrieval method. Keyword extraction and cluster center point selection are key issues in text clustering research. Common keyword extraction algorithms can be divided into three categories: semantic-based algorithms, machine learning-based algorithms, and statistical model-based algorithms. There are three common methods for selecting cluster centers: randomly selecting the initial cluster center point, manually specifying the cluster center point, and selecting the cluster center point according to the similarity between the points to be clustered. The randomly selected initial cluster center points may contain “outliers,” and the clustering results are locally optimal. Manually specifying the cluster center points will be very subjective because each person’s understanding of the text set is different, and it is not suitable for the case of a large number of text sets. Selecting the cluster center points according to the similarity between the points to be clustered can make the selected cluster center points distributed in each class and be as close as possible to the class center points, but it takes a long time to calculate the cluster centers. Aiming at this problem, this paper proposes a keyword extraction algorithm based on cluster analysis. The results show that the algorithm does not rely on background knowledge bases, dictionaries, etc., and obtains statistical parameters and builds models through training. Experiments show that the keyword extraction algorithm has high accuracy and can quickly extract the subject content of an English translation.
... More specifically, path prediction of a thrown body has become important in human-behavioral-based tasks in robotics, which is also reflected in different robocup competitions such as soccer, basketball, and other sports [4,5,6]. Therefore, crucial tasks which can be influenced by path prediction are shooting and passing in soccer-playing humanoids [7,8,9,10], or ping-pong/tennis-playing humanoids such as [11,12,13,14] in which a Kalman filter based estimation method is used to predict the status of the ball. ...
Chapter
In this paper, we propose a method for predicting the path of the ball on the soccer field for the humanoid robots. A cost-function-based k-nearest neighbor regression method is first proposed to account for the part of the prediction which is based on previously observed data. Next, the autoregression method is utilized in order to carry out the prediction based on the current ball path. Finally, these two methods are combined to form the final prediction model. Moreover, two different schemes are introduced based upon the proposed model: fixed and adap-tive schemes. In fixed scheme, the prediction is made once during the initial steps of the motion and is used throughout the whole ball movement. However, in adaptive scheme, autoregression method coefficients are updated in fixed predefined steps during the motion. This is beneficial to robustify the prediction against an externally applied disturbance on the ball path. Our proposed method is tested by simulation and practical implementation and the results demonstrate a high precision rate.
... Particularly, the prediction of the performance decline trajectory of an individual from only one measurement would be desirable, as it permits an immediate assessment without requiring results from several previous years that can often not be obtained. ML approaches were successful in the prediction of septic shock onset [18], epileptic seizures [19], the onset of type 2 diabetes mellitus [20], or ball trajectories for table-tennis robots [21]. In sports, the prediction of the potential and performance trajectories of young talents to identify future champions by ML has been demonstrated for archers [22] and in table tennis [23]. ...
Article
Full-text available
The present work aims to accelerate sports development in China and promote technological innovation in the artificial intelligence (AI) field. After analyzing the application and development of AI, it is introduced into sports and applied to table tennis competitions and training. The principle of the trajectory prediction of the table tennis ball (TTB) based on AI is briefly introduced. It is found that the difficulty of predicting TTB trajectories lies in rotation measurement. Accordingly, the rotation and trajectory of TTB are predicted using some AI algorithms. Specifically, a TTB detection algorithm is designed based on the Feature Fusion Network (FFN). For feature exaction, the cross-layer connection network is used to strengthen the learning ability of convolutional neural networks (CNNs) and streamline network parameters to improve the network detection response. The experimental results demonstrate that the trained CNN can reach a detection accuracy of over 98%, with a detection response within 5.3 ms, meeting the requirements of the robot vision system of the table tennis robot. By comparison, the traditional Color Segmentation Algorithm has advantages in detection response, with unsatisfactory detection accuracy, especially against TTB's color changes. Thus, the algorithm reported here can immediately hit the ball with high accuracy. The research content provides a reference for applying AI to TTB trajectory and rotation prediction and has significant value in popularizing table tennis.
Article
Full-text available
When a table tennis ball is hit by a racket, the ball spins and undergoes a complex trajectory in the air. In this article, a model of a spinning ball is proposed for simulating and predicting the ball flight trajectory including the topspin, backspin, rightward spin, leftward spin, and combined spin. The actual trajectory and rotational motion of a flying ball are captured by three high-speed cameras and then reconstructed using a modified vision tracking algorithm. For the purpose of model validation, the simulated trajectory is compared to the reconstructed trajectory, resulting in a deviation of only 2.42%. Such high modeling accuracy makes this proposed method an ideal tool for developing the virtual vision systems emulating the games that can be used to train table tennis players efficiently.
Article
Full-text available
In highly dynamic tasks that involve moving targets, planning is necessary to figure out when, where and how to intercept the target. In robotic table tennis in particular, motion planning can be very challenging due to time constraints, dimension of the search space and joint limits. Conventional planning algorithms often rely on a fixed virtual hitting plane to construct robot striking trajectories. These algorithms, however, generate restrictive strokes and can result in unnatural strategies when compared with human playing. In this paper, we introduce a new trajectory generation framework for robotic table tennis that does not involve a fixed hitting plane. A free-time optimal control approach is used to derive two different trajectory optimizers. The resulting two algorithms, Focused Player and Defensive Player, encode two different play-styles. We evaluate their performance in simulation and in our robot table tennis platform with a high speed cable-driven seven DOF robot arm. The algorithms return the balls with a higher probability to the opponent's court when compared with a virtual hitting plane based method. Moreover, both can be run online and the trajectories can be corrected with new ball observations.
Article
Full-text available
Motion state estimation and trajectory prediction of a spinning ball are two important but challenging issues for both the promotion of the next generation of robotic table tennis systems and the research on motion analysis of spinning- ying objects. Due to the Magnus force acting on the ball, the ying state and spin state are coupled, which makes the accurate estimation of them a huge challenge. In this paper, we �rst derive the Extended Continuous Motion Model (ECMM) by clustering the trajectories into multiple categories with a K-means algorithm and �tting them respectively using Fourier series. The ECMM can easily adapt to all kinds of trajectories. Based on the ECMM, we propose a novel motion state estimation method using Expectation-Maximization (EM) algorithm, which in result contributes to an accurate trajectory prediction. In this method, ECMM is treated as a latent variable, and the likelihood of the motion state is formulated as a Gaussian Mixture Model of the di�erences between the trajectory predictions and observations. The e�ectiveness and accuracy of the proposed method is veri�ed by o�ine evaluation using a collected dataset, as well as online evaluation that the humanoid robotic table tennis system \Wu & Kong" successfully hits the high-speed spinning ball.
Article
Full-text available
To meet the demand of surrounding detection of a humanoid robot, we developed an omnidirectional vision system for robot perception (OVROP) with 5 Degrees of Freedom (DOFs). OVROP has a modular design and mainly consists of three parts: hardware, control architecture and visual processing part (omnidirectional vision and stereovision). As OVROP is equipped with universal hardware and software interfaces it can be applied to various types of robots. Our performance evaluation proves that OVROP can accurately detect and track an object with 360° field of view (FOV). Besides, undistorted omnidirectional perception of surroundings can be achieved through calibrations of both monocular and stereo cameras. Furthermore, our preliminary experimental results show that OVROP can perceive a desired object within 160 ms in most cases. As a result, OVROP can provide detailed information on surrounding environment for full-scope and real-time robot perception.
Conference Paper
Full-text available
A learning method of the point for a robot to hit a coming ball in table tennis is proposed in this paper. The learning is performed based on the artificial neural network. In order to learn the effects of the rotational velocity and the air resistance, the inputs and outputs are defined as the variations of the measured data and the hitting point from those produced by a simple model, which consists of the equations of motion without the air resistance and the Newton's rebound model without friction. The learning and verification are performed using the simulation and experimental data, where the simulation is executed with the aerodynamics model and the table rebound model, and the ball trajectories in the experiment are measured when two humans play table tennis.
Conference Paper
Full-text available
For ping-pong playing robots, observing a ball and predicting a ball’s trajectory accurately in real-time is essential. However, most existing vision systems can only provide ball’s position observation, and do not take into consideration the spin of the ball, which is very important in competitions. This paper proposes a way to observe and estimate ball’s spin in real-time, and achieve an accurate prediction. Based on the fact that a spinning ball’s motion can be separated into global movement and spinning respect to its center, we construct an integrated vision system to observe the two motions separately. With a pan-tilt vision system, the spinning motion is observed through recognizing the position of the brand on the ball and restoring the 3D pose of the ball. Then the spin state is estimated with the method of plane fitting on current and historical observations. With both position and spin information, accurate state estimation and trajectory prediction are realized via Extended Kalman Filter(EKF). Experimental results show the effectiveness and accuracy of the proposed method.
Article
In this paper, an accurate real-time ball trajectory estimation approach working on the onboard stereo camera system for the humanoid ping-pong robot has been presented. As the asynchronous observations from different cameras will great reduce the accuracy of the trajectory estimation, the proposed approach will main focus on increasing the estimation accuracy under those asynchronous observations via concerning the flying ball’s motion consistency. The approximate polynomial trajectory model for the flying ball is built to optimize the best parameters from the asynchronous observations in each discrete temporal interval. The experiments show the proposed approach can performance much better than the method that ignores the asynchrony and can achieve the similar performance as the hardware-triggered synchronizing based method, which cannot be deployed in the real onboard vision system due to the limited bandwidth and real-time output requirement.
Article
The research on the collision between a spinning ball and a table, as a general issue for the trajectory analysis of the spinning-flying objects, is important but complicated. Traditionally, it is studied either by simplifying the collision process as a black box in which the ball's motion state changes abruptly before and after collision, or by assuming that the coefficient of restitution and the coefficient of friction are simply piecewise linear or even constant during the whole collision process. In this paper, we first analyze the mechanism of the collision dynamics and conclude from mathematical derivations that the collision between a ping-pong ball and a table is a continuous transition process and its collision duration is a constant irrelevant to the ball's motion state. These two conclusions are also verified by the visual measurement using an ultrahigh-speed camera. Then, we propose a novel rebound model using the mean value theorem, momentum theorem, and angular momentum theorem. The novelty of the proposed model is that the forces and the parameters in it are formulated as the continuous functions related to the ball's motion state, which is more accurate, rather than as constants. The integrations of these functions over time can be formulated by using the mean value theorem, and their expressions are learned using multilayer perception with a large data set collected by the position vision system and the pan-tilt vision system. The experimental results verify the effectiveness and accuracy of the proposed model.
Conference Paper
The identification and trajectory prediction of spinning ball has been a problem for years. In order to improve the accuracy of trajectory prediction we take following measures: firstly the kinematics model of the flight spinning ball is analysed; then based on the Unscented Kalman Filter (UKF), the motion equation and observation equation of the ball's movement trajectory is constructed; finally the BP pattern recognition classifier is used to recognize the pattern according to the predicted flight trajectory. Large number of Matlab simulations and experimental results show that, in comparing with that of EKF, UKF can save 99% of the computing time and also get more accurate prediction. BP classifier outperforms other similar classifiers, and is more suitable for the trajectory recognition of spinning ball movement.