Available via license: CC BY 4.0
Content may be subject to copyright.
Applied Mathematics and Nonlinear Sciences, 9(1) (2024) 1-19
Applied Mathematics and Nonlinear Sciences
https://www.sciendo.com
†Corresponding author.
Email address: zhouj@wxit.edu.cn
ISSN 2444-8656
https://doi.org/10.2478/amns.2023.2.01714
© 2023 Jie Zhou and Rong Lu, published by Sciendo.
This work is licensed under the Creative Commons Attribution alone 4.0 License.
Image Recognition Technology Applied to the Design of Mobile Platform for
Warehouse Logistics Robots
Jie Zhou1,†, Rong Lu1
1. Department of Control Technology, Wuxi Institute of Technology, Wuxi, Jiangsu, 214121, China.
Submission Info
Communicated by Z. Sabir
Received February 19, 2023
Accepted July 10, 2023
Available online December 30, 2023
Abstract
This paper first studies the processing flow of image processing technology that preprocesses the image and adopts the
method of polygonal approximation to identify the shape and localize the moving target. Then, the mobile platform of
the warehouse logistics robot is designed. Then, the vision system of the robot was designed using image recognition
technology to realize obstacle collision prediction and route planning. Finally, the robot’s localization and grasping
abilities, trajectory following performance, and semantic segmentation abilities are analyzed using comparative
experiments. The successful localization and grasping rates of the warehouse robots are all higher than 93%, and the
trajectory following the straight line road section is better, with a maximum error of less than 21 mm. The mIoU of this
paper’s method on the Cityscapes dataset is 78.85%, MPA is 86.05%, and PA is 96.89%, with good image segmentation
performance. This study is of great significance for the development of the intelligent logistics field.
Keywords: Image recognition; Collision prediction; Route planning; Mobile platform; Warehouse logistics.
AMS 2010 codes: 97M50
Jie Zhou, Rong Lu. Applied Mathematics and Nonlinear Sciences, 9(1) (2024) 1-19
2
1 Introduction
In recent years, with the rapid development of e-commerce, more and more people have chosen to
buy goods on major e-commerce platforms, resulting in a surge of orders in the warehousing and
logistics industry, and the development of warehousing and logistics is faced with great challenges
[1-2]. As one of the biggest demand points of e-commerce on logistics, warehousing logistics is
undergoing a great change [3]. Traditional warehousing logistics, due to low efficiency, have made it
difficult to meet the needs of the modern e-commerce industry [4]. Intelligent warehousing and
logistics system based on intelligent mobile robot has become one of the research hotspots in the e-
commerce industry and logistics industry [5].
Traditional automation methods have large investments, slow returns, poor flexibility, are difficult to
expand, and require large operating space and site requirements, which have constrained the further
development of the e-commerce logistics industry [6-7]. Modern warehousing and logistics systems
not only need to be efficient, fast, and low-cost but also need to be flexible and have better
expandability for changes in demand to be able to respond quickly. Only in this way can we keep up
with the development of the industry and technological progress [8].
In the warehousing logistics work system, high repetitiveness, high labor intensity, and high-risk
factor work can be carried out by intelligent robots [9-10]. Warehouse logistics robots are the link to
improve the efficiency of logistics, and in many practical industrial environments such as
warehousing and logistics, the use of robots to work together can improve the efficiency of the work
greatly [11-12].
Duan, L.M proposed a path planning method based on a modified A* algorithm for batch picking of
warehouse logistics robots, which oh method effectively improves the speed and accuracy of robotic
goods sorting and realizes intelligent warehouse management [13]. Barenji, A.V. et al. proposed a
new scheduling mechanism for the problem of irrational scheduling schemes, which effectively
solves the multi-robot and task allocation problems that may occur in intelligent warehouse systems
[14]. Dasygenis, M. proposed a new path-planning method that combines Dijkstra’s algorithm and
Kuhn-Munkers algorithm to provide new possibilities for robot pathfinding [15]. Emde, M. et al.
proposed the use of schemes such as lane-guided vehicles to improve the efficiency of returned item
handling. This method utilizes optical markers to guide robots through the warehouse to collect
returned items and place them at workstations, allowing logistics workers to focus on the actual
processing tasks [16].
In this paper, we first study the principles and processes of image processing techniques, pre-
processing images, identifying object shapes as well and calculating the center of gravity of an object
using the grayscale center of gravity method to locate the position of a moving target. Then, the
warehouse logistics robot mobile platform is designed, and the control methods and specific
structures of the mobile system, mounting platform and main control system are explored. Then,
image recognition technology is used to extract image features and visual saliency detection and
combined with machine control algorithms to realize dynamic obstacle avoidance, target recognition
and grasping, and path planning. Finally, the performance of the robot in recognizing and grasping
target artifacts from complex scenes and the effect of image semantic segmentation are analyzed by
comparing with the ORB algorithm, and the trajectory-following performance of the storage robot is
analyzed by comparing the difference between the actual trajectory and the predetermined trajectory.
Image Recognition Technology Applied to the Design of Mobile Platform for Warehouse Logistics Robots
3
2 Image Recognition Technology
2.1 Image Preprocessing
Image preprocessing is an indispensable step in object shape recognition, the actual image acquisition;
the external environment is more interference with in order to eliminate the influence of irrelevant
information, the need for image preprocessing. The image is filtered using the Gaussian filtering
method in this paper. To enhance the reliability of object shape recognition, the binary threshold
segmentation method uses the Otsu algorithm to separate the object from the background.
The Otsu algorithm is the best threshold selection algorithm for real image segmentation. The
algorithm is easy to compute and is not influenced by image brightness and contrast. It can better
separate the object from the background, which will improve the accuracy of object shape recognition
later.
Other types of threshold segmentation algorithms are user-defined thresholds, while the Otsu
threshold type automatically obtains the appropriate threshold for Binary binarization through the
Otsu algorithm, so the method is more suitable for application in engineering practice.
Otsu threshold type, through the Otsu algorithm, automatically obtains the appropriate threshold for
Binary binarization. Otsu algorithm principle is that the best threshold will be the image grayscale
histogram split into two parts so that the two parts of the classification of the variance between the
maximum value, that is, the maximum separation. Cumulative probability, mean gray level and
variance, the relationship between the values listed in the formula is shown below:
00
() T
i
i
P T p
=
=
(1)
1
10
1
( ) 1 ( )
L
i
iT
P T p P T
−
=+
= = −
(2)
00
0
1
() ()
T
i
i
T ip
PT
=
=
(3)
1
11
1
1
() ()
L
i
iT
T ip
PT
−
=+
=
(4)
( )
2
2
00
0
0
1
( ) ( )
()
T
i
i
T i T p
PT
=
=−
(5)
Let μ denote the global mean and
22
( ), ( )
bw
TT
denote the between-class variance and within-class
variance, respectively, which can be obtained:
1
0 0 1 1
0
( ) ( ) ( ) ( )(13)
L
i
i
ip P T T P T T
−
=
= = +
(6)
Jie Zhou, Rong Lu. Applied Mathematics and Nonlinear Sciences, 9(1) (2024) 1-19
4
( ) ( )
22
20 0 1 1
( ) ( ) ( ) ( ) ( )
bT P T T P T T
= − + −
(7)
2 2 2
0 0 1 1
( ) ( ) ( ) ( ) ( )T P T T P T T
=+
(8)
From this, the optimal threshold is deduced:
1
*2
0arg max ( )
b
TT
TT
−
=
(9)
2.2 Shape Recognition
Image feature extraction uses a polygonal approximation to obtain the basic contour of the target.
Polygon approximation is mostly used to identify the shape of an object through the polygon fitting
function, which is implemented using the Douglas-Puke algorithm to fit the function.
The steps for applying an algorithm are as follows. First, make a chord of the target curve, assuming
that the first point of the curve is A and the tail point is B. The first and last points are connected to
form a straight line AB, which is regarded as the chord of the target curve. Then, find the maximum
distance point C between the target curve and the chord so that the maximum distance between the
curve and the chord is d. Compare d with the pre-set threshold, and if it is smaller than the specified
threshold, this section of the straight line can be approximated as the target curve, and the processing
of this section of the curve is finished. If the distance d is greater than the threshold value, the line
segment is divided into two segments, AC and BC, with point C and the above operation is continued
until all curve processing is finished. The shape of the polygon is determined in this paper by defining
the output polygon point set separately as the side lengths of the polygon and thus determining the
shape of the polygon.
2.3 Motion targeting
The grayscale center of gravity method is used to calculate the center of gravity of an object and
locate the position of a moving target. The coordinates of its center of gravity are obtained by the
method of finding the moments of the image if the pixels of the image are written as a density function
( , )f x y
, and the expectation of the pixel points is the moment of origin of the image. The formula
for the moment of origin is:
11
( , )
MN pa
pq x
m x y f x y
== =
=
(10)
Where
pq
m
denotes the sum of all pixels in the image, and
,xy
of each pixel is multiplied with
the corresponding scale factor
,
pq
xy
.
Since the first-order moments of a figure are related to the shape of the figure, the center of gravity
of the object is found by using the method of first-order moments in this scheme, and the center of
gravity is given by:
10 01
00 00
,
mm
Cmm
=
(11)
Image Recognition Technology Applied to the Design of Mobile Platform for Warehouse Logistics Robots
5
Where
00
m
denotes all non-zero regions on the image, and the length of the processing contour is:
00
00 ,,
( , ) ( , )
rr
x r y r x r y r
m x y f x y f x y
=− =− =− =−
==
(12)
10
m
and
01
m
denote the accumulation of the contour in the
,xy
-direction, and the accumulation
process is:
10
10 ,,
( , ) * ( , )
rr
x r y r x r y r
m x y f x y x f x y
=− =− =− =−
==
(13)
01
01 ,,
( , ) * ( , )
rr
x r y r x r y r
m x y f x y y f x y
=− =− =− =−
==
(14)
3 Application of image recognition technology in the design of warehouse logistics robots
3.1 Warehouse logistics robot mobile platform design
The warehouse logistics robot system designed in this paper includes a mechanical system, a master
control system, a remote system, and a drive system. The mechanical system, in turn, includes two
subsystems: the mobile platform and the piggyback platform. The following content gives a brief
introduction to some systems.
3.1.1 Mobile system design
The mobile system is the basis of the mobile robot’s various sensors, controllers, actuators, and
carrying platforms need to be the mobile platform as a carrier. At the same time, the mobile platform
must also realize the basic function of the mobile robot a mobile.
Robots move in a variety of ways. The main mobile institutions are wheeled, tracked, legged and so
on. The tracked structure applied to small ground mobile robots in this field is the current mobile
robot, especially armed, explosive, special terrain mobile robot development of one of the mainstream
directions. Figure 1 shows the tracked specific mobile mechanism.
Figure 1. Diagram of the moving mechanism of the robot
3.1.2 Mounting platform design
Based on the design of the series of robots, the warehouse logistics robot is equipped with an upper
load platform to realize its versatility. The difficulty of the design is that the mounting platform must
have a universal interface to carry different devices, and at the same time, without affecting the
Jie Zhou, Rong Lu. Applied Mathematics and Nonlinear Sciences, 9(1) (2024) 1-19
6
stability of the robot body operation under the premise of a number of degrees of freedom to make
the upper mount flexible operation and easy to control.
According to the indicators of the warehouse logistics robot, the robot is required to be able to carry
a variety of devices, such as firefighting, reconnaissance, detection, robotic arms and light weapons.
After considering the motion characteristics of general devices, it is determined that the robot
mounting platform can have two degrees of freedom: one is 360° of rotational freedom around the
vertical direction, and the other is ±90° of pitching freedom around the horizontal direction. The
specified task can be accomplished by a typical device with these two degrees of freedom. A
schematic diagram of the degrees of freedom of the piggyback platform is depicted in Figure 2.
Drivetrain
Drivetrain
Axis 1
Axis 2
Degree of
freedom 1
Degree of
freedom 2
Figure 2. Schematic diagram of the degrees of freedom of the hitching platform
Axis 1 and axis 2 are driven by two brushless DC servo motors and driven by worm gear. Worm
drives can achieve a staggered angle of 90° between the two axes of power and motion transfer. They
are usually used in medium and small power applications with short, continuous work periods.
Compared to other transmission methods, it has a compact structure, a large transmission ratio,
smooth transmission, low noise, and easy self-locking advantages. To ensure that the recoil during
the launch process does not affect the targeting, it is necessary to have a good self-locking
transmission system. The self-locking characteristics of the worm gear drive meet this requirement.
3.1.3 Main control system design
Warehouse logistics mobile robot has a variety of environment sensing and control functions. For this
reason, the robot control system must have better real-time coordination, reliability, scalability,
openness, and other characteristics. Warehouse logistics robot control system using distributed
control mode upper PC industrial computer, the lower controller selected with separate control
computing capabilities of Israel ElmoHarmonic digital servo drive, the upper machine and the lower
drive between the use of RS232 serial communication. According to the system requirements, the
industrial computer selected Pentium 4 processor, the main frequency of 800MHZ, the size of the
available hard disk for 80G, the available memory for 512MB, and the use of Advantech full-length
Image Recognition Technology Applied to the Design of Mobile Platform for Warehouse Logistics Robots
7
motherboards to support PCI, ISA slot, with two USB interfaces and two serial interfaces, and in
addition to expanding a multi-serial card, chassis model Advantech 1P0-6806 (B). The digital
magnetic compass is used to detect the high-precision heading information of the mobile robot, the
inclination sensor is used to obtain the robot’s attitude information, and the GPS is used to localize
the mobile robot. The camera is used to obtain the front video information of the mobile robot, which
is mounted on the top of the robot mounting platform and transmits the information to the industrial
computer through the image capture card. Figure 3 shows the distribution of ultrasonic sensors and
photoelectric switches.
Before
After
Ultrasonic
sensors Photoelectric
switch
Figure 3. Distribution of ultrasonic sensor and photoelectric switch installation
3.2 Machine vision for warehouse logistics robots
3.2.1 Feature extraction
As the generation of images is easily affected by the weather, light, or vehicle vibration from the
movement, the image in the input is easy to introduces various types of noise, and these noises are
easy to cause redundant calculations in the data processing, so in the generation of salient maps and
image segmentation must be corrected before the filtering process. On the other hand, obstacles in
the agricultural environment are generally different from the environment, so the input obstacle image
can be Gaussian filtered to analyze the texture, color, and frequency intensity of different regions in
the image to initially locate the pixel region where the salient objects are located.
If the image is defined as
( , )I u v
and the Gaussian filter as
( , , )G u v
, the process of filtering by a
single filter can be expressed as:
( , , ) ( , ) ( , , )R u v I u v G u v
=
(15)
Jie Zhou, Rong Lu. Applied Mathematics and Nonlinear Sciences, 9(1) (2024) 1-19
8
22
2
2
2
1
( , , ) 2
uv
G u v e
+
−
=
(16)
When the obstacle information in the image is converted into frequency domain information, the
high-frequency information generally indicates the position and contour of the obstacle in the image,
while the low-frequency information indicates the basic features of the obstacle, such as color and
texture. In order to fully extract the feature information of the obstacle and describe the area where
the obstacle is located and its basic contour as clearly as possible, the low-frequency information in
the frequency domain should be retained to the maximum extent before the computation of the
saliency map, and part of the high-frequency information should be filtered out selectively. The
process is as follows:
( ) ( )
12
( , , ) ( , ) , , , ,R u v I u v G u v G u v
= −
(17)
The bandwidth between the high and low frequencies is related to the ratio
p
of
1
q
and
2
, and
generally the size of
p
is 1.6 to ensure that the outline of the obstacle can be accurately detected.
3.2.2 Visual saliency detection
After Gaussian filtering for image denoising and feature extraction, further computation of pixel
features is required to obtain the saliency map of the operational environment in order to select the
appropriate saliency detection method for the next step of image segmentation.
The locations of all obstacles were accurately captured, and different levels of highlighting operations
were performed based on the saliency level of the obstacles with good suppression of the background,
but the outline of the obstacles was poorly described, and their highlighted edges could not accurately
describe the shape of the obstacles, which might bring errors to the image segmentation.
Mainly, after transforming the RGB image to Lab image space, the central-peripheral operator is
utilized to segment the region of the image that is suspected to be an obstacle, and its classification
computation process is relatively simple, which can be expressed as follows:
( , ) ( , )
whc
S u v I I u v
=−
(18)
Where
I
is the pixel average of the converted Lab image and
whc
I
is the eigenvalue of a single
vector after Gaussian filtering. This formula calculates the Euclidean distance between the
eigenvalues of all vectors in the image and the average value, which can smoothly highlight the region
of suspected obstacles and facilitate the next segmentation step.
3.2.3 Obstacle collision prediction
Collision prediction of obstacles is one of the main reference indexes for speed control strategy of
warehouse robots when encountering obstacles. The spatio-temporal occupancy raster map of
obstacles is a commonly used environment modeling method for path planning, which can represent
the occupancy state of obstacles, especially motion obstacles, in the current environment in 2.5
dimensions. As shown in Fig. 4, it models the environment around the agricultural robotic vehicle
with the current moment as the zero moment and the vertical coordinate as the infinitely extendable
Image Recognition Technology Applied to the Design of Mobile Platform for Warehouse Logistics Robots
9
time axis. If point
( , , )P x y t
in the coordinate system satisfies
( ) 1TP=
, it means that grid
( , )xy
is
occupied by an obstacle at moment
t
, and vice versa is noted as
( ) 0TP=
. Figure 4 shows the spatio-
temporal occupancy grid.
TIME
X
Y
O
m
x
m
y
n
t
( )
,,P x y t
Figure 4. Space-time occupation grid map
Figure 5 shows the spatial-temporal obstacle collision model. In the general spatio-temporal raster
map, the motion state of the unmanned vehicle is represented as a time surface similar to a cone, as
shown in Figure 5 (a), the location of the robotic vehicle at the current moment is
( )
00
,xy
, and the
location of the obstacle is
( )
,
rr
xy
, and all the possible traveling routes of the vehicle are assembled
into a cone-cutting rectangle, which transforms the problem of collision prediction with the moving
obstacle into the problem of intersecting a line with a surface in space. However, for unmanned
agricultural vehicles with definite operation targets and fixed operation routes, their collision
prediction is transformed into the intersection of lines and lines in space, which undoubtedly puts
forward quite high requirements on the accuracy of the motion obstacle prediction, and at this time,
this way of description is not appropriate. In this paper, on the basis of motion prediction of obstacles,
the hazard amplification method is adopted to consider uncertainties, as shown in Figure 5 (b), which
determines the hazard domain based on the dimensions of obstacles and their threat levels and
transforms the intersection of a line with a line to the intersection of a line with a spatial body, in
order to guarantee the safety of unmanned vehicle traveling and operation.
Jie Zhou, Rong Lu. Applied Mathematics and Nonlinear Sciences, 9(1) (2024) 1-19
10
m
y
TIME
X
m
x
Y
O
( )
00
, ,0xy
( )
, ,0xy
m
y
TIME
X
m
x
Y
O
( )
, ,0xy
( )
00
, ,0xy
(a) Schematic diagram of the conical
spatio-temporal obstacle raster (b) Schematic diagram of the spatio-
temporal obstacle grid of the column
Figure 5. Space-time obstacle collision model
The radius of the danger domain
r
is positively correlated with the size of the obstacle and the
degree of danger. The larger the size of the obstacle, the higher the degree of danger, the larger the
radius of the danger domain. Let the longest side of the detected obstacle size
R
, the danger of the
obstacle is
vau
k
, so that
( )
1/2
/2 a vau
r R m k= +
,
a
m
is the adjustment parameters, the actual value
according to the specific needs of the test in this paper
0.2
a
m=
.
Fig. 6 shows the linear prediction model for moving obstacles. Let the current position of the
warehouse robot be
( )
00
A,xy
, moving towards the operation target point B. C and D are static
obstacles, E is dynamic obstacle, and
( )
00
,
rr
E x y
is the current dynamic obstacle position.
D
B
C
Y
XO A
v
E
vr
ar
( )
,
rr
xy
( )
00
,xy
vr: Obstacle velocity; ar: Obstacle acceleration; θ: Obstacle direction
of motion and the angle of the positive direction of the X-axis
Figure 6. Straightline prediction model of moving obstacles
Image Recognition Technology Applied to the Design of Mobile Platform for Warehouse Logistics Robots
11
The predicted traveling trajectory equation for a moving obstacle is:
2
0
2
0
1
cos cos 0
2
1
sin sin 0
2
r r r r
r r r r
x x v t a t
y y v t a t
− − − =
− − − =
(19)
Where
r
v
and
r
a
are the traveling speed and acceleration of the obstacle,
t
is the movement time,
and
is the angle between the movement direction of the obstacle and the
X
-axis. The speed,
acceleration and motion direction of the obstacle can be obtained from the a priori information given
by the LiDAR.
Taking the LiDAR as the coordinate origin, the current moment is
0
t
. If the vehicle is deflected by
an angle of
c
during its movement, the occupancy coordinates of the obstacle are
( )
,
rt rt
xy
obtained after the sensor information acquisition interval time
t
:
( )
0
0
/2
cos sin sin
sin cos cos
1 0 0 1 1
a a a
rt c c car c r
rt c c car c r
d v v t
x d x
y d y
= +
−−
=
(20)
( ) ( )
22
00
/
r rt r rt r
rr
d x x y y
v d t
= − + −
=
(21)
Where
a
d
is the distance traveled by the vehicle during the interval time
t
, and
r
d
is the distance
moved by the obstacle during the interval time
t
. It should be noted that the
Y
-axis of each raster
map is parallel to the traveling direction of the storage robot, and the direction of the vehicle’s motion
is the positive direction. When calculating the obstacle motion parameters, only the self-motion of
the unmanned vehicle in the direction of the
Y
-axis needs to be eliminated. The angle between the
obstacle motion direction and the
X
-axis is:
( )
00
arctan /
rt r rt r
c
y y x x
= − −
=−
(22)
For ease of presentation, the predicted trajectory equations for the obstacles are written in the
following format:
()
()
r
r
x f t
y g t
=
=
(23)
Then the spatial column equation for the hazard domain of the obstacle is:
2 2 2
( ( )) ( ( ))x f t y g t r− + −
(24)
Jie Zhou, Rong Lu. Applied Mathematics and Nonlinear Sciences, 9(1) (2024) 1-19
12
Set the storage robot is to maintain the current state of motion uniform speed traveling operations,
then the operating time and travel distance should be proportional to the speed of inversely
proportional, but the storage robot along the operating route or to the target point is not completely
straight line travel, need to be constantly adjusted to constraints on the direction of travel, the test
vehicle in this study, the corrected formula for the test vehicle after the field operation test:
( )
( ) ( )
22
0 0 0 0
1 / / a
t p q x x y y x x y y v= + − − − + −
(25)
Where,
p
and
q
are correction coefficients and
a
v
is the traveling speed of the agricultural
vehicle.
The operating path equation of the unmanned operating vehicle is assumed to be:
y kx b=+
(26)
Where
k
is the coefficient of the path equation in the current spatio-temporal coordinates and
b
is
a constant.
The collision prediction equation is:
( )
( ) ( )
2 2 2
22
0 0 0 0
( ( )) ( ( ))
1 / /
. . 0, 0, 0
a
mm
x f t y g t r
t p q x x y y x x y y v
y kx b
s t t x x y y
− + −
= + − − − + −
=+
(27)
3.2.4 Path planning algorithms
In this paper, we use DGA* algorithm optimized based on SAS algorithm to implement path planning.
The valuation function of DGA* algorithm is:
( ) ( ) ( )f i g i h i=+
(28)
Where
()fi
is the estimation function from the original point through the current node
i
to the
goal point,
()gi
is the actual cost in the state space from the initial point to the current node
i
,
()hi
is the estimated surrogate value of the optimal path from the current node to the goal point. When the
estimated cost
()hi
is less than or equal to the actual distance from node
i
to the end point, more
nodes are searched and the efficiency is lower, but the optimal result can be obtained. On the contrary,
if the estimated cost is greater than the actual value, fewer nodes are searched, but the optimal solution
cannot be obtained every time, so the actual straight line distance between the current node and the
goal point is generally chosen as the estimated cost, i.e.:
( ) ( )
22
() D i D i
h i x x y y= − + −
(29)
Where,
( )
,
ii
xy
is the coordinates of the current node
i
and
( )
,
DD
xy
is the coordinates of the
target point D.
Image Recognition Technology Applied to the Design of Mobile Platform for Warehouse Logistics Robots
13
Figure 7 shows the node expansion with the bootstrap field. Take the minimum cost point in the
successor node of each step of expansion as the node to be expanded in the next step, when the
bootstrap number of node
n
is
k
, and the path can pass through the bootstrap field
k
G
when the
successor node of the quest
n
, then the bootstrap number of node
n
is
1k+
, which indicates
that the quest tree passes through the bootstrap field
k
G
Otherwise, the bootstrap number of the
successor node
n
is still
k
, which indicates that the computed paths in the quest tree still do not
pass through the bootstrap field
k
G
. Until the planned paths passes through the goal point, i.e., the
last bootstrap field is reached, indicating that the algorithm ends and the path search is successful.
S
1
2
34n
D
G1
Gk
rk
rk+1
Gk+1
r1
1
n
2
n
3
n
Lead number k
Lead number k+1
Figure 7. The way of node expansion with guide fields
The guiding field of the DGA* algorithm is generally set by the operator based on experience, but
this study introduces the concept of a hazardous domain based on the cloud model, and the hazardous
domain dimensions can be used as the dimensions of the guiding field. Figure 8 shows the
determination method of the guidance field.
FQ
P
N
M
A
E
B
d
G
Turn left (-)
Right turn (+)
Figure 8. Determination method of guide fields
The method of determining the guidance field is shown in Figure 8. Point A is the location of the
current expansion node, line AB is the original operating path of the operating vehicle, the circular
area G is the area occupied by the obstacle, MN or PQ is parallel to the line AB and tangent to the
Jie Zhou, Rong Lu. Applied Mathematics and Nonlinear Sciences, 9(1) (2024) 1-19
14
circular edge of the obstacle G. The steering point of the operating vehicle should have a certain
safety distance 1 from the area of the obstacle, which is perpendicular to the MN or PQ. The steering
point of the operating vehicle should be at a safe distance
d
from the obstacle area, which is
perpendicular to MN or PQ, and from the concept of hazardous area, the size of
d
is
( )
1/2
a vau
mk
.
The circular area where points E and F are located is the steering field set up by the operator for the
robot’s right and left turns, and the centers of the field, points E and F, are on the edges of the circle
in the hazardous area. After determining the specific location and size of the guidance field, the re-
planned path can smoothly avoid obstacles and realize path planning as long as it passes through the
guidance field.
4 Results and analysis
4.1 Positioning Crawl Effect Analysis
In this paper, the main purpose is to identify the target workpiece grasping from a complex scene and
to verify the feasibility and stability of the system. In the experiment, by placing the workpiece in
different positions, using the method of this paper to match the identification of the minimum outer
rectangle to determine the centroid coordinates of the workpiece, and measure the positional
information of the workpiece and the time to identify and locate the workpiece. The pixel coordinates
of the feature points correctly matched by the algorithm are averaged as the form-center coordinates
for 3D reconstruction, and each position is repeated 10 times, and then the average value is taken.
The experimental results of the method in this paper are displayed in Table 1. The experimental results
of the ORB algorithm are displayed in Table 2.
Table 1. Experimental results of the method in this paper (error unit: mm, time unit s)
Serial
number
Reconstructed
coordinates
Actual coordinates
X-axis
error
Y-axis
error
Error
Identification
time
Positioning
time
1
(22.6, -7.7, -646.5)
(14.5, -15.9, -641.0)
7.35
8.39
1.68
0.33
0.78
2
(13.6, 14.1, -645.4)
(20.5, 10.8, -641.4)
6.23
5.43
3.48
0.01
0.68
3
(9.7, -37.4.-641.5)
(1.9, -34.1, -641.0)
7.84
6.20
1.51
0.15
0.56
4
(43.5, -108.4, 671.7)
(49.5, -112.5, -640.0)
5.84
5.17
1.74
0.31
0.90
5
(169 3, 25.9, -662.3)
(154.2, 153, -640)
7.32
6.56
2.18
0.43
0.74
6
(16.8, -13.8, -645.9)
(21, -7.5, -641.0)
4.49
5.68
1.82
0.36
0.53
7
(149.7, -18.9, 643.5)
(142.2, -23.2, 640.0)
6.36
5.33
3.08
0.19
0.75
8
(59.6, -2.5, -639.7)
(52.3.4.1, -640.5)
7.14
6.37
1.30
0.16
0.89
9
(201.2, -58.1, -643.5)
(195.9.-65.2, -660.0)
7.32
6.06
2.98
0.17
0.63
10
(157.6, 43.6, -642.1)
(165.3, 36.4, -646.0)
6.10
6.91
1.91
0.34
0.69
Table 2. Experimental results of ORB algorithm (error unit: mm)
Serial number
Reconstructed coordinates
Actual coordinates
X-axis error
Y-axis error
Z-axis error
1
(22.6, -7.7, -646.5)
(14.5, -15.9, -641.0)
10.26
10.50
2.65
2
(13.6, 14.1, -645.4)
(20.5, 10.8, -641.4)
9.53
11.65
3.50
3
(9.7, -37.4.-641.5)
(1.9, -34.1, -641.0)
12.46
9.58
4.47
4
(43.5, -108.4, 671.7)
(49.5, -112.5, -640.0)
9.75
7.18
3.23
5
(169 3, 25.9, -662.3)
(154.2, 153, -640)
6.67
8.59
1.99
Image Recognition Technology Applied to the Design of Mobile Platform for Warehouse Logistics Robots
15
6
(16.8, -13.8, -645.9)
(21, -7.5, -641.0)
6.20
7.77
3.03
7
(149.7, -18.9, 643.5)
(142.2, -23.2, 640.0)
10.17
9.87
4.81
8
(59.6, -2.5, -639.7)
(52.3.4.1, -640.5)
10.20
11.79
2.92
9
(201.2, -58.1, -643.5)
(195.9.-65.2, -660.0)
11.25
10.15
4.24
10
(157.6, 43.6, -642.1)
(165.3, 36.4, -646.0)
8.91
9.02
3.59
Figure 9 shows the comparison of the coordinate axis errors of the two methods. The average
recognition time of this method is 0.2s, the average localization time is 0.7s, and the average total
recognition and localization time is 2.44s, which is less than the set total recognition and localization
time and meets the target of this paper. The average error of the X-axis is 7mm, the average error of
the Y-axis is 6.5mm, and the average error of the Z-axis is 2.8mm. The average error of the ORB
algorithm is 9.9mm, 9.6mm, and 3.8mm for the X-axis, Y-axis, and Z-axis, respectively, which is
smaller than that of this method. The average error in the X-axis, Y-axis and Z-axis is 9.9mm, 9.6mm
and 3.8mm, respectively, and the error of this paper’s method is much smaller, which meets the
requirement of accuracy. The main reason for the error caused by the ORB algorithm is that the
uneven distribution of the feature points results in a larger coordinate error. The main reasons for the
error of this paper’s method are the hardware error, the error caused by the noise and wrong feature
points that are not eliminated in the process of stereo matching.
Figure 9. Comparison of axis errors between the two methods
The ORB algorithm and this paper’s method for several kinds of workpiece recognition and grasping,
each kind of workpiece 50 times, the number of successful recognition and grasping success statistics,
the results show that this paper’s method is more than 93% of the success rate of grasping. Table 3
shows the recognition and grasping success rate.
Table 3. Identification and crawling success rates
Logistics pieces
ORB
Algorithms in this paper
Logistics piece 1
Number of
recognition
successes
Number of
successful
captures
Success
rate/%
Number of
recognition
successes
Number of
successful
captures
Success
rate/%
Logistics piece 2
47
42
89.4
49
46
93.88
Logistics piece 3
49
44
89.8
49
48
97.96
Jie Zhou, Rong Lu. Applied Mathematics and Nonlinear Sciences, 9(1) (2024) 1-19
16
Logistics 4
50
47
94.0
50
49
98.00
Logistics piece 5
48
45
93.8
50
48
96.00
Logistics 6
47
46
97.9
48
47
97.92
4.2 Trajectory following performance analysis
In this paper, in order to test the trajectory following the performance of the storage robot, the robot
motion trajectory is set as a circle so that the robot moves according to the preset trajectory, and the
difference between the actual trajectory and the predetermined trajectory is recorded. Figure 10 shows
the trajectory following the diagram of the storage robot. From the experimental results, it can be
seen that the warehouse robot’s trajectory following the straight road section is better. Practical use
can meet the needs of the maximum error of less than 21mm. In the curve section, according to the
test results, it can be seen that the frequency of speed changes has been reduced, and the body stability
is good.
Figure 10. Trajectory following diagram of the warehouse robot
4.3 Comparative analysis of image semantic segmentation effect
4.3.1 Evaluation of the effect of semantic segmentation of images of various categories
In order to verify the effect of image segmentation on this paper’s method, training and testing were
completed on the Cityscapes dataset using this paper’s method and ORB algorithm, respectively, and
the experimental results were evaluated in comparison. First, we calculated the IoU and CPA of the
algorithm for each semantic category. Figure 11 demonstrates the IoU and CPA outcomes of this
paper’s method for the dataset. Figure 12 shows the IoU and CPA results of the ORB algorithm on
the dataset.
Segmentation accuracy and pixel recognition accuracy for most semantic categories on the Cityscapes
dataset are achieved by both this paper’s method and the ORB algorithm. As can be seen from the
comparison graph, among the 19 semantic categories of warehousing logistics, this paper’s method
achieves higher IoU and CPA than the ORB algorithm for 18 of them, and only for the wall category,
the IoU and CPA are the lowest, 59.24% and 56.50%, respectively, which are slightly lower than the
Image Recognition Technology Applied to the Design of Mobile Platform for Warehouse Logistics Robots
17
59.47% and 57.89% of the ORB algorithm. In this paper, the recognition rate of the method is 98.99%
for roads, 81.03% for windows, 92.01% for buildings, 92.38% for greenery, 94.94% for sky, and
94.33% for cardboard boxes, which is a better segmentation effect. For wall recognition rate of
49.24%, fence recognition rate of 53.75%, ground recognition rate of 65.53%, tape recognition rate
of 55.95%, and door recognition rate of 54.64%, the segmentation effect of the semantic category is
poorer, and the numerical trend of the corresponding CPA is consistent with the performance of IoU.
All categories have the lowest IoU and CPA in the wall category. The segmentation accuracy and
recognition accuracy of this paper’s method for semantic categories such as ground, goods and
shelves are significantly higher than that of the ORB algorithm, where the IoU is higher by 3.35%,
5.75%, 4.18 and 7.99%, and the CPA is higher by 3.59%, 4.95%, 7.49% and 8.67%, respectively.
Figure 11. IoU, CPA results of the algorithm in
this paper on the dataset
Figure 12. IoU, CPA results of ORB algorithm on
the dataset
The comparison shows that this paper’s method outperforms the ORB algorithm on the semantic
categories that are easy to missegment, and the corresponding CPA is also higher. Therefore, the
method in this paper performs better than the ORB algorithm.
4.3.2 Evaluation of Overall Image Semantic Segmentation Effectiveness
The overall performance of the ORB algorithm and this paper’s method on the Cityscapes dataset is
evaluated comparatively in terms of mIoU, MPA, PA, number of algorithm parameters, and validation
time. Table 4 shows the results of the overall comparison of image semantic segmentation effects.
From the table, it can be seen that the ORB algorithm achieves 72.02% mIoU, 78.84% MPA and
95.26% PA on the Cityscapes dataset. When the deep feature extraction structure is replaced with this
paper’s method, the Cityscapes dataset achieved 78.85% mIoU, 86.05% MPA and PA of 96.89%, all
of which are improved, by 6.81%, 7.21% and 0.61%, respectively. The validation time and the number
of algorithm parameters for a single image are also increased, by 7ms and 1.63MB, respectively. The
segmentation accuracy and pixel recognition quasi-rate are improved, which indicates that the method
of this paper is very advantageous in image segmentation.
Table 4. Comparison results of overall image semantic segmentation results
Evaluation indicators
Algorithms in this paper
ORB
mIoU (%)
72.02
78.85
MPA (%)
78.84
86.05
Jie Zhou, Rong Lu. Applied Mathematics and Nonlinear Sciences, 9(1) (2024) 1-19
18
PA (%)
95.26
96.89
Verification time (ms)
232
239
Number of algorithm parameters (MB)
17.51
19.28
5 Conclusion
This paper firstly investigates the use of image recognition technology in the design of warehouse
logistics robot platforms and then analyzes the visual design effect of the robot, including the
localization and grasping effect, the trajectory following performance and the semantic segmentation
effect. These conclusions are drawn:
In terms of the localization grasping effect, the average recognition time of this paper’s method is 0.2
s, the average localization time is 0.7 s, and the average total time of recognition and localization is
2.44 s, which is less than the set total time of recognition and localization, and the successful grasping
rate of this paper’s method is higher than 93%, which meets the set goal of this paper.
In terms of trajectory following performance, the warehouse robot in the straight section trajectory
follows better, the maximum error is less than 21mm, and it can meet the actual use of demand. In
the curve section, according to the test results, the frequency of speed change is reduced, and the
body’s stability is good.
In terms of semantic segmentation, the method in this paper achieves 78.85% mIoU, 86.05% MPA
and 96.89% PA on the Cityscapes dataset, which is improved compared to the ORB algorithm,
respectively, by 6.81%, 7.21% and 0.61%, and the validation time of a single image and the number
of algorithmic parameters are also increased, respectively, by 7ms and 1.63MB, the segmentation
accuracy and pixel recognition accuracy are improved, indicating that the method in this paper is
advantageous in image segmentation.
References
[1] Sun, Y. (2021). Path planning of mobile robots in warehouse logistics relying on computer multimedia
3d reconstruction technology. Advances in multimedia(Pt.1), 2021.
[2] Atzeni, G., Vignali, G., Tebaldi, L., & Bottani, E. (2021). A bibliometric analysis on collaborative robots
in logistics 4.0 environments. Procedia Computer Science, 180(1), 686-695.
[3] Petkovic, T., Puljiz, D., Markovic, I., & Hein, B. (2019). Human intention estimation based on hidden
markov model motion validation for safe flexible robotized warehouses. Robotics and Computer-
Integrated Manufacturing, 57(JUN.), 182-196.
[4] Weidinger, F., Boysen, N., & Briskorn, D. (2018). Storage assignment with rack-moving mobile robots
in kiva warehouses. Transportation Science.
[5] Li, T., Huang, B., Li, C., & Huang, M. (2019). Application of convolution neural network object detection
algorithm in logistics warehouse. The Journal of Engineering, 2019(2).
[6] Lee, C. K. M., Lin, B., Ng, K. K. H., Lv, Y., & Tai, W. C. (2019). Smart robotic mobile fulfillment system
with dynamic conflict-free strategies considering cyber-physical integration. Advanced Engineering
Informatics, 42, 100998.
[7] Mantha, B. R. K., Jung, M. K., Borja García de Soto, Menassa, C. C., & Kamat, V. R. (2020). Generalized
task allocation and route planning for robots with multiple depots in indoor building environments.
Automation in Construction, 119.
Image Recognition Technology Applied to the Design of Mobile Platform for Warehouse Logistics Robots
19
[8] Liu, Z., Liu, H., Lu, Z., & Zeng, Q. (2021). A dynamic fusion pathfinding algorithm using delaunay
triangulation and improved a-star for mobile robots (january 2021). IEEE Access, PP(99), 1-1.
[9] Arrais, R., Oliveira, M., Toscano, César, & Veiga, G. (2017). A mobile robot based sensing approach for
assessing spatial inconsistencies of a logistic system. Journal of Manufacturing Systems, 43, 129-138.
[10] Miku?ová, Nikoleta, ?ujan, Zdeněk, Tomková, Eva, & Stopka, O. (2017). Robotization of logistics
processes. Matec Web of Conferences, 134, 00038.
[11] Xiao, Y., Yun-Chao, M., Jian, X., Shan, B., & Jiao, L. (2020). Research on automated warehouse
scheduling system based on double label and sorting algorithm. MATEC Web of Conferences, 325, 05001.
[12] Santis, R. D., Montanari, R., Vignali, G., & Bottani, E. (2018). An adapted ant colony optimization
algorithm for the minimization of the travel distance of pickers in manual warehouses. European Journal
of Operational Research, 267(1), 120-137.
[13] Duan, L. M. (2018). Path planning for batch picking of warehousing and logistics robots based on
modified a* algorithm. Academic Journal of Manufacturing Engineering, 16(2), 99-106.
[14] Li, Z., Barenji, A. V., Jiang, J., Zhong, R. Y., & Xu, G. (2018). A mechanism for scheduling multi robot
intelligent warehouse system face with dynamic demand. Journal of Intelligent Manufacturing.
[15] Dasygenis, M. (2022). A routing and task-allocation algorithm for robotic groups in warehouse
environments. Information, 13.
[16] Emde, M. G. C. H. (2021). Routing automated lane-guided transport vehicles in a warehouse handling
returns. European Journal of Operational Research, 292(3).