ChapterPDF Available

Industry 4.0, Intelligent Visual Assisted Picking Approach: 6th International Conference, MIKE 2018, Cluj-Napoca, Romania, December 20–22, 2018, Proceedings


Abstract and Figures

This work deals with a novel intelligent visual assisted picking task approach, for industrial manipulator robot. Intelligent searching object algorithm, around the working area, by RANSAC approach is proposed. After that, the image analysis uses the Sobel operator, to detect the objects configurations; and finally, the motion planning approach by Screw theory on SO(3), allows to pick up the selected object to move it, to a target place. Results and whole approach validation are discussed.
Content may be subject to copyright.
Industry 4.0, Intelligent Visual Assisted
Picking Approach
Mario Arbulu1(B
), Paola Mateus1(B
), Manuel Wagner1(B
Cristian Beltran2(B
), and Kensuke Harada2(B
1Universidad Nacional Abierta y a Distancia (UNAD), Bogota 111511, Colombia
2Graduate School of Engineering Science, Department of Systems Innovation,
Osaka University, 1-3 Machikaneyama, Toyonaka 560-8531, Japan
Abstract. This work deals with a novel intelligent visual assisted pick-
ing task approach, for industrial manipulator robot. Intelligent searching
object algorithm, around the working area, by RANSAC approach is pro-
posed. After that, the image analysis uses the Sobel operator, to detect
the objects configurations; and finally, the motion planning approach by
Screw theory on SO(3), allows to pick up the selected object to move it,
to a target place. Results and whole approach validation are discussed.
Keywords: Artificial intelligence ·Autonomous picking
Artificial vision ·Sobel ·RANSAC ·Screws modeling
1 Introduction
The Industry 4.0 challenges are directed around integrating automation process,
cloud and IoT. Furthermore, robotics manipulation, and autonomy are currently
improving, by artificial intelligence algorithms, [1,2]. For instance the proposed
World Robotics Summit (WRS), motivate researchers around the world, in order
to overcome some challenges at industrial robotics applications too, [3]. Cur-
rently, artificial vision is used to assist robotic systems, by extracting visual
features from given images. Some research have been done for obtaining the nec-
essary features, such as foreground extraction, noise removal and unnecessary
objects. A proposal of color segmentation method, is detailed in [4]. A back-
ground modeling through statistical edge is given by [5]. Additionally, in [6]
visual control systems had been used, which are based on images for trajectory
tracking. Also, for vision pre-processing, object detection algorithms are used
[7], visual tracking [8], and color intensity [9]. Some of them are conventional
Supported by UNAD, Convocatoria 007.
Springer Nature Switzerland AG 2018
A. Groza and R. Prasath (Eds.): MIKE 2018, LNAI 11308, pp. 1–10, 2018.
2 M. Arbulu et al.
methods, which have limitations on objects detection; and others extract depth
features, which have complex processing [1012].
So, the object detection and features extraction, for intelligent vision assisted
proposed in this work, is focused in select the interest working planes where the
objects are located. After that, the Sobel [13] operator is proposed to edge detec-
tion, with morphological operations and dilatation [14,15]. And finally, regions
are labeled for obtaining the interest object features as: area, position (x, y),
centroid and orientation angle.
2 Theoretical Background
In this section theoretical background will be detailed regarding: artificial vision,
artificial intelligence, and motion computation; which will describe the overall
approach proposed (see Fig. 1). Where the user sends a pieces set inquiry, in
order to develop the manipulator intelligent picking task. This proposal will be
applied in the Industrial Assembly Challenge, in the WRS, specifically in the
kitting task.
Fig. 1. Overall approach proposal.
2.1 Artificial Vision Approach
Through the Sobel algorithm [16], the horizontal and vertical edges detection is
realized, (see Fig. 2(b)).
The edges detection is obtained, with a central approximation of the first
derivative, as following:
df (x)
dx =f(x+1)f(x)
with a mask [1/2 0 1] for the vertical edges, and other mask 1/201
the horizontal edges.
Industry 4.0, Intelligent Visual Assisted Picking Approach 3
Next, in order to remove the generated false edges, the Sobel operator is
evaluated by the gradient at the xand ycoordinates (Gx,andGy), such as:
y] (2)
where, Gx=1
the vertical smoothing, which is proposed by the Sobel operator
and Gy=
being 242
the horizontal smoothing, which is proposed by the Sobel
For obtaining better pixels information on each edges previously detected,
the square morphological dilatation is proposed, (see Eq. 3). It is by using the
logic operator OR and selecting a 9 pixels window I(m, n). With the I(m, n)
window, a whole image sweep is realized, thus a new image is generated which
corresponds to the square dilatation, (see Fig. 2(c)).
W[I(m, n)] =
I(m1,n1) I(m1,n)I(m1,n+1)
I(m, n 1) I(m, n)I(m, n +1)
I(m+1,n1) I(m+1,n)I(m+1,n +1
being mthe coordinate in the xpixel and nthe coordinate in the ypixel.
Dil(m, n)=OR {W[I(m, n)]}=max {W[I(m, n)]}(3)
In order to obtain the features, and differentiate each one of the detected objects
in the image; the object labeled is developed by the mask B3X3(see Fig. 2(d)).
That mask sweeps vertically each pixel from the skeleton, it find adjacent pixels
with the value 1, and it assign a label value.
P(m3,n3) P(m3,n)P(m3,n+3)
P(m, n 3) P(m, n)P(m, n +3)
P(m+3,n3) P(m+3,n)P(m+3,n +3)
where P(m, n)is
the evaluated pixel value.
Being Okeach labeled k-th object, a rectangle which embed to each object
(rok), is obtained as follows: rok=mknkAKBkkN
Rectangle with higher left vertex in (mk,y
k), height Bkand width Ak.Thus,
each object centroid cokis obtained, by the following expressions: xk=mkAk
4 M. Arbulu et al.
Fig. 2. (a) Working area image in RGB, (b) Obtained image by Sobel operator, (c)
Dilatation at square shape and fills holes, (d) Objects label: bounding box is the red
square, and the centroid is the red cross on each one.
2.2 Artificial Intelligence Approach
The method in this subsection deals with features detection; which is the work-
flow of extraction and correspondence, and them are saved in a features vector.
This method is used to find and object, inside of working area (i.e. Fig. 3),
and it is called “Random Sample Consensus” (RANSAC). Specifically, features
detection is developed with “Speeded Up Robust Features” algorithm (SURF),
which is based in Hessian Matrix (H(i, j)), where in a given point x= (i, j) in a
image I:
H(i, j)=Lxx (i, j)Lxy (i, j )
Lxy(i, j )Lyy(i, j)(4)
Where Lxx(i, j ) corresponds to convolution of second order derivative of g(j),
(Gauss function) d2g(j)/dx2with the Iimage in the xpoint, and in the similar
way for the elements Lxy (i, j)andLyy(i, j ), [17].
Industry 4.0, Intelligent Visual Assisted Picking Approach 5
Fig. 3. (a) Object SURF features detection (b) Work space SURF features detection
(c) Object location in the work space.
The RANSAC algorithm application removes outliers, which can produce
error detection. And at first, it obtain a data set with inliers and outliers,
given by:
Where sis the set size and nis the data number, [18]. These outliers set is shifted
by a probabilistic values set q(Desired probability for drawing an outlier free
subset.), which reduces computational cost. If the probability of an inlier is w,
so the probability of a outlier is: =1w. It is necessary to make at least N
selections of sets, given by: N=log(1 q)/log(1 ws)
2.3 Screws Modeling
In order to compute suitable manipulator motion, the screw approach theory
embedded on Special Euclidean groups SO(3), [19], is detailed in this section.
Being the forward kinematics of 5 DOF manipulator robot of Fig. 4, as following:
θ5.gth(0) (5)
Regarding the Eq. 5,gth (0) is the 4 ×4 matrix, which describes the initial end-
effector configuration (position and orientation); gth(θ) is the 4 ×4 matrix, which
6 M. Arbulu et al.
Fig. 4. Five DOF manipulator arm frames (T, H), joint axes (wi), rotation angles θi,
pand kaxes cross points for modeling.
describes the target end-effector configuration, where θis the 5 ×1 vector of
joints rotations. For the ith joint, the joint angle rotation is θi; the twist is ζi;
and finally, the exponential matrix is eζi.
θi. The product of exponential applied
to the initial end-effector configuration gth(0), allows to model the end-effector
motion to a target configuration, through successive rotations around the free
joints axes. In order to compute the inverse kinematics, the Paden-Kahan (P-K)
subproblems will be applied, [20]. Thus, for solving the θ3joint rotation, the
third P-K subproblem is used, because it solve what is the rotation, around any
free axis, which translates a point to a given distance:
θ5.p k|| =δ(6)
Applying the exponential matrices from axes 1 to 5, to the cross point of axes 4
and 5 (p), the axes rotations θ4and θ5do not affect to that point (see Eq. 6).
Furthermore, the distance δ=||gsh(θ).gsh (0)1.p k|| from the resulting
rotated point pto the point k, is not affected by exponential matrices 1 and 2.
So, the θ3joint angle rotation is solved with the third P-K subproblem, by the
following simplified expression:
θ3.p k|| =δ(7)
Next, the second P-K subproblem give us the solution, of two rotation joint
angles with crossed axes; so, the first and second joint angles rotations θ1and
θ2are given by:
θ5.p =p (8)
Evaluating the exponential matrices 1 to 5, in ppoint (see Eq. 8), only the rota-
tions 1 to 3 affect to ppoint; thus, that point achieves the p =gsh(θ).gsh (0)1.p
Industry 4.0, Intelligent Visual Assisted Picking Approach 7
position. As, the joint rotation θ3has been already solved, and axes 1 and 2 are
crossed at kpoint, the joint angle rotations θ1and θ2could be solved with the
second P-K subproblem, as next, where p=eζ3.
θ2.p=p (9)
As the joints rotation angles θ1to θ3have been computed, and it is notice that,
the joints axes ω4and ω5are crossed at p, following expression is obtained,
through apply rotations θ4and θ5to point k:
θ5.k =k(10)
The above expression (Eq. 10) allows to solve the joints rotations θ4and θ5,by
the second P-K subproblem, being k=eζ3.
Furthermore, some via points have been selected in 3D space, in order to
define smooth Cartesian trajectories for approaching, picking and dispatching
the objects to defined targets. Those via points are obtained, as orthogonal
projections from the objects (or pieces) locations, computed in the artificial
vision approach previously proposed.
3 Results
The proposal was validated with simulation and experimental tests. After, the
user inquiry, the robot can do successfully the kitting task autonomously. The
RANSAC approach identifies where is the desired piece (see Fig. 5), next the
image analysis compute the pieces position, by border detection with Sobel
Fig. 5. (a) Screw detection (b) Packing ring detection.
8 M. Arbulu et al.
operator, and shape dilatation. Using a pixel to mm , scale adaptation, and
the reference translation to robot base; the pieces configurations (position and
orientation) are obtained, as targets in Cartesian space. Those target pieces
configurations are introduced to compute robot motion, with Screw theory, (see
Fig. 6). The correction factor of pieces configurations, due to the image analysis
precision, give us an small displacement in ydirection up to maximum value of
Fig. 6. Snapshots of intelligent picking from user inquiry, for different type of pieces,
in order to achieve the kitting task. (Robot: Scorbot ER 4u)
Industry 4.0, Intelligent Visual Assisted Picking Approach 9
4 Conclusions
The RANSAC algorithm removes outliers, which allows increase the detection
probability to find an object inside the working area. The image analysis algo-
rithm detects accurately enough, each object configuration, which allows to move
the robot end-effector, for picking any object in the working area. Kinemat-
ics motion computation by screw theory, allows efficient computation of joint
patterns, avoiding singularities, and with meaningful analytic description. The
whole intelligent picking algorithm have been successfully tested in order to
achieve the kitting tasks. Current research is focused on increase the robustness
of image analysis, intelligence and motion planning.
1. Indri, M., Grau, A., Ruderman, M.: Guest editorial special section on recent trends
and developments in industry 4.0 motivated robotic solutions. IEEE Trans. Ind.
Inform. 14(4), 1677–1680 (2018).
2. Wan, J., Tang, S., Hua, Q., Li, D., Liu, C., Lloret, J.: Context-aware cloud robotics
for material handling in cognitive industrial Internet of Things. IEEE Internet
Things J. 5(4), 2272–2281 (2018).
3. World Robot Summit 2018.
4. Nieuwenhuis, C., Cremers, D.: Spatially varying color distributions for interactive
multilabel segmentation. IEEE Trans. Pattern Anal. Mach. Intell. 35(5), 1234–
1247 (2013).
5. Rivera, A.R., Murshed, M., Kim, J., Chae, O.: Background modeling through
statistical edge-segment distributions. IEEE Trans. Circuits Syst. Video Technol.
23(8), 1375–1387 (2013).
6. Mezouar, Y., Chaumette, F.: Optimal camera trajectory with imagebased control.
Int. J. Robot. Res. 22(10–11), 781–804 (2003)
7. Guo, M., Zhao, Y., Zhang, C., Chen, Z.: Fast object detection based on selective
visual attention. Neurocomputing 144, 184–197 (2014)
8. Stalder, S., Grabner, H., Van Gool, L.: Dynamic objectness for adaptive tracking.
In: Lee, K.M., Matsushita, Y., Rehg, J.M., Hu, Z. (eds.) ACCV 2012. LNCS, vol.
7726, pp. 43–56. Springer, Heidelberg (2013).
37431-9 4
9. Zhu, W., Liang, S., Wei, Y., Sun, J.: Saliency optimization from robust background
detection. In: Proceedings of CVPR, pp. 2814–2821, June 2014
10. Seo, H.J., Milanfar, P.: Static and space-time visual saliency detection by self-
resemblance. J. Vis. 9(12), 15 (2009). Art. no. 15
11. Bai, X., Sapiro, G.: A geodesic framework for fast interactive image and video
segmentation and matting. In: IEEE 11th International Conference on Computer
Vision, pp. 1–8 (2007)
12. Price, B., Morse, B., Cohen, S.: Geodesic graph cut for interactive image segmen-
tation. In: IEEE Computer Society Conference on Computer Vision and Pattern
Recognition, pp. 3161–3168 (2010)
13. Kanopoulos, N., Vasanthavada, N., Baker, R.L.: Design of an image edge detection
filter using the Sobel operator. IEEE J. Solid State Circuits 23(2), 358–367 (1988)
10 M. Arbulu et al.
14. Chen, J., Su, C., Grimson, W.E.L., Liu, J., Shiue, D.: Ob ject segmentation of
database images by dual multiscale morphological reconstructions and retrieval
applications. IEEE Trans. Image Process. 21(2), 828–843 (2012).
15. Jin, X.C., Ong, S.H., Jayasooriah, A.: Domain operator for binary morphological
processing. IEEE Trans. Image Process. 4(7), 1042–1046 (1995).
16. Lim, J.S.: Two-Dimensional Signal and Image Processing, pp. 478–488. Prentice
Hall, Englewood Cliffs (1990)
17. Bay, H., Tuytelaars, T., Van Gool, L.: SURF: speeded up robust features. In:
Leonardis, A., Bischof, H., Pinz, A. (eds.) ECCV 2006. LNCS, vol. 3951, pp. 404–
417. Springer, Heidelberg (2006). 32
18. Yaniv, Z.: Random sample consensus (RANSAC) algorithm, a generic implementa-
tion. Insight J., 1–14 (2010).
19. Mustafa, S.K., Agrawal, S.K.: On the force-closure analysis of n-DOF cable-driven
open chains based on reciprocal screw theory. IEEE Trans. Robot. 28(1), 22–31
20. Dimovski, I., Trompeska, M., Samak, S., Dukovski, V., Cvetkoska, D.: Algorith-
mic approach to geometric solution of generalized PadenKahan subproblem and
its extension. Int. J. Adv. Robot. Syst., 1–11 (2018).
Artificial intelligence is nowadays a well-known technology in Latin America, while Industry 4.0 is a hot trend with different approaches in our region. We recognize Industry 4.0 is supported in some aspects by artificial intelligence. This fact is validated by the work of some researchers in our region. The mature work in some artificial intelligence techniques such as natural language processing, digital image processing, neural networks, and fuzzy and intelligent systems is complemented by the emerging work on Industry 4.0. Consequently, in this chapter, we summarize the trends related to artificial intelligence and Industry 4.0 in Latin America by using domain models, a previous form of class diagrams. We perform a systematic literature review to this aim for identifying the main elements of the domain model. Different terms are unified as a way to compare the work performed in the Latin America region related to such technologies. Our graphical summary can be used throughout this book for understanding the close relationship artificial intelligence and Industry 4.0 exhibits in our region and analyzing the contents of the different chapters.
Full-text available
Kinematics as a science of geometry of motion describes motion by means of position, orientation, and their time derivatives. The focus of this article aims screw theory approach for the solution of inverse kinematics problem. The kinematic elements are mathematically assembled through screw theory by using only the base, tool, and workpiece coordinate systems—opposite to conventional Denavit–Hartenberg approach, where at least n + 1 coordinate frames are needed for a robot manipulator with n joints. The inverse kinematics solution in Denavit–Hartenberg convention is implicit. Instead, explicit solutions to inverse kinematics using the Paden–Kahan subproblems could be expressed. This article gives step-by-step application of geometric algorithm for the solution of all the cases of Paden–Kahan subproblem 2 and some extension of that subproblem based on subproblem 2. The algorithm described here covers all of the cases that can appear in the generalized subproblem 2 definition, which makes it applicable for multiple movement configurations. The extended subproblem is used to solve inverse kinematics of a manipulator that cannot be solved using only three basic Paden–Kahan subproblems, as they are originally formulated. Instead, here is provided solution for the case of three subsequent rotations, where last two axes are parallel and the first one does not lie in the same plane with neither of the other axes. Since the inverse kinematics problem may have no solution, unique solution, or many solutions, this article gives a thorough discussion about the necessary conditions for the existence and number of solutions.
The Random Sample Consensus (RANSAC) algorithm for robust parameter value estimation has been applied to a wide variety of parametric entities (e.g. plane, the fundamental matrix). In many implementations the algorithm is tightly integrated with code pertaining to a specific parametric object. In this paper we introduce a generic RANSAC implementation that is independent of the estimated object. Thus, the user is able to ignore outlying data elements potentially found in their input. To illustrate the use of the algorithm we implement the required components for estimating the parameter values of a hyperplane and hypersphere.
The twelve papers in this special section focus on the development of robotic solutions for smart factories in industry - the concept of the fourth industrial revolution (industry 4.0). The inclusion of robotics is expected to deeply change the future manufacturing and production processes, and lead to smart factories that will benefit from the main design principles of Industry 4.0: interoperability, virtualization, decentralization, real-time capability, service orientation, and modularity. Robotics will have a key role in this development since innovative technologies and solutions, traditionally associated with the service robotics sector, are going to migrate to industrial smarter robots, exploiting the maturing of navigation, localization, sensing, and motion control technologies. These smarter robots will draw on a much broader range of technology, allowing higher levels of dexterity and flexibility, the ability to learn tasks without formal programming, and to autonomously collaborate with other autonomous devices and human operators.
In the context of Industry 4.0, industrial robotics such as AGVs (Automated Guided Vehicles) have drawn increased attention due to their automation capabilities and low cost. With the support of cognitive technologies for industrial Internet of Things (IoT), production processes can be significantly optimized and more intelligent manufacturing can be implemented for smart factories. In this paper, for advanced material handling, a cognitive industrial entity called Context-Aware Cloud Robotics (CACR) are introduced and analyzed. Compared with the One-Time On-Demand Delivery (OTODD), CACR is characterized by two features: context-aware services, and effective load balancing. First, the system architecture, advantages, challenges, and applications for CACR are introduced. Then, fundamental functions for material handling are articulated, namely, decision-making mechanisms and cloud-enabled simultaneous localization and mapping. Finally, a CACR case study is performed to highlight its energy-efficient and cost-saving material handling capabilities. Simulations indicate the superiority of cognitive industrial IoT and show that using CACR for material handling can significantly improve energy efficiency and save cost.
Recent progresses in salient object detection have exploited the boundary prior, or background information, to assist other saliency cues such as contrast, achieving state-of-the-art results. However, their usage of boundary prior is very simple, fragile, and the integration with other cues is mostly heuristic. In this work, we present new methods to address these issues. First, we propose a robust background measure, called boundary connectivity. It characterizes the spatial layout of image regions with respect to image boundaries and is much more robust. It has an intuitive geometrical interpretation and presents unique benefits that are absent in previous saliency measures. Second, we propose a principled optimization framework to integrate multiple low level cues, including our background measure, to obtain clean and uniform saliency maps. Our formulation is intuitive, efficient and achieves state-of-the-art results on several benchmark datasets.
Selective visual attention plays an important role in human visual system. In real life, human visual system cannot handle all of the visual information captured by eyes on time. Selective visual attention filters the visual information and selects interesting one for further processing such as object detection. Inspired by this mechanism, we construct an object detection method which can speed up the object detection relative to the methods that search objects by using sliding window. This method firstly extracts saliency map from the origin image, and then gets the candidate detection area from the saliency map by adaptive thresholds. To detect object, we only need to search the candidate detection area with the deformable part model. Since the candidate detection area is much smaller than the whole image, we can speed up the object detection. We evaluate the detection performance of our approach on PASCAL 2008 dataset, INRIA person dataset and Caltech 101 dataset, and the results indicate that our method can speed up the detection without decline in detection accuracy.
Background modeling is challenging due to background dynamism. Most background modeling methods fail in the presence of intensity changes, because the model cannot handle sudden changes. A solution to this problem is to use intensity-robust features. Despite the changes of an edge's shape and position among frames, edges are less sensitive than a pixel's intensity to illumination changes. Furthermore, background models in the presence of moving objects produce ghosts in the detected output, because high quality models require ideal backgrounds. In this paper, we propose a robust statistical edge-segment-based method for background modeling of non-ideal sequences. The proposed method learns the structure of the scene using the edges' behaviors through the use of kernel-density distributions. Moreover, it uses segment features to overcome the shape and position variations of the edges. Hence, the use of segments gives us local information of the scene, and that helps us to predict the objects and background precisely. Furthermore, we accumulate segments to build edge distributions, which allow us to perform unconstrained training and to overcome the ghost effect. In addition, the proposed method uses adaptive thresholding (in the segments) to detect the moving objects. Therefore, this approach increases the accuracy over previous methods, which use fixed thresholds.
Conference Paper
A fundamental problem of object tracking is to adapt to unseen views of the object while not getting distracted by other objects. We introduce Dynamic Objectness in a discriminative tracking framework to sporadically re-discover the tracked object based on motion. In doing so, drifting is effectively limited since tracking becomes more aware of objects as independently moving entities in the scene. The approach not only follows the object, but also the background to not easily adapt to other distracting objects. Finally, an appearance model of the object is incrementally built for an eventual re-detection after a partial or full occlusion. We evaluated it on several well-known tracking sequences and demonstrate results with superior accuracy, especially in difficult sequences with changing aspect ratios, varying scale, partial occlusion and non-rigid objects.