Content uploaded by Andrzej Tuchołka
Author content
All content in this area was uploaded by Andrzej Tuchołka on Sep 14, 2018
Content may be subject to copyright.
Content uploaded by Maciej Majewski
Author content
All content in this area was uploaded by Maciej Majewski on Sep 10, 2018
Content may be subject to copyright.
A Method for Intelligent Quality
Assessment of a Gearbox Using
Antipatterns and Convolutional
Neural Networks
Andrzej Tucholka(B
), Maciej Majewski, Wojciech Kacalak,
and Zbigniew Budniak
Faculty of Mechanical Engineering, Koszalin University of Technology,
Raclawicka 15-17, 75-620 Koszalin, Poland
{andrzej.tucholka,maciej.majewski,wojciech.kacalak,
zbigniew.budniak}@tu.koszalin.pl
http://wm.tu.koszalin.pl
Abstract. Taking gearbox as a reference structure, authors apply a
method for grading the quality of mechanical structures using a con-
volutional neural network trained with antipatterns found in gearbox
constructions. Antipatterns are used as a quality reference embodied in
a neural network, which is used for classifying tested structures to match
the antipatterns taught to it.
The measure of similarity to antipatterns (used for training and
abstracted by the neural network) is interpreted as the quality measure
and so the inversed sum of similarities to each of the antipattern classes
used in training is considered a quantitative grade of quality.
Such grading enables automated cross-comparison of structures based
on their quality (defined as differentiation from used antipatterns).
Keywords: Antipattern ·Structure quality ·Convnet
1 Introduction
Mechanical Engineering has been built upon mathematical models enabling
design of technically advanced solutions. Recent emergence of novel algorithms
(i.e. Convolutional Neural Network) created an opportunity to increase the speed
with which the evaluation of mechanical structures can be performed. This is
achievable by increasing the ability of computers to detect and analyze patterns
in structured data.
Quality evaluation methods based on simulations require manual and detailed
definition of the design of each of tested structures. Such approach, to be precise,
requires detailed definition of the simulated construction and associated simula-
tion models. Physics based simulations of such constructions are widespread, but
even highly optimized ones, due to their inherent complexity, lack in performance
and require manual scoping, configuration, and interpretation.
c
Springer International Publishing AG, part of Springer Nature 2019
R. Silhavy (Ed.): CSOC 2018, AISC 764, pp. 57–68, 2019.
https://doi.org/10.1007/978-3-319-91189-2_7
58 A. Tucholka et al.
To define a quality grade, simulation results have to be compared against a
pre-defined quality reference (or at least each other) to define the best among
tested ones. Even for trivial design errors, simulation based approach makes it
difficult to timely evaluate the quality of a large population of elements (e.g. a
population of computer generated mechanical designs of gearboxes).
Convolutional Neural Network (Convnet) is used in many object recognition
and classification techniques, and is known for its applications in image recog-
nition. Most notably it is used to explore the structured content of the image
[1], but also in text and other structured data classification problems. Proposed
method of intelligent quality assessment is aimed to aid machine designers in
early detection of reproducible design errors. Still, to define a meaningful qual-
ity grade, our method requires a library of incorrect designs (antipatterns) that
can be observed in tested constructions.
Comparing the traditional, simulation based approach with our method, we
emphasize that our quality results cannot be considered exhaustive, as they do
not attempt to predict the behavior of the construction. This means that this
method cannot be used for reasoning about the performance of the construction.
Data format
normalization
KXML symbolic
representation
format
Technical
drawings
Convolutional NN
Similarity to
antipatterns
Kohonen NN
Probabilistic NN
Hamming distance
Confidence level
Distance from the
closest neuron
Confidence level
Smallest distance
Feature value
matrix / tree
Feature value
matrix
Feature value
list
Multiplicative model
Additive model
Normalized
list / tree with
normalized
feature values
Smallest value
Smallest value
Antipattern
constructions
(quality reference)
Tested
constructions
(quality grading)
Fig. 1. Overview of the method and applied models for intelligent quality assessment
Testing different numerical models Fig. 1for such pattern identification, we’ve
found neural networks very useful for working with incomplete datasets (e.g.
enabling suggestions during the work on the designs). Additionally, the low
computation cost of classifying a structure by a pre-trained model is an attrac-
tive addition, compared to simpler iterative models (i.e. Antipattern Matching
Factor [2]).
This paper presents a method for quality assessment [3] of a worm gear-
box using Convolutional Neural Network as a classification mechanism. A worm
gearbox is a construction characterized by multiple features meant to provide
casing for the oil, protection, heat dispersion, worm gear positioning, and other.
All these features can malfunction in multiple ways, mainly due to design errors.
We present a method that can be used to automatically detect and prevent such
errors from being added to mechanical designs.
Intelligent Quality Assessment Using Antipatterns and Convnets 59
1.1 Antipatterns as a Quality Reference
Antipatterns representing mechanical design knowledge prove themselves useful
in enhancing quality related processes [5]. Usage of antipatterns as the training
dataset for the network, provides the method with an ability to perform its
calculations with regards to concrete quality references. In case of antipatterns,
such references are examples of incorrect and repeatable solutions, that could be
observed in the class of tested elements (i.e. worm gearbox).
The library of such antipatterns (i.e. gearbox design errors) is used, as the
reference enabling quantification of the quality measurements. The similarity
of the tested construction to the library of antipatterns can be measured and
interpreted as a subjective, quantitative quality measure.
Antipatterns can be identified in features of mechanical constructions (Fig.2)
but also can be a function of the structure of the construction (Fig. 3). In both
cases it is possible to provide a symbolic description of such constructions and
calculate their similarity to a tested element.
E
127
140±0,15
E
127
140±0,05
Incorrect
Fig. 2. Antipattern: precision (0,15mm) of the spread of the axis of the worm and the
worm-wheel can’t be greater than the acceptable value (0,06mm)
It is necessary to mention, that due to the nature of neural networks, the
calculations done to classify the tested element cannot be directly analyzed for
correctness. Furthermore, the classification done by neural networks result in a
subjective confidence level rather then a precise quantitative value that can be
reasoned upon. Still, upon converging on the training data, the neural network
is able to detect structured data patterns.
2 Preparing the Training Dataset
The aim of applying convolutional neural network is to enable detecting sim-
ilarities in design (i.e. repeating patterns). To allow the network to create an
abstraction over the antipatterns used for training, we can either: transform
60 A. Tucholka et al.
the data into image format, or we can redesign the data transformations, and
convolutions to match the symbolic data structure.
The scope of the symbolic representation of antipattern features, in training
data, should allow to clearly define the pattern and the incorrect nature of
the antipattern (e.g. see Fig. 2). Additionally, to allow the network to properly
abstract over the antipattern, the error should be placed in different contexts
(i.e. structures of symbolic nodes).
The antipattern visualized on Fig. 2, is referring directly to a feature of the
construction being the precision of the spread between the main axis of a gearbox.
As designed, the distance between the axis of the worm and the axis of the worm
wheel, is 140 mm, but the precision of this dimension vary. In the incorrect case,
it is twice the acceptable limit. Such design error in precision, can be easily
overlooked, yet it will create an increased tip clearance of the gear and have
a negative impact on the kinematic precision of the reduction gear caused by
elevated kinematic deviations.
KIncorrect
e
r
R
R
r
e
Fig. 3. Local shoulders forming an antipattern design - a hot spot
Another, more complex antipattern (Fig. 3) is reflecting an incorrect relation
between curvature of the shoulder and the thickness of the wall on which this
shoulder exists.
Having the technical drawings as input can be very convenient, as there are
several neural network designs ready for image processing and detecting similar-
ities in embedded objects. Still, the nature of technical drawings adds an addi-
tional set of challenges: rotations, varying notation styles and other often implied
character of features of designed elements. Furthermore, the technical drawings
themselves require additional notation to define the logic of the antipattern (i.e.
mathematical on Fig. 3).
2.1 Symbolic Representation of the Antipattern Features
of the Worm Gearbox
To represent common antipatterns found in worm gearboxes, we use a symbolic
representation [4]. Challenges in normalization of data describing the construc-
tions, come mainly from the character of technical drawings. Such difficulties
Intelligent Quality Assessment Using Antipatterns and Convnets 61
come from their varying style, visual irregularities that are not the data (e.g.
rotation, visibility, or positioning of the dimensions).
Main structure of the construction’s data
Model klasyfikacyjny 1 .. N
Model klasyfikacyjny 1 .. N
Classification model 1 .. N
Węzeł 1 .. K
Węzeł 1 .. K
Node 1 .. K
Węzeł 1 .. L
Węzeł 1 .. L
Identifier Model class
Cecha konstrukcyjna 1 .. M Miara Funkcja
Cecha konstrukcyjna 1 .. M Miara Funkcja
Feature definition 1 .. M Identifier Metric Feature class
Identifier Node class
Cecha konstrukcyjna 1 .. M Identyfikator Miara Funkcja
Cecha konstrukcyjna 1 .. M Identyfikator Miara Funkcja
Feature instance 1 .. M Identifier Value Function
Identifier Construction class
Node 1 .. L Identifier Node class
Identyfikator Miara Funkcja
Identyfikator Miara Funkcja
Feature instance 1 .. M Identifier Value Function
… with possible further nestings ...
Fig. 4. The structure of KXML symbolic language
Another approach could be to select a specific computer interpretable data
format. A notable ones to use could be the Standard for the Exchange of Product
model data (ISO 10303) and ECMA-363 Universal 3D File Format. We have
found these to be designed mainly for manufacturing and visualization purposes
making the data normalization process more complex. It is worth to note, that
provided adequate parsing models the method of comparing structures with
antipatterns could work with any data format.
To avoid pitfalls of normalizing the data describing constructions, and com-
plexity of existing computer interpretable data formats, we have created a sym-
bolic language (KXML [4] Fig. 4) capable of representing the symbolical struc-
ture of a mechanical construction. In essence, provided a technical drawing, we
manually decompose the construction into a symbolic, object-oriented, tree-like
structure of nodes. Such data format simplifies further processing of the struc-
tured data. We format this symbolic representation in XML compatible notation
to ease data imports, but the format itself is notation independent.
None of used numerical models allows for object oriented input, hence a nat-
ural loss occurring during the data transformation. Using the symbolic language,
we can describe complex structures in many ways: using an XML or JSON com-
patible notations, a specialized declarative language similar to HTML, or in a
62 A. Tucholka et al.
minimized or symbolical notation. We have defined assumptions for creation of
a language, whereby a structural object can be easily divided into constituent
objects, and modifiers can be distinguished as elements for altering dimensions,
shape, structure, or other properties of processed objects.
To represent the data describing the construction we use four main elements:
(1) a classification model - providing the context for the analysis and interpreta-
tion of the features and structural nodes; (2) definitions of features - structuring
the data used for calculating the properties of the construction; (3) instances
of nodes and their structure - allowing for representing the construction decom-
posed explicitly into a node structure; (4) instances of features - containing
values or math functions allowing to use them in the context of the concrete
classification model and its position in the structure of nodes.
<st ruct clas s=”gearbox” id=”g001”>
<model class=”mechanical”>
<feature id=”r” unit=”mm” />
<feature id=”R” unit=”mm” />
<feature id=”e” unit=”mm” />
<node cla s s=”case”>
<node cla s s=”wall”>
<feature id=”e” value=”5” />
<node cla s s=”thermal cen te r”>
<node class=”contraction cavity”>
<feature id=”r” value=”1” />
<feature id=”R” value=”2” />
</node>
</node>
</node>
</node>
</model>
</struct >
Listing 1. KXML representation of the antipattern (Fig. 3.)
Relations emerging from the structure of the language, through the prop-
erties of those four main elements, allow to represent contextual information
between: (A) elementary elements of the construction - through the structure of
classification models, nodes, feature definitions and instances; (B) class defini-
tions for features and nodes enabling common reasoning on values enclosed in
feature instances through class definition reuse; (C) inherited node and feature
definitions allowing for common analysis of represented elements based on their
ancestry and common feature scopes; (D) feature instances in different classi-
fication models - through the node identity mechanism; (E) indirect numeri-
cal relations included in mathematical functions relating feature values to each
other.
Besides increased verbosity that simplifies the integration of the data, pro-
posed KXML language, (in general a computer readable format makes) makes it
Intelligent Quality Assessment Using Antipatterns and Convnets 63
easier to generate large populations of examples of a specific antipattern. Larger
amounts of samples are required for neural networks to be able to properly
abstract over the features they will classify.
For the neural network to converge on the antipattern it is necessary to
generate multiple example structures to be used during the training process.
Proposed symbolic description makes it also easier to generate large popula-
tions of examples of a specific antipattern that is taught to the network. In the
antipattern visualized on Fig. 3and describe on Listing 1, the incorrect nature
of the design comes from the fact that both: the inner (r) and outer (R) radius’
of the contraction cavity are smaller then the thickness (e) of the gearbox’s wall.
2.2 Data Format Normalization
When it comes to creation of technological and organizational processes, object-
based description is a natural mapping of such structures. Attempting to process
as much information from the KXML symbolic description, we have to include
the feature value and class information in the context and structural position
of the node class and its nesting. Aiming to avoid the complexity brought by
multi-dimensionality of the symbolic description, where each feature class would
form one.
The requirements of the numerical models are much more rigid, mostly
requiring a low-dimensional input. The additive (i.e. Antipattern matching fac-
tor) and multiplicative models we’ve tested require one-dimensional input, and
providing a very basic ability to process tree-like datasets. Such ability is not
allowing for differentiation between tree’s nodes and hence becomes useless if
the input data’s structure is not static (impractical for real life structures).
The neural networks allow for more complex input structure, still due to
computational complexity preferring a low number of dimensions. Introducing
any dimension reducing model (e.g. Principal Component Analysis) would blur
the relationship between the data even further then a normal neural network
design. Aiming to preserve as much data as possible, we’re using data matrices
with lexically normalized structure.
3 Convnet Based Neural Network Design
Aligning the design of the algorithm used for antipattern detection, we take
advantage of two main features of convolutional networks: the location invariance
and compositionality. Location invariance enables detection of patterns in any
location (position) of the symbolic data representation, while compositionality
enables combining data patterns and non-direct data relations. Availability of
such features in the numerical method used for data classification is a clear
advantage compared to simple additive, multiplicative or deterministic models
proposed earlier [3].
Application of the Convnet based network design (for detection of antipat-
terns in a symbolic representation of a mechanical construction), rather then
64 A. Tucholka et al.
analyzing image pixel or text input, takes the matrix of feature data created
from the KXML symbolic description of the mechanical construction. This app-
roach builds upon the Convnet designs for text classification, but requires defin-
ing a normalization technique that will preserve at least some of the structural
information contained in KXML.
Flattening out internal tree data structure, the two dimensional input matrix
contains a map of feature values of each node in the antipattern structure
(Listing 1).
Antipattern1m,n =
⎡
⎢
⎢
⎣
000
500
000
012
⎤
⎥
⎥
⎦
Such input representation of the data, given adequate kernel size and normal-
ized columns, could enable detecting patterns in feature values. The incorrect
nature of the antipattern (the values of rand Rbeing too small compared to
thickness of the wall e. Compared to the symbolic representation, the above
matrix representation is omitting the dependencies between parts of the con-
struction. Processing larger matrices (real sized designs), this pattern would be
visible to the network due to proximity of the feature values (within kernel size).
Attempting to include more information from the KXML symbolic descrip-
tion, we have to include the feature value and class information in the context
of the node class and its nesting. To avoid creation of additional dimensions, we
can embed such information in a more complex matrix. Here, in addition to the
feature values (columns 5–7), the matrix is expanded with class of the node (1st
column) and node’s nesting level (columns 2–4).
Antipattern1m,n =
⎡
⎢
⎢
⎣
1000000
0200500
0030000
0004012
⎤
⎥
⎥
⎦
Compared, to rather minimal data matrix above, the full construction con-
tains more nesting levels, and feature definitions are increasing the m(horizontal)
dimension of the matrix. Its n(vertical) dimension will be directly related to
the amount of nodes in the description. Each row of the above matrix is a node,
and having the antipatterns being formed by relations between features of such
nodes, we have a one dimensional dataset that can be processed as input.
3.1 Kernel and Filters
Our approach when designing the neural network, visualized on Fig. 5, processes
a one-dimensional dataset containing construction’s feature data. Due to the
changes in the input structure, it is required to define and fine-tune the dimen-
sions of the kernel along with the size of the step and stride with which it will be
applied to the dataset. Additionally, data transformations (filters) can be used
Intelligent Quality Assessment Using Antipatterns and Convnets 65
to highlight related feature columns for easier detection of patterns occurring in
the data.
To enable the network to detect relations between them, the dimensions of
the kernel have to span across multiple neighboring nodes. Due to the structure
of proposed above data matrices, the kernel should contain an arbitrary amount
of rows. Its amount should be synchronized with the complexity and abstraction
level of antipatterns.
To keep the calculations simple, we have flattened out the data structure,
which removes the need to create tree-like kernels. It is possible that such app-
roach could yield better results, as the structural information would be more
accessible for the network.
The filters that can be applied on the data, again need to reflect the modi-
fied input data structure. In the example above, the first four columns contain
structure-related data and the last three - feature data. Hence the structures of
the kernel, and the network will work with multi-dimensional data entities. This
complexity should be reduced by applying filters. Namely, inspecting relations
between features and forming feature-sets, highlighting specific relations between
data items. Such filters make it easier for the network to notice and learn the pat-
terns in input data and could be loosely compared to edge detection algorithms
used in image recognition.
Node class e r R
...
...
1000000
0200500
0030000
0004012
...
Symbolic representation Convolutional layers Subsampling layers Fully connected layers
Feature extraction from the symbolic description Classification of features
Softmax Antipattern
Sharp shoulder
dropout value
Misaligned axis
Axis spread
Spread precision
Wall thickness
Perpendicularity
Fig. 5. A conceptual mapping of one-dimensional dataset and the Convnet design
3.2 Quality Assessment with Distance from Antipatterns
The essence of our method of intelligent quality assessment lies in detecting
similarity of the tested element to a set of antipatterns defining the quality
reference (Fig. 6). In this context, for the design of the classification algorithm
to be useful in quality grading, it has to provide a quantitative value representing
similarity to the library of the antipatterns rather then a single one.
66 A. Tucholka et al.
Detecting just one, most similar antipattern can be useful in practice - for
example when advising a designer about a detected error that needs fixing. In
case of quality grading, comparing two tested elements requires creation of a
common denominator (i.e. a quality reference) which we aim to achieve with
a library of antipatterns. The quantifiable values of similarity to antipatterns,
as indicated by the output of the network, should consider each type of the
antipatterns taught to the network. This approach, by encompassing all of the
types of design mistakes provides a value representative for all used antipatterns.
Attempting to find additional features of numerical models enabling struc-
tural analysis of construction data we tested a custom designed neural network
based on Hamming distance, probabilistic neural networks, and Kohonen maps.
Among these three designs, only Kohonen maps allow for processing of struc-
tural data - through similarity of coordinates resulting from normalized positions
feature instances by their class and node structure. Still, Kohonen maps proved
to be challenging in practical applications, as they can’t differentiate between
antipatterns and require maps to function as standalone classifiers defining sim-
ilarity to one antipattern at a time.
Normalization of the feature values and structure for application with ConvNet
Similarity to antipattern
Symbolic representation of the
antipattern constructions
Symbolic representation of the
construction
Convolutional Neural Network
TESTING / CLASSIFYINGTRAINING / LEARNING
Fig. 6. General, visual representation of the process
When applying Convolutional Neural Network we mimicked the network
designs used for text classification and took advantage of ConvNet’s two main
features: the location invariance and compositionality. The former one enables
detection of patterns in any location (position) of the symbolic data represen-
tation, while the latter enables combining data patterns and non-direct data
relations. Availability of such features in the numerical method used for data
classification is a clear advantage compared to other models.
Intelligent Quality Assessment Using Antipatterns and Convnets 67
Having a Softmax function at the output, the distribution of results needs
to be further processed. First, it is necessary to filter out values with confidence
below a certain threshold. Having the Softmax activation distributed equally
among all types of antipatterns means, there is no clear similarity to a specific
antipattern class. Due to the fact, that Softmax output sums up to 1, it is
expected for all of the values of the output matrix to be close to the inverse size
of the output data array (M).
All koutput values (V) exceeding such value ( 1
M),canbesummedupand
used to calculate the similarity to all classes of antipatterns used in the training
process.
A=
k
n=1
Vn(1)
Value Astands for similarity to trained dataset, and will always be less then
1. To represent the distance from the antipatterns and use it as a relative (to
the library of antipatterns) we simply invert it.
Qap =1
A(2)
The resulting quantitative value Qap can be used for defining a subjective (to
the library of antipatterns and network configuration) quality grade. Such grade,
beyond comparing designs against each other, provided a sufficient database of
antipatterns can act as a rough (due to its non-deterministic nature) quality
evaluation.
4 Conclusions
Calculation models, based on neural networks, have created an opportunity to
widen the tool-set of mechanical component and assembly designers with auto-
mated algorithms preventing common design errors. We observe several data
integration and processing challenges that still need to be addressed for the
method of automated and intelligent quality assessment to be efficient. The
ability to abstract over the structural data contained in antipatterns found in
mechanical designs was observed partially in Kohonen maps, and Convolutional
Neural Networks.
Compared with simple algebraic models, neural networks require additional
configuration, data normalization to be performed adjusting the data and the
model to fit each other. The benefits of applying neural networks become barely
visible with ConvNet design, due to: local feature detectors (sliding window of
convolutions) enabling structural analysis of the construction, and the layered
design enabling varying scopes of convolutions on tested data. It is possible to
further increase the efficiency of the design by introducing custom kernels and
approach to data convolution.
Using Convolutional Neural Network as the classification algorithm in the
proposed method of intelligent quality assessment, provides a novel (vs algebraic
68 A. Tucholka et al.
models) opportunity to detect patterns in the symbolic description of mechanical
constructions. Support for detecting antipatterns can be observed in the ability
to adapt the kernel and filters to work with the symbolic structure in the form
of a list including basic information about the analyzed structure.
On the other hand, the pooling and fully connected layers reduce the ability
to detect more complex structures. The limitations of the Convnet based designs
(complexity of data normalization, required custom kernels, and filters), make
the fitness of Convnet for quality assessment using proposed method promising
but limited in terms of depth of analyzed structural complexity.
Algorithmic progress in machine learning, pattern detection and dimension-
ality reduction methods has enabled the ability for numerical models to build
abstractions representing structural concepts of the mechanical construction’s
design. Such ability presents a unique opportunity to dramatically increase the
spread of mechanical design knowledge and automated reduction of common
design mistakes.
4.1 Future Research
Recently presented, capsule networks provide a unique design of capsules (i.e.
neuron groups) to represent a specific type of an entity such as an object or
object part [6]. Considering that the strengths of new neural network designs
lie in the ability to detect structured patterns, proposed routing mechanisms
will probably further increase the benefits of applying neural networks to detect
antipatterns.
References
1. Farabet, C., Couprie, C., Najman, L., LeCun, Y.: Learning hierarchical features for
scene labeling. IEEE Trans. Pattern Analy. Mach. Intell. 35(8), 1915–1929 (2013)
2. Kacalak, W., Majewski, M., Tucholka, A.: A method of object-oriented symbolical
description and evaluation of machine elements using antipatterns. J. Mach. Eng.
16(4), 46–69 (2016)
3. Kacalak, W., Majewski, M., Tucholka, A.: Intelligent assessment of structure cor-
rectness using antipatterns. In: International Conference on Computational Science
and Computational Intelligence, pp. 559–564. IEEE Xplore Digital Library. IEEE
(2015)
4. Tucholka, A., Majewski, M., Kacalak, W.: Zorientowany obiektowo, symboliczny
zapis cech, relacji i struktur konstrukcyjnych. Inzynieria Maszyn 20(1), 112–120
(2015)
5. Kacalak, W., Majewski, M., Budniak, Z.: Intelligent automated design of machine
components using antipatterns. In: Jackowski, K., Burduk, R., Walkowiak, K.,
Wozniak, M., Yin, H. (eds.) Intelligent Data Engineering and Automated Learn-
ing. Lecture Notes in Computer Science, vol. 9375, pp. 248–255. Springer, Cham
(2015)
6. Sabour, S., Frost, N., Hinton, G.E.: Dynamic Routing Between Capsules. Computer
Vision and Pattern Recognition, arXiv:1710.09829 (2017)