Available via license: CC BY 4.0
Content may be subject to copyright.
Page 1/41
Online Prediction of Automotive Tempered Glass
Quality using Machine Learning
Abdelmoula Khdoudi
Noureddine Barka ( noureddine_barka@uqar.ca )
University of Quebec at Rimouski
Tawk Masrour
Ibtissam El Hassani
Choumicha El Mazgualdi
Research Article
Keywords: Manufacturing quality prediction, Machine Learning, Articial Neural Networks, Random
Forest, Glass tempering quality, online prediction
Posted Date: September 19th, 2022
DOI: https://doi.org/10.21203/rs.3.rs-2040065/v1
License: This work is licensed under a Creative Commons Attribution 4.0 International License.
Read Full License
Page 2/41
Abstract
This study introduce the application of machine learning algorithms for supporting the manufacturing
quality control of a complex process as an alternative for the destructive testing methodologies. The
choice of this application eld was motivated by the lack of a robust engineering technique to assess the
production quality in real time, this arise the need of using advanced smart manufacturing solution as AI
in order to save the extremely high cost of destructive tests. In concrete, this paper investigates the
performance of machine learning techniques including Ridge regression, Linear Regression, Light
Gradient Boosting Machine, Lasso Regression and more, for predicting the at glass tempering quality
within the building glass industry. In the rst part, we applied the selected machine learning models to a
dataset collected manually and made up by the more relevant process parameters of the heating and the
quenching process. Evaluating the results of the applied models, based on several performance
indicators such as Mean Absolute Error, Mean Squared Error, R Squared, declared that Ridge Regression
was the most accurate model. The second part consist of developing a digitalized device connected with
the manufacturing process in order to provide predictions in real time. This device operates as an error-
proong system that send a reverse signal to the machine in case the prediction shows a non-compliant
quality of the current processed product. This study can be expanded to predict the optimal process
parameters to use when the predicted values does not meet the desired quality, and can advantageously
replace the trial and error approach that is generally adopted for dening those parameters. The
contribution of our work relies on the introduction of a clear methodology (from idea to industrialization)
for the design and deployment of an industrial-grad predictive solution within a new eld which is the
glass manufacturing.
1. Introduction, Motivation And State Of The Art
In this paper, we are interested in solving the most critical manufacturing issue in the glass tempering
industry, which is the thermal tempering quality, and thus, using modern machine learning technics.
The glass tempering quality is a safety and regulation aspect of this product, which means, a bad
tempering quality can lead to user injury during any unsafe breakage (example: window breakage). In
other hand, a correctly tempered glass is highly strong and resistive to mechanical chocks. Moreover,
during breakage, generated tempered glass fragments are safely small and non-sharp.
Tempered glass has to sustain the rigorous safety standards that are stipulated in several homologation
programs. The most obvious way to dene the safety level of tempered glass, regarding the international
regulation, is by counting the number of fragments in the standardized fragmentation test. It consists of
a destructive test, also called, a punch test where a tempered glass piece is impacted with a pointed tool
at the mid-point of the longest edge. Then, from the breakage pattern, the number of the particles is
counted from a minimum fragment count area of 50*50 mm² (Fig.1).
Page 3/41
From a manufacturing point of view, the glass tempering quality check is a costly process since it is
based only on systematic destructive tests. To our best knowledge, there is no proven non-destructive test
methods practiced in the industry, and this can be caused by the diculty to interpret with accuracy the
fracture test result based on manufacturing settings.
The goal of our work is to develop a data driven decision model to help manufacturing organizations to
predict in real time (for each processed part) the tempering quality with an acceptable accuracy, and then,
to gain in quality control cost by reducing the count of destructive tests. Such a prediction system, can
allow other more advantages as the enablement of the early reaction for the process parameters
adjustment based on prediction curve evolution.
The rest of the paper is arranged as follows: Section 2 gives a general overview of the thermally
tempering process and the fracture phenomena. Next, the industrial case study, the dataset collection and
the methodology adopted for the ML experiments is presented in section 3. Subsequently, section 4
presents the main results, followed by a brief interpretation and comparison of the proposed models. The
deployment of the nal data driven decision model is then explained in section 5. Finally, section 6
provides some concluding remarks.
2. Fracture Of Thermally Tempered Building Glass
2.1. Thermal Tempering of Glass
Thermal tempering is a heat treatment process that consist of elevating the temperature of a material to
a critical set point for a certain period of time, and then allowing it to cool rapidly at predetermined rate.
As a result of this thermal treatment, the glass becomes stronger, but not harder unlike other tempered
metals.
In order to have a good-quality tempered glass, precise control of heat transfer during the tempering
process is required. This consist of controlling the heating which take place through simultaneous
radiation and forced convection. Then, the quench rate must be carefully xed so that the glass is rapidly
cooled to below the glass transition temperature (Marshall, 1978).
Tempering gives rise to an extreme temperature difference between the surface and the core (mid-plane)
of the piece. Therefore the glace surface and edges cools off before the core and this causes permanent
stresses in the glass (Narayanaswamy, 1978). The core is under tensile stress, while the zones close to
the glass surfaces are under compressive stress.
Due to the residual stress state, thermally tempered glass shows a greater resistance to mechanical
shocks and thermal stresses. Moreover, when breakage occurs, tempered glass breaks into small and
blurred fragments, eliminating the risk of dangerous shards. Hence, thermally tempered glass is also
known as tempered safety glass.
Page 4/41
The residual stress state obtained by the tempering process is approximately parabolic distributed across
the thickness of a tempered glass, with a maximum compression at the free surface and a maximum
tension at the center, as shown in Fig.2. It represents a kind of a balance between compression in the
surface layers and tension in the core (mid-plane). Any externally imposed compressive stress at the
surface must be compensated before any net tensile stress can occur, which can cause failure.
The residual stress distribution along the plate thickness direction (x-axis) is given by:
1
Where is the surface compressive stress, and is the half thickness of the glass plate.
2.2. FRACTURE OF TEMPERED GLASS
Tempered glass breaks in use when an external load causes tensile stresses in the glass surface to
exceed the surface compression due to tempering by more than the tensile strength of the awed surface.
Additionally, fracture may also be initiated by the existence of many aws (such as, pores, cracks and
bubbles) into the interior, where tempered glass is always under tension (Gardon, 1980). In both cases,
the propagation of the fracture is spontaneous. In addition, if the equilibrated residual stress state within
the glass plate is disturbed suciently and if the elastic strain energy in the glass is large enough, it will
lead to a complete fragmentation of the glass plate with the creation of many small fragments. Thus, the
fragment size depends on the amount of the strain energy stored inside the glass. A high stored strain
causes small fragments, while a small one leads to the creation of larger fragments. It’s important to
mention that the average size of the fragments represent the rough measure of the quenching quality or
the quality of the temper. (Fig.3). More details about the fragmentation process can be found in the
literature (Gardon, 1958); (Gardon, 1980); (Akeyoshi, 1967); (Silverman, 2012); (Nielsen, 2010); (Karvinen,
2003); (Ab ronen, 2018); (Rantala, 2015).
3. Experimental Methodology
This section describes the methodology used for developing our ML-based decision model for the
experiments. Firstly, we present the way we construct our dataset which we will use to train our ML
models. Next, we describe the process of data acquisition for online prediction. Finally, we exhibit the ML
algorithms used for constructing our decision model, then, we set out the evaluation strategy used in the
experiments.
3.1. Case study
σ
(
x
,
y
) = −
σs
[
1 − 3 × + ×
( )
2
]
x
l
3
2
x
l
σsl
Page 5/41
The building glass manufacturing industry consists of producing the safety glasses for cars, bus, trucks
and other transportation systems. The output product is generally either a tempered glass composed
from a geometrically curved layer of different thickness and various transparency level or, a multi-layer
annealed glass strengthened by a highly additive resins (laminated glass). The production of such parts
requires some major materiel transformation steps such as the cutting of 2D at shapes of glass within
precise CNC machines, the grinding of edge to allow safe touch and the printing of traceability
information on the glass surface. Finally, the pre-processed parts are transferred to the main process
operations which are the heating, the 3D forming using pressing technologies, then the tempering (or
quenching) that consists of applying an accelerated air ow on the hot glass surface to create a material
structure transformation. This transformation creates a balanced stress inside the tempered piece. In
order to check the quality level of the tempered glass according to the international safety and regulation
norms, only destructive tests are allowed. Those tests are performed in hourly basis and generating a big
materiel loss. Each year, a glass manufacturing plant can destruct a total of 30 000 part only for the
purpose of process quality check. This can cost a yearly loss of 150 000 US Dollars. For this reason, the
development efforts should be directed to determine a more economical technic for the glass tempering
quality check, ideally based on advanced data analytics and predictive modeling.
The goal of our approach is to construct a suciently effective predictive model that try to learn the
physical relationship between the manufacturing settings and the exact fracture (or fragmentation) test
result. To reach our goal, we started by selecting the possible manufacturing variables that can inuence
the glass tempering quality. Then, we implemented a manual procedure for data collection, which was
executed during three months and covered more than 206 destructive tests where the numerical fracture
results were included in a dataset in respect to the manufacturing setting that leaded to those results.
Next, we proceeded to the data preprocessing and cleaning which helped us to eliminate the ambiguous
or missing values that surely came from the manual data lling procedure. After the described steps, we
comes into the machine learning algorithms selection and testing, where several models was constructed,
tuned and compared. As a result, mainly three models worth to be described in the present paper which
are derived from the respective algorithms: Multiple Linear Regression; Articial Neural Networks and
Random Forests. Then, a validation phase was conducted to decide about the most accurate model as
solution for our problem. Finally, after model development and validation, a model deployment in the
manufacturing shop oor was designed in order to allow autonomous prediction triggering and reaction
based on product ow.
3.2. Data construction (collection) for training
The applicability of articial intelligence and specially, machine learning in the industrial eld requires
some expertise in selecting the adequate problem to solve. One key consideration in problem selection is
data availability which conduct us to develop robust and ecient ML models able to be generalized.
Unfortunately, in our case study and by the lack of digitization, there is no data available. Our rst biggest
work consist in constructing and collecting the appropriate data for training our ML models.
Page 6/41
The data collection process consist, for every destructive test, of recording the process and product
parameters, then noting the fragmentation value (Fig. 4 ; Fig. 5). The data collection process took 3
months and enabled us in constructing a dataset of 206 samples.
3.3. Exploratory Data Analysis (EDA)
The main task that we perform in this step is the correlation test. To nd out if two categorical variables
are related, we use the famous chi-square test. In the test that interests us, the null hypothesis is simply
“the two variables tested are independent”. Finally, the test is accompanied by a test statistic which
participates in the decision to reject or not the null hypothesis. The way that this statistic is built has the
good technic to follow a chi-square law with a certain degree of freedom.
At the other hand, there is a test to determine if two continuous variables are independent: the Pearson
correlation test. The null hypothesis to be tested is identical: “the two variables tested are independent”.
As for the chi-square test, this one is accompanied by a test statistic and a p-value which determines
whether or not the null hypothesis is rejected.
Since our categorical variables are not with high impact on the industrial process result, we present
hereafter the pairplot (Fig.6) and correlation matrix (Fig.7) for our dataset.
As it can be seen, some correlation exists between the values of air pressure and air temperature, this is a
normal behavior in the pneumatic eld: the air temperature increase under pressure increase. We can also
notice that the pressure and the air temperature that are applied on the top side of the glass are correlated
respectively with the air pressure and air temperature that are applied on the bottom surface of the part.
This comes from the fact that a balance between the top and bottom side of the part in term of thermal
exchange should be maintained to avoid some type of defects like shape of optic.
We decide to maintain the same variables as input and test different techniques of modeling using
different machine learning algorithms, some of them already deal with collinearity and multicollinearity
like the ridge regression algorithm.
3.4. Machine learning models and results
The application of Machine Learning (ML) techniques for predicting product quality and process
performance in the manufacturing companies is at the heart of the 4.0 strategy. Successful ML models
should be expected to make a signicant impact, positively, on global production performance. This is
only achievable by selecting appropriate ML algorithms, owning meaningful data and handling a high
value-added application. We investigate several supervised learning algorithms before selecting the
appropriate ones to apply in our study.
In this section, we present the main results of our study and our ndings from these results. We exhibit a
comparison of the predictive performance obtained by the ML algorithms under their different
congurations. The performance of each of the predictive models was evaluated using several
performance indicators such as mean abosulte error, mean squared error, root mean squared error, r-
Page 7/41
squared, mean absolute percentage error and root mean square log error. All the models used in this
research are developed using Python 3.6 with the Pytorch, Scikit learn, Numpy and Pandas packages.
3.4.1 Ridge Regression
Ridge Regression is a linear regression variation usually used to handle the problem of multi-collinearity
in multiple linear regression context. When the independent variables of a linear regression model are
highly correlated, least square estimators are unbiased and variances are large so that the predictions are
far away from target values. Ridge regression technique consists of performing a regularization penalty,
also known by L2 penalty, to the loss function during training. Thus, the loss function is altered by adding
a penalty equivalent to square of the coecients magnitude multiplied by a penalty term , as described
in the following equation:
Which is equivalent to minimizing the loss function in Eq. 3 under condition as described below:
Therefore, ridge regression puts constraint on the coecients , so that the optimization function is
penalized in cases the parameters take large values. To sum up, by applying the L2 penalty, ridge
regression minimizes the standard errors by shrinking the coecients of input variables in order to avoid
the issue of multi-collinearity and enhance the accuracy and reliability of regression estimates.
Applying ridge regression algorithm on our problem data with Regularization strength of 1 in addition to
the default solver and a precision of 0,001 shows the results as presented the in the below Table1.
The Fig. 8 shows the learning curve (bottom right), the residuals plot (top right), the prediction error
(bottom left) and the predictions vs real values (top left) of the Ridge Regression model.
Table 1
Performance indices of the Ridge Regression Model
Ridge Regression
MAE MSE RMSE R2 RMSLE MAPE
Mean 14,78 428,51 19,72 0,21 0,22 0,24
Standard Deviation 3,99 270,5 6,28 0,29 0,16 0,33
λ
Min
m
∑
i
=1
(
yi
−
n
∑
j
=1
xijbj
−
b
0
)
2
+
λ
n
∑
j
=0
bj
2
Forsomec
0,
n
∑
j
=0
bj
2<
c
bj
bj
Page 8/41
3.4.2 Light Gradient Boosting machine
Light Gradient Boosting Machine, or LightGBM for short, is an extension of gradient boosting framework
based on decision tree algorithm providing higher eciency, faster training speed, and improved
predictive performance. The algorithm was rst introduced by (Guolin, 2017), it is based on two novel
techniques, namely Gradient-based One Side Sampling (GOSS) and Exclusive Feature Bundling (EFB).
GOSS is a new sampling method to lter out the data instances, by focusing on instances resulting in a
larger gradient while performing random sampling on instances with small gradients, to nd the best split
value. As for EFB, is a near-lossless approach for reducing the number of effective features by merging
together sparse mutually exclusive features and treating them as a single feature. The two techniques
comprise together to provide an ecient and effective implementation of the gradient boosting algorithm,
speeding up the learning procedure and enhancing the capability of handling large-scale data.
Unlike other boosting algorithms that split the tree level-wise, LightGBM splits the tree leaf-wise which
means growing the tree by splitting the data at the nodes with the highest loss change Fig.9. Such as, the
leaf-wise algorithm can reduce more loss compared to the level-wise algorithm and hence achieving
much better accuracy compared to other boosting techniques. However, when dealing with smaller
datasets, leaf-wise algorithm may usually lead to overtting and increase the model complexity and
therefore level-wise growth can present a good alternative.
The results Table 2 shows the performance of Light Gradient Boosting machine model with the following
parameters:
Boosting type = Gradient Boosting Decision Tree
learning rate = 0,1
Max depth = -1
Number of estimators = 100
Number of leaves = 31
The model interpretation plot Fig. 11 using SHAP (SHapley Additive exPlanations) values shows the high
impact of the air pressure and the air temperature applied on the top surface of the part. A high impact of
the processing time of the part is also showing a high impact of the prediction result.
Table 2
Performance indices of the Light Gradient Boosting Model
Light Gradient Boosting Machine
MAE MSE RMSE R2 RMSLE MAPE
Mean 15,21 466,02 20,77 0,07 0,25 0,43
Standard Deviation 4,21 248,97 5,85 0,54 0,22 0,89
Page 9/41
The Fig. 10 shows the learning curve (bottom right), the residuals plot (top right), the prediction error
(bottom left) and the predictions vs real values (top left) of the Light Gradient Boosting model.
3.4.3 Lasso Regression
Just like Ridge regression, Lasso regression is another variation of linear regression that uses shrinkage
to tighten data values towards a central point, like the mean. This particular type of regression uses L1
regularization technique by adding a penalty equivalent to the absolute value of the coecients
magnitude. The Lasso technique is of high interest when dealing with multicollinearity issues, and also
for performing a feature selection procedure. It consists of identifying the variables and its corresponding
regression parameters that provide a more accurate predictions. This is achieved by applying a constraint
on the model parameters to shrink the regression coecients towards zero. The loss function for Lasso
regression can be written as:
This type of regularization imposes a penalty so that it forces the sum of the absolute value of the
magnitude of coecients to be less than a xed value 𝜆 denoted as the amount of shrinkage. Unlike
Ridge regression, Lasso regularization can lead to zero coecients that will be eliminated from the
model, which is suitable for producing simpler models with few coecients.
Which is equivalent to minimizing the lost function in Eq.3 under condition as described below:
Using Lasso model with the below parameters shows the results displayed on Table 3:
Alpha = 1,0
Maximum iteration = 1000
Normalization = False
Selection = cyclic
Optimization tolerance = 0,0001
Min
m
∑
i
=1
(
yi
−
n
∑
j
=1
xijbj
−
b
0
)
2
+
λ
n
∑
j
=0
∣
∣
bj
∣
∣
Forsomet
0,
n
∑
j
=0
∣
∣
bj
∣
∣ <
t
Page 10/41
Table 3
Performance indices of the Lasso Regression Model
Lasso Regression
MAE MSE RMSE R2 RMSLE MAPE
Mean 15,28 450,56 20,17 0,19 0,24 0,41
Standard Deviation 4,14 300,4 6,59 0,24 0,22 0,82
The Fig. 12 shows the learning curve (bottom right), the residuals plot (top right), the prediction error
(bottom left) and the predictions vs real values (top left) of the Lasso Regression model.
3.4.4 Multiple Linear Regression (MLR)
Multiple Linear Regression Model (MLR) is a model that summarizes the relationship between a set of
predictor variables and a response variable, also called a criterion. It involves the estimation of multiple
regression equation by using parameters entered linearly and estimated by the least squares method
(Khdoudi, 2019). The most general form of the regression equation can be expressed as follows:
2
Where represent the
i-th
dependent variable, represent the
i-th
independent variable (or explanatory
variable), is the
i-th
partial regression coecient (or regression weights), is the regression intercept,
and is the error related to the
i-th
observation (or possible variation form).
As mentioned before, the MLR model is constructed using the least square method as the objective
function, with the goal of minimizing the sum of the least square errors between the expected and
predicted outputs as described in the following equation:
3
MLR is seen as one of the most classical black box ML models using for prediction tasks (Ciulla, 2019). It
is intuitive and easy to handle with strong generalization and self-learning abilities, which make it popular
for a lot of real time applications especially industrial ones. Black box methods generally didn’t require
any knowledge of the physical phenomena and thus this methods, namely MLR as one of them, provide
good predictions without excessive computational cost.
yi
=
b
0+
b
1
x
1+
b
2
x
2+ ⋯ +
bnxn
+
ϵ
yixi
bib
0
ϵ
Min
m
∑
i
=1
(
yi
−
n
∑
j
=1
xijbj
−
b
0
)
Page 11/41
Besides its self-learning abilities and high degree of generalization, MLR is used in our study as one of
the models more used for linear problems since we ignore if our problem is a linear or non-linear problem.
This make the choice of MLR a well-founded option for the comparison purposes. Here, we used the MLR
proposed in the Scikit Learn package with its default hyper parameters. Table4 presents the result
obtained using the predetermined metrics.
Hyper parameters:
Fit_intercept = True
Normalize = False
Table 4
Performance indices of the Light Linear Regression Model
Linear Regression
MAE MSE RMSE R2 RMSLE MAPE
Mean 14,97 439,92 19,95 0,19 0,21 0,19
Standard Deviation 4,15 279,21 6,44 0,29 0,12 0,2
The Fig. 13 shows the learning curve (bottom right), the residuals plot (top right), the prediction error
(bottom left) and the predictions vs real values (top left) of the Linear Regression model.
3.4.5. Random Forest Regression
Random forest (RF) is a supervised machine learning algorithm used for both prediction and
classication problems (El Mazgualdi, 2020) that was proposed by Breiman in 2001 (Breiman, 2001).
Known as a bagging ensemble learning technique, RF is made up of a number of decision trees (ig. 14
such that each tree is generated from a randomly selected subset of the same training data using
replacement. The RF predictions are then produced by aggregating the results (outputs) of all individual
trees that form the forest. Robustness to noise, high interpretability, and insusceptibility to over tting, are
key advantages of RF over other traditional ML models. Additionally, RF is an ecient ML model that
require a very little data preprocessing and features selection and this due to the way in which it is
constructed. For real, and especially, industrial application, RF is one the most popular and most used ML
algorithm in literature because of its ability to deal with high-dimensional and complex datasets.
Page 12/41
Known as one of the most powerful ML algorithms that require a very little data preprocessing and
features selection, RF was the best candidate for our study. Our RF model is implemented using 120 trees,
the number of trees was decided by trial and error using mean absolute error (MAE) as a metric. The other
hyper parameters were left to the default Scikit Learn parameters. For a better and non-biased
constructed model, we used the cross-validation strategy (Arlot, 2010). The K-fold cross-validation
technique was adopted since it is friendly and easy to use. It consist on dividing randomly the data into a
k number of folds that have similar size. In our case, we divided our data into 10 folds, 9 folds will be
used to train the RF model and the remaining fold for testing the trained RF model.
Table5 presents the performance metrics.
The below parameters of random forest algorithm shows the best model result Table5. In addition the
model’s SHAPE values shows a slightly different interpretation Fig.16 compared to Light Gradient
Boosting algorithm since it makes in evidence the pressure and temperature on both top and bottom
surface of the part, then gives less importance to the processing time of the part.
Hyper parameters:
Criterion = mean squared error
Minimum samples split = 2
Number of estimators = 100
Table 5
Performance indices of the Random Forest Model
Random Forest Regressor
MAE MSE RMSE R2 RMSLE MAPE
Mean 15,3 480,15 20,76 0,08 0,25 0,49
Standard Deviation 4,92 316,38 6,98 0,52 0,24 1,07
The Fig. 15 shows the learning curve (bottom right), the residuals plot (top right), the prediction error
(bottom left) and the predictions vs real values (top left) of the Random Forest model.
3.4.6. Articial Neural Networks (ANN)
Articial Neural Networks (ANN) have been widely used to deal with both classication and prediction
problems. They are known by their high accuracy, which make them the more valuable to handle complex
problems without requiring exact mathematical description about the underlying phenomena, and huge
Page 13/41
ability to generalize and deduce the unseen part of a population especially, when the sample data contain
noisy information. ANNs are inspired from the biological neural networks. In the present work, the feed-
forward network based on the back propagation training algorithm is adopted as a particular ANN class
(Schmidhuber, 2015). It consists of three building blocks also called layers namely input layer, hidden
layers and output layer. Each layer is composed of multiple neurons Fig.17 that are connected with the
neurons of the other layers. The training process of the feed forward consists of delivering information
and associating weights from the input layer, through all hidden layers, until the output layer. Then a back
propagation step is taken in order to adjust the weights, using gradient descent, based on the residual
error between the simulated values of the network and the target outputs (Jain, 1996). The transfer
function that associate the neurons of a layer to the other neurons, of the previous and the subsequent
layers, is expressed as follows:
4
Where f denote the transfer function, y is the output of the neuron, b is the bias value of the neuron, x is
the input vector of the neuron, and w is the weight vector of the neuron.
A key consideration in constructing robust and powerful ANN model is to determine correctly the structure
used in term of number of nodes and hidden layers.
Our ANN model is implemented using an input layer of 9 neurons which represent our input features, two
hidden layers, and an output layer of one node since we are predicting one nal value (fragmentation
count). The number of hidden layers was chosen based on the literature recommendations given that 2
hidden layers can approximate any complex relation between a set of input variables and output
variables. For the number of neurons in each hidden layer, we tried different congurations, some using
trial and error and others using heuristics and meta-heuristics (Hornik, 1991); (Yamasaki, 1993); (Hunter,
2012). Finally we adopted three congurations that give better results, one of them used the Huang’s
network architecture for two hidden layers feed-forward network (TLFN) (Huang, 2003); (El Mazgualdi,
2020). The ANN model was constructed using the Pytorch package and its architecture under its three
congurations is represented in Table6.
y
=
f
{∑
wx
+
b
}
Page 14/41
Table 6
ANN architecture under its three congurations
Conguration 1 Conguration 2 Conguration 3
Hidden Layers 222
First Layer 38 neurons 38 neurons 128 neurons
Second Layer 7 neurons 7 neurons 64 neurons
Activation Function ReLu ReLu ReLu
Optmizer Adam Adam Adam
Loss Function MSE Loss MSE Loss MSE Loss
Learning Rate 0,001 0,01 0,01
Epochs number 4000 2000 2000
The performance indices of best constructed ANN models are presented in Table 7. The Fig. 18 presents
the loss curves for both training and test sets for the best conguration. It is found, for each
conguration, that the two losses have the same decreased trend which reects the stability and the
performance of the model
Table 7
Performance indices of the Articial Neural Networks Model
Articial Neural Networks
MAE MSE RMSE R2 RMSLE MAPE
Mean 15,47 505,16 21,14 0,11 0,24 0,46
Standard Deviation 4,55 352,18 5,95 0,44 0,24 1,12
The Fig. 19 shows the learning curve (bottom right), the residuals plot (top right), the prediction error
(bottom left) and the predictions vs real values (top left) of the Ridge Regression model.
4. Result Analysis
4.1. Comparative analysis
Page 15/41
The models introduced in the prior sections were all compared for the purpose of providing precise
predictions of the thermal quality value in real time with a high degree of generalization. The quality level
or value were more closely approximated using Ridge Regression with the cross validation technique in
comparison to that obtained using the remaining ML models. This conclusion is conrmed by comparing
the mean absolute error (MAE), the mean squared error (MSE), root mean squared error (RMSE), r-squared
(R²) between predictions and targets of the discussed models. These metrics on Ridge Regression model
are considerably smaller, then at second and third position we observe an acceptable performance of
multiple linear regression and light gradient boosting machine. For the articial neural networks, and
despite the fact that is robust and more suitable to handle non-linear problems, presents comparatively
less accuracy than the previous algorithms. This could be explained by the fact that ANN model generally,
requires large dataset which is not satised in our case since our dataset does not exceed 206 samples.
Finally, we can explain the performance of the ridge regression by the fact that this algorithm perform
very well on
4.2 Extended comparison
In this part, we present an extended comparison list using additional algorithms. The goal is to explore en
more results and understand how a set of old and modern algorithms perform on our dataset. The rst
Table 8 shows the cross-validated mean value of mean absolute error, mean squared error, root mean
square error, r-squared etc. obtained from models created based on the listed algorithms. The second
Table 9 shows the standard deviation for the same metrics and gives an idea about the models stability
and generalization capacity. As an overall conclusion we can notice that the Ridge Regression based
model still showing the best result over the 14 tested algorithms.
Page 16/41
Table 8
The list of the performance results (mean value) of the tested algorithms
Mean Value (10 folds)
MAE MSE RMSE R2 RMSLE MAPE
Ridge Regression 14,78 428,51 19,72 0,21 0,22 0,24
Linear Regression 14,97 439,92 19,95 0,19 0,21 0,19
Light Gradient Boosting Machine 15,21 466,02 20,77 0,07 0,25 0,43
Lasso Regression 15,28 450,56 20,17 0,19 0,24 0,41
Random Forest Regressor 15,3 480,15 20,76 0,08 0,25 0,49
Articial Neural Networks 15,47 505,16 21,14 0,11 0,24 0,46
Gradient Boosting Regressor 15,68 524,93 21,91 0,04 0,26 0,47
Extra Trees Regressor 15,93 534,92 21,81 0,02 0,26 0,49
Elastic Net 16,08 488,86 21,07 0,12 0,25 0,46
Bayesian Ridge 16,2 526,11 21,9 0,06 0,26 0,52
K Neighbors Regressor 16,22 508,18 21,7 0,01 0,26 0,47
AdaBoost Regressor 16,97 516,5 21,74 0,01 0,26 0,51
Least Angle Regression 17,24 582,87 23,08 0,2 0,22 0,17
Orthogonal Matching Pursuit 17,33 526,78 22,06 0,04 0,27 0,53
Page 17/41
Table 9
The list of the performance results (standard deviation) of the tested algorithms
Standard Deviation (10 folds)
MAE MSE RMSE R2 RMSLE MAPE
Ridge Regression 3,99 270,5 6,28 0,29 0,16 0,33
Linear Regression 4,15 279,21 6,44 0,29 0,12 0,2
Light Gradient Boosting Machine 4,21 248,97 5,85 0,54 0,22 0,89
Lasso Regression 4,14 300,4 6,59 0,24 0,22 0,82
Random Forest Regressor 4,92 316,38 6,98 0,52 0,24 1,07
Articial Neural Networks 4,55 352,18 5,95 0,44 0,24 1,12
Gradient Boosting Regressor 4,68 304,12 6,68 0,56 0,23 0,99
Extra Trees Regressor 5,99 380,81 7,68 0,55 0,24 1,07
Elastic Net 4,06 317,94 6,67 0,24 0,23 0,96
Bayesian Ridge 4,08 337,51 6,81 0,23 0,24 1,12
K Neighbors Regressor 4,87 269,84 6,07 0,44 0,23 0,99
AdaBoost Regressor 4,13 313,54 6,61 0,51 0,23 1,04
Least Angle Regression 4,99 308,07 7,05 0,75 0,08 0,07
Orthogonal Matching Pursuit 4,22 309,52 6,3 0,18 0,24 1,12
For the same comparison purpose, we explore the models plots through the residuals value graph Fig. 20,
the prediction error graph Fig. 21 and the predicted vs the real values Fig. 22. These plots shows some
additional insight such as a great overtting showed by the Extra Tree Regression and a non-convergence
of the Lars Algorithm for our problem type.
The gures presents as row by row (from top left to bottom right) the plots of the below algorithms:
Ridge Regression
Linear Regression
Light Gradient Boosting Machine
Lasso Regression
Random Forest Regressor
Articial Neural Networks
Gradient Boosting Regressor
Extra Trees Regressor
Page 18/41
Elastic Net
Bayesian Ridge
K Neighbors Regressor
AdaBoost Regressor
Least Angle Regression
Orthogonal Matching Pursuit
5. Deployment
5.1. Real Time Data Acquisition
The next step, after ML models training, is the data acquisition for performing real time predictions. It
consists in building a pipeline to acquire the parameters automatically and use them for predicting the
tempering quality of the current glass during the part processing.
The contribution of this section is to show a clear roadmap for the integration of a predictive model into a
complex manufacturing process. The pipeline connects directly to the machine’s programmable logic
controller (PLC) in order to capture the measurements and then send them to the prediction software.
Once the online prediction is performed, the software exhibit the product quality value. The software is
also able to turn off the machine in the case where the prediction shows a non-compliant tempering
quality.
5.2 Physical validation of the model
After the oine validation of the predictive model using the test data and validation data, the launch of
the model in production required an additional follow-up by getting the production-level prediction cross-
checked by destructive testing and physical quality checks. This step showed that the model error
matches the actual real average difference, which is more convincing and encouraging for going further
in the deployment steps.
5.3 Industrial connectivity of the prediction system with the physical process
As mentioned in the Fig. 23, the industrialization of our glass tempering quality prediction model require a
digitalized link with the tempering machine in order to get triggered by the arrival of new part, pull the
processing parameters, perform prediction and communicate the result.
Our nal developed system, in addition to the previous connectivity feature, included also a control
function that allows the automatic stop of the machine in case a prediction results that show a bad
tempering quality. All the previous functions was implemented using the PLC (Programmable Logical
Controller) constructor communication protocol, in our case Ethernet TCP/IP.
Page 19/41
5.4 Online prediction of automotive tempered glass
Our computer based system host the predictive model which is behind a graphical user interface. This
GUI displays the parameters values in 200(ms) rate and waits for a trigger to perform and visualize the
prediction result. The trigger is enabled by a virtual tracking of the work piece at the time the quenching
cycle start. The actual production rate is 300 pieces per hour, which means our predictive system is also
performing and delivering 300 values per hour that is plotted as visual evolution curve. The predicted
value is compared with a Min threshold to judge either the piece is presenting a good quality or not.
The client PC (predictive system) is communicating with the main PLC through the Ethernet/IP protocol
which is one of the robust industrial communication solution. A router is installed to expose the PLC
variables, then our application is connected in real time through WIFI network, pulling the measurements
data and feeding it to the model Fig.14.
The Ethernet / IP protocol was created in 2001 and is more and more developed today. It is a proven
industrial Ethernet network solution available for industrial automation.
Ethernet / IP is a member of a family of networks that implements the Common Industrial Protocol (CIP)
at its upper layers. CIP encompasses a comprehensive suite of messages and services for industrial
automation applications including control, security, timing, motion, conguration, and information.
In the same way as our predictive system, the main SCADA system acquires also data through the same
protocol using physical cable. Supervisory Control and Data Acquisition systems collect data from
various sensors at a factory, plant or in other remote locations and then send this data to a central
computer which then manages and controls the data. Traditionally, SCADA was designed to be in a
private network utilizing line communication. As the scope becomes larger, and utilizing line
communication becomes impractical therefore integrating wireless communication to SCADA was
introduced.
6. Conclusion
In this paper, the performance of several machine learning techniques including Ridge Regression, Light
Gradient Boosting Machine, Random Forest, Lasso Regression, Articial Neural Networks and more were
evaluated to predict the glass thermal tempering quality. To this end a data collection step was performed
in order to bring out a suitable dataset followed by developing the more relevant ML models that are able
to deal with the complexity of our industrial case study. The ultimate goal of our work is to show an end-
to-end practical methodology of applying manufacturing quality prediction in complex manufacturing
processes, which is not clearly covered in literature. The selected use-case justify a high added value
using such technics which can replace traditional costly destructive testing.
Results indicated that the applied models have varying performances for predicting thermal tempering
quality, however, the Ridge Regression algorithm presented the best overall predictive performance for the
Page 20/41
test examples which is acceptable for practical purposes. Taking this a step forward, we performed an
additional follow-up of the new predictions and getting them cross-checked by destructive testing and
physical quality evaluation. This step conrmed our ndings about the model performance. Finally, we
constructed a digitalized device that contain our best performing ML model for predicting product quality
in real time. The device is connected to the machine’s PLC in order to get triggered by the arrival of new
part, pull the processing parameters, perform prediction, and then communicate the result. It is also
designed as an error-proong system performing a control function, and then sending a reverse signal to
the machine to stop in case of anomaly. This work can be extended by adding several optimization
technics to allow the proposal of the ideal manufacturing setting to keep the process in the safe interval.
Declarations
Ethical Approval
This article does not involve human or animal participation or data, therefore ethics approval is not
applicable.
Consent to Participate
This article does not involve human or animal participation or data, therefore consent to participate is not
applicable.
Consent to Publish
This article does not involve human or animal participation or data, therefore consent to publication is
not applicable.
Authors Contributions
The authors' contributions are as following:
Abdelmoula KHDOUDI: Conceptualization and design of study, Implementation, Writingoriginal draft,
Results analysis, Approval of the version of the manuscript to be published.
Noureddine Barka: Conceptualization, and design of study, Interpretation of results, Writing- Reviewing
and Editing, Validation, Approval of the version of the manuscript to be published.
Tawk Masrour:Conceptualization, and design of study, Interpretation of results,Writing- Reviewing and
Editing, Validation, Approval of the version of the manuscript to bepublished.
Ibtissam El Hassani:Conceptualization, and design of study, Interpretation of results,Writing- Reviewing
and Editing, Validation, Approval of the version of the manuscript to bepublished.
Choumicha El MAZGUALDI: Conceptualization, and design of study, Interpretation of results, Writing-
Reviewing and Editing, Validation, Approval of the version of the manuscript to be published.
Page 21/41
Funding
The authors declare that no funding is received in the framework of this paper.
Competing Interests
The authors declare that there are no conicts of interest/competing interests concerning this paper. In
this regard, they have no known competing nancial interests or personal relationships that could have
appeared to inuence the work reported in this paper.
Availability of data and materials
All data, material, and codes used in this paper are available.
References
1. Ab ronen, A., & Karvinen, R. (2018). Effect of glass temperature before cooling and cooling rate on
residual stresses in tempering.
Glass Structures & Engineering
,
3
(1), 3-15.
2. Akeyoshi, K., Kanai, E., Yamamoto, K., & Shima, S. (1967). Study on the physical tempering of glass
plates.
Rep. Res. Lab. Asahi. Glass
,
17
(1), 23-26.
3. Arlot, S., & Celisse, A. (2010). A survey of cross-validation procedures for model selection.
Statistics
surveys
,
4
, 40-79.
4. Breiman, L. (2001). Random forests.
Machine learning
,
45
(1), 5-32.
5. Ciulla, G., & D'Amico, A. (2019). Building energy performance forecasting: A multiple linear regression
approach.
Applied Energy
,
253
, 113500.
. El Mazgualdi, C., Masrour, T., El Hassani, I., & Khdoudi, A. (2020, March). A Deep Reinforcement
Learning (DRL) Decision Model for Heating Process Parameters Identication in Automotive Glass
Manufacturing. In
International Conference on Articial Intelligence & Industrial Applications
(pp. 77-
87). Springer, Cham.
7. El Mazgualdi, C. E., Masrour, T., El Hassani, I., & Khdoudi, A. (2020). Machine learning for KPIs
prediction: a case study of the overall equipment effectiveness within the automotive industry.
Soft
Computing
, 1-19.
. Gardon, R. (1958). Calculation of temperature distributions in glass plates undergoing heat‐
treatment.
Journal of the American Ceramic Society
,
41
(6), 200-209.
9. Gardon, R. (1980). Thermal tempering of glass in Glass: Science and Technology, ed. DR Uhlmann et
NJ Kreidl.
10. Guoline, K. (2017). LightGBM: a highly ecient gradient boosting decision tree, In International
Conference on Neural Information Processing Systems, 3149-3157
11. Hornik, K. (1991). Approximation capabilities of multilayer feedforward networks.
Neural networks
,
4
(2), 251-257.
Page 22/41
12. Huang, G. B. (2003). Learning capability and storage capacity of two-hidden-layer feedforward
networks.
IEEE Transactions on Neural Networks
,
14
(2), 274-281.
13. Hunter, D., Yu, H., Pukish III, M. S., Kolbusz, J., & Wilamowski, B. M. (2012). Selection of proper neural
network sizes and architectures—A comparative study.
IEEE Transactions on Industrial Informatics
,
8
(2), 228-240.
14. Karvinen, R., Rantala, M., & Pesonen, T. (2003, August). Heat transfer in glass tempering and forming
processes. In
Advances in Heat Transfer Engineering. 4th Baltic Heat Transfer Conference
(pp. 25-
27).
15. Khdoudi, A., Masrour, T., & El Mazgualdi, C. (2019, July). Using Machine Learning Algorithms for the
prediction of Industrial Process Parameters Based on Product Design. In
International Conference on
Advanced Intelligent Systems for Sustainable Development
(pp. 728-749). Springer, Cham.
1. Marshall, D. B., & Lawn, B. R. (1978). Strength degradation of thermally tempered glass plates.
Journal of the American Ceramic Society
,
61
(1‐2), 21-27.
17. Narayanaswamy, O. S. (1978). Stress and structural relaxation in tempering glass.
Journal of the
American Ceramic Society
,
61
(3‐4), 146-152.
1. Nielsen, J. H., Olesen, J. F., Poulsen, P. N., & Stang, H. (2010). Simulation of residual stresses at holes
in tempered glass: a parametric study.
Materials and structures
,
43
(7), 947-961.
19. Rantala, M. (2015). Heat transfer phenomena in oat glass heat treatment processes.
20. Schmidhuber, J. (2015). Deep learning in neural networks: An overview.
Neural networks
,
61
, 85-117.
21. Jain, A. K., Mao, J., & Mohiuddin, K. M. (1996). Articial neural networks: A tutorial.
Computer
,
29
(3),
31-44.
22. Silverman, M. P., Strange, W., Bower, J., & Ikejimba, L. (2012). Fragmentation of explosively
metastable glass.
Physica Scripta
,
85
(6), 065403.
23. Yamasaki, M. (1993, September). The lower bound of the capacity for a neural network with multiple
hidden layers. In
International Conference on Articial Neural Networks
(pp. 546-549). Springer,
London.
Figures
Figure 1
The physical count of fragments count of a tempered glass
Page 23/41
Figure 2
Stress distribution across the thickness of a thermally tempered glass plate.
Figure 3
Fracture pattern of Non-tempered glass (left) vs tempered glass (right)
Page 24/41
Figure 4
The selected input process parameters
Figure 5
Manufacturing process data collection procedure
Page 25/41
Figure 6
Pair-plot for the dataset’s variables.
Page 26/41
Figure 7
Correlation matrix.
Page 27/41
Figure 8
Model plots for Ridge Regression Model
Page 28/41
Figure 9
Architecture of the Light Gradient Boosting Algorithm
Figure 10
Model plots for Light Gradient Boosting Model
Page 29/41
Figure 11
SHAP (SHapley Additive exPlanations) plot for Light Gradient Boosting Machine
Page 30/41
Figure 12
Model plots for Lasso Regression Model
Page 31/41
Figure 13
Model plots for Linear Regression Model
Page 32/41
Figure 14
Random Forest Algorithm architecture
Page 33/41
Figure 15
Model plots for Random Forest Model
Page 34/41
Figure 16
SHAP (SHapley Additive exPlanations) plot for Random Forest Algorithm
Figure 17
Page 35/41
Articial Neural Network Architecture
Figure 18
Learning curve for the Articial Neural Networks Model
Page 36/41
Figure 19
Model plots for Articial Neural Networks Model
Page 37/41
Figure 20
Residuals plot for the tested algorithms
Page 38/41
Figure 21
Error plot for the tested algorithms
Page 39/41
Figure 22
Predictions plot for the tested algorithms
Page 40/41
Figure 23
Physical device architecture for the predictive system
Figure 24
Page 41/41
Industrial Network setup