DNNET-Ensemble approach to detecting and identifying
attacks in IoT environments
Cristiano A. de Souza1, Carlos B. Westphall1, Jean D. G. Valencio2
Renato B. Machado2, Wesley dos R. Bezerra1
1Departamento de Inform´
atica – Universidade Federal de Santa Catarina (UFSC)
Abstract. The growth of the Internet of Things (IoT) and computing applica-
tions creates a greater possibility of vulnerabilities, which malicious entities
can use to cause damage. This makes special security techniques as intrusion
detection mechanisms indispensable in modern computer systems. It is impor-
tant to detect and identify the attack in a category so that speciﬁc countermea-
sures for the threat category are solved. However, most existing multiclass de-
tection approaches have some weaknesses, mainly related to detecting speciﬁc
categories of attacks and problems with false positives. This article addresses
this research problem and advances state-of-the-art, bringing contributions to
a two-stage detection architecture called DNNET-Ensemble, combining binary
and multiclass detection. While the benign trafﬁc can be quickly released on
the ﬁrst detection, the intrusive trafﬁc can be subjected to a robust analysis
approach without causing delay issues. Additionally, we propose the DNNET
binary approach for the binary detection level, which can provide more accu-
rate and faster binary detection. The proposed Hybrid Attribute Selection strat-
egy can ﬁnd an optimal subset of attributes through a wrapper method with a
lower training cost due to pre-selection using a ﬁlter method. Furthermore, the
proposed Soft-SMOTE improvement allows operating with a balanced dataset
with a minor training time increase, even in scenarios where there are a large
number of classes with a large imbalance among them. The results obtained
in experiments with renowned intrusion datasets demonstrate that the approach
can achieve superior detection rates and false positives performance compared
to other state-of-the-art approaches.
Internet of Things (IoT) devices have limited resources thereby is a need to transfer,
through the Internet, the data generated by these devices to process and store it in a
computational environment of major capacity. Regarding that, once Cloud Computing
has latency problems caused by the data center distance, Fog Computing provides ser-
vices closer to the end devices (Edge) with less latency [Bonomi et al. 2012]. This way, it
stores and processes information close to IoT devices, reducing the trafﬁc sent to the cloud
[Bonomi et al. 2012]. Also, allowing real-time applications to obtain a faster processing
response time. Nonetheless, Smart Environments are not free from security threats and
vulnerabilities and the growth mentioned above increases the likelihood of vulnerabili-
ties, which malicious entities can use to cause damage. As consequence, special security
techniques are indispensable in modern computer systems.
Intrusion Detection mechanisms are critical points of security, aiming to identify
attempted attacks by unauthorized users. Hence, methods that only perform the detec-
tion that an intrusion is occurring (i.e., binary detection) are insufﬁcient to provide efﬁ-
cient security wherein the approach must be able to mitigate the invasion not to succeed
[Nobakht et al. 2016]. Therefore, it is essential identify and categorize the attack selecting
its speciﬁc countermeasures to mitigate the related vulnerability. Also, the classiﬁcation
of the type or category of the attack is paramount for the decision network administrator
which, based on category identiﬁcation of a recurrent attack, can decide to implement
actions, correcting the vulnerability exploited by the attack.
Motivations. Most intrusion detection approaches focus on anomaly methods for
binary detection (attack or non-attack) [de Souza et al. 2022]. However, binary meth-
ods cannot identify the type or category of attack. On the other hand, the few ex-
isting multiclass detection approaches that aim to classify the attack in speciﬁc cat-
egories present lower accuracy rates than the binary methods [Prabavathy et al. 2018,
Nguyen et al. 2019]. This is justiﬁed by difﬁculties in identifying speciﬁc types of
attacks [Prabavathy et al. 2018, Almiani et al. 2020, Diro and Chilamkurti 2018]. Fur-
thermore, the approaches present problems related to normal trafﬁc identiﬁcation rates
[Ieracitano et al. 2020, Moustafa et al. 2021]. Moreover, this metric is extremely impor-
tant as it indicates how much of the normal trafﬁc is being identiﬁed. A high rate of
false positives is a big problem and, in some cases, degrading performance of the net-
work. This is the mainly reason why breaking down trafﬁc into speciﬁc categories is a
more complex problem. Furthermore, the IoT and Fog computing environments limit the
design of robust approaches due to the constraint resources present in such environments.
This article addresses this research problem and advances state-of-art, bringing
contributions to a two-stage detection architecture combining binary and multiclass de-
tection. While the benign trafﬁc can be quickly released on the ﬁrst detection, the intrusive
trafﬁc can be subjected to a robust analysis without causing delay issues.
So, we propose the DNNET binary approach for the binary detection level, which
can provide faster binary detection than the DNNKNN [de Souza et al. 2020] approach.
The proposed Hybrid Attribute Selection strategy can ﬁnd an optimal subset of attributes
through a wrapper method having a lower training cost due to pre-selection with a ﬁlter
method. Furthermore, the proposed Soft-SMOTE improvement allows operating with a
balanced dataset without generating a relevant increase in training time, even in scenarios
with a large number of classes and a large imbalance among them.
Therefore, we have advanced the state of the art by providing a complete behavior-
based and false-positive resistant approach called DNNET-Ensemble for detecting and
identifying intrusions in Fog Computing and IoT environments.
The results obtained from experiments with the NSL-KDD and IoTID20 intrusion
datasets demonstrated that the approach achieved superior performance over other clas-
sical machine learning techniques and state-of-art approaches. The proposed approach
obtained superior average balanced accuracy, precision, and recall rates than classical
machine learning and state-of-art approaches. As a result, itself proved superior to other
approaches regarding identifying benign trafﬁc, indicating a low rate of false positives
and requiring a fewer computational cost.
Contributions. The main contributions of this work are as follows:
• Proposal of a two-level approach called DNNET-Ensemble for intrusion detection
• Proposal of the soft-SMOTE strategy for class balancing with resource constraints.
• Proposal of the Hybrid Attribute Selection strategy to reduce the cost of wrapper
attribute selection approaches;
• Detection and identiﬁcation results superior to classical machine learning methods
and state-of-art approaches;
• Resistance to false positives having less computational cost requirements.
The remainder of this paper is organized as follows. Section 2 presents recent
works. Section 3 presents a detailed description of the proposed approach. The exper-
imental evaluation results are presented in Section 4. Finally, Section 5 concludes our
2. Related works
This section presents a literature review on the topic with articles found in the IEEE,
ACM, Elsevier, and Springer databases. Table 1 compares the works found in the state-
Several works have proposed approaches focused on single methods. Some
with based methods and neural models like DNN [Diro and Chilamkurti 2018,
Liang et al. 2022], Deep Belief Networks (DBN) [Vinayakumar et al. 2019], Deep
Recurrent Neural Network (DRNN) [Almiani et al. 2020], AutoEncoder (AE)
[Ieracitano et al. 2020] and Convolutional Neural Network (CNN) [Blanco et al. 2018].
Some of these approaches presented drawbacks regarding false positives. The ob-
tained normal trafﬁc identiﬁcation rates were lower than expected and maybe mistakenly
block a large amount of benign trafﬁc. SVM is another widely used technique. Du et al.
[Du et al. 2020] presented a Principal Component Analysis (PCA) based approach to re-
duce the data’s dimensionality and the SVM classiﬁer’s training time. The work presented
some difﬁculty in detecting privilege escalation attacks.
Furthermore, approaches that work with single classiﬁers can suffer from insta-
bilities. There is no guarantee that a classiﬁer will always perform at its best in all
situations. However, with Ensemble Learning, better classiﬁcation performance than
any single classiﬁer can be achieved [Samat et al. 2014]. The authors Prabavathy et
al. [Prabavathy et al. 2018] have proposed a new multiclass anomaly intrusion detec-
tion technique based on the ensemble learning and Online Sequential-Extreme Learn-
ing Machine (OS-ELM). [Zhao et al. 2022] proposes a hybrid weighted ensemble stack-
ing intrusion detection system. Random Forest (RF), XGBoost, and KNN methods are
used as basic classiﬁers, and Logistic Regression (LR) is selected as meta classiﬁer.
[Albulayhi et al. 2022] proposes an approach with an ensemble method by a majority
vote of four basic classiﬁers: KNN, Decision Tree (DT), a neural model, and a bagging
Table 1. Related works —From the left, the ﬁrst column brings the reference of
the analyzed work; the second brings the method used in the detection of anoma-
lies, followed by the indication if the work supports (✓) or does not support (×)
multiclass; ﬁnally, it presented some remarks about each work.
Work Detection method M Observations
[Prabavathy et al. 2018] OS-ELM ✓Low accuracy for some types of attacks
[Diro and Chilamkurti 2018] DNN ✓Difﬁculties related to false positives
[Blanco et al. 2018] CNN+GA ✓Difﬁculties related to false positives
[Almiani et al. 2020] RNN ✓Lower accuracy than binary methods
[Ieracitano et al. 2020] AE ✓Difﬁculties related to false positives
[de Souza et al. 2020] DNNKNN ×KNN’s high computational cost
[Du et al. 2020] PCA-SVM ✓Low accuracy for some types of attacks
[Qaddoura et al. 2021b] DNN+LSTM ✓Balancing cost can become high
[Liang et al. 2022] DNN ✓Preserve data privacy
[Zhao et al. 2022] Ensemble stacking ✓Low accuracy for some types of attacks
[Dat-Thinh et al. 2022] DT ✓Difﬁculties related to false positives
[Albulayhi et al. 2022] Ensemble voting ✓Feature selection ﬁlter method
[Sarwar et al. 2022] PSO+XGB+RF ✓Low accuracy for some types of attacks
Our work DNNET-Ensemble ✓No drawnbacks/vulnerabilities
method. Although ensemble methods provide greater robustness, they generally require
greater computational capabilities as they work with multiple classiﬁers.
Another important problem in state-of-art is the difﬁculties related to false
positives, generally existing in anomaly-based approaches [Ieracitano et al. 2020,
Dat-Thinh et al. 2022]. The approach proposed in [Dat-Thinh et al. 2022] presents these
difﬁculties, as it blocks all trafﬁc previously identiﬁed as intrusive by the ﬁrst level of
binary detection with the DT method. Nevertheless, DTs are susceptible to overﬁtting
issues [T.K. et al. 2021]. In addition, the second level only classiﬁes the type of attack,
not being able to correct misclassiﬁcations.
The DNN-kNN [de Souza et al. 2020] method, based on neural networks and the
kNN algorithm, can obtain high detection rates for binary detection. However, the kNN
algorithm has disadvantages about the computational cost for trafﬁc classiﬁcation, which
affects the approach’s performance. These works focus on binary detection only.
Multiclass detection is of paramount importance since identifying the attack in
a category makes it possible to carry out speciﬁc countermeasures for the given type
of threat. Also, the classiﬁcation of the type or category of the attack is required
for the network admin decision process. However, state-of-art multiclass detection ap-
proaches, which aim to classify the attack in speciﬁc categories, have lower accuracy
rates than binary methods [Prabavathy et al. 2018, Sarwar et al. 2022, Zhao et al. 2022].
There are difﬁculties in identifying speciﬁc types of attacks. Furthermore, some ap-
proaches present problems related to normal trafﬁc identiﬁcation [Ieracitano et al. 2020,
Dat-Thinh et al. 2022]. Approaches with problems of false positives can degrade the net-
Attack detection difﬁculties are often related to the imbalance of existing train-
ing data. Some works used the Synthetic Minority Oversampling Technique (SMOTE)
technique to balance the data [Qaddoura et al. 2021a, Qaddoura et al. 2021b]. Still,
[Qaddoura et al. 2021b] propose a hybrid approach combining two detection steps with
the SMOTE technique for class balancing. However, applying the full SMOTE strategy
in extremely unbalanced scenarios with many classes will create a very large number of
synthetic registers, which increases the cost of training and can downgrade the machine
learning model’s performance.
Some works have tried to ﬁlter the best trafﬁc characteristics with wrapper at-
tribute selection techniques, where classiﬁcation methods are embedded in the selector.
Comparatively, wrapper methods get higher quality attribute sets for detection than ﬁlter
methods. However, wrapper approaches demand more processing and generate higher
computational costs, which can be prohibitive when dealing large amounts of data.
Therefore, in many cases, the techniques used in the detection, attribute selection,
and class balancing approaches can make the approaches cost high to operate in the Fog-
This article addresses these research problems and advances state-of-art, bringing
contributions to a two-stage detection architecture, wherein benign trafﬁc can be quickly
released on the ﬁrst detection. Thus, intrusive trafﬁc can be subjected to a robust analysis
without causing delay issues.
3. Proposed approach
This section presents a proposal for analyzing and monitoring IoT networks. The ap-
proach is based on anomaly detection and aiming detect and identify the intrusion cate-
Nonetheless, IoT devices often have limited computing resources [Ni et al. 2018].
These limitations make it difﬁcult to perform anomaly analysis based on complex IoT de-
vices’ techniques. Hence, we propose a two-level detection approach designed to operate
in the fog and the cloud, as proposed in [de Souza et al. 2020]. Although the cloud has
devices with more signiﬁcant computing resources than the IoT [Ni et al. 2018] and the
Fog, it suffer from latency problems caused by the large distance between the IoT network
and the datacenters. Moreover, Fog is closer to the IoT network and can provide process-
ing and storage mechanisms at the edge of the network [Bonomi et al. 2012]. Therefore,
this makes it possible to detect threats faster.
Initially, the information captured from the monitored network is pre-processed
and sent to the ﬁrst detection level, where a binary detection analysis is performed. The
proposed DNNET classiﬁer is responsible for operating on the ﬁrst level and assigning the
intrusive or non-intrusive label to each event. If it is benign trafﬁc, it is automatically al-
lowed. Otherwise, if the trafﬁc is classiﬁed as intrusive, it is sent to the second level of de-
tection in the cloud layer, where a multiclass classiﬁer identiﬁes the attack category. The
multiclass classiﬁer was adapted from the approach proposed in [de Souza et al. 2022].
This method will classify the event as an attack type or benign. If it is identiﬁed as a type
of attack, this information will be sent to the mitigation module, which will implement
appropriate countermeasures for each intrusion. However, the second level approach can
classify an event as benign, in this case, identifying an event wrongly classiﬁed as intru-
sive by the ﬁrst level. In consequence, the approach allows for the recovery of ﬁrst-level
false positives. In the next sections, details about each detection level are provided.
Ensemble methods provide greater robustness to classiﬁcation. However, they
have higher processing and training costs than single classiﬁer models. The proposed ap-
proach proposes to circumvent this limitation by executing the ensemble method in the
cloud layer, thus having greater computational power. Furthermore, the ensemble ap-
proach will only be applied to classify previously detected intrusive events into a speciﬁc
attack category. As normal events have already been detected by the ﬁrst stage and have
already been released, there is no need for an urgent response to the events sent to this sec-
ond stage of classiﬁcation, as there is a high chance that they will be intrusive. As a result,
the delay will be minimal in the ﬂow of legitimate trafﬁc caused by the communication
latency with the cloud.
3.1. First Level - DNNET Binary detection
The hybrid method for binary detection, DNNKNN [de Souza et al. 2020], has a highly
detection rate. However, it has a high computational cost in prediction time, as it performs
several comparisons with the instance base during the analysis. As the binary method
is designed to operate in the fog and will act on the ﬁrst level of analysis used on all
IoT network trafﬁc, it must be light so as not to cause a signiﬁcant delay in the trafﬁc.
We optimize the DNNKNN [de Souza et al. 2020] method and propose the new DNNET
method, proposing some improvements.
In the DNNET method, the trafﬁc is initially submitted to the neural model, as
illustrated in Figure 1. The model generates outputs on two neurons, one corresponding
to the intrusive class and the other to the non-intrusive class. Each neuron generates an
output between 0 and 1. This value corresponds to the probability that the ﬂow belongs
to the class to which the neuron corresponds.
Figure 1. Illustration of the DNNET binary method.
The approach has a predeﬁned threshold for each neuron. If the neuron of the
intrusive class has an output bigger than its predeﬁned threshold, the analyzed trafﬁc
instance will automatically be classiﬁed as intrusive. The same procedure is applied to
the output of the non-intrusive class neuron. Instances that obtained values below the
output limits on both neurons are considered situations where the neural model did not
obtain clarity or precision. Hence, trafﬁc is sent to a binary Extra Tree. So, before being
analyzed by the ET method, the instance has its attributes reduced. This step is to reduce
instance complexity and reduce ET processing. The attributes are reduced according
to the attributes selected during the DNNET training stage. The dataset with reduced
attributes is then submitted to the ET, and the binary classiﬁcation generated by the ET
is accepted as ﬁnal. ET is a Decision Tree ensemble method. DTs are lightweight and
simple methods with good classiﬁcation performance. However, they are susceptible to
overﬁtting issues [T.K. et al. 2021]. Through the ensemble strategy, the ET technique
builds a more robust classiﬁcation model and reduces overﬁtting.
3.1.1. Deep Neural Network (DNN)
The DNN used is Feed-forward of the Multilayer Perceptron type, motivated by the ability
to solve non-linearly separable problems. The number of neurons assigned to each hidden
layer of the network was deﬁned, according to [de Souza et al. 2020], as equal to the input
size. The hyperbolic tangent function was used as an activation function of hidden layer
neurons. It can be deﬁned from the ratio of hyperbolic sine to hyperbolic cosine, as seen
in Equation 1.
φ(v) = senh(v)
The output layer was ﬁxed in two neurons with a softmax activation function.
Each neuron generates a rank value between 0and 1. The softmax function transforms
the outputs of each neuron into values between 0and 1. It divides them by the sum of the
outputs, generating the probability that the input is in a speciﬁc class.
3.1.2. Extra Tree (ET)
Extra Tree is an ensemble method aggregating the results of several decorrelated DTs
accumulated within a “forest” to produce the classiﬁcation results. DT is one of the most
widely used classiﬁcation algorithms in data mining [Rokach 2016]. DTs have a tree-like
structure and are composed of nodes. These nodes can be divided into a root node, a set of
intermediate nodes, and a set of leaf nodes [Breiman et al. 1984]. Starting with the entire
data set, the root node corresponds to the ﬁrst division that speciﬁes how the data will
be divided into separate parts. Successive intermediate nodes continue to divide data into
smaller partitions until no further partitioning is required. In this way, the structure’s leaf
nodes represent the ﬁnal partitions [Rokach 2016]. DTs classify using a set of hierarchical
resource decisions. Decisions made at internal nodes are the division criteria.
The ET focuses on strongly randomizing both the attributes’ choice and the at-
tributes’ cutoff point while dividing a node in the tree. In the extreme case, it builds
totally random trees, whose structures are independent of the output values of the learn-
ing sample [Geurts et al. 2006]. As in RF a random subset of candidate attributes is used.
However, in ET, instead of looking for the most discriminating cut points, cut points are
drawn randomly for each candidate attribute, and the best of these generated cut points is
randomly chosen as the division rule.
The logic behind the method is that the explicit cutoff and attribute random-
ization combined with the ensemble mean should be able to reduce the variance more
strongly than the weaker randomization schemes used by other methods. Using original
and complete training data, rather than bootstrap replicas, is motivated to minimize bias
[Verma and Ranga 2020].
The ET has some important parameters. The number of decision trees (a) present
in the structure was set to 10 (a= 10), the minimum size of the training set to split
a node is 2 (n min = 2), and the number of attributes considered for better division is
the root of the number of existing attributes (K=sqrt(N)). The Gini Index is used
as a criterion. The parameter pindicates the number of depth levels each tree can grow.
In the design of the ET structure of the DNNET method, the depth limit was set at ten
levels (p= 10). This parameter controls the size of trees. Failure to deﬁne this structure
generates unpruned fully grown trees that can potentially be very large in some datasets
[Geurts et al. 2006].
3.1.3. DNNET training
The training process of the DNNET approach includes the training of the neural model,
the selection of binary attributes, the training of the binary ET, and the adjustment process
of the limits of the output neurons of the neural model, as can be seen in the Algorithm 1.
Algorithm 1: Training of the DNNET method.
Input: x,y,acceptable F P rate,acceptable F N r ate
model ←TrainingModelDNN (x,y);
reduced x ←HybridAttributeSelection (x,y)
et ←TrainingBinaryET (reduced x,y);
benign neuron limit ←0.5;
attack neuron limit ←0.5;
while i <= 10 do
pred y, neurons output ←DNNET (x)
f p rate, f n rate ←CalculateMetrics (pred y, y);
if (fp rate > acceptable F P rate)then
attack neuron limit ←percentile(neurons output.attack, i*10);
if (fn rate > acceptable F N rate)then
benign neuron limit ←percentile(neurons output.benign, i*10);
if (fp rate <=acceptable F P rate)and
(fn rate <=acceptable F N rate)then
DNNET method training has two main parameters, the acceptable false positive
rate (acceptable F P rate) and the acceptable false-negative rate (acceptable F N rate).
The process of deﬁning the limits is iterative. It is carried out until the false positive (FP)
and false-negative (FN) rates obtained are less than or equal to the acceptable rates or
until the limits are equal to 1, in the latter, in which case all trafﬁc instances would be sent
to be classiﬁed by ET.
Initially, the training of the DNN model is carried out. After this step, a strategy
is applied to select the attributes that can best contribute to the binary analysis. Hence,
we propose a Hybrid Attribute Selection strategy with two main steps. In Step 1, a ﬁlter-
type attribute selection algorithm is applied, the Information Gain (IG). Also, the set of
attributes selected by Step 1 is then submitted to Step 2, composed of wrapper algorithms,
Recursive Feature Elimination (RFE), and Sequential Forward Feature Selection (SFFS).
They use the ET classiﬁer during the selection step to assess the importance of attributes.
Finally, the set of best attributes generated by the second step is taken as a result of the
Wrapper methods tend to obtain higher quality attribute sets for the detection pro-
cess. However, wrapper approaches demand more processing and generate higher com-
putational costs, which can be prohibitive when dealing large amounts of data. Thus,
using a ﬁlter method in Step 1 is of great importance, as it allows submitting a partially
reduced data set to Step 2, making the process less costly.
The dataset with the attributes selected by the hybrid attribute selection strategy
is then used to build the binary ET structure. Thereafter, the benign neuron and attack
neuron thresholds are initially set at 0.5. In this case, all instances classiﬁed by DNN are
accepted, and none are sent to ET.
Figure 2. Training process of the proposed approach.
From this, an iterative process of up to 10 steps begins. In each iteration, the
DNNET method is applied to the training data to generate a list of predictions and two
sets of values. These values correspond to the output generated by the neurons for each
of the training data. A set of neuron output values corresponding to the benign class and a
set of neuron output values for the intrusive class. Subsequently, the metrics f p rate and
fn rate are generated from the predictions. The f p rate metric, corresponding to the
false positive rate, is compared with the acceptable F P rate parameter. A new threshold
is deﬁned for the attack neuron if it is higher than acceptable. This new limit will be
higher so that a smaller number of instances are classiﬁed only by the DNN, and a larger
one is sent to the ET. The same procedure occurs with the false-negative rate (fn rate)
and the parameter acceptable F N rate. The deﬁnition of the new threshold is based on
the sets of benign neuron and attack neuron output values obtained during the DNNET
prediction with the training data. This new threshold for each neuron is assigned accord-
ing to the value corresponding to the (i∗10) percentile of the set of values generated by
each output neuron, where icorresponds to the current iteration. The threshold deﬁnition
process continues to the next iteration, where it is repeated with the new thresholds. Thus,
the limits are incremented by ten percentiles at each iteration until reaching a limit that
reaches the acceptable rate of FP and FN. If the percentages of FP and FN are smaller and
acceptable at any time, the ideal limits are found, and the training is ﬁnished.
3.2. Second level - Ensemble Multiclass identiﬁcation
The proposed approach second step is the Identiﬁcation phase, which is designed to op-
erate in the cloud computing layer. As for second step, the method used consists of a
multiclass ensemble approach proposed in [de Souza et al. 2022], composed of 3 differ-
ent machine learning techniques, an ET, RF, and a DNN. Ensemble methods are created
by combining multiple models. Promoting the combination of the classiﬁcations of sev-
eral conceptually different base machine learning classiﬁers to improve generalization
and robustness over single classiﬁers.
Due to the major resource capacity of the cloud, there is no need to reduce the
structure of the method to conserve resources. So, unlike [de Souza et al. 2022], we pro-
pose to use the 100 decision tree in the RF and ET structures (a= 100), but limiting
the depth growth of the decision tree. Each tree can grow to a maximum of 100 levels
(p= 100). The greater resource capacity also allows for working with a DNN with a
more complex architecture, which was deﬁned with two hidden layers of 150 neurons,
each with a ReLU activation function.
3.3. DNNET-Ensemble Training process
The training process is responsible for generating the models of the proposed approach
and making it able to analyze network trafﬁc to identify intrusions.
Initially, the training dataset goes through a pre-processing step to obtain the nor-
malized set of information for analysis. This set is then submitted to the class balancing
process, as shown in Figure 2.
Class balancing equals the number of training instances in each class. The chal-
lenge of working with unbalanced datasets is that most machine learning techniques will
underperform in the minority class. One of the strategies for working with unbalanced
datasets is to create new data for the minority classes to equal the quantity of the majority
class. The SMOTE method can be used for this purpose. It selects nearby examples in
the feature space, draws a line between the examples in the feature space, and creates new
instances at a point along that line.
Nevertheless, in scenarios where there is a large imbalance between classes and a
large number of classes, creating a very large number of synthetic records can damage the
machine learning model’s performance and make the training process more expensive.
As solution, we propose the Soft-SMOTE, a less aggressive balancing strategy
that seeks to deﬁne an adequate percentage of records for each class based on the total
number of records. The objective is to create new records for the minority classes until
they reach a minimum amount (minimum) that can contribute to the detection method
learning process. The strategy does not aim to equal the number of records of all classes
with the number of records of the majority class. In extremely unbalanced scenarios,
this would result in a huge amount of new synthetic records. The pseudo-code of the
proposed class balancing strategy is presented in Algorithm 2. Initially, the number of
existing classes in the dataset is identiﬁed (n classes), and the appropriate percentage of
records for each class (percentage) is calculated. This percentage calculates each class’s
minimum number of records (minimum).
Algorithm 2: Proposed Soft-SMOTE strategy.
Output: balanced x, balanced y
n classes ←countN umberClasses(y);
percentage ←100/n classes;
minimum ←(countData(x)/100) ∗percentage;
qty per class ←countQtyDataP er Class(x, y);
while i <=n classes do
if qty per class[i]< minimum then
final qty per class[i]←minimum;
final qty per class[i]←qty per class[i];
balanced x, balanced y ←SMOTE (x, y, final qty per class);
For each of the classes, a check is performed. If the currently existing quantity
(qty per class) is less than the minimum (minimum), then it is deﬁned as the new quan-
tity of records for the class (final qty per class), the minimum quantity deﬁned earlier.
These values are then used to indicate the intended number of records for each class to
After balancing the original training data, a new dataset with balanced classes
is generated and submitted in two different ﬂows —one for training the binary DNNET
method and another for training the multiclass method.
The balanced dataset is submitted to the attribute selection step in one of the ﬂows.
To perform the selection of attributes in this step, the same strategy presented in Section
3.1.3 is used. However, in this case, considering a multiclass scenario.
Then, the dataset with only the selected attributes is used to train the Ensemble
classiﬁer. After training, an ensemble method capable of performing multiclass detection
on new data is obtained.
The other ﬂow, shown in Figure 2, corresponds to the training of the DNNET
method. Initially, the balanced dataset goes through a binarization step to convert all
intrusive ﬂows to label 1 and the benign ones to label 0. The binary dataset is used to train
the DNNET approach. After the complete training process of the DNNET-ET approach,
the architectures, information, and weights of the neural model are sent for implantation
in the Fog Node.
This section presents the methodology deﬁned for the evaluation of the proposal. The pro-
posed approach and machine learning methods were evaluated through experiments with
the IoTID20 [Ullah and Mahmoud 2020] and NSL-KDD [Tavallaee et al. 2009] datasets.
Each experiment had run ﬁve times, wherein 70% of the data were used for training and
30% for testing.
4.1. Evaluation binary classiﬁers
The NSL-KDD database [Tavallaee et al. 2009] is used in many current works
[Diro and Chilamkurti 2018, Rathore and Park 2018] to assess intrusion detection meth-
ods. Each base instance has 42 attributes (also called resources) of which 41 are be-
havioral characteristics of network connections extracted from packet analysis. The last
attribute is the instance label, indicating whether it refers to an attack or normal behavior.
The NSL-KDD dataset has 125,973 records.
Additionally, Table 2 compares the detection metrics obtained by the binary ap-
proaches proposed in experiments with the NSL-KDD dataset. Both proposed approaches
presented higher detection rates than the other approaches. The method proposed by
[Mohamed Omar et al. 2021] had a high recall rate; however, the accuracy rate was be-
low 99%. The accuracy rate obtained by DNNKNN was 99.43%, and that of DNNET
was 99.64%. In addition, both achieved attack detection rates (DR) of over 99%.
Table 2. Binary detection results of approaches.
Abordagens ACC PRE DR Train (s) Test (s)
[Mohamed Omar et al. 2021] 99,4 98,7 99,4 - -
[Sahar et al. 2021] 95,4 96,2 95,4 - -
[Gopalakrishnan and Purusothaman 2022] 95,6 98,3 92,2 - -
DNNKNN [de Souza et al. 2020] 99,43 99,36 99,43 247.47 7.97
DNNET 99,64 99,88 99,36 113.24 2.36
The DNNKNN method proposed in [de Souza et al. 2020] optimized the KNN
algorithm about the computational cost, with a 90% reduction in the prediction time;
however, the approach remains expensive. Thus, the DNNET method was designed to
optimize the prediction time, seeking to maintain the quality of the binary detection pre-
sented by DNNKNN. In the experiments conduct, the DNNET approach achieved a bi-
nary detection quality similar or superior to DNNKNN’s. In addition, the objective of
reducing the computational cost was also achieved. The DNNKNN approach needed ap-
proximately 247.5 seconds to complete the training process, while the DNNET performed
this process in 113.2 seconds, a reduction of 55.3%. Furthermore, the DNNET approach
also reduced the prediction time from 7.96 seconds to just 2.35 seconds, corresponding to
a reduction of approximately 70.4%.
4.2. Evaluation DNNET-Ensemble with NSL-KDD
Table 3 presents the results obtained in experiments with the NSL-KDD dataset. The
kNN, DNN, RF, ET, and the DNNET-Ensemble achieved excellent results in identifying
benign trafﬁc, DoS, and probing attacks. They achieved detection rates greater than 99%
in all three categories. In detecting R2L attacks, DNN achieved a recall of approximately
89%. Therefore, it obtained inferior performance to the kNN, RF, ET approaches and the
proposed approach, which reached approximately 95% detection.
Table 3. Results of experiments with the NSLKDD dataset.
Works BACC Class
Benign DoS Pro R2L U2R
DNN 85.84 99.69 99.87 99.15 89.26 41.25
kNN 88.30 99.71 99.92 99.34 95.03 47.50
RF 86.52 99.94 99.98 99.55 95.64 37.50
ET 87.71 99.92 99.96 99.49 94.16 45.00
[de Souza et al. 2022] 88.41 99.86 99.93 99.43 92.82 50.00
Our work 92.60 99.89 99.95 99.19 95.23 68.75
About R2L attacks, the proposed approach reached the second highest rate, behind
the approach of Blanco et al. [Blanco et al. 2018], which got 99.18%, as can be seen in
Figure 3. However, this work has failed to detect U2R attacks.
The U2R attack category has few ﬂows in the dataset. State-of-the-art approaches
and classic machine learning methods have difﬁculty in detecting these attacks. DNN
was able to identify 41%. The kNN detected approximately 47%, the RF 37%, and the
ET 45%. The approach proposed by Du et al. [Du et al. 2020] detected less than 40%
of U2R attacks. The DNNET-Ensemble overcame these rates, achieving a recall of ap-
proximately 69%, surpassed only by the approach of Almiani et al. [Almiani et al. 2020],
which obtained 77.2% detection. However, this approach does not have a detection rate
for normal ﬂows. Furthermore, the works by Almiani et al. [Almiani et al. 2020] and
Liang et al. [Liang et al. 2022] had detection difﬁculties in the R2L category, achieving
only 65% and 86% detection, respectively, while the proposed approach achieved 95%.
As presented in Section 2, the approaches present difﬁculties related to the iden-
tiﬁcation of benign trafﬁc [Blanco et al. 2018, Ieracitano et al. 2020, Zhao et al. 2022].
The approach of Blanco et al. [Blanco et al. 2018] obtained 95.4% identiﬁcation of be-
nign ﬂows. This can be considered a weakness of the approaches as it indicates that a
large amount of benign trafﬁc may be mistakenly detected as an attack. False positives
are a major problem with anomaly-based techniques and can damage the network. The
DNNET-Ensemble obtained approximately 99.89% detection of normal ﬂows, being the
approach that obtained the highest metric. Therefore, it performs the detection of attacks
generating a few false positives.
Observing the Balanced Accuracy (BACC), a metric that considers the imbalance
between classes, it is noted that the proposed approach was able to overcome the oth-
ers. The DNNET-Ensemble reached 92.6% of balanced average accuracy, while DNN
Figure 3. Results of experiments with the NSLKDD dataset.
obtained 85.8%, kNN 88.3%, RF 86,5%, ET 87.7% and [de Souza et al. 2022] 88.41%.
4.3. Evaluation DNNET-Ensemble with IoTID20
The IoTID20 [Ullah and Mahmoud 2020] dataset has data referring to network
trafﬁc of IoT devices and interconnected structures typical of a Smart Home
[Ullah and Mahmoud 2020]. Among the devices present in the monitored architecture
are security cameras, for example. An interesting variety of attacks are recorded, involv-
ing DoS, Botnet Mirai, Man in the Middle (MITM), and probing activities.
Figure 4 highlights that some works presented difﬁculties detecting some types
of attacks. [Qaddoura et al. 2021a] had difﬁculties in identifying DoS and Mirai trafﬁc.
[Sarwar et al. 2022] had a recall of 50% in identifying Scan attacks and only 13% for
MITM. The authors [Qaddoura et al. 2021b] identiﬁed only 55% of DoS and 74% of
Figure 4. Results of experiments with the IoTID20 dataset.
The approach proposed by [Dat-Thinh et al. 2022] identiﬁed only 31.7% of be-
nign trafﬁc. Other approaches also presented difﬁculties in identifying benign trafﬁc,
obtaining recall rates below 90% [Qaddoura et al. 2021a, Sarwar et al. 2022]. This indi-
cates that the approach had false positive problems, where normal trafﬁc is erroneously
detected as intrusive. On the other hand, the DNNET-Ensemble detected approximately
Table 4. Results of experiments with the IoTID20 dataset.
Works BACC Class
Benign DoS Mirai Scan MITM
DNN 96.97 98.72 99.85 97.93 97.42 90.95
kNN 98.03 98.43 99.83 99.39 97.41 95.09
RF 98.10 98,55 99,91 99,66 97,52 94,89
ET 97.12 98.23 99.89 99.27 96.09 92.14
[de Souza et al. 2022] 98.99 99.21 99.89 99.47 98.78 97.67
Our work 99.33 99.40 99.92 99.77 99.09 98.45
99.4% of the existing normal trafﬁc, thus generating a low rate of false positives. Fur-
thermore, the proposed approach achieved recall rates of 98% for all attacks. It is also
noteworthy that the good detection performance presented in each of the classes reﬂects
the balanced accuracy metric, which reached 99.33%, the highest among the evaluated
techniques, as can be seen in Table 4. The fact that it has a BACC of 99.3% allows
us to conclude that the approach operates with a lower false positive rate than the other
techniques, which presented greater difﬁculty in identifying benign trafﬁc.
5. Conclusions and future works
This work proposes a two-level intrusion detection and identiﬁcation approach in Fog
Computing and IoT environments called DNNET-Ensemble. We propose improvements
to a recent binary detection approach and generate the new DNNET binary approach.
The proposed Hybrid Attribute Selection strategy can ﬁnd an optimal subset of attributes
through a wrapper method with a lower training cost due to pre-selection with a ﬁlter
method. Furthermore, the proposed Soft-SMOTE improvement allows operating with
a balanced dataset without generating a large increase in training time. The results ob-
tained in experiments with renowned intrusion datasets demonstrate that the approach can
achieve performance superior to other classical machine learning techniques about pre-
diction metrics. The DNNET-Ensemble achieved a balanced average accuracy of 92,6%
for the NSL-KDD and 99,3% for IoTID20. Furthermore, compared with state-of-art ap-
proaches, the proposed approach capacity to generate a low rate of false positives is ob-
Future work includes proposing countermeasure mechanisms and mapping be-
tween attacks and countermeasures.
[Albulayhi et al. 2022] Albulayhi, K., Abu Al-Haija, Q., Alsuhibany, S. A., Jillepalli, A. A.,
Ashrafuzzaman, M., and Sheldon, F. T. (2022). Iot intrusion detection using machine
learning with a novel high performing feature selection method. Applied Sciences,
[Almiani et al. 2020] Almiani, M., AbuGhazleh, A., Al-Rahayfeh, A., Atiewi, S., and
Razaque, A. (2020). Deep recurrent neural network for iot intrusion detection system.
Simulation Modelling Practice and Theory, 101:102031. Modeling and Simulation of
[Blanco et al. 2018] Blanco, R., Malag´
on, P., Cilla, J. J., and Moya, J. M. (2018). Multi-
class network attack classiﬁer using cnn tuned with genetic algorithms. In 2018 28th
International Symposium on Power and Timing Modeling, Optimization and Simula-
tion (PATMOS), pages 177–182.
[Bonomi et al. 2012] Bonomi, F., Milito, R., Zhu, J., and Addepalli, S. (2012). Fog com-
puting and its role in the internet of things. In Proceedings of the First Edition of the
MCC Workshop on Mobile Cloud Computing, MCC ’12, page 13–16, New York, NY,
USA. Association for Computing Machinery.
[Breiman et al. 1984] Breiman, L., Friedman, J., Stone, C. J., and Olshen, R. A. (1984).
Classiﬁcation and regression trees. CRC press.
[Dat-Thinh et al. 2022] Dat-Thinh, N., Xuan-Ninh, H., and Kim-Hung, L. (2022). Midsiot:
A multistage intrusion detection system for internet of things. Wireless Communica-
tions and Mobile Computing, 2022.
[de Souza et al. 2022] de Souza, C. A., Westphall, C. B., and Machado, R. B. (2022). Two-
step ensemble approach for intrusion detection and identiﬁcation in iot and fog com-
puting environments. Computers & Electrical Engineering, 98:107694.
[de Souza et al. 2022] de Souza, C. A., Westphall, C. B., Machado, R. B., Lofﬁ, L., West-
phall, C. M., and Geronimo, G. A. (2022). Intrusion detection and prevention in
fog based iot environments: A systematic literature review. Computer Networks,
[de Souza et al. 2020] de Souza, C. A., Westphall, C. B., Machado, R. B., Sobral, J. B. M.,
and Vieira, G. S. (2020). Hybrid approach to intrusion detection in fog-based iot envi-
ronments. Computer Networks, 180:107417.
[Diro and Chilamkurti 2018] Diro, A. A. and Chilamkurti, N. (2018). Distributed attack de-
tection scheme using deep learning approach for internet of things. Future Generation
Computer Systems, 82:761 – 768.
[Du et al. 2020] Du, R., Li, Y., Liang, X., and Tian, J. (2020). Support vector machine intru-
sion detection scheme based on cloud-fog collaboration. In International Conference
on Security and Privacy in New Computing Environments, pages 321–334. Springer.
[Geurts et al. 2006] Geurts, P., Ernst, D., and Wehenkel, L. (2006). Extremely randomized
trees. Machine learning, 63(1):3–42.
[Gopalakrishnan and Purusothaman 2022] Gopalakrishnan, B. and Purusothaman, P.
(2022). A new design of intrusion detection in iot sector using optimal feature se-
lection and high ranking-based ensemble learning model. Peer-to-Peer Networking
and Applications, pages 1–28.
[Ieracitano et al. 2020] Ieracitano, C., Adeel, A., Morabito, F. C., and Hussain, A. (2020).
A novel statistical analysis and autoencoder driven intelligent intrusion detection ap-
proach. Neurocomputing, 387:51 – 62.
[Liang et al. 2022] Liang, H., Liu, D., Zeng, X., and Ye, C. (2022). An intrusion detection
method for advanced metering infrastructure based on federated learning. Journal of
Modern Power Systems and Clean Energy, pages 1–11.
[Mohamed Omar et al. 2021] Mohamed Omar, H. O., Goyal, S. B., and Varadarajan, V.
(2021). Application of sliding window deep learning for intrusion detection in fog
computing. In 2021 Emerging Trends in Industry 4.0 (ETI 4.0), pages 1–6.
[Moustafa et al. 2021] Moustafa, N., Keshk, M., Choo, K. R., Lynar, T., Camtepe, S., and
Whitty, M. (2021). Dad: A distributed anomaly detection system using ensemble
one-class statistical learning in edge networks. Future Generation Computer Systems,
[Nguyen et al. 2019] Nguyen, T. G., Phan, T. V., Nguyen, B. T., So-In, C., Baig, Z. A., and
Sanguanpong, S. (2019). Search: A collaborative and intelligent nids architecture for
sdn-based cloud iot networks. IEEE Access, 7:107678–107694.
[Ni et al. 2018] Ni, J., Zhang, K.and Lin, X., and Shen, X. (2018). Securing fog computing
for internet of things applications: Challenges and solutions. IEEE Communications
Surveys & Tutorials.
[Nobakht et al. 2016] Nobakht, M., Sivaraman, V., and Boreli, R. (2016). A host-based
intrusion detection and mitigation framework for smart home iot using openﬂow. In
2016 11th International Conference on Availability, Reliability and Security (ARES),
[Prabavathy et al. 2018] Prabavathy, S., Sundarakantham, K., and Shalinie, S. M. (2018).
Design of cognitive fog computing for intrusion detection in internet of things. Journal
of Communications and Networks, 20(3):291–298.
[Qaddoura et al. 2021a] Qaddoura, R., Al-Zoubi, A. M., Almomani, I., and Faris, H.
(2021a). Predicting different types of imbalanced intrusion activities based on a multi-
stage deep learning approach. In 2021 International Conference on Information Tech-
nology (ICIT), pages 858–863.
[Qaddoura et al. 2021b] Qaddoura, R., M. Al-Zoubi, A., Faris, H., and Almomani, I.
(2021b). A multi-layer classiﬁcation approach for intrusion detection in iot networks
based on deep learning. Sensors, 21(9).
[Rathore and Park 2018] Rathore, S. and Park, J. H. (2018). Semi-supervised learning based
distributed attack detection framework for iot. Applied Soft Computing, 72:79 – 89.
[Rokach 2016] Rokach, L. (2016). Decision forest: Twenty years of research. Information
Fusion, 27:111 – 125.
[Sahar et al. 2021] Sahar, N., Mishra, R., and Kalam, S. (2021). Deep learning approach-
based network intrusion detection system for fog-assisted iot. In Proceedings of in-
ternational conference on big data, machine learning and their applications, pages
[Samat et al. 2014] Samat, A., Du, P., Liu, S., Li, J., and Cheng, L. (2014). E2LMs : En-
semble extreme learning machines for hyperspectral image classiﬁcation. IEEE Jour-
nal of Selected Topics in Applied Earth Observations and Remote Sensing, 7(4):1060–
[Sarwar et al. 2022] Sarwar, A., Hasan, S., Khan, W. U., Ahmed, S., and Marwat, S. N. K.
(2022). Design of an advance intrusion detection system for iot networks. In 2022 2nd
International Conference on Artiﬁcial Intelligence (ICAI), pages 46–51.
[Tavallaee et al. 2009] Tavallaee, M., Bagheri, E., Lu, W., and Ghorbani, A. A. (2009). A
detailed analysis of the kdd cup 99 data set. In 2009 IEEE Symposium on Computa-
tional Intelligence for Security and Defense Applications, pages 1–6.
[T.K. et al. 2021] T.K., B., A., C. S. R., and B., A. (2021). Machine learning algorithms for
social media analysis: A survey. Computer Science Review, 40:100395.
[Ullah and Mahmoud 2020] Ullah, I. and Mahmoud, Q. H. (2020). A scheme for generating
a dataset for anomalous activity detection in iot networks. In Canadian Conference on
Artiﬁcial Intelligence, pages 508–520. Springer.
[Verma and Ranga 2020] Verma, A. and Ranga, V. (2020). Machine learning based in-
trusion detection systems for iot applications. Wireless Personal Communications,
[Vinayakumar et al. 2019] Vinayakumar, R., Alazab, M., Soman, K. P., Poornachandran, P.,
Al-Nemrat, A., and Venkatraman, S. (2019). Deep learning approach for intelligent
intrusion detection system. IEEE Access, 7:41525–41550.
[Zhao et al. 2022] Zhao, R., Mu, Y., Zou, L., and Wen, X. (2022). A hybrid intrusion detec-
tion system based on feature selection and weighted stacking classiﬁer. IEEE Access,