Conference PaperPDF Available

Advancing Network Threat Detection Through Standardized Feature Extraction and Dynamic Ensemble Learning

Authors:

Abstract

This research proposes an innovative approach to network intrusion detection by employing an ensemble of machine learning models to identify suspicious network traffic. Building on previous studies, this research focuses on three key components: developing a standardized feature extraction framework, selecting and training a set of machine learning models, and designing a novel ensemble classification algorithm that leverages their respective strengths. The standardized feature extraction framework emphasizes metadata and flow-level statistics, with the goal of reducing bias and improving generalization across diverse network environments. An ensemble of machine learning models (Random Forest, Isolation Forest, Gaussian Mixture Models, Quadratic Discriminant Analysis, AdaBoost, XGBoost, Convolutional Neural Networks, and Recurrent Neural Networks) are evaluated with the goal of leveraging their unique strengths in detecting various types of network anomalies. A novel ensemble classification algorithm, Ford Class Specific Weighted Values (Ford-CSWV), is proposed to weight model outputs based on class-specific validation performance, with the goal of enhancing the ensemble's overall detection capability. Experimental results demonstrate that the ensemble classifier achieved an accuracy of 97.92%, with balanced precision and recall for both benign and malicious traffic. While gains over individual classifiers were minimal, the ensemble provides stable performance across different traffic types, reducing the risk of false positives. This work provides a blueprint for developing more robust and adaptable network intrusion detection systems.
Advancing Network Threat Detection through
Standardized Feature Extraction and Dynamic
Ensemble Learning
Jason Ford
Computer Science, Engineering and Mathematics
University of South Carolina Aiken
Aiken SC USA
jasonsf@usca.edu
Hala Strohmier Berry
Computer Science, Engineering and Mathematics
University of South Carolina Aiken
Aiken SC USA
hala.strohmier@usca.edu
Abstract—This research proposes an innovative approach to
network intrusion detection by employing an ensemble of ma-
chine learning models to identify suspicious network traffic.
Building on previous studies, this research focuses on three
key components: developing a standardized feature extraction
framework, selecting and training a set of machine learning
models, and designing a novel ensemble classification algorithm
that leverages their respective strengths.
The standardized feature extraction framework emphasizes
metadata and flow-level statistics, with the goal of reducing
bias and improving generalization across diverse network en-
vironments. An ensemble of machine learning models (Random
Forest, Isolation Forest, Gaussian Mixture Models, Quadratic
Discriminant Analysis, AdaBoost, XGBoost, Convolutional Neu-
ral Networks, and Recurrent Neural Networks) are evaluated
with the goal of leveraging their unique strengths in detecting
various types of network anomalies.
A novel ensemble classification algorithm, Ford Class Specific
Weighted Values (Ford-CSWV), is proposed to weight model
outputs based on class-specific validation performance, with the
goal of enhancing the ensemble’s overall detection capability.
Experimental results demonstrate that the ensemble classifier
achieved an accuracy of 97.92%, with balanced precision and
recall for both benign and malicious traffic. While gains over
individual classifiers were minimal, the ensemble provides stable
performance across different traffic types, reducing the risk of
false positives. This work provides a blueprint for developing
more robust and adaptable network intrusion detection systems.
Keywords—Network intrusion detection, machine learning,
neural networks, feature extraction, ensemble classifier, anomaly
detection, cybersecurity.
I. INTRODUCTION
The rapid evolution of cyber attacks has seen a significant
increase in their complexity, making detection of suspicious
network traffic an even greater concern in cybersecurity
[1]. Traditional defense mechanisms such as firewalls and
reputation-based detection systems often struggle to identify
novel attack chains due to their reliance on pattern matching
and known threat signatures [2], [3]. This limitation necessi-
tates the development of a more advanced Network Intrusion
Detection System (NIDS) that can adapt to emerging threats
by analyzing network behavior patterns.
Machine learning (ML) has shown the ability to enhance
NIDS capabilities by detecting anomalies and classifying
malicious activities without prior knowledge of specific at-
tack signatures [4], [5]. However, individual classifiers face
challenges such as overfitting, high false positive rates, and
sensitivity to specific network environments [6], [7]. Previous
research shows that anomaly detection models require careful
feature extraction and tuning to avoid misclassifying normal
traffic. Similarly, binary classifiers may not generalize well
due to biases introduced by features such as IP addresses and
port numbers [4], [8].
Recent studies emphasize the critical role of feature selec-
tion in improving ML-based NIDS performance. Yang et al.
[4] highlight that the choice of features has a more significant
impact on detection accuracy than the specific ML model
employed. Features that focus on metadata, such as packet
sizes and TCP flags, have been shown to enhance the detection
of novel events while reducing bias [4], [8]–[10]. Additionally,
the curse of dimensionality underscores the need to select
relevant features to prevent model degradation [4].
Ensemble classification presents a potential solution to
address the limitations of individual classifiers by combining
their strengths and mitigating weaknesses [11]–[15]. These
methods have demonstrated improved accuracy and robust-
ness in NIDS by taking advantage of diverse classifiers and
aggregating their predictions. Work by Lavate and Srivastava
[10] achieved 99.5% accuracy using a hybrid approach that
combines Random Forest and Particle Swarm Optimization for
feature selection. Similarly, Chiba et al. [3] proposed a hybrid
NIDS integrating Suricata with an Isolation Forest algorithm,
highlighting the effectiveness of ensemble strategies.
Despite these advancements, challenges remain in stan-
dardizing feature extraction and effectively aggregating model
output in ensemble systems. Many existing studies focus on
specific protocols and the continued use of legacy datasets,
which limits the ability to generalize their approaches to more
modern and diverse network environments [1], [5], [8]. In
addition, aggregation mechanisms in ensemble models often
lack dynamic weighting schemes that can adapt to varying
performance of the model and network conditions.
Building upon these insights, this research proposes a new
approach to building an ensemble-based NIDS that integrates
multiple machine learning models. This collection of classi-
fiers is guided by a novel ensemble classification algorithm
designed to weight model outputs based on their class-specific
performance metrics. This approach aims to enhance detection
accuracy while reducing false positives, adapting to the com-
plexities of real-world network traffic. The key contributions
of this research include:
Standardized Feature Extraction Framework: Develop-
ment of a feature selection process that focuses on meta-
data and flow-level statistics, reducing bias and improv-
ing generalization across different network environments.
This addresses the challenge of inconsistent detection
accuracy due to inadequate feature selection.
Curated Ensemble of Diverse Machine Learning Models:
Selection and training of a collection of machine learn-
ing models: Isolation Forests, Gaussian Mixture Models
(GMMs), Quadratic Discriminant Analysis (QDA), Ran-
dom Forests, AdaBoost, XGBoost, Convolutional Neural
Networks (CNN), and Recurrent Neural Networks (RNN)
[1]–[11], [14]–[22], [26], [27]. This approach uses the
unique strengths of each model to detect potentially
malicious network traffic that might be missed by a single
classifier.
Ensemble Classification Algorithm: Design of an en-
semble classifier (Ford-CSWV) that assigns weights to
individual model outputs based on class-specific valida-
tion performance. This enhances the ensemble’s overall
detection capability, improving accuracy by leveraging
the unique strengths of each model’s performance.
The remainder of this paper is organized as follows:
Section II: Related work in machine learning, focusing
on feature selection, model performance, and ensemble
methods.
Section III: Outlines the challenges in existing NIDS
solutions and the gaps this research aims to fill.
Section IV: Details the feature selection analysis and
rationale, explaining the construction of the standardized
feature set.
Section V: Describes the methodology, including data
sources, preparation, model training, and the development
of the ensemble classification algorithm.
Section VI: Presents the experimentation results, ana-
lyzing the performance of the individual classifiers and
ensemble model.
Section VII: Discusses the results and potential future
work directions.
Section VIII: Summarizes the contributions and findings
of this research.
This work aims to demonstrate that a group of machine
learning models, guided by a standardized feature extrac-
tion framework and an ensemble classification algorithm can
provide a blueprint for a more robust and accurate network
intrusion detection solution that is suitable for real-world
deployments.
II. RE LATE D WORK
The development of a ML-based NIDS has become a
focal point of recent cybersecurity research. Various models,
datasets, and methodologies have been explored to enhance
detection accuracy, reduce false positives, and address the
challenges of evolving network threats. This section reviews
key contributions in anomaly detection, binary classification,
feature selection, ensemble methods, and challenges identified
in deploying effective NIDS.
Sharafaldin et al. [1], creators of the CIC-IDS-2017 dataset,
identified ongoing challenges in network traffic analysis, par-
ticularly the lack of real-world traffic datasets with adequate
statistical characteristics. This gap highlights the necessity for
continued research in building datasets and developing models
that generalize across diverse network environments.
Ouiazzane et al. [2] proposed a hybrid NIDS by combining
Suricata with a Decision Tree-based anomaly detection model,
achieving 99.9% accuracy. Their research notes the risks
associated with anomaly detection such as the potential for
high false positive rates, emphasizing the need for careful
tuning and monitoring during model training.
Chiba et al. [3] presented a collaborative hybrid NIDS
combining Suricata with an Isolation Forest algorithm, of-
fering detailed insights into signature and anomaly detection
intricacies. Their work underscores the need for continuous
enhancement in hybrid NIDS models, particularly in real-time
traffic analysis.
Fuhnwi et al. [5] compared Isolation Forest and One-Class
Support Vector Machines (OCSVM) using the NSL-KDD
dataset. OCSVM outperformed Isolation Forest in F1 score,
detection rate, and false positive rate. Their work brings up
important questions about the effectiveness of models trained
on specific protocols, suggesting that such approaches may not
generalize well to real-world traffic.
Smolen and Benova [6] compared Autoencoder and Iso-
lation Forest models for network anomaly detection using
web server logs. Autoencoders showed a higher probability
of detecting zero-day attacks due to their ability to identify
missing features, while Isolation Forest required tuning of
estimators for optimal results.
Agustina et al. [7] analyzed the performance of the Random
Forest algorithm using different feature selection methods
including filter, wrapper, and embedded techniques for net-
work anomaly detection. They found that the wrapper method
yielded the highest accuracy of 91.51%, further confirming the
importance of effective feature selection.
Sarhan et al. [8] conducted a comprehensive comparison
of machine learning models, including Deep Feed Forward,
CNN, RNN, Decision Tree, Logistic Regression, and Naive
Bayes, combined with feature extraction algorithms like Prin-
cipal Component Analysis (PCA), Autoencoder, and LDA.
They found that feature extraction methods significantly im-
pact model performance and discussed potential bias intro-
duced by using source and destination IP addresses.
Kiran et al. [9] explore the detection of outliers in network
transfers by utilizing feature extraction techniques. They fo-
cused on applying PCA to reduce the dimensionality of net-
work traffic data and improve the performance of anomaly de-
tection models. Their study demonstrated that effective feature
extraction could significantly enhance the ability of machine
learning algorithms to detect anomalies in high-dimensional
network data. This work highlights the importance of feature
selection in enhancing the accuracy and efficiency of NIDS,
aligning with the emphasis on effective feature selection in
other studies.
Thockchom, Singh, and Nandi [11] introduced a novel
ensemble learning-based model combining Gaussian Naive
Bayes, Logistic Regression, Decision Tree, and Stochastic
Gradient Descent classifiers. Using datasets like KDDCup99,
UNSW-NB15, and CIC-IDS-2017, they demonstrated that en-
semble classifiers effectively handle unbalanced datasets and
improve detection rates.
Dietterich [12] and Polikar [13] provide insights into en-
semble systems, emphasizing that ensembles often outperform
single classifiers by reducing the risk of poor model selection
and improving generalization. Both advocate for classifier
diversity within ensembles to enhance performance, and their
respective works are foundational to the research conducted
as part of this effort.
Farooqi et al. [14] proposed an ensemble voting classifier
combining Decision Tree, Random Forest, and XGBoost,
tested on the NSL-KDD,UNSW-NB15, and CIC-IDS-2017
datasets. Their work emphasizes the importance of data pre-
processing through min-max normalization and recommends
an 80/20 training-validation split. They noted that ensemble
methods improved detection accuracy significantly.
Alserhani and Aljared [15] evaluated ensemble learning
mechanisms for predicting advanced cyber attacks, comparing
algorithms like Random Forest, Logistic Regression, Decision
Tree, KNN, SVM, XGBoost, CatBoost, and Artificial Neural
Networks using the UNSW-NB15 dataset. They reported ac-
curacies ranging from 88% to 94%, with Random Forest and
Logistic Regression achieving the highest accuracy.
Kiran et al. [16] investigated detecting anomalous packets
using Isolation Forest and Autoencoder models. They con-
cluded that Isolation Forest outperformed PCA and Autoen-
coders in identifying anomalies in TCP traffic, especially when
dealing with complex network behaviors.
Du et al. [17] introduced GPR-RF, a network attack de-
tection method based on Random Forest and Bayesian op-
timization. Their approach demonstrated higher accuracy in
detecting anomalous data with less time required for parameter
optimization, highlighting the potential for combining machine
learning models with optimization techniques.
Mondal et al. [18] evaluated six machine learning algo-
rithms, including Logistic Regression, KNN, Naive Bayes,
SVM, Decision Tree, and Random Forest using the NSL-KDD
dataset for DDoS attack detection. They found that Decision
Tree and Random Forest achieved high recall rates of 98%,
indicating their effectiveness in NIDS.
Li, Shi and Wu [19] explored various machine learning
techniques for network traffic anomaly detection, including
Prophet, RNN, CNN, Isolation Forest, and OmniAnomaly.
They discussed different anomaly types: point, contextual, and
collective, and stressed the importance of selecting appropriate
algorithms based on data characteristics.
Saran and Kesswani [20] conducted a comparative study of
supervised machine learning classifiers for intrusion detection.
They evaluated algorithms like KNN, SVM, Naive Bayes,
Random Forest, Decision Tree, and Stochastic Gradient De-
scent across multiple datasets, highlighting the need for diverse
datasets and models to improve detection accuracy.
Nixon, Sedky, and Hassan [21] demonstrated the use of
Autoencoder as a low-cost anomaly detection method for
network data streams. They highlighted that Autoencoder can
adapt to concept drift in data streams and emphasized the
challenges posed by the dynamic nature of network traffic.
Elsayed, Mohamed, and Madkour [22] conducted a compar-
ative study using deep learning algorithms, including DNN,
CNN, RNN, LSTM, GRU, and a hybrid CNN-LSTM using
the NSL-KDD dataset. The GRU model achieved the highest
accuracy at 99.54%, demonstrating deep learning’s potential
in NIDS.
Verma et al. [23] developed an ensemble approach us-
ing Random Forest and Gradient Boosting Machine (GBM)
algorithms, applying XGBoost for feature selection on the
CIC-AWS-2018 dataset. Their work emphasized documenting
hardware and software specifications for reproducibility, and
highlighted ensemble methods’ effectiveness in enhancing
detection accuracy.
Kuncheva’s essential work [24] discusses combining pattern
classifiers and the methods and algorithms involved, highlight-
ing that not all features are equally relevant and that feature
extraction improves classifier descriptions. They also note the
risk of overfitting when training data is reused for testing, and
emphasized that no single classifier is best for all problems.
Garcia et al. [25] performed an empirical analysis of bot-
net detection methods, noting the challenges associated with
suboptimal datasets and the accessibility of quality data in
netflow or PCAP format. Their work emphasizes the need for
improved security measures to safeguard network reliability
and integrity against DoS threats, and introduces the CTU-13
dataset used as part of this study.
Wang et al. [26] introduced a method for classifying mal-
ware network traffic using CNNs to automatically learn feature
representations. Their approach demonstrated that CNN-based
models show the potential for improving the accuracy of mal-
ware traffic classification compared to other machine learning
techniques. The authors are responsible for the creation of the
USTC-TFC2016 dataset used in this research.
Chen and Guestrin’s [27] introduction of the XGBoost algo-
rithm implements novel system optimizations and a sparsity-
aware learning approach. Their method efficiently handles
large-scale datasets and has become a widely adopted tool
in the machine learning community by enhancing computa-
tional speed and model performance for gradient boosting
algorithms.
Ford and Strohmier-Berry [28] investigated the use of CNNs
to enhance the detection of quick response (QR) code images
in email-based threats, highlighting the effectiveness of ma-
chine learning in identifying malicious content. This research
underscores the broad applicability of neural networks in
cybersecurity, demonstrating how deep learning models can
be leveraged to detect and mitigate various types of threats.
III. PROB LE M STATEM EN T
Effectively identifying threats in dynamic network environ-
ments is a complex problem. Existing NIDS solutions rely
heavily on singular detection methods, which face significant
challenges in keeping pace with evolving cyber threats. Tradi-
tional systems including those that utilize both anomaly detec-
tion and binary classification suffer from inherent limitations.
A primary obstacle in improving NIDS performance is
the extraction of relevant features from network traffic data.
Research in this domain often lacks a standardized approach
to feature selection, leading to inconsistencies in detection
accuracy across different network environments. Additionally,
machine learning models’ effectiveness is intricately linked to
the quality and relevance of the features used during training.
Inadequate feature selection hampers model generalization and
increases computational overhead, limiting the practicality of
deploying these systems in real world scenarios.
In light of these challenges, there is room for further
exploration of combining multiple machine learning mod-
els within an ensemble framework. Ensemble methods hold
promise for leveraging the strengths of individual models to
create a more robust detection system [3], [10]–[15]. However,
aggregating model outputs in a way that maximizes accuracy
and minimizes false positives for NIDS applications remains
an underdeveloped area of study. Specifically, a mechanism
to dynamically weight model scores and adapt to different
network traffic conditions is needed to create a more versatile
NIDS.
IV. FEATURE SELECT IO N ANA LYSI S AN D RATIO NALE
This section outlines the selection process, guided by lit-
erature review findings and a focus on constructing a robust
feature set to generalize across varied network environments.
Prior research reveals that packet-level and flow-level char-
acteristics play a significant role in the accuracy of NIDS
solutions [4], [8]–[10]. By combining the findings from those
works, the following feature set was constructed to focus on
metadata and characteristics less tied to specific environments.
This decision is intended to reduce model bias and improve
the ability to detect novel threats in diverse network scenarios.
A. Packet-Level Features
Counts of TCP Flags: FIN, SYN, RST, PSH, ACK, URG,
ECE, and CWR
Time To Live (TTL): The average TTL value of the
packets.
B. Flow-Level Features
Flow Statistics: Total Packets and Flow Duration
Forward (FWD) Packet Statistics: Mean, Min, Max, and
Total
Backward (BWD) Packet Statistics: Mean, Min, Max, and
Total
Inter-Arrival Times (IAT):
FWD and BWD IAT Mean: Mean inter-arrival time
of packets in the forward direction.
BWD IAT Total: Total inter-arrival time of packets
in the backward direction.
BWD IAT Standard Deviation: Standard deviation of
inter-arrival times in the backward direction.
Time-Based Metrics:
Packets in Last T Seconds: Number of packets
observed in the last defined time window.
V. METHODOLOGY
This section outlines the methodology used to develop
the proposed ensemble-based NIDS. It includes the selection
of machine learning classifiers, the rationale behind their
selection, and the development of the ensemble framework.
A. Model Selection
The following classifiers were selected based on their ability
to handle large datasets, suitability for binary classification
tasks, and model diversity to enhance the ensemble’s overall
performance.
1) Random Forest: An ensemble learning method that
operates by constructing a multitude of decision trees during
training and outputting the class that is the mode of the
classes or mean prediction of the individual trees. It intro-
duces randomness by selecting random subsets of features
and data samples, which enhances its ability for effective
generalization. Random Forest is highly effective for binary
classification tasks, making it an ideal choice for distinguishing
between normal and malicious network traffic. It is known for
high accuracy and robustness against overfitting due to the
ensemble of multiple decision trees [1], [10], [14], [15], [17],
[18], [20].
2) Isolation Forest: An unsupervised anomaly detection
algorithm that isolates anomalies instead of profiling normal
data points. It constructs random binary trees and isolates
observations by randomly selecting a feature and a split value
between the maximum and minimum values of the selected
feature. Anomalies require fewer splits to isolate, resulting in
shorter path lengths in the tree structure. Isolation Forest excels
at detecting anomalies in large datasets and high-dimensional
spaces, making it suitable for network traffic analysis. Its low
memory requirements allow it to handle large-scale network
data efficiently [3], [5], [6], [16].
3) Gaussian Mixture Models (GMMs): Probabilistic mod-
els that assume all the data points are generated from a mixture
of a finite number of Gaussian distributions with unknown
parameters. GMMs can model the underlying distribution of
network traffic data, identifying anomalies as data points that
have a low probability under the model. This model would
ideally contribute unsupervised anomaly detection in the con-
text of this research by modeling the probability density of
normal network traffic. This model is less frequently explored
in NIDS research, adding the potential for unique findings in
this field of study [4].
4) Quadratic Discriminant Analysis (QDA): A classifica-
tion algorithm that models the conditional probability den-
sity functions as multivariate normal distributions with class-
specific means and covariance matrices. It effectively handles
binary classification tasks, which is essential for distinguishing
between normal and malicious network traffic. By modeling
quadratic boundaries, QDA can capture complex relationships
between features and classes. Like GMMs, this model is less
frequently explored in NIDS research, adding model diversity
and the potential for unique findings in this field of study [1].
5) Adaptive Boosting (AdaBoost): An ensemble learning
algorithm that combines multiple weak classifiers to form a
strong classifier. It enhances binary classification performance
by combining weak learners to improve accuracy. This model
also introduces an ensemble mechanism compared to bagging
methods like Random Forest. Utilizing AdaBoost can poten-
tially improve the model’s ability to detect subtle and complex
attack patterns in network traffic [1], [10].
6) Extreme Gradient Boosting (XGBoost): A scalable and
efficient implementation of gradient boosting machines, in-
troduced by Chen and Guestrin [27]. It employs gradient
boosting algorithms based on decision trees and includes
system optimizations and algorithmic enhancements to im-
prove speed and performance. XGBoost is highly effective for
binary classification tasks, making it suitable for distinguishing
between normal and malicious network traffic. This classifier
is another example of a model that has not been extensively
explored in NIDS research, offering potential for novel insights
and contributions to the field [14], [15].
7) Convolutional Neural Networks (CNN): A class of deep
learning models that employ convolutional layers to automat-
ically and adaptively learn spatial hierarchies of features from
input data. In the context of NIDS, CNNs can be applied
to capture spatial features of network traffic by treating the
traffic data as multidimensional inputs. Elsayed et al. [22]
reviewed CNNs in their comparative study and showed that
deep learning models are effective for NIDS. Including this
model in the ensemble adds diversity by incorporating the
ability to capture spatial dependencies in the data. Prior studies
have demonstrated that CNNs can achieve high accuracy
across a variety of cybersecurity use cases [8], [19], [22], [26],
[28].
8) Recurrent Neural Networks (RNN): Designed to rec-
ognize patterns in sequences of data by utilizing temporal
dependencies. RNNs maintain an internal state that captures
information about previous inputs, making them effective
for modeling sequential data. These models are well-suited
for capturing temporal patterns in sequential data, which is
valuable for analyzing network traffic over time. They have
demonstrated the ability to model complex temporal relation-
ships, which may aid in the detection of advanced persistent
threats and other novel attack chains that exhibit temporal
behaviors. Including RNNs in the ensemble introduces a focus
on temporal features that complements models that focus on
spatial or statistical features [8], [19], [22].
B. Ensemble Framework
The selected models are integrated into an ensemble frame-
work to leverage their complementary strengths. The proposed
ensemble algorithm - Ford Class Specific Weighted Values
(Ford-CSWV) - provides an ensemble score Eby combining
the predictions from multiple classifiers. Each classifier’s
prediction is weighted based on two factors:
The classifier’s validation accuracy for benign (0) and
malicious (1), ensuring that classifiers with better perfor-
mance contribute more to the final decision.
A dynamic weight based on the uncertainty of the classi-
fier’s prediction, giving higher importance to predictions
that are less confident.
The class-specific weights Bi(benign) and Mi(malicious)
are derived from the validation accuracy of classifier i. These
weights ensure that the classifiers influence is proportional
to its historical performance for each class. The algorithm
applies a dynamic weighting function based on the predicted
probability pifor each classifier. The dynamic weight wi
assigns higher importance to predictions closer to 0.5, em-
phasizing less confident predictions. This approach aims to
mitigate the impact of overconfident but incorrect predictions
by giving more consideration to areas of uncertainty. The
dynamic weighting function wiis defined as:
wi= 1 |0.5pi|
This weighting function has the following properties:
When pi= 0.5 (maximum uncertainty), wi= 1 (maximum
weight).
When pi= 0 or pi= 1 (maximum confidence), wi= 0.5
(minimum weight).
The weighted benign score Bis given by:
B=
n
X
i=1
wi·pi·Bi
Similarly, the weighted malicious score Mis given by:
M=
n
X
i=1
wi·pi·Mi
where nis the number of classifiers, piis the predicted
probability for classifier i, and Biand Miare the benign and
malicious weights for classifier i, respectively. The ensemble
score Eis the average of the benign and malicious scores:
E=B+M
2
Figure 1 illustrates the ensemble classification process using
the Ford-CSWV algorithm. The classification decision is made
by comparing Eto a threshold of 0.5. If E0.5, the sample
is labeled as malicious. Otherwise, the sample is labeled as
benign.
C. Data Sources and Preparation
To develop and evaluate the proposed ensemble-based
NIDS, several datasets were utilized to provide a variety of
network traffic scenarios. Datasets available as PCAP files
were selected to ensure full access to the features outlined
in Section IV, as many of the legacy data sets used in prior
work lack the ability to effectively capture these features.
The following combination of these datasets aims to enhance
the robustness and generalizability of the models by exposing
them to a diverse assortment network behaviors, traffic types,
and attack patterns.
1) CTU-13 Dataset: A public dataset provided by the
Czech Technical University in Prague, capturing a variety of
malware-related traffic scenarios [25]. It includes real botnet
traffic as well as benign traffic and background noise, offering
valuable data for anomaly detection.
2) TON IoT Dataset: Developed by the University of New
South Wales (UNSW), this dataset includes a collection of
telemetry datasets for the Internet of Things (IoT) and Indus-
trial IoT (IIoT) environments [29]. It provides labeled PCAP
files for various types of network attacks such as DDoS,
ransomware, backdoors, and data exfiltration attacks.
3) USTC-TFC2016 Dataset: Provided by the University
of Science and Technology of China (USTC), this dataset
contains real network traffic samples including both normal
and abnormal flows [26]. It encompasses a variety of attack
types such as botnet, DDoS, port scanning, and infiltration
attacks, with corresponding labeled PCAP files.
4) Custom-Collected Benign Traffic: To balance the train-
ing and validation datasets with normal traffic, packet captures
were collected from home and academic networks. This data
aims to provide recent and diverse benign network behaviors.
The custom-collected data was destroyed following comple-
tion of the experiment at the request of the individuals and
organizations who volunteered their network data for partici-
pation in this study.
VI. EX PE RI ME NTATION
This section outlines the experimentation process conducted
to evaluate the proposed ensemble-based NIDS. The exper-
imentation involved data preparation, model training, and
performance evaluation using the datasets described in the
previous section.
A. Experimental Setup
CPU: Intel Core i7-7700K @ 4.20GHz
Memory: 32 GB DDR4 2400MHz
Local Disk: Samsung SSD 970 EVO Plus 1TB
Operating System: Ubuntu 22.04.3 LTS
B. Dataset Preparation
To prepare the data for training and evaluation, the labeled
PCAP files were divided into training and testing sets using
an 80/20 ratio.
Fig. 1. Flow of the ensemble classification process using the Ford-CSWV
algorithm
C. Feature Extraction Process
Feature extraction was implemented in Python with the
following key components:
Libraries Used: scapy for packet parsing, numpy for
numerical computations, and collections for data
structures like defaultdict and deque.
Flow Features Dictionary: A dictionary was used to store
flow features, allowing for dynamic creation of entries for
new flows.
Time Window for Time-Based Features: A constant time
window of 10 seconds was defined to calculate time-
based metrics.
Fixed Feature Length: The feature vector was standard-
ized to a fixed length of 1000 elements to maintain
uniformity across all samples.
D. Model Training
Python scripts were developed to train and test the pre-
diction accuracy of each model. The training process for the
Random Forest model is described below, with similar scripts
developed for training the other models.
Feature Extraction: Features were extracted from the
PCAP files using the feature extraction library created
for the project, ensuring that each feature vector matched
the expected fixed length.
Batch Processing: To handle large datasets efficiently,
PCAP files were processed in batches using Python’s
concurrent.futures module for parallel process-
ing.
Label Assignment: Labels were assigned to the feature
vectors, with ’0’ representing benign traffic and ’1’
representing malicious traffic.
Data Aggregation: All feature vectors and labels were
aggregated into NumPy arrays for training.
Data Scaling: Features were scaled using
StandardScaler to normalize the data.
VII. RES ULTS A ND FUTURE WORK
The performance of each model was evaluated using the
test dataset. The evaluation metrics include precision for
both Benign and Malicious classes, as well as the overall
accuracy for each model. The supervised classification models
demonstrated high accuracy and precision in detecting mali-
cious network traffic. AdaBoost achieved the highest accuracy,
closely followed by Ford-CSWV, QDA, Random Forest, and
XGBoost. Model precision testing results are shown in Figure
2, and full evaluation metrics for each model are detailed in
Appendix 2Model Evaluation Results.
A. Evaluation of Individual Classifiers
Random Forest: Achieved an accuracy of 97.50%, demon-
strating high effectiveness in distinguishing between normal
and malicious network traffic. It attained a precision of 95.24%
for the Benign class and 100.00% for the Malicious class, indi-
cating excellent performance in correctly identifying malicious
activities while maintaining a low false-positive rate for benign
traffic.
Isolation Forest: Achieved an accuracy of 85.00%, with
precision of 79.17% for the Benign class and 93.75% for
the Malicious class. This model performed reasonably well
in detecting malicious traffic, albeit with lower accuracy com-
pared to the supervised classification models. This is expected
given its unsupervised nature and reliance on modeling normal
behavior.
Gaussian Mixture Models (GMM): This model performed
poorly, with an accuracy of 50.00%. It achieved a precision
of 100.00% for the Benign class, but only 50.98% for the
Malicious class. The low accuracy suggests that GMM may
not be suitable for this type of network intrusion detection task
in real-world scenarios. This could be due to the complexity
and high dimensionality of network traffic, or that GMM
density assumptions do not align with the malicious traffic
patterns present in the training data. It is also possible that
aspects of the benign and malicious traffic samples share
similar patterns, causing the GMM to place them in the
same cluster. This would explain why the model classifies
benign traffic correctly, but mislabels a significant amount of
malicious samples as benign.
Quadratic Discriminant Analysis (QDA): Performed ex-
ceptionally well with an accuracy of 97.50%. It achieved a
precision of 95.24% for the Benign class and 100.00% for
the Malicious class, demonstrating the ability to effectively
distinguish between benign and malicious traffic.
AdaBoost: Achieved the highest accuracy among all models,
with an accuracy of 98.08%. The precision was 96.30%
for the Benign class and 100.00% for the Malicious class.
This suggests that AdaBoost is highly effective in enhancing
detection performance by combining weak learners to form a
strong classifier.
XGBoost: Like Random Forest and QDA, the XGBoost
model achieved an accuracy of 97.50%, with a precision of
95.24% for the Benign class and 100.00% for the Malicious
class. XGBoost’s performance demonstrates its capability in
handling large datasets and modeling complex patterns in
network traffic data.
CNN: The CNN model achieved an accuracy of 83.30%,
with a precision of 100.00% for the Benign class and 75.00%
for the Malicious class. While previous studies have shown
that CNNs can achieve high accuracy in network intrusion
detection, this model struggled with accurately detecting ma-
licious activities. The discrepancy in performance suggests that
additional optimization and experimentation are necessary to
enhance the CNN model’s effectiveness in this context.
RNN: Despite prior research demonstrating the potential
of RNNs in NIDS, this model performed poorly with an
accuracy of 50.00%. It achieved a precision of 100.00%
for the Benign class, but only 50.00% for the Malicious
class. This highlights the challenges of utilizing deep learning
models in the context of NIDS without careful tuning. Further
investigation is needed to determine whether adjustments to
the model training process could improve performance.
B. Evaluation of Ensemble Classifier
The proposed FORD-CSWV ensemble classifier offers bal-
anced precision for both benign (96.00%) and malicious
(100.00%) traffic, with a slight increase in accuracy (97.92%)
compared to Random Forest, QDA, and XGBoost. This sug-
gests that while the ensemble is not dramatically outperform-
ing them, it does provide a stable result across different traffic
types. This is a key characteristic for any classifier that would
be considered for use in a production network environment.
C. Future Work
This research reveals opportunities for further investigation
and exploration in several key areas that span the full scope
of this work.
The poor performance of the GMM and RNN classifiers
in validation testing mean they are contributing little
to the ensemble’s predictions. Future research could in-
vestigate retraining these models using different hyper
parameters, particularly in the case of the RNN classifier
to more closely align it with those used in studies where
it performed more suitably for this type of task. Like-
wise, the lower precision and recall for Isolation Forest
warrants consideration for whether it could be replaced in
future ensemble configurations to improve performance.
The proposed feature extraction framework provides a
foundation for experimentation with additional datasets.
While the results of validation testing are promising,
Fig. 2. Model Precision Testing Results
production network traffic will ultimately reveal the fea-
sibility of utilizing this method with live data.
While the balanced performance of the ensemble clas-
sifier across traffic types is encouraging, it is important
to consider that its effectiveness in real-world scenarios
would be dependent on continuous re-training of individ-
ual classifiers with new data to preserve performance and
adaptability to new and emerging threats. Future work
could explore development of a re-training mechanism
to provide an ongoing source of new traffic samples, as
well as a feedback loop to reclassify samples that were
mislabeled by the ensemble.
Future endeavors could leverage this work to develop a
network sensor that identifies suspicious network traffic,
sending key details to a Security Incident and Event Man-
agement (SIEM) platform for further analysis. The pos-
sibility of pairing the ensemble classifier with signature-
based detection like Suricata to form a hybrid detection
solution presents opportunities for exploring a NIDS plat-
form that works on low-cost hardware like the Raspberry
Pi as well as commodity virtualization solutions.
VIII. CONCLUSION
This research focused on three primary contributions: a
standardized feature extraction framework that emphasizes
metadata and flow-level statistics to reduce bias and enhance
generalization; training a diverse set of machine learning
models to detect benign and malicious network traffic; and
designing the Ford-CSWV ensemble classifier to dynamically
weight model outputs based on class-specific performance.
The experimental results demonstrate that the proposed ensem-
ble approach achieves high detection accuracy with balanced
precision and recall. While the gains over the best-performing
individual models were modest, the ensemble’s consistent
performance highlights its potential for real-world deploy-
ment. As cyber threats become increasingly sophisticated and
pervasive, the development of effective NIDS solutions will
be key to the implementation of effective defenses. With
ample opportunity for continuing to innovate and build upon
this work, the cybersecurity community is well positioned to
develop more resilient defenses against ever-evolving threats.
REFERENCES
[1] Sharafaldin, I., Lashkari, A. H., & Ghorbani, A. A. (2018). Toward
Generating a New Intrusion Detection Dataset and Intrusion Traf-
fic Characterization. In Proceedings of the 4th International Con-
ference on Information Systems Security and Privacy (pp. 108116).
https://doi.org/10.5220/0006639801080116
[2] Ouiazzane, S., Addou, M., & Barramou, F. (2022). A Suricata
and Machine Learning Based Hybrid Network Intrusion Detection
System. In Lecture Notes in Networks and Systems (pp. 474485).
https://doi.org/10.1007/978-3-030-91738-8 43
[3] Chiba, Z., Abghour, N., Moussaid, K., Omri, A. E., & Rida, M.
(2019). Newest Collaborative and Hybrid Network Intrusion Detection
Framework Based on Suricata and Isolation Forest Algorithm. In Pro-
ceedings of the 4th International Conference on Smart City Applications.
https://doi.org/10.1145/3368756.3369061
[4] Yang, K., Kpotufe, S., & Feamster, N. (2020). Feature Extraction for
Novelty Detection in Network Traffic. arXiv preprint arXiv:2006.16993.
https://arxiv.org/abs/2006.16993
[5] Fuhnwi, G. S., Adedoyin, V., & Agbaje, J. O. (2023). An Em-
pirical Internet Protocol Network Intrusion Detection Using Isola-
tion Forest and One-Class Support Vector Machines. International
Journal of Advanced Computer Science and Applications, 14(8).
https://doi.org/10.14569/IJACSA.2023.0140801
[6] Smolen, T., & Benova, L. (2023). Comparing Autoencoder and Isolation
Forest in Network Anomaly Detection. In 2023 46th International
Conference on Telecommunications and Signal Processing (TSP) (pp.
375378). IEEE. https://doi.org/10.1109/TSP56665.2023.10143005
[7] Agustina, T., Masrizal, & Irmayanti. (2023). Performance
Analysis of Random Forest Algorithm for Network
Anomaly Detection Using Feature Selection. SinkrOn, 8(2).
https://doi.org/10.33395/sinkron.v8i2.13625
[8] Sarhan, M., Layeghy, S., Moustafa, N., Gallagher, M., & Portmann,
M. (2022). Feature Extraction for Machine Learning-Based Intrusion
Detection in IoT Networks. Digital Communications and Networks.
https://doi.org/10.1016/j.dcan.2022.08.012
[9] Kiran, M., Rao, B. P., Wang, L., & Mandal, A. (2018). Detecting Out-
liers in Network Transfers with Feature Extraction. Lawrence Berkeley
National Laboratory. https://doi.org/10.2172/1468101
[10] Lavate, S. H., & Srivastava, P. K. (2023). A Hybrid Fea-
ture Selection Approach Based on Random Forest and Particle
Swarm Optimization for IoT Network Traffic Analysis. Interna-
tional Journal of Electrical and Electronics Research, 11(2), 568574.
https://doi.org/10.37391/ijeer.110244
[11] Thockchom, N., Singh, M. M., & Nandi, U. (2021). A Novel Ensemble
Learning-Based Model for Network Intrusion Detection. In Proceedings
of the International Conference on Computing and Communication Sys-
tems (pp. 123134). Springer. https://doi.org/10.1007/978-981-16-2758-
712
[12] Dietterich, T. G. (2000). Ensemble Methods in Machine
Learning. In Multiple Classifier Systems (pp. 115). Springer.
https://doi.org/10.1007/3-540-45014-9 1
[13] Polikar, R. (2006). Ensemble Based Systems in Decision
Making. IEEE Circuits and Systems Magazine, 6(3), 2145.
https://doi.org/10.1109/MCAS.2006.1688199
[14] Farooqi, M., Akhtar, F., Rahman, Z., Sadiq, A. S., & Abbass, H.
(2023). Enhancing Network Intrusion Detection Using an Ensem-
ble Voting Classifier for Internet of Things. Sensors, 24(1), 127.
https://doi.org/10.3390/s24010127
[15] Alserhani, F. M., & Aljared, A. M. (2022). Evaluating Ensemble Learn-
ing Mechanisms for Predicting Advanced Cyber Attacks. IEEE Access,
10, 8231982331. https://doi.org/10.1109/ACCESS.2022.3198238
[16] Kiran, M., Wang, L., Papadimitriou, G., & Mandal, A. (2021). Detecting
Anomalous Packets in Network Transfers: Investigations Using PCA,
Autoencoder, and Isolation Forest in TCP. Machine Learning, 110,
13211340. https://doi.org/10.1007/s10994-020-05870-y
[17] Du, X., Lin, L., Han, Z., Zhang, C., & Du, Y. (2023).
GPR-RF: A Network Attack Traffic Detection Method Based
on Random Forest and Bayesian Optimization. Research Square.
https://doi.org/10.21203/rs.3.rs-3360166/v1
[18] Mondal, A., Koner, R., Chakraborty, S., & Gupta, S. (2022). Detection
and Investigation of DDoS Attacks in Network Traffic Using Machine
Learning Algorithms. International Journal of Computer Applications,
184(24), 3034. https://doi.org/10.5120/ijca2022922028
[19] Li, X., Shi, G., & Wu, Y. (2023). Utilizing Machine Learning
Techniques for Network Traffic Anomaly Detection. arXiv preprint
arXiv:2309.14560. https://doi.org/10.48550/arXiv.2309.14560
[20] Saran, V., & Kesswani, N. (2023). A Comparative Study
of Supervised Machine Learning Classifiers for Intrusion
Detection in Internet. Procedia Computer Science, 201, 386393.
https://doi.org/10.1016/j.procs.2022.12.225
[21] Nixon, S., Sedky, M., & Hassan, W. (2020). Autoencoders: A
Low Cost Anomaly Detection Method for Computer Network Data
Streams. International Journal of Computer Applications, 175(24), 18.
https://doi.org/10.5120/ijca2020920931
[22] Elsayed, M. S., Mohamed, R. A., & Madkour, M. A. (2020). A Compar-
ative Study of Using Deep Learning Algorithms in Network Intrusion
Detection. International Journal of Advanced Computer Science and
Applications, 11(7). https://doi.org/10.14569/IJACSA.2020.0110755
[23] Verma, A. K., Dumka, A., Singh, V. P., Ashok, A., Gehlot, A., Malik,
A., Gaba, G. S., & Hedabou, M. (2021). A Novel Intrusion Detection
Approach Using Machine Learning Ensemble for IoT Environments.
Applied Sciences, 11(21), 10268. https://doi.org/10.3390/app112110268
[24] Kuncheva, L. I. (2004). Combining Pattern Classifiers: Methods and
Algorithms. Wiley-Interscience. ISBN: 9780471210788
[25] Garcia, S., Grill, M., Stiborek, J., & Zunino, A. (2014). An empirical
comparison of botnet detection methods. Computers & Security, 45,
100123. https://doi.org/10.1016/j.cose.2014.05.011
[26] Wang, W., Zhu, M., Zeng, X., Ye, X., & Sheng, Y. (2017).
Malware Traffic Classification Using Convolutional Neural Net-
work for Representation Learning. In 2017 International Confer-
ence on Information Networking (ICOIN) (pp. 712717). IEEE.
https://doi.org/10.1109/ICOIN.2017.7899569
[27] Chen, T., & Guestrin, C. (2016). XGBoost: A Scalable Tree Boosting
System. In Proceedings of the 22nd ACM SIGKDD International Con-
ference on Knowledge Discovery and Data Mining (pp. 785794). ACM.
https://doi.org/10.1145/2939672.2939785
[28] Ford, J., & Strohmier-Berry, H. (2024). Feasibility of Machine
Learning-Enhanced Detection for QR Code Images in Email-based
Threats. In 4th Cyber Awareness and Research Symposium. IEEE.
https://doi.org/10.1109/CARS61786.2024.10778732
[29] Moustafa, N., & Slay, J. (2019). The TON IoT Datasets: A New
Generation of IoT and IIoT Intrusion Detection Datasets. Available at
https://research.unsw.edu.au/projects/toniot-datasets
IX. ACK NOWLEDGEMENTS
The authors would like to thank Selena Larson, Ken Ray,
Tyler Johnson and Ashley Ford. Their feedback and support
is greatly appreciated.
X. APPENDIX 1-ADDITIONAL RESO UR CE S
The feature extraction framework developed as part
of this research is available under a GPLv3 license at
https://github.com/jasonsford.
XI. APPENDIX 2-MO DE L EVALUATI ON RE SU LTS
TABLE I
MOD EL EVALUATI ON RE PORT - R ANDOM FOR EST
P recision Recall F 1Score
Benign 0.952381 1.000000 0.975610
Malicious 1.000000 0.950000 0.974359
Accuracy 0.975000 0.975000 0.975000
Macro Avg 0.976190 0.975000 0.974984
Weighted Avg 0.976190 0.975000 0.974984
TABLE II
MOD EL EVALUATI ON RE PORT - I SO LATI ON FOREST
P recision Recall F 1Score
Benign 0.791667 0.950000 0.863636
Malicious 0.937500 0.750000 0.833333
Accuracy 0.850000 0.850000 0.850000
Macro Avg 0.864583 0.850000 0.848485
Weighted Avg 0.864583 0.850000 0.848485
TABLE III
MOD EL EVALUATI ON RE PORT - G MM
P recision Recall F 1Score
Benign 1.000000 0.000000 0.000000
Malicious 0.509804 1.000000 0.675325
Accuracy 0.509804 0.509804 0.509804
Macro Avg 0.754902 0.500000 0.337662
Weighted Avg 0.750096 0.509804 0.344283
TABLE IV
MOD EL EVALUATI ON RE PORT - Q DA
P recision Recall F 1Score
Benign 0.952381 1.000000 0.975610
Malicious 1.000000 0.950000 0.974359
Accuracy 0.975000 0.975000 0.975000
Macro Avg 0.976190 0.975000 0.974984
Weighted Avg 0.976190 0.975000 0.974984
TABLE V
MOD EL EVALUATI ON RE PORT - A DABO OS T
P recision Recall F 1Score
Benign 0.962963 1.000000 0.981132
Malicious 1.000000 0.961538 0.980392
Accuracy 0.980769 0.980769 0.980769
Macro Avg 0.981481 0.980769 0.980762
Weighted Avg 0.981481 0.980769 0.980762
TABLE VI
MOD EL EVALUATI ON RE PORT - X GBOOS T
P recision Recall F 1Score
Benign 0.952381 1.000000 0.975610
Malicious 1.000000 0.950000 0.974359
Accuracy 0.975000 0.975000 0.975000
Macro Avg 0.976190 0.975000 0.974984
Weighted Avg 0.976190 0.975000 0.974984
TABLE VII
MOD EL EVALUATI ON RE PORT - C NN
P recision Recall F 1Score
Benign 1.000000 0.666667 0.800000
Malicious 0.750000 1.000000 0.857143
Accuracy 0.833333 0.833333 0.833333
Macro Avg 0.875000 0.833333 0.828571
Weighted Avg 0.875000 0.833333 0.828571
TABLE VIII
MOD EL EVALUATI ON RE PORT - R NN
P recision Recall F 1Score
Benign 1.000000 0.000000 0.000000
Malicious 0.500000 1.000000 0.666667
Accuracy 0.500000 0.500000 0.500000
Macro Avg 0.750000 0.500000 0.333333
Weighted Avg 0.750000 0.500000 0.333333
TABLE IX
MOD EL EVALUATI ON RE PORT - F ORD-CSWV
P recision Recall F 1Score
Benign 0.960000 1.000000 0.979592
Malicious 1.000000 0.958333 0.978723
Accuracy 0.979167 0.979167 0.979167
Macro Avg 0.980000 0.979167 0.979158
Weighted Avg 0.980000 0.979167 0.979158
... The ensemble, which included models like Random Forest, AdaBoost, and convolutional neural networks, achieved a high overall accuracy of 97.92% with balanced precision and recall across both benign and malicious traffic. While performance gains over the best individual models were modest, the ensemble offered more stable and generalised results across varied traffic types (Ford and Berry, 2025). Another application was credit card fraud detection, where logistic regression, random forest, and AdaBoost were combined to improve classification accuracy and reduce false positives. ...
Conference Paper
Full-text available
As QR codes become increasingly common in digital communication, cybercriminals have seized upon this technology as a vehicle for sophisticated URL-based email phishing attacks. These malicious QR codes, embedded within email messages, are designed to deceive recipients into revealing sensitive information. The primary challenge for cybersecurity vendors is efficiently detecting and analyzing these QR codes at scale, a task that is both computationally demanding and challenging within the high-throughput environment of email servers. This research investigates the application of convolutional neural networks (CNNs) to automate the detection of QR codes embedded in email images, addressing a growing vector for phishing attacks by integrating advanced image recognition techniques into existing email security frameworks. Through iterative development and refinement, a CNN model was designed to accurately differentiate between benign and malicious QR codes. The experimentation process revealed that while the model achieved high accuracy in early stages, it also encountered issues with overfitting as complexity increased, underscoring the need for careful balance in training processes. The study concludes with a proof of concept that demonstrates the effectiveness of CNNs in enhancing email security systems. It also emphasizes the importance of continuous model adaptation to address the evolving nature of phishing threats. This work represents a significant step toward scalable and efficient solutions for detecting QR code-based phishing attacks in the dynamic cybersecurity landscape.
Article
Full-text available
As the volume and complexity of computer network traffic continue to increase, network administrators face a growing challenge in monitoring and discovering unusual activity. To keep the network safe and functioning, detecting anomalies is essential. Machine learning-based anomaly detection techniques have become increasingly popular in recent years. This is due to the fact that conventional anomaly detection methods make it difficult to detect unknown and complex attacks. This research aims to conduct a performance analysis of two feature selection methods using the random forest algorithm using the UNSW-NB15 dataset to determine which model is most effective in detecting network traffic anomalies. The models evaluated were random forest with the filter method and random forest with the wrapper method. A number of metrics used for model performance assessment are accuracy, F1-score, receiver operating characteristic curve, and precision-recall. Dataset collection, data pre-processing, feature selection, model construction, and evaluation are the main components of the research methodology. The research results show that the Random Forest approach with the Filter method has an accuracy of 0.8950, F1-score of 0.8333, ROC score of 0.8928, and a precision-recall value of 0.8347. Meanwhile, the approach using the Wrapper method obtained an accuracy of 0.9151, F1-score of 0.8510, ROC score of 0.9136, and a precision-recall value of 0.8637. This shows that the performance of Random Forest with the Wrapper method is superior in all assessment metrics. Random Forest with the Wrapper Method is the right choice of model for detecting network traffic anomalies because of its stable performance and ability to handle complex patterns
Article
Full-text available
This study introduces a deep learning approach for network intrusion detection (NIDS), which excels in both binary and multi-classification tasks. This approach combines the strengths of six distinct deep learning algorithms: DNN, CNN, RNN, LSTM, GRU, and a Hybrid CNN-LSTM architecture. The NSL-KDD dataset, a widely recognized benchmark for intrusion detection research, was utilized for implementation and evaluation. In binary classification, the approach demonstrates exceptional capabilities, with the GRU approach outperforming others. Similarly, the DNN, LSTM, CNN, and RNN approaches exhibit robust performance, showcasing their efficacy in detecting anomalies within network data. In the multi-classification setting, the DNN approach stands out with outstanding performance. While other approaches, including RNN, CNN, LSTM, GRU, and the Hybrid CNN-LSTM approach, also maintain commendable results, the DNN approach proves to be the most effective in handling complex network patterns. This research provides valuable insights into the application of deep learning approaches using the NSL-KDD dataset for network anomaly detection, emphasizing their versatility and reliability across different classification scenarios. The findings lay the groundwork for further exploration and utilization of deep learning methodologies in enhancing network security.
Article
Full-text available
The landscape of network traffic anomaly detection has evolved considerably in recent times, driven largely by the advent of pioneering algorithms. This article undertakes an exhaustive comparative exploration of some of the most contemporary algorithms, namely Prophet, Recurrent Neural Networks (RNN), Convolutional Neural Networks (CNN), Isolated Forest (IF), and OmniAnomaly. Delving deep into their distinctive features, functionalities, and practical applications, the paper sheds light on the factors that render these algorithms superior to their predecessors. The discourse commences with a prologue underscoring the escalating importance of network traffic anomaly detection, especially in the face of the burgeoning cyber threats of the modern era. This sets the stage for a presentation on the focal algorithms - Prophet, RNN, CNN, IF, and OmniAnomaly. Each algorithm is then dissected to provide readers with a nuanced understanding of its underlying mechanics and methodologies. Furthermore, the discourse amplifies the breakthroughs and innovations underpinning each algorithm, highlighting attributes such as heightened accuracy, lucid interpretability, proficiency in deciphering intricate patterns, and the agility to detect anomalies in real time. Factors like computational agility, resilience, structural intricacy, and versatility across varied operational terrains are assessed in a meticulous comparative framework. Drawing from empirical evidence available in extant literature, the article underscores the stellar performance of these algorithms, benchmarked using quantitative metrics like precision.
Article
Full-text available
In the context of 6G technology, the Internet of Everything aims to create a vast network that connects both humans and devices across multiple dimensions. The integration of smart healthcare, agriculture, transportation, and homes is incredibly appealing, as it allows people to effortlessly control their environment through touch or voice commands. Consequently, with the increase in Internet connectivity, the security risk also rises. However, the future is centered on a six-fold increase in connectivity, necessitating the development of stronger security measures to handle the rapidly expanding concept of IoT-enabled metaverse connections. Various types of attacks, often orchestrated using botnets, pose a threat to the performance of IoT-enabled networks. Detecting anomalies within these networks is crucial for safeguarding applications from potentially disastrous consequences. The voting classifier is a machine learning (ML) model known for its effectiveness as it capitalizes on the strengths of individual ML models and has the potential to improve overall predictive performance. In this research, we proposed a novel classification technique based on the DRX approach that combines the advantages of the Decision tree, Random forest, and XGBoost algorithms. This ensemble voting classifier significantly enhances the accuracy and precision of network intrusion detection systems. Our experiments were conducted using the NSL-KDD, UNSW-NB15, and CIC-IDS2017 datasets. The findings of our study show that the DRX-based technique works better than the others. It achieved a higher accuracy of 99.88% on the NSL-KDD dataset, 99.93% on the UNSW-NB15 dataset, and 99.98% on the CIC-IDS2017 dataset, outperforming the other methods. Additionally, there is a notable reduction in the false positive rates to 0.003, 0.001, and 0.00012 for the NSL-KDD, UNSW-NB15, and CIC-IDS2017 datasets.
Article
Full-text available
With the increased sophistication of cyber-attacks, there is a greater demand for effective network intrusion detection systems (NIDS) to protect against various threats. Traditional NIDS are incapable of detecting modern and sophisticated attacks due to the fact that they rely on pattern-matching models or simple activity analysis. Moreover, Intelligent NIDS based on Machine Learning (ML) models are still in the early stages and often exhibit low accuracy and high false positives, making them ineffective in detecting emerging cyber-attacks. On the other hand, improved detection and prediction frameworks provided by ensemble algorithms have demonstrated impressive outcomes in specific applications. In this research, we investigate the potential of ensemble models in the enhancement of NIDS functionalities in order to provide a reliable and intelligent security defense. We present a NIDS hybrid model that uses ensemble ML techniques to identify and prevent various intrusions more successfully than stand-alone approaches. A combination of several distinct machine learning methods is integrated into a hybrid framework. The UNSW-NB15 dataset is pre-processed, and its features are engineered prior to being used to train and evaluate the proposed model structure. The performance evaluation of the ensemble of various ML classifiers demonstrates that the proposed system outperforms individual model approaches. Using all the employed experimental combination forms, the designed model significantly enhances the detection accuracy attaining more than 99%, while false positives are reduced to less than 1%.
Preprint
Full-text available
Intrusion detection systems can identify intrusion processes which are attempting to intrude, in the process of intruding, or have already occurred.Intrusion detection is a proactive defense approach. Intrusion detection system models built by machine learning are very sensitive to hyper-parameter settings, and different combinations of hyper-parameters can dramatically affect the model's capabilities. In previous work, finding hyperparameters corresponding to advanced models is a difficult task. In order to deal with the huge amount of network traffic data, we here propose the Gaussian Process Regression and Random Forest (GPR-RF) method, which uses Bayesian optimization of Gaussian process regression to find more appropriate combinations of Random Forest hyperparameters, and realizes the following two advantages: 1. The accuracy of the model is greatly improved compared to the traditional methods. 2. The method can quickly converge to better configurations. Among several methods compared, our method performs best in both aspects.
Article
Full-text available
With the increasing reliance on web-based applications and services, network intrusion detection has become a critical aspect of maintaining the security and integrity of computer networks. This study empirically investigates internet protocol network intrusion detection using two machine learning techniques: Isolation Forest (IF) and One-Class Support Vector Machines (OC-SVM), combined with ANOVA F-test feature selection. This paper presents an empirical study comparing the effectiveness of two machine learning algorithms, Isolation Forest (IF) and One-Class Support Vector Machines (OC-SVM), with ANOVA F-test feature selection in detecting network intrusions using web services. The study used the NSL-KDD dataset, encompassing hypertext transfer protocol (HTTP), simple mail transfer protocol (SMTP), and file transfer protocol (FTP) web services attacks and normal traffic patterns, to comprehensively evaluate the algorithms. The performance of the algorithms is evaluated based on several metrics, such as the F1-score, detection rate (recall), precision, false alarm rate (FAR), and Area Under the Receiver Operating Characteristic (AUCROC) curve. Additionally, the study investigates the impact of different hyper-parameters on the performance of both algorithms. Our empirical results demonstrate that while both IF and OC-SVM exhibit high efficacy in detecting network intrusion attacks using web services of type HTTP, SMTP, and FTP, the One-Class Support Vector Machines outperform the Isolation Forest in terms of F1-score (SMTP), detection rate(HTTP, SMTP, and FTP), AUCROC, and a consistent low false alarm rate (HTTP). We used the t-test to determine that OCSVM statistically outperforms IF on DR and FAR.
Article
Full-text available
The complexity and volume of network traffic has increased significantly due to the emergence of the “Internet of Things” (IoT). The classification accuracy of the network traffic is dependent on the most pertinent features. In this paper, we present a hybrid feature selection method that takes into account the optimization of Particle Swarms (PSO) and Random Forests. The data collected by the security firm, CIC-IDS2017, contains a large number of attacks and traffic instances. To improve the classification accuracy, we use the framework's RF algorithm to identify the most important features. Then, the PSO algorithm is used to refine the selection process. According to our experiments, the proposed method performed better than the other methods when it comes to the classification accuracy. It achieves a ~99.9% accuracy when using a hybrid of Random Forest and PSO. The hybrid approach also helps improve the model's performance. The suggested method can be utilized by security analysts and network administrators to identify and prevent attacks on the IoT.
Conference Paper
Full-text available
Anomaly detection is essential to spot cyber-attacks within networks. Unsupervised anomaly detection methods are becoming more popular due to difficult and expensive process of labeling network data as well as their superior ability to detect unknown attacks when compared with supervised or signature-based solutions. In this paper, we introduce an LSTM-based Autoencoder anomaly detection model trained in a fully unsupervised environment, with optimizations for minimal memory usage. Secondly, we compare the Autoencoder model with an Isolation Forest model by analysing their results. Our Autoencoder attempts to capture the profile of the data by dimensionality reduction and the use of LSTM layers enables it to leverage the data from previous requests. Reconstruction error is calculated to decide about the anomality. We train the models on a dataset of requests towards a webserver in an unsupervised fashion. Before training, significant feature engineering is done to process multiple categorical attributes. The training process of introduced Autoencoder is optimized for minimum memory usage. We evaluated the results based on our analysis of the data as well as their statistical features. A manual analysis revealed differing focuses between numerical and categorical attributes. The Isolation Forest disregards most categorical attributes and emphasizes numerical values. Autoencoder on the other hand detects missing features more effectively but largely disregards numerical attributes. As such, Autoencoder might have a higher probability of detecting a zero-day attack when compared to Isolation Forest.