Article
To read the full-text of this research, you can request a copy directly from the authors.

Abstract

Classification-oriented Machine Learning methods are a precious tool, in modern Intrusion Detection Systems (IDSs), for discriminating between suspected intrusion attacks and normal behaviors. Many recent proposals in this field leveraged Deep Neural Network (DNN) methods, capable of learning effective hierarchical data representations automatically. However, many of these solutions were validated on data featuring stationary distributions and/or large amounts of training examples. By contrast, in real IDS applications different kinds of attack tend to occur over time, and only a small fraction of the data instances is labeled (usually with far fewer examples of attacks than of normal behavior). A novel ensemble-based Deep Learning framework is proposed here that tries to face the challenging issues above. Basically, the non-stationary nature of IDS log data is faced by maintaining an ensemble consisting of a number of specialized base DNN classifiers, trained on disjoint chunks of the data instances’ stream, plus a combiner model (reasoning on both the base classifiers predictions and original instance features). In order to learn deep base classifiers effectively from small training samples, an ad-hoc shared DNN architecture is adopted, featuring a combination of dropout capabilities, skip-connections, along with a cost-sensitive loss (for dealing with unbalanced data). Tests results, conducted on two benchmark IDS datasets and involving several competitors, confirmed the effectiveness of our proposal (in terms of both classification accuracy and robustness to data scarcity), and allowed us to evaluate different ensemble combination schemes.

No full-text available

Request Full-text Paper PDF

To read the full-text of this research,
you can request a copy directly from the authors.

... To face the challenges above and try to overcome the limitations of some state-of-the-art approaches in the current literature, we propose an unsupervised, DL-based model for detecting DoS and DDoS attacks in a NIDS scenario. Our solution encompasses two main steps: (i) an ad-hoc preprocessing phase in which features with a certain degree of correlation are removed, and several additional features are created by applying non-linear functions to the original input (similarly to what was proposed in [12]); and (ii) an unsupervised learning phase in which a hybrid architecture combining Sparse AEs and U-Net-like models [13] (from now on referred as Sparse U-Net) is trained against legitimate traffic only and then used to reveal the presence of abnormal (possibly attack-related) behaviors. In particular, following the intuition of [14], a data augmentation strategy is exploited in our neural architecture to mitigate the risk of overfitting and yield more reliable models. ...
... Finally, the Additional Feature Generation component is based on the idea proposed in [12] consisting of enriching the original input vector with a fixed number of additional features. The additional features are computed by applying several distinguished non-linear functions to each original feature. ...
... In future works, we plan to investigate the usage of semisupervised architectures to take advantage of both the better capability of the unsupervised techniques in recognizing zeroday attacks and the higher accuracy achieved by the supervised methods. Moreover, since computer networks are evolving environments in which concept drift and data shift phenomena are expected to occur, we are considering the possibility of integrating our approach into an incremental ensemble learning scheme similar to that proposed in [12]. ...
Conference Paper
In the last few years, we experienced exponential growth in the number of cyber-attacks performed against com-panies and organizations. In particular, because of their ability to mask themselves as legitimate traffic, DoS and DDoS have become two of the most common kinds of attacks on computer networks. Modern Intrusion Detection Systems (IDSs) represent a precious tool to mitigate the risk of unauthorized network access as they allow for accurately discriminating between benign and malicious traffic. Among the plethora of approaches proposed in the literature for detecting network intrusions, Deep Learning (DL)-based IDSs have been proved to be an effective solution because of their ability to analyze low-level data (e.g., flow and packet traffic) directly. However, many current solutions require large amounts of labeled data to yield reliable models. Unfortunately, in real scenarios, small portions of data carry label information due to the cost of manual labeling conducted by human experts. Labels can even be completely missing for some reason (e.g., privacy concerns). To cope with the lack of labeled data, we propose an unsupervised DL-based intrusion detection methodology, combining an ad-hoc preprocessing procedure on input data with a sparse U-Net-like autoencoder architecture. The experimentation on an IDS benchmark dataset substantiates our approach's ability to recognize malicious behaviors correctly.
... We use ensemble averaging to merge model predictions, offering a detailed assessment of intrusion likelihood, reducing the impact of outliers, and creating smoother decision boundaries for better understanding of intrusion patterns [7][8][9]. The base classifier in our ensemble is a DNN, chosen for its ability to handle large and complex data, manage the increasing volume of network traffic, and adapt to evolving network conditions [10,11]. Additionally, we use DE as the weighting strategy for our ensemble methods. ...
... It is computed using the Eq. (11). ...
Article
Full-text available
Detecting coordinated attacks in cybersecurity is challenging due to their sophisticated and distributed nature, making traditional Intrusion Detection Systems often ineffective, especially in heterogeneous networks with diverse devices and systems. This research introduces a novel Collaborative Intrusion Detection System (CIDS) using a Weighted Ensemble Averaging Deep Neural Network (WEA-DNN) designed to detect such attacks. The WEA-DNN combines deep learning techniques and ensemble methods to enhance detection capabilities by integrating multiple Deep Neural Network (DNN) models, each trained on different data subsets with varying architectures. Differential Evolution optimizes the model’s contributions by calculating optimal weights, allowing the system to collaboratively analyze network traffic data from diverse sources. Extensive experiments on real-world datasets like CICIDS2017, CSE-CICIDS2018, CICToNIoT, and CICBotIoT show that the CIDS framework achieves an average accuracy of 93.8%, precision of 78.6%, recall of 60.4%, and an F1-score of 62.4%, surpassing traditional ensemble models and matching the performance of local DNN models. This demonstrates the practical benefits of WEA-DNN in improving detection capabilities in real-world heterogeneous network environments, offering superior adaptability and robustness in handling complex attack patterns.
... As stated in paper [14], there aren't many works that propose deep ensemble learning for intrusion detection systems. In [14], an ensemble made up of several specialized base DNN classifiers is used in order to deal with the nonstationary nature of IDS log data. ...
... As stated in paper [14], there aren't many works that propose deep ensemble learning for intrusion detection systems. In [14], an ensemble made up of several specialized base DNN classifiers is used in order to deal with the nonstationary nature of IDS log data. The DNN classifiers were trained on separate segments of the data instances' stream. ...
Conference Paper
System and network security have become one of the main considerations for the system or network design, implementation and maintenance. The increased volume of traffic and the ever-changing form of internet communications has led to a steep rise in network attacks, hence the need to protect both sides of the communication is essential. Over the last couple of years, an upsurge in the use of machine learning and deep learning techniques in detecting network attacks can be noticed. Alongside with individual machine and deep learning techniques, various ensemble techniques have been implemented in order to combine multiple models working on the same problem. However, ensemble techniques mostly focus on machine learning models, disregarding the fact that deep learning techniques have been proven to be efficient when handling large amounts of data. In this paper, a network intrusion detection system is implemented using a weighted voting ensemble deep learning model. Fifteen various Deep Neural Network models have been trained and evaluated on the imbalanced multiclass CICIDS-2017 dataset. Weighted voting system has been implemented for combining decisions of various models in two ways. The evaluation results for multiple combinations are given. The proposed system has the ability to use heterogeneous models for the decision.
... In the detection of the correlation between the characteristic data of the ATC network security attack behavior, the cross correlation test index is used to study the correlation of the characteristics, and the correlation is set to 0.3 as the standard to determine whether the characteristic data of ATC network security attack behavior is the key characteristic data. When the correlation value of characteristic data of network security attack behavior is larger, it reflects that the extracted characteristics are more critical [20][21][22]. After adding the recursive features, the optimal features for the secure attack behavior of the ATC network are extracted step by step. ...
... In formula (20), ϕ i represents the traversal result [50,52]. ...
Article
Full-text available
In order to enhance the accuracy of Air Traffic Control (ATC) cybersecurity attack detection, in this paper, a new clustering detection method is designed for air traffic control network security attacks. The feature set for ATC cybersecurity attacks is constructed by setting the feature states, adding recursive features, and determining the feature criticality. The expected information gain and entropy of the feature data are computed to determine the information gain of the feature data and reduce the interference of similar feature data. An autoencoder is introduced into the AI (artificial intelligence) algorithm to encode and decode the characteristics of ATC network security attack behavior to reduce the dimensionality of the ATC network security attack behavior data. Based on the above processing, an unsupervised learning algorithm for clustering detection of ATC network security attacks is designed. First, determine the distance between the clustering clusters of ATC network security attack behavior characteristics, calculate the clustering threshold, and construct the initial clustering center. Then, the new average value of all feature objects in each cluster is recalculated as the new cluster center. Second, it traverses all objects in a cluster of ATC network security attack behavior feature data. Finally, the cluster detection of ATC network security attack behavior is completed by the computation of objective functions. The experiment took three groups of experimental attack behavior data sets as the test object, and took the detection rate, false detection rate and recall rate as the test indicators, and selected three similar methods for comparative test. The experimental results show that the detection rate of this method is about 98%, the false positive rate is below 1%, and the recall rate is above 97%. Research shows that this method can improve the detection performance of security attacks in air traffic control network.
... The conventional feature extraction approaches are used in order to preserve the consistency level of the parameters. There are several neural network methods that are commonly used in the field of intrusion detection mainly include Long Short-Term Memory networks [24] , Convolutional Neural Networks [25], Recurrent Neural Networks [26] and deep belief networks. In literature, the deep learning model performs even better when it comes to generalization; it is utilized in the final stage of identifying anomalous data types [27].The author [28] used a recognised gas pipeline dataset to conduct the experiment.A hybrid approach was utilized to finalize the parameter selection during the experiment, yielding more satisfactory results; however, the methods produced a high number of false alarms as a result of improper feature extraction from the dataset. ...
... The experimental parameters include the environmental configuration, description and allocation of data resources, baseline investigations, and performance metrics. To assess the efficacy of our intrusion detection model, we will conduct tests comparing it to other sophisticated studies, like Schneble's [26] and Nguyen's [21], utilizing our deep learning model. ...
Preprint
Full-text available
Industrial control systems (ICSs) are integrated with communication networks and the Internet of Things (IoT), they become more susceptible to cyberattacks, which can have catastrophic effects. However, the lack of sufficient high-quality attack examples has made it very difficult to withstand cyber threats like large-scale, sophisticated, and heterogeneous ICS. Conventional intrusion detection systems (IDSs), designed primarily to assist IT systems, rely heavily on pre-established models and are mostly trained on particular types of cyberattacks. Furthermore, most intrusion detection systems suffer from low accuracy and high false-positive rates when used because they fail to take into account the imbalanced nature of datasets and feature redundancy. In this article,the Deep DenseAttention Learning Model (DDAnet), a novel and inventive deep learning scheme described in this article, is intended to identify and detect cyber attacks that target industrial control systems. The intrusion activity is regarded as a densenet-based network intrusion detection model with an attention model along with a random forest as a classifier. The DDAnet learning scheme has been extensively tested on a real industrial control system dataset. The results of these experiments reveal the great effectiveness of the scheme in identifying different types of data injection attacks on industrial control systems. Furthermore, the scheme has been found to have superior performance compared to state-of-the-art schemes and existing methodologies. The proposed strategy is a versatile method that can be easily deployed in the current ICS infrastructure with minimal effort.
... Ensemble-based DNN frameworks offer high accuracy but come with high computational complexity. Folino et al. [43] propose an ensemble for continuous intrusion detection and analyze ensemble aggregation strategies, focusing on unstructured data from NIDS logs. Sarath et al. [44] utilized the Enhanced Elman Spike Neural Network to enhance Intrusion Detection Systems (IDS). ...
... Utilizing the weights ( j , j = 1, 2, 3, 4) derived from Table 4, a weighted aggregation process was applied to the multiple IT3FLS outcomes using Eq. (43). This operation combined the predictions of k-barrier, resulting in the final output. ...
Article
Full-text available
In an evolving defense landscape with persistent security threats, enhancing Wireless Sensor Networks (WSN) for border security and advancing Intrusion Detection Systems (IDS) are vital for national defense and data integrity. In this research, we present a structured and innovative Analytical Hierarchy Process (AHP) Multi attribute Decision Making (MADM) Aggregated Multiple Type 3 Fuzzy Logic (IT3FLS) approach for the accurate prediction of the number of k-barriers for fast intrusion detection and prevention within WSN. Four possible features—the rectangular region, the detecting sensors range, the transmission range of the sensors, and the number of sensors for uniform sensor distribution—were used in the training and evaluation of the suggested model. Using Monte Carlo simulation, these traits are retrieved. This methodology outlined in four-stages. In Stage 1, it constructs Multiple IT3FLS through data collected from simulations. Stage 2 rigorously evaluates IT3FLS models using statistical measures, culminating in a performance matrix. Stage 3 integrates this matrix, enhancing understanding via the AHP-MADM to assign weights. In Stage 4, these weights optimize predictions through a weighted aggregation method. The system's results significantly enhance the accuracy of k-barrier predictions in intrusion detection. The model demonstrates its proficiency with a remarkable correlation coefficient (R) of 0.997, a minimal root mean square error (RMSE) of 5.36 and low bias of 1.7. Furthermore, the research assesses the proposed system's performance against multiple benchmark methods, confirming its superior accuracy and computational efficiency.
... A BiGAN-inspired model combined with a custom loss function is adopted for identifying intrusions in computer networks. In [14], the authors propose a supervised incremental Deep Learning scheme to cope with concept drifts and data shifts: a number of Residual Neural Networks (ResNet) are trained against disjoint data chunks gathered in different time windows. Then, the single models are combined in an ensemble model, which is further fine-tuned on a subset of data extracted by each data chunk. ...
... Then, the single models are combined in an ensemble model, which is further fine-tuned on a subset of data extracted by each data chunk. As shown in [20], the performances of different ML-based detection methods (including also [14]) can be further improved by embedding them in an Active Learning scheme. A different solution to deal with the problem proposes the usage of the Federated Learning framework: the underlying idea of this approach is to distribute the computation of the model among different nodes, which are the owner of data. ...
Chapter
In recent times, Machine Learning has played an important role in developing novel advanced tools for threat detection and mitigation. Intrusion Detection, Misinformation, Malware, and Fraud Detection are just some examples of cybersecurity fields in which Machine Learning techniques are used to reveal the presence of malicious behaviors. However, Out-of-Distribution, i.e., the potential distribution gap between training and test set, can heavily affect the performances of the traditional Machine Learning based methods. Indeed, they could fail in identifying out-of-samples as possible threats, therefore devising robust approaches to cope with this issue is a crucial and relevant challenge to mitigate the risk of undetected attacks. Moreover, a recent emerging line proposes to use generative models to yield synthetic likely examples to feed the learning algorithms. In this work, we first survey recent Machine Learning and Deep Learning based solutions to face both the problems, i.e., outlier detection and generation; then we illustrate the main cybersecurity application scenarios in which these approaches have been adopted successfully.KeywordsThreat DetectionOutlier GenerationDeep LearningEnsemble LearningGenerative ModelAutoencoder
... Folino and co. For analyzing non-stationary data like intrusion detection system logs, a novel ensemble-based deep learning framework was created by [18]. When employing string section learners, the capacity to construct a superior detection structure is essential in order to achieve a higher detection rate, when putting together an ensemble, one of the most difficult problems is choosing from the available base classifiers and combiners. ...
... University of Sindh Journal of Information and Communication Technology (USJICT) Vol.6(4), pg.:[8][9][10][11][12][13][14][15][16][17][18] ...
Article
Full-text available
This paper explores distributed denial of service (DDoS) attacks, their current threat level, and intrusion detection systems (IDS), which are one of key techniques for mitigating them. It focuses on the problems and issues that IDS systems encounter while detecting DDoS attacks, as well as the difficulties and obstacles that they face nowadays when integrating with artificial intelligence systems. These ID systems enable the automatic and real-time identification of harmful threats. However, the network requires a highly sophisticated security solution due to the frequency with which malicious threats emerge and change. A significant amount of research is required to create an intelligent and trustworthy identification system for research purposes; numerous ID datasets are freely accessible. Due to the rapid evolution of attack detection mechanisms and the complexity of malicious attacks, publicly available Identification databases must be completely changed. on a regular basis. Due to the ever-evolving attack detection mechanism and the complexity of malicious attacks, publicly available ID datasets must frequently be modified. A Convolutional Neural Network (CNN) network was trained using four distinct training algorithms. The CICDDoS2019 dataset, which contains the most recent DDoS attack types created in CICDDoS2019, was tested, According to the analysis; the "Gradient Descent with Momentum Backpropagation" algorithm could be trained quickly. Network data attacks were correctly detected 93.1 percent of the time. The results indicate that The Convolutional Neural Network is able to successfully defend against DDoS attacks detection by using intrusion detection systems IDS, as evidenced by the high accuracy values obtained.
... Those ML algorithms are typically binary classifiers, which aim at distinguishing between normal and attack-related behavior by processing feature values. This has proven to be very effective for detecting a wide variety of attacks, and in the last two decades originated a huge amount of research papers and industrial applications [42], [43], [44], [45], [46], [47], [48] that have the potential to improve security attributes of ICT systems. However, researchers and practitioners have to craft intrusion detectors for specific systems, network interfaces and attack models, to name a few. ...
... Their performance is then evaluated and compared against potential competitors, and then the detection system is deployed and put into operation. This is a consolidated flow that has been proven effective in many studies [42], [44], [45], [46], [47], [48]. ...
Chapter
Full-text available
Exercising Machine Learning (ML) algorithms to detect intrusions is nowadays the de-facto standard for data-driven detection tasks. This activity requires the expertise of the researchers, practitioners, or employees of companies that also have to gather labeled data to learn and evaluate the model that will then be deployed into a specific system. Reducing the expertise and time required to craft intrusion detectors is a tough challenge, which in turn will have an enormous beneficial impact in the domain. This paper conducts an exploratory study that aims at understanding to which extent it is possible to build an intrusion detector that is general enough to learn the model once and then be applied to different systems with minimal to no effort. Therefore, we recap the issues that may prevent building general detectors and propose software architectures that have the potential to overcome them. Then, we perform an experimental evaluation using several binary ML classifiers and a total of 16 feature learners on 4 public attack datasets. Results show that a model learned on a dataset or a system does not generalize well as is to other datasets or systems, showing poor detection performance. Instead, building a unique model that is then tailored to a specific dataset or system may achieve good classification performance, requiring less data and far less expertise from the final user.KeywordsIntrusion detectionGeneral modelTransferabilityMachine learningFeature learning
... In addition, many researchers have used advanced boosting technology to process long-tail data recently, and combined the Deep Neural Network (DNN) architecture with it to improve the classification effect by taking advantage of the good generalization performance of advanced boosting technology. Folino et al. [20] adopted an ad-hoc shared DNN architecture, featuring a combination of dropout capabilities, skip connections, along with a cost-sensitive loss to efficiently learn deep base classifiers from minority class samples,which is also the first attempt to combine a block-based learning scheme with DNN ensemble techniques to handle long-tail classification tasks, and the experimental results also confirm the feasibility and application prospects of the method. Bedi et al. [21] proposed an improved Siam-IDS (I-SiamIDS) algorithm-level method, using a collection of binary eXtreme Gradient Boosting, Siamese Neural Network and Deep Neural Network to improve the system's effectiveness in detecting intrusion attacks in an unbalanced network environment, but its computational time overhead is very large. ...
... The deep learning-based techniques in the literature [18,19] reconstruct the data well, but do not introduce new features for the data, which is also the biggest difference from GAN. In the literature [20][21][22], the method based on advanced boosting and DNN introduces the idea of modularization, and the scope of changes to the model structure is small, but because it is a combination of multiple models, it often increases the computational cost exponentially. In the literature [23][24][25][26][27][28][29][30], the method based on GAN introduces new features while adding minority class samples, which effectively improves the classification effect. ...
Article
Full-text available
With the rapid development and application of the mobile Internet, it is necessary to analyze and classify mobile traffic to meet the needs of users. Due to the difficulty in collecting some application data, the mobile traffic data presents a long-tailed distribution, resulting in a decrease in classification accuracy. In addition, the original GAN is difficult to train, and it is prone to "mode collapse". Therefore, this paper introduces the self-attention mechanism and gradient normalization into the auxiliary classifier generative adversarial network to form SA-ACGAN-GN model to solve the long-tailed distribution and training stability problems of mobile traffic data. This method firstly converts the traffic into images; secondly, to improve the quality of the generated images, the self-attention mechanism is introduced into the ACGAN model to obtain the global geometric features of the images; finally, the gradient normalization strategy is added to SA-ACGAN to further improve the data augmentation effect and improve the training stability. It can be seen from the cross-validation experimental data that, on the basis of using the same classifier, the SA-ACGAN-GN algorithm proposed in this paper, compared with other comparison algorithms, has the best precision reaching 93.8%; after adding gradient normalization, during the training process of the model, the classification loss decreases rapidly and the loss curve fluctuates less, indicating that the method proposed in this paper can not only effectively improve the long-tail problem of the dataset, but also enhance the stability of the model training.
... The designed novel FLbHTDN has five layers: input layer, hidden layer, classification layer, parameter tuning phase, and output layer, as described in fig. 3. The proposed FLbHTDN approach has been designed based on the frog leaping algorithm ) and deep neural model (DNM) (Folino et al. 2021). The fitness of the fog is updated in all layers to tune the parameters for rearing better results. ...
... The performance of the developed novel FLbHTDN method has been analyzed with other existing works that are, Deep belief Model (DBM) (Wang et al. 2021), Recurrent Neural Model (RNM) (Imrana 2021), Deep neural model (DNM) (Folino et al. 2021), and Convolutional Neural Model (CNM) (Mendonça 2021). Hence, the obtained existing models were implemented in the same java platform, and the results are discussed as follows. ...
Preprint
Full-text available
Nowadays, Internet-of-things (IoT) facilities have been used worldwide in all digital applications. Hence, maintaining the IoT communication system's security range is crucial to enrich the IoT advanced better. However, the harmful attacks can destroy security and degrade the IoT communication channel by making network traffic, system shutdown, and collapse. The present work has introduced a novel Frog Leap-based Hyper-parameter Tuned Deep Neural (FLbHTDN) model to overcome these issues to detect intrusion in the IoT communication paradigm. Hence, the dataset called Nsl-Kdd has been utilized to validate the pressed model. Initially, the preprocessing process functioned to remove the error from the trained dataset. Consequently, the present features in the dataset have been tracked, and the malicious features have been extracted and classified as specific attack classes. The designed model is executed in the Java platform, and the improvement measure of the developed technique has been validated by performing the comparative analysis. The proposed FLbHTDN approach has obtained the finest attack prediction score in less duration than the compared models.
... For the purpose of illustration, we assume that the TDS layer includes EBIDS (Ensemble Based IDS) [37], a ML-based Intrusion Detection technique adopting specialized ensembles of classification models to identify undetected attacks by analyzing traffic flow statistics extracted from network logs. Here, we use pcap format to share network flow information, but the proposed security event object, described in the previous section, is flexible and allows for supporting data shared in other formats. ...
... , TDS N . These instances are initialized by using the same parameters described in [37]. For the architecture of the base model, (i) the Extended Input layer produces the transformations √ ...
Article
Sharing threat events and Indicators of Compromise (IoCs) enables quick and crucial decision making relative to effective countermeasures against cyberattacks. However, the current threat information sharing solutions do not allow easy communication and knowledge sharing among threat detection systems (in particular Intrusion Detection Systems (IDS)) exploiting Machine Learning (ML) techniques. Moreover, the interaction with the expert, which represents an important component to gather verified and reliable input data for the ML algorithms, is weakly supported. To address all these issues, ORISHA, a platform for ORchestrated Information SHaring and Awareness enabling the cooperation among threat detection systems and other information awareness components, is proposed here. ORISHA is backed by a distributed Threat Intelligence Platform based on a network of interconnected Malware Information Sharing Platform instances, which enables the communication with several Threat Detection layers belonging to different organizations. Within this ecosystem, Threat Detection Systems mutually benefit by sharing knowledge that allows them to refine the underlying predictive accuracy. Uncertain cases, i.e. examples with low anomaly scores, are proposed to the expert, who acts with the role of oracle in an Active Learning scheme. By interfacing with a honeynet, ORISHA allows for enriching the knowledge base with further positive attack instances and then yielding robust detection models. An experimentation conducted on a well-known Intrusion Detection benchmark demonstrates the validity of the proposed architecture.
... By 2027, endpoint security is anticipated to be valued over $29 billion [2]. Researchers recommend employing Machine Learning (ML) and Deep Learning (DL) as an Intrusion Detection System (IDS) to detect intrusions in IIoT networks [3]. Nevertheless, there are unique difficulties associated with using ML and DL in IDS for this type of networks. ...
Article
Full-text available
While conventional Intrusion Detection Systems (IDS) are essential for defending against intruders in the Industrial Internet of Things (IIoT), handling data from heterogeneous and streaming data sources should receive more attention. This work introduces a novel Optimized IForest-based Intrusion Detection System (OIFIDS) which is designed to handle both heterogeneous and streaming data efficiently. The suggested approach employs a collection of optimized binary trees, each of which is trained on a distinctive subset of data, and in which the location of empty leaves determines the anomaly score assigned to a certain data point. Optimizing isolation Forest (iForest) utilizing a modified version of the Harris Hawks Optimization algorithm, which exploits both Exploration factor and Random walk strategies (ERHHO) decreases the dataset's dimension, decreases its learning time, and enhances the detection precision, accuracy, F1-score, FPR, and recall. To demonstrate how effective the proposed approach is, it is evaluated using three datasets: CICIDS-2018, NSL-KDD, and UNSW-NB15. The experimental results prove the ability of the suggested approach in handling both heterogeneous and streaming data efficiently and delivering results that were comparable to the cutting-edge baseline techniques. Moreover, it performs effectively when there are no anomalies in the training sample and when dealing with challenging scenarios with several irrelevant features and high dimensions. Based on the comparison with various state-of-the-art IDSs, the suggested approach is able to detect intrusion with greater accuracies of 95.6%, 94.8%, and 99% than the other approaches on the NSL-KDD, UNSW-NB15, and CICIDS-2018 datasets, respectively. Experiments on heterogenous data reveal that Area Under the ROC Curve (AUC) of OIFIDS beats the baseline approach for UNSW-NB15 dataset and is higher than the second-best method by 8% and 2.4% for NSL-KDD and CICIDS-2018 respectively. Evaluating the proposed system on streaming data illustrates that it can address the concept drift problem well with high AUC value of 0.948, 0.97, and 0.922 on the NSL-KDD, CICIDS-2018, and UNSW-NB15 datasets, respectively.
... ML and DL approaches have been widely used in network security due to their ability to distinguish data [18][19][20][21][22]. Other researchers have used various techniques for identification (IDS) using KNN and SVM on various datasets to evaluate the efficiency of these algorithms on the NSL-KDD dataset [23,24]. ...
Article
Full-text available
Intrusion Detection Systems (IDS) play a vital role in network security by detecting and preventing malicious activities. The network intrusion data is integrated into a vast number of common occurrences due to the dynamic and ever-changing networking environment. This results in a scarcity of training cases for models and detection outcomes, accompanied by a significant percentage of false detections. Our suggested Network-IDS addresses the issue of data imbalance by integrating Deep Learning Networks (DLN) via hybrid sampling. We begin by collecting out-of-the-ordinary samples from the majority and eliminating them using the Difficult-Set-Sampling-Technique method, which stands for Difficult-Set-Sampling-Technique (DSST). Next step is to increase the minority group's sample size using (DCGAN) means Deep-Convolutional-Generative-Adversarial-Networks. Step two involves building a model for a deep neural network to extract geographical features using DenseNet169, in addition, we utilize SAT-Net to capture features of temporal. This approach effectively represents the unique attributes of the dataset. Lastly, we deployed the EESNN to identify assault types. In addition to that, we conducted tests on the latest and most extensive intrusion datasets, the Telecommunications Network Internet of Things (ToN-IoT) dataset as well as the CICIDS2019 dataset, to verify of proposed approach. The outcome demonstrates that our recommended structure surpasses similar efforts in terms of accuracy, false alarm rate, recall, and precision. The findings indicate that our proposed system is superior to other attempts of a similar kind in terms of accuracy, false alarm rate, recall, and precision. We will provide a detailed explanation of this in the comparative section.
... In formula (1), D represents the feature set of air traffic control network security attack behavior, 1 2 , ,... n d d d represents the composition of characteristics of ATC network security attack behavior, and n represents the number of feature vectors [17]- [19]. ...
... Folino et al. [12] proposed a novel ensemble-based DL framework for Intrusion Detection Systems (IDSs). In order to deal with the non-stationary nature of IDS log data, an ensemble of specific base deep neural network (DNN) classifiers trained on discontinuous portions of the data instances' stream with a combiner model was developed. ...
... A large body of research on IDS relies on machine and deep learning techniques. For example, Folino et al. (2021) propose a novel ensemble-based deep learning framework for the analysis of non-stationary data, such as networktraffic logs. The results indicate a better accuracy on two popular benchmark IDS datasets compared to other stateof-the-art solutions. ...
Article
Full-text available
The number of papers on machine learning and deep neural networks applied to intrusion detection systems (IDS) is ever-increasing. Differently from existing work on the topic, this paper explores the effect of training-time randomness of deep neural networks, which is overlooked by the related literature. Training-time randomness is regulated by the seed of the pseudorandom number generator, and affects the performance of IDS models. The seed selection is studied in conjunction with other critical learning parameters: to the best of our knowledge, there are no similar studies in IDS. The experiments are done with a recent and widely consolidated intrusion detection benchmark, which is used to train and test a neural network under different combinations of seeds and parameters both in supervised and semi-supervised learning modes. The results are inferred by a mixture of explorative analysis, design of experiments, and analysis of variance. According to the results, the choice of the seed yields either excellent or scarce detection metrics; more importantly, the seed selection might be as relevant as the other major learning parameters assessed.
... , D i−k , respectively), which are fed with data instances gathered in different temporal intervals. Specifically, the incremental learning process adopted in our solution is loosely inspired by the work (Folino et al., 2021) proposing an ensemble-based deep learning approach trained on disjoint data chunks. Differently from this work where each model is trained independently, in our solution the model M i is trained (i.e., fine-tuned) from the weights of the model M i−1 and the sample D i . ...
Article
Full-text available
Modern IoT ecosystems are the preferred target of threat actors wanting to incorporate resource-constrained devices within a botnet or leak sensitive information. A major research effort is then devoted to create countermeasures for mitigating attacks, for instance, hardware-level verification mechanisms or effective network intrusion detection frameworks. Unfortunately, advanced malware is often endowed with the ability of cloaking communications within network traffic, e.g., to orchestrate compromised IoT nodes or exfiltrate data without being noticed. Therefore, this paper showcases how different autoencoder-based architectures can spot the presence of malicious communications hidden in conversations, especially in the TTL of IPv4 traffic. To conduct tests, this work considers IoT traffic traces gathered in a real setting and the presence of an attacker deploying two hiding schemes (i.e., naive and “elusive” approaches). Collected results showcase the effectiveness of our method as well as the feasibility of deploying autoencoders in production-quality IoT settings.
... Wang Zhendong et al [4] proposed an IGWO-BP neural network detection model for the problem that error backpropagation neural networks were random when giving values at the beginning and tended to fall into local optima during training. Folino F et al [5] proposed a new deep learning framework based on integration learning in an attempt to solve the problem of data imbalance. ...
Article
Full-text available
Using deep learning and machine learning techniques for network intrusion detection is of great significance for enhancing the defense capability of network security systems. Given the characteristics of generative adversarial networks, such as the approximate consistency of generated samples with the input data distribution but with a random distribution within a certain bounded interval, and in response to the problem of insufficient classification performance and detection omission caused by the imbalance of different degrees of data categories and quantities in network intrusion traffic, and in light of the fact that the effectiveness of existing classification algorithms based on unbalanced traffic data still has some room for improvement, this paper proposes a network intrusion detection strategy based on auxiliary classifier generative adversarial networks. The data expansion experiments are conducted with the intrusion detection dataset NSL-KDD. The data are classified into twenty-three categories before and after the expansion by binary classification validation. The results show that the expansion of the generated samples for unbalanced network traffic data improve the subsequent recognition effect significantly. Finally, five classification performance index verification experiments are conducted. The results prove that the strategy of this paper performs better in accuracy, precision, recall rate and F-value indexes, and is capable of obtaining a large number of features from limited samples and inferring complete data distribution based on fewer features. The model as a whole has stronger generalization ability and defense effect.
... A total of 98% of these datasets are regarded as normal, whereas the remaining 2% are categorized as assaults [11]. Folino et al. [12] suggested a novel deeplearning model based on ensemble learning for interpreting non-stationary datasets such as IDS logs. It is desirable to be able to construct a better detection system, especially when utilizing ensemble classifiers. ...
Article
Full-text available
Attacks on networks are currently the most pressing issue confronting modern society. Network risks affect all networks, from small to large. An intrusion detection system must be present for detecting and mitigating hostile attacks inside networks. Machine Learning and Deep Learning are currently used in several sectors, particularly the security of information, to design efficient intrusion detection systems. These systems can quickly and accurately identify threats. However, because malicious threats emerge and evolve regularly, networks need an advanced security solution. Hence, building an intrusion detection system that is both effective and intelligent is one of the most cognizant research issues. There are several public datasets available for research on intrusion detection. Because of the complexity of attacks and the continually evolving detection of an attack method, publicly available intrusion databases must be updated frequently. A convolutional recurrent neural network is employed in this study to construct a deep-learning-based hybrid intrusion detection system that detects attacks over a network. To boost the efficiency of the intrusion detection system and predictability, the convolutional neural network performs the convolution to collect local features, while a deep-layered recurrent neural network extracts the features in the proposed Hybrid Deep-Learning-Based Network Intrusion Detection System (HDLNIDS). Experiments are conducted using publicly accessible benchmark CICIDS-2018 data, to determine the effectiveness of the proposed system. The findings of the research demonstrate that the proposed HDLNIDS outperforms current intrusion detection approaches with an average accuracy of 98.90% in detecting malicious attacks.
... F-score represents twice multiplication of the precision and recall divided by the summation. The equation of F1-score is presented in Equation (16). ...
Article
Full-text available
Concept drift (CD) in data streaming scenarios such as networking intrusion detection systems (IDS) refers to the change in the statistical distribution of the data over time. There are five principal variants related to CD: incremental, gradual, recurrent, sudden, and blip. Genetic programming combiner (GPC) classification is an effective core candidate for data stream classification for IDS. However, its basic structure relies on the usage of traditional static machine learning models that receive onetime training, limiting its ability to handle CD. To address this issue, we propose an extended variant of the GPC using three main components. First, we replace existing classifiers with alternatives: online sequential extreme learning machine (OSELM), feature adaptive OSELM (FA-OSELM), and knowledge preservation OSELM (KP-OSELM). Second, we add two new components to the GPC, specifically, a data balancing and a classifier update. Third, the coordination between the sub-models produces three novel variants of the GPC: GPC-KOS for KA-OSELM; GPC-FOS for FA-OSELM; and GPC-OS for OSELM. This article presents the first data stream-based classification framework that provides novel strategies for handling CD variants. The experimental results demonstrate that both GPC-KOS and GPC-FOS outperform the traditional GPC and other state-of-the-art methods, and the transfer learning and memory features contribute to the effective handling of most types of CD. Moreover, the application of our incremental variants on real-world datasets (KDD Cup ‘99, CICIDS-2017, CSE-CIC-IDS-2018, and ISCX ‘12) demonstrate improved performance (GPC-FOS in connection with CSE-CIC-IDS-2018 and CICIDS-2017; GPC-KOS in connection with ISCX2012 and KDD Cup ‘99), with maximum accuracy rates of 100% and 98% by GPC-KOS and GPC-FOS, respectively. Additionally, our GPC variants do not show superior performance in handling blip drift.
... Another chunk-based learning scheme is presented in [21]. The authors use disjoint time-delimited chunks of the training dataset for training a series of DNN classifiers. ...
Preprint
Full-text available
The network security analyzers use intrusion detection systems (IDSes) to distinguish malicious traffic from benign ones. The deep learning-based IDSes are proposed to auto-extract high-level features and eliminate the time-consuming and costly signature extraction process. However, this new generation of IDSes still suffers from a number of challenges. One of the main issues of an IDS is facing traffic concept drift which manifests itself as new (i.e., zero-day) attacks, in addition to the changing behavior of benign users/applications. Furthermore, a practical DL-based IDS needs to be conformed to a distributed architecture to handle big data challenges. We propose a framework for adapting DL-based models to the changing attack/benign traffic behaviors, considering a more practical scenario (i.e., online adaptable IDSes). This framework employs continual deep anomaly detectors in addition to the federated learning approach to solve the above-mentioned challenges. Furthermore, the proposed framework implements sequential packet labeling for each flow, which provides an attack probability score for the flow by gradually observing each flow packet and updating its estimation. We evaluate the proposed framework by employing different deep models (including CNN-based and LSTM-based) over the CIC-IDS2017 and CSE-CIC-IDS2018 datasets. Through extensive evaluations and experiments, we show that the proposed distributed framework is well adapted to the traffic concept drift. More precisely, our results indicate that the CNN-based models are well suited for continually adapting to the traffic concept drift (i.e., achieving an average detection rate of above 95% while needing just 128 new flows for the updating phase), and the LSTM-based models are a good candidate for sequential packet labeling in practical online IDSes (i.e., detecting intrusions by just observing their first 15 packets).
... Also, Folino et al. [16] combined four base DNN classifiers, trained on disjoint chunks of the data instances' stream, and the meta classifier uses both the base classifiers predictions and original instance features for training and the final prediction tasks. Using two datasets the experimental results showed that the proposed ensemble model can act as a methodological basis robust and scalable enough for intelligent systems for the analysis of streaming IDS. ...
... On the other hand, deep neural models that currently yield accurate decisions in several cybersecurity domains (e.g., [4]- [7]), perform as black-box models, while easier-to-explain models are becoming increasingly desirable in several domains. Several eXplainable Artificial Intelligence (XAI) techniques [8] have been recently explored to produce explanations of decisions of deep neural models also in cybersecurity applications [9]. ...
... On the other hand, the alternative is to get a great deal more knowledge about a single non-ensemble system. An ensemble system can be more efficient in terms of total accuracy improvement by spreading the same increase in computing, storage, or communication resources among two or more methods, rather than increasing the resource usage for a single method (Folino et al., 2021). ...
Article
Full-text available
In this paper, we have looked at how easy it is for users in an organisation to be given different roles, as well as how important it is to make sure that the tasks are done well using predictive analytical tools. As a result, ensemble of classification and regression tree link Neural Network was adopted for evaluating the effectiveness of role-based tasks associated with organization unit. A Human Resource Manangement System was design and developed to obtain comprehensive information about their employees’ performance levels, as well as to ascertain their capabilities, skills, and the tasks they perform and how they perform them. Datasets were drawn from evaluation of the system and used for machine learning evaluation. Linear regression models, decision trees, and Genetic Algorithm have proven to be good at prediction in all cases. In this way, the research findings highlight the need of ensuring that users tasks are done in a timely way, as well as enhancing an organization’s ability to assign individual duties.
... Despite having a high computational complexity, ensemble-based techniques have a high level of accuracy as compared to the base models. An ensemble based DNN framework for the continuous analysis of intrusion detection along with the ability to learn hierarchical data-sets automatically has been proposed in Folino et al. (2021). Here, a log-stream of an intrusion detection system maintains an ensemble that contains classifiers trained on discrete chunks of the data-set instance and a combiner model. ...
Preprint
Full-text available
Wireless Sensor Networks (WSNs) is a promising technology with enormous applications in almost every walk of life. One of the crucial applications of WSNs is intrusion detection and surveillance at the border areas and in the defense establishments. The border areas are stretched in hundreds to thousands of miles, hence, it is not possible to patrol the entire border region. As a result, an enemy may enter from any point absence of surveillance and cause the loss of lives or destroy the military establishments. WSNs can be a feasible solution for the problem of intrusion detection and surveillance at the border areas. Detection of an enemy at the border areas and nearby critical areas such as military cantonments is a time-sensitive task as a delay of few seconds may have disastrous consequences. Therefore, it becomes imperative to design systems that are able to identify and detect the enemy as soon as it comes in the range of the deployed system. In this paper, we have proposed a deep learning architecture based on a fully connected feed-forward Artificial Neural Network (ANN) for the accurate prediction of the number of k-barriers for fast intrusion detection and prevention. We have trained and evaluated the feed-forward ANN model using four potential features, namely area of the circular region, sensing range of sensors, the transmission range of sensors, and the number of sensor for Gaussian and uniform sensor distribution. These features are extracted through Monte Carlo simulation. In doing so, we found that the model accurately predicts the number of k-barriers for both Gaussian and uniform sensor distribution with correlation coefficient (R = 0.78) and Root Mean Square Error (RMSE = 41.15) for the former and R = 0.79 and RMSE = 48.36 for the latter. Further, the proposed approach outperforms the other benchmark algorithms in terms of accuracy and computational time complexity.
... Despite having a high computational complexity, ensemble-based techniques have a high level of accuracy as compared to the base models. An ensemble based DNN framework for the continuous analysis of intrusion detection along with the ability to learn hierarchical data-sets automatically has been proposed in Folino, Folino, Guarascio, Pisani, and Pontieri (2021). Here, a log-stream of an intrusion detection system maintains an ensemble that contains classifiers trained on discrete chunks of the data-set instance and a combiner model. ...
Article
Full-text available
Wireless Sensor Networks (WSNs) is a promising technology with enormous applications in almost every walk of life. One of the crucial applications of WSNs is intrusion detection and surveillance at the border areas and in the defense establishments. The border areas are stretched in hundreds to thousands of miles, hence, it is not possible to patrol the entire border region. As a result, an enemy may enter from any point absence of surveillance and cause the loss of lives or destroy the military establishments. WSNs can be a feasible solution for the problem of intrusion detection and surveillance at the border areas. Detection of an enemy at the border areas and nearby critical areas such as military cantonments is a time-sensitive task as a delay of few seconds may have disastrous consequences. Therefore, it becomes imperative to design systems that are able to identify and detect the enemy as soon as it comes in the range of the deployed system. In this paper, we have proposed a deep learning architecture based on a fully connected feed-forward Artificial Neural Network (ANN) for the accurate prediction of the number of k-barriers for fast intrusion detection and prevention. We have trained and evaluated the feed-forward ANN model using four potential features, namely area of the circular region, sensing range of sensors, the transmission range of sensors, and the number of sensor for Gaussian and uniform sensor distribution. These features are extracted through Monte Carlo simulation. In doing so, we found that the model accurately predicts the number of k-barriers for both Gaussian and uniform sensor distribution with correlation coefficient (R = 0.78) and Root Mean Square Error (RMSE = 41.15) for the former and R = 0.79 and RMSE = 48.36 for the latter. Further, the proposed approach outperforms the other benchmark algorithms in terms of accuracy and computational time complexity.
... As a kind of combinational optimization learning method, ensemble learning can efficiently solve practical application problems [15] . Related studies have shown that simply training several neural networks and integrating their prediction can significantly improve the performance of neural networks [16][17] . However, ensemble learning was seldom applied to diagnose lungrelated diseases. ...
Article
Deep learning based analyses of computed tomography (CT) images contribute to automated diagnosis of COVID-19, and ensemble learning may commonly provide a better solution. Here, we proposed an ensemble learning method that integrates several component neural networks to jointly diagnose COVID-19. Two ensemble strategies are considered: the output scores of all component models that are combined with the weights adjusted adaptively by cost function back propagation; voting strategy. A database containing 8 347 CT slices of COVID-19, common pneumonia and normal subjects was used as training and testing sets. Results show that the novel method can reach a high accuracy of 99.37% (recall: 0.9981, precision: 0.989 3), with an increase of about 7% in comparison to single-component models. And the average test accuracy is 95.62% (recall: 0.958 7, precision: 0.955 9), with a corresponding increase of 5.2%. Compared with several latest deep learning models on the identical test set, our method made an accuracy improvement up to 10.88%. The proposed method may be a promising solution for the diagnosis of COVID-19.
... Intrusion detection is an important problem for both physical space [20][21][22] and cyberspace [23,24], for safety and security reasons. There are numerous previous studies attempting to address the difficulties of railway clearance intrusion detection. ...
Article
Full-text available
The efficiency and the effectiveness of railway intrusion detection are crucial to the safety of railway transportation. Most current methods of railway intrusion detection or obstacle detection are inappropriate for large-scale applications due to their high cost or limited coverage. In this study, we present a fast and low-cost solution to intrusion detection of high-speed railways. As the solution to heavy computational burdens in the current convolutional-neural-network-based detection methods, the proposed method is mainly a novel neural network based on the SSD framework, which includes a feature extractor using an improved MobileNet and a lightweight and efficient feature fusion module. In addition, aiming to improve the detection accuracy of small objects, the feature map weights are introduced through convolution operation to fuse features at different scales. TensorRT is employed to optimize and deploy the proposed network in the low-cost embedded GPU platform, NVIDIA Jetson TX2, to enhance the efficiency. The experimental results show that the proposed methods achieved 89% mAP on the railway intrusion detection dataset, and the average processing time for a single frame was 38.6 ms on the Jetson TX2 module, which satisfies the need of real-time processing.
... • Raw traffic -network traffic observed in an observation point, such as a line, to which the probe is attached, an Ethernet-based LAN, or the ports of a switch or router [10]. • Flow (also called traffic flow (e.g., [10]), network connection (e.g., [11]), internet stream (e.g., [12]))-grouped raw network traffic according the same properties, usually 5tuple: source and destination IP address, source and destination port number, and type of service. • Session-bi-directional flow. ...
Article
Full-text available
The enormous growth of services and data transmitted over the internet, the bloodstream of modern civilization, has caused a remarkable increase in cyber attack threats. This fact has forced the development of methods of preventing attacks. Among them, an important and constantly growing role is that of machine learning (ML) approaches. Convolutional neural networks (CNN) belong to the hottest ML techniques that have gained popularity, thanks to the rapid growth of computing power available. Thus, it is no wonder that these techniques have started to also be applied in the network traffic classification domain. This has resulted in a constant increase in the number of scientific papers describing various approaches to CNN-based traffic analysis. This paper is a survey of them, prepared with particular emphasis on a crucial but often disregarded aspect of this topic—the data transformation schemes. Their importance is a consequence of the fact that network traffic data and machine learning data have totally different structures. The former is a time series of values—consecutive bytes of the datastream. The latter, in turn, are one-, two- or even three-dimensional data samples of fixed lengths/sizes. In this paper, we introduce a taxonomy of data transformation schemes. Next, we use this categorization to describe various CNN-based analytical approaches found in the literature.
... Authors found that Xgboost outperforms SVM, NB, and RF-based IDSs. The authors [27] and [28] used deep learning based NIDS model which are more resource consuming and complex network intrusion detection system. ...
... Misuse intrusion detection is also called feature-based intrusion detection. e basic principle is to collect a large number of network intrusion characteristics and establish a network intrusion signature database by establishing a misuse detection model [6]. e detection process can be simply understood as comparing the status of the monitored network data with the established network intrusion signature database to determine whether the current network behavior is abnormal. ...
Article
Full-text available
Intrusion Detection System (IDS) is an important part of ensuring network security. When the system faces network attacks, it can identify the source of threats in a timely and accurate manner and adjust strategies to prevent hackers from intruding. Efficient IDS can identify external threats well, but traditional IDS has poor performance and low recognition accuracy. To improve the detection rate and accuracy of IDS, this paper proposes a novel ACGA-BPNN method based on adaptive clonal genetic algorithm (ACGA) and backpropagation neural network (BPNN). ACGA-BPNN is simulated on the KDD-CUP’99 and UNSW-NB15 data sets. The simulation results indicate that, in contrast to the methods based on simulated annealing (SA) and genetic algorithm (GA), the detection rate and accuracy of ACGA-BPNN are much higher than of GA-BPNN and SA-BPNN. In the classification results of KDD-CUP’99, the classification accuracy of ACGA-BPNN is 11% higher than GA-BPNN and 24.2% higher than SA-BPNN, and F-score reaches 99.0%. In addition, ACGA-BPNN has good global searchability and its convergence speed is higher than that of GA-BPNN and SA-BPNN. Furthermore, ACGA-BPNN significantly improves the overall detection performance of IDS.
Article
Full-text available
As cyber threats evolve in complexity, traditional cybersecurity measures struggle to keep pace, often failing to detect sophisticated attacks. To address these challenges, this paper introduces a robust machine learning-based Intrusion Detection System (IDS) that integrates advanced deep learning models. By leveraging hybrid architectures, such as the combination of Convolutional Neural Networks (CNN) and Long Short-Term Memory (LSTM) networks, the system enhances detection accuracy by capturing both spatial and temporal patterns in network traffic. The hybrid approach enables the model to analyze and classify real-time network anomalies and threats with high precision, reducing false positives and improving overall reliability. This research demonstrates the effectiveness of the proposed system by evaluating its performance against traditional methods, with hybrid CNN-LSTM and DCNN-LSTM models delivering superior results. The system is trained on a comprehensive dataset that includes normal behavior and diverse cyber threats, enabling it to detect both known and novel attacks. The results highlight the hybrid model's potential in not only enhancing intrusion detection but also minimizing false positives, ultimately providing a scalable, accurate, and adaptive solution for securing modern digital infrastructures against emerging cyber threats.
Article
Full-text available
The Internet of Things (IoT) refers to a vast and interconnected network comprising smart objects with comprehensive capabilities. Unfortunately, the vulnerabilities of IoT device awareness layer nodes are vulnerable to network intrusion. Therefore, it is crucial to detect new types of intrusions in the IoT environment. Also, the current IoT intrusion detection models are trained by samples with a balanced distribution. However, the distribution of intercepted network samples is unbalanced in some specific scenarios. In addition, malicious traffic easily interferes with the IoT environment. As a result, detection efficiency and accuracy decrease. In this study, we propose a multi-constraint transfer approach with additional auxiliary domains for IoT intrusion detection under unbalanced samples distribution. First, we construct a high precision and efficiency feature extractor using PointNet ++ as a framework to complete attack feature extraction. We then design a multi-constraint transfer approach with additional auxiliary domains. In addition, we also design a multi-scale and multi-level sample augmented discriminator to complete the final IoT intrusion detection under unbalanced samples distribution. Finally, we validate our approach by using four intrusion datasets from IoT networks, and it demonstrates excellent performance. In the comparison results of all approaches, the detection accuracy of our approach is the highest under four unbalanced sample combinations. Also, the average accuracy is 96.398% on the four datasets. One of the biggest advantages of this approach is its very good convergence, efficiency and detection stability in the presence of noise. In particular, it can be used effectively for intrusion detection in real IoT environments.
Article
The rapid growth of Internet of Things (IoT) applications has raised concerns about the security of IoT communication systems, particularly due to a surge in malicious attacks leading to network disruptions and system failures. This study introduces a novel solution, the Hyper-Parameter Optimized Progressive Neural Network (HOPNET) model, designed to effectively detect intrusions in IoT communication networks. Validation using the Nsl-Kdd dataset involves meticulous data preprocessing for error rectification and feature extraction across diverse attack categories. Implemented on the Java platform, the HOPNET model undergoes comprehensive evaluation through comparative analysis with established intrusion detection methods. Results demonstrate the superiority of the HOPNET model, with improved attack prediction scores and significantly reduced processing times, highlighting the importance of advanced intrusion detection methods for enhancing IoT communication security. The HOPNET model contributes by establishing robust defense against evolving cyber threats, ensuring a safer IoT ecosystem, and paving the way for proactive security measures as the IoT landscape continues to evolve.
Article
Rapid technological advances and network progress has occurred in recent decades, as has the global growth of services via the Internet. Consequently, piracy has become more prevalent, and many modern systems have been infiltrated, making it vital to build information security tools to identify new threats. An intrusion detection system (IDS) is a critical information security technology that detects network fluctuations with the help of machine learning (ML) and deep learning (DL) approaches. However, conventional techniques could be more effective in dealing with advanced attacks. So, this paper proposes an efficient DL approach for network intrusion detection (NID) using an optimal weight-based deep neural network (OWDNN). The network traffic data was initially collected from three openly available datasets: NSL-KDD, CSE-CIC-IDS2018 and UNSW-NB15. Then preprocessing was carried out on the collected data based on missing values imputation, one-hot encoding, and normalization. After that, the data under-sampling process is performed using the butterfly-optimized k-means clustering (BOKMC) algorithm to balance the unbalanced dataset. The relevant features from the balanced dataset are selected using inception version 3 with multi-head attention (IV3MHA) mechanism to reduce the computation burden of the classifier. After that, the dimensionality of the selected feature is reduced based on principal component analysis (PCA). Finally, the classification is done using OWDNN, which classifies the network traffic as normal and anomalous. Experiments on NSL-KDD, CSE-CIC-IDS2018 and UNSW-NB15 datasets show that the OWDNN performs better than the other ID methods.
Article
The mission of an intrusion detection system (IDS) is to monitor network activities and assess whether or not they are malevolent. Specifically, anomaly-based IDS can discover irregular activities by discriminating between normal and anomalous deviations. Nonetheless, existing strategies for detecting anomalies generally rely on single classification models that are still incapable of reducing the false alarm rate and increasing the detection rate. This study introduces a dual ensemble model by combining two existing ensemble techniques, such as bagging and gradient boosting decision tree (GBDT). Multiple dual ensemble schemes involving various fine-tuned GBDT algorithms such as gradient boosting machine (GBM), LightGBM, CatBoost, and XGBoost, are extensively appraised using multiple publicly available data sets, such as NSL-KDD, UNSW-NB15, and HIKARI-2021. The results indicate that the proposed technique is a reasonable solution for the anomaly-based IDS task. Furthermore, we demonstrate that the combination of Bagging and GBM is superior to all alternative combination schemes. In addition, the proposed dual ensemble (e.g., Bagging-GBM) is considerably more competitive than similar techniques reported in the current literature.
Article
Malicious traffic detection is one of the most important parts of cyber security. The approaches of using the flow as the detection object are recognized as effective. Benefitting from the development of deep learning techniques, raw traffic can be directly used as a feature to detect malicious traffic. Most existing work usually converts raw traffic into images or long sequences to express a flow and then uses deep learning technology to extract features and classify them, but the generated features contain much redundant or even useless information, especially for encrypted traffic. The packet header field contains most of the packet characteristics except the payload content, and it is also an important element of the flow. In this paper, we only use the fields of the packet header in the raw traffic to construct the characteristic representation of the traffic and propose a novel flow-vector generation approach for malicious traffic detection. The preprocessed header fields are embedded as field vectors, and then a two-layer attention network is used to progressively generate the packet vectors and the flow vector containing context information. The flow vector is regarded as the abstraction of the raw traffic and is used to classify. The experiment results illustrate that the accuracy rate can reach up to 99.48% in the binary classification task and the average of AUC-ROC can reach 0.9988 in the multi-classification task.
Article
The goal of this systematic and broad survey is to present and discuss the main challenges that are posed by the implementation of Artificial Intelligence and Machine Learning in the form of Artificial Neural Networks in Cybersecurity, specifically in Intrusion Detection Systems. Based on the results of the state-of-the-art analysis with a number of bibliographic methods, as well as their own implementations, the authors provide a survey of the answers to the posed problems as well as effective, experimentally-found solutions to those key issues. The issues include hyperparameter tuning, dataset balancing, increasing the effectiveness of an ANN, securing the networks from adversarial attacks, and a range of non-technical challenges of applying ANNs for IDS, such as societal, ethical and legal dilemmas, and the question of explainability. Thus, it is a systematic review and a summary of the body of knowledge amassed around implementations of Artificial Neural Networks in Network Intrusion Detection, guided by an actual, real-world implementation.
Article
Nowadays, several kinds of attacks exist in cyberspace, and hence comprehensive research has been implemented to overcome these drawbacks. One such method to provide security in WSN (Wireless sensor network) is Intrusion Detection System. However, the determination of unknown attacks remains a major challenge in the intrusion detection system. Hence, the usage of deep learning methodologies remains to be an active area in cyber security. However, prevailing a deep learning algorithm possesses limitations such as comparatively low accuracy and heavily depends on the manual selection of the features. These problems have been analysed, and corresponding to the problems, the present study proposed an enhanced empirical-based component analysis to select relevant features. This proposed feature selection model integrates the advantages of both empirical mode decomposition and principal component analysis to retain most of the relevant features. The classification of the attack node with the selected features was performed with LSTM (Long Short Term Memory). The proposed framework validated datasets, namely NSL – KDD, CICIDS 2017, UNSW NB 2015, and KDD99 datasets, compared with the state of art methods. The comparative analysis with the prevailing methods proved the effectiveness of the presented system in terms of performance metrics such as accuracy, F1 score, Recall, FPR, FAR, etc.
Article
Network Intrusion Detection (NID) systems are one of the most powerful forms of defense for protecting public and private networks. Most of the prominent methods applied to NID problems consist of Deep Learning methods that have achieved outstanding accuracy performance. However, even though they are effective, these systems are still too complex to interpret and explain. In recent years this lack of interpretability and explainability has begun to be a major drawback of deep neural models, even in NID applications. With the aim of filling this gap, we propose ROULETTE: a method based on a new neural model with attention for an accurate, explainable multi-class classification of network traffic data. In particular, attention is coupled with a multi-output Deep Learning strategy that helps to discriminate better between network intrusion categories. We report the results of extensive experimentation on two benchmark datasets, namely NSL-KDD and UNSW-NB15, which show the beneficial effects of the proposed attention mechanism and multi-output learning strategy on both the accuracy and explainability of the decisions made by the method.
Article
Internet of things (IoT) security is a prerequisite for the rapid development of the IoT to enhance human well-being. Machine learning-based intrusion detection systems (IDS) have good protection capabilities. However, it is difficult to identify attack information in massive amounts of data, which leads to inefficient model detection when faced with insufficient samples for certain types of attacks. In this regard, this paper fuses deep learning methods and statistical ideas to address the problem of minority samples attack detection, and proposes an intrusion detection method for the IoT based on Improved Conditional Variational Autoencoder (ICVAE) and Borderline Synthetic Minority Oversampling Technique (BSM), called ICVAE-BSM. By introducing an auxiliary network into the Conditional Variational Autoencoder (CVAE) to adjust the output probability distribution of the encoder, learning the posterior distribution of different classes of samples, so that the distributions of samples of the same class are concentrated, and the distributions of different classes of samples are scattered in the latent space; then based on BSM, adaptively synthesize the edge latent variables in the latent space of ICVAE, and feed the new synthetic edge latent variables to the ICVAE’s decoder to generate representative new samples to balance the data set. The output of the encoder is connected to the Softmax classifier at last, and the original samples are mixed with the generated samples to fine-tune it to enhance its generalization ability for intrusion detection of minority samples. We use the NSL-KDD data set, CIC-IDS2017 data set and CSE-CIC-IDS2018 data set to simulate and evaluate the model, the experimental results show that the proposed method can more effectively improve the accuracy of IoT attack detection under the condition of unbalanced samples.
Article
Full-text available
Intrusion detection tools have largely benefitted from the usage of supervised classification methods developed in the field of data mining. However, the data produced by modern system/network logs pose many problems, such as the streaming and non-stationary nature of such data, their volume and velocity, and the presence of imbalanced classes. Classifier ensembles look a valid solution for this scenario, owing to their flexibility and scalability. In particular, data-driven schemes for combining the predictions of multiple classifiers have been shown superior to traditional fixed aggregation criteria (e.g., predictions’ averaging and weighted voting). In intrusion detection settings, however, such schemes must be devised in an efficient way, since (part of) the ensemble may need to be re-trained frequently. A novel ensemble-based framework is proposed here for the online intrusion detection, where the ensemble is updated through an incremental stream-oriented learning scheme, correspondingly to the detection of concept drifts. Differently from mainstream ensemble-based approaches in the field, our proposal relies on deriving, though an efficient genetic programming (GP) method, an expressive kind of combiner function defined in terms of (non-trainable) aggregation functions. This approach is supported by a system architecture, which integrates different kinds of functionalities, ranging from the drift detection, to the induction and replacement of base classifiers, up to the distributed computation of GP-based combiners. Experiments on both artificial and real-life datasets confirmed the validity of the approach.
Article
Full-text available
Intrusion detection system (IDS) is one of extensively used techniques in a network topology to safeguard the integrity and availability of sensitive assets in the protected systems. Although many supervised and unsupervised learning approaches from the field of machine learning have been used to increase the efficacy of IDSs, it is still a problem for existing intrusion detection algorithms to achieve good performance. First, lots of redundant and irrelevant data in high-dimensional datasets interfere with the classification process of an IDS. Second, an individual classifier may not perform well in the detection of each type of attacks. Third, many models are built for stale datasets, making them less adaptable for novel attacks. Thus, we propose a new intrusion detection framework in this paper, and this framework is based on the feature selection and ensemble learning techniques. In the first step, a heuristic algorithm called CFS-BA is proposed for dimensionality reduction, which selects the optimal subset based on the correlation between features. Then, we introduce an ensemble approach that combines C4.5, Random Forest (RF), and Forest by Penalizing Attributes (Forest PA) algorithms. Finally, voting technique is used to combine the probability distributions of the base learners for attack recognition. The experimental results, using NSL-KDD, AWID, and CIC-IDS2017 datasets, reveal that the proposed CFS-BA-Ensemble method is able to exhibit better performance than other related and state of the art approaches under several metrics.
Article
Full-text available
In recent years, advanced threat attacks are increasing, but the traditional network intrusion detection system based on feature filtering has some drawbacks which make it difficult to find new attacks in time. This paper takes NSL-KDD data set as the research object, analyses the latest progress and existing problems in the field of intrusion detection technology, and proposes an adaptive ensemble learning model. By adjusting the proportion of training data and setting up multiple decision trees, we construct a MultiTree algorithm. In order to improve the overall detection effect, we choose several base classifiers, including decision tree, random forest, kNN, DNN, and design an ensemble adaptive voting algorithm. We use NSL-KDD Test+ to verify our approach, the accuracy of the MultiTree algorithm is 84.2%, while the final accuracy of the adaptive voting algorithm reaches 85.2%. Compared with other research papers, it is proved that our ensemble model effectively improves detection accuracy. In addition, through the analysis of data, it is found that the quality of data features is an important factor to determine the detection effect. In the future, we should optimize the feature selection and preprocessing of intrusion detection data to achieve better results.
Article
Full-text available
Many Intrusion Detection Systems (IDS) has been proposed in the current decade. To evaluate the effectiveness of the IDS Canadian Institute of Cybersecurity presented a state of art dataset named CICIDS2017, consisting of latest threats and features. The dataset draws attention of many researchers as it represents threats which were not addressed by the older datasets. While undertaking an experimental research on CICIDS2017, it has been found that the dataset has few major shortcomings. These issues are sufficient enough to biased the detection engine of any typical IDS. This paper explores the detailed characteristics of CICIDS2017 dataset and outlines issues inherent to it.Finally, it also presents a combined dataset by eliminating such issues for better classification and detection of any future intrusion detection engine.
Article
Full-text available
Identification of network attacks is a matter of great concern for network operators due to extensive the number of vulnerabilities in computer systems and creativity of the attackers. Anomaly-based Intrusion Detection Systems (IDSs) present a significant opportunity to identify possible incidents, logging information and reporting attempts. However, these systems generate a low detection accuracy rate with changing network environment or services. To overcome this problem, we present a deep neural network architecture based on a combination of a stacked denoising autoencoder and a softmax classifier. Our architecture can extract important features from data and learn a model for detecting abnormal behaviors. The model is trained locally to denoise corrupted versions of their inputs based on stacking layers of denoising autoencoders in order to achieve reliable intrusion detection. Experimental results on real KDD-CUP'99 dataset show that our architecture outperformed shallow learning architectures and other deep neural network architectures.
Article
Full-text available
Humans and animals have the ability to continually acquire and fine-tune knowledge throughout their lifespan. This ability, referred to as lifelong learning, is mediated by a rich set of neurocognitive mechanisms that together contribute to the development and specialization of our sensorimotor skills as well as to long-term memory consolidation and retrieval. Consequently, lifelong learning capabilities are crucial for computational learning systems and autonomous agents interacting in the real world and processing continuous streams of information. However, lifelong learning remains a long-standing challenge for machine learning and neural network models since the continual acquisition of incrementally available information from non-stationary data distributions generally leads to catastrophic forgetting or interference. This limitation represents a major drawback also for state-of-the-art deep and shallow neural network models that typically learn representations from stationary batches of training data, thus without accounting for situations in which the number of tasks is not known a priori and the information becomes incrementally available over time. In this review, we critically summarize the main challenges linked to lifelong learning for artificial learning systems and compare existing neural network approaches that alleviate, to different extents, catastrophic forgetting. Although significant advances have been made in domain-specific learning with neural networks, extensive research efforts are required for the development of robust lifelong learning on autonomous agents and robots. We discuss well-established and emerging research motivated by lifelong learning factors in biological systems such as neurosynaptic plasticity, critical developmental stages, multi-task transfer learning, intrinsically motivated exploration, and crossmodal learning.
Article
Full-text available
The evaluation of algorithms and techniques to implement intrusion detection systems heavily rely on the existence of well designed datasets. In the last years, a lot of efforts have been done towards building these datasets. Yet, there is still room to improve. In this paper, a comprehensive review of existing datasets is first done, making emphasis on their main shortcomings. Then, we present a new dataset that is built with real traffic and up-to-date attacks. The main advantage of this dataset over previous ones is its usefulness for evaluating IDSs that consider long-term evolution and traffic periodicity. Models that consider differences in daytime/night or weekdays/weekends can also be trained and evaluated with it. We discuss all the requirements for a modern IDS evaluation dataset and analyze how the one presented here meets the different needs.
Article
Full-text available
A model of an intrusion-detection system capable of detecting attack in computer networks is described. The model is based on deep learning approach to learn best features of network connections and Memetic algorithm as final classifier for detection of abnormal traffic.One of the problems in intrusion detection systems is large scale of features. Which makes typical methods data mining method were ineffective in this area. Deep learning algorithms succeed in image and video mining which has high dimensionality of features. It seems to use them to solve the large scale of features problem of intrusion detection systems is possible. The model is offered in this paper which tries to use deep learning for detecting best features.An evaluation algorithm is used for produce final classifier that work well in multi density environments.We use NSL-KDD and Kdd99 dataset to evaluate our model, our findings showed 98.11 detection rate. NSL-KDD estimation shows the proposed model has succeeded to classify 92.72% R2L attack group.
Article
Full-text available
Ensemble-based methods are among the most widely used techniques for data stream classification. Their popularity is attributable to their good performance in comparison to strong single learners while being relatively easy to deploy in real-world applications. Ensemble algorithms are especially useful for data stream learning as they can be integrated with drift detection algorithms and incorporate dynamic updates, such as selective removal or addition of classifiers. This work proposes a taxonomy for data stream ensemble learning as derived from reviewing over 60 algorithms. Important aspects such as combination, diversity, and dynamic updates, are thoroughly discussed. Additional contributions include a listing of popular open-source tools and a discussion about current data stream research challenges and how they relate to ensemble learning (big data streams, concept evolution, feature drifts, temporal dependencies, and others).
Article
Full-text available
In this work, we introduce a novel interpretation of residual networks showing they are exponential ensembles. This observation is supported by a large-scale lesion study that demonstrates they behave just like ensembles at test time. Subsequently, we perform an analysis showing these ensembles mostly consist of networks that are each relatively shallow. For example, contrary to our expectations, most of the gradient in a residual network with 110 layers comes from an ensemble of very short networks, i.e., only 10-34 layers deep. This suggests that in addition to describing neural networks in terms of width and depth, there is a third dimension: multiplicity, the size of the implicit ensemble. Ultimately, residual networks do not resolve the vanishing gradient problem by preserving gradient flow throughout the entire depth of the network - rather, they avoid the problem simply by ensembling many short networks together. This insight reveals that depth is still an open research question and invites the exploration of the related notion of multiplicity.
Conference Paper
Full-text available
A Network Intrusion Detection System (NIDS) helps system administrators to detect network security breaches in their organizations. However, many challenges arise while developing a flexible and efficient NIDS for unforeseen and unpredictable attacks. We propose a deep learning based approach for developing such an efficient and flexible NIDS. We use Self-taught Learning (STL), a deep learning based technique, on NSL-KDD - a benchmark dataset for network intrusion. We present the performance of our approach and compare it with a few previous work. Compared metrics include accuracy, precision, recall, and f-measure values.
Article
Full-text available
The prevalence of mobile phones, the internet-of-things technology, and networks of sensors has led to an enormous and ever increasing amount of data that are now more commonly available in a streaming fashion [1]-[5]. Often, it is assumed - either implicitly or explicitly - that the process generating such a stream of data is stationary, that is, the data are drawn from a fixed, albeit unknown probability distribution. In many real-world scenarios, however, such an assumption is simply not true, and the underlying process generating the data stream is characterized by an intrinsic nonstationary (or evolving or drifting) phenomenon. The nonstationarity can be due, for example, to seasonality or periodicity effects, changes in the users' habits or preferences, hardware or software faults affecting a cyber-physical system, thermal drifts or aging effects in sensors. In such nonstationary environments, where the probabilistic properties of the data change over time, a non-adaptive model trained under the false stationarity assumption is bound to become obsolete in time, and perform sub-optimally at best, or fail catastrophically at worst.
Article
Full-text available
Anomaly detection in communication networks provides the basis for the uncovering of novel attacks, misconfigurations and network failures. Resource constraints for data storage, transmission and processing make it beneficial to restrict input data to features that are (a) highly relevant for the detection task and (b) easily derivable from network observations without expensive operations. Removing strong correlated, redundant and irrelevant features also improves the detection quality for many algorithms that are based on learning techniques. In this paper we address the feature selection problem for network traffic based anomaly detection. We propose a multi-stage feature selection method using filters and stepwise regression wrappers. Our analysis is based on 41 widely-adopted traffic features that are presented in several commonly used traffic data sets. With our combined feature selection method we could reduce the original feature vectors from 41 to only 16 features. We tested our results with five fundamentally different classifiers, observing no significant reduction of the detection performance. In order to quantify the practical benefits of our results, we analyzed the costs for generating individual features from standard IP Flow Information Export records, available at many routers. We show that we can eliminate 13 very costly features and thus reducing the computational effort for on-line feature generation from live traffic observations at network nodes.
Conference Paper
Full-text available
Transfer Learning is a paradigm in machine learning to solve a target problem by reusing the learning with minor modifications from a different but related source problem. In this paper we propose a novel feature transference approach, especially when the source and the target problems are drawn from different distributions. We use deep neural networks to transfer either low or middle or higher-layer features for a machine trained in either unsupervised or supervised way. Applying this feature transference approach on Convolutional Neural Network and Stacked Denoising Autoencoder on four different datasets, we achieve lower classification error rate with significant reduction in computation time with lower-layer features trained in supervised way and higher-layer features trained in unsupervised way for classifying images of uppercase and lowercase letters dataset.
Article
Full-text available
Mixture of experts (ME) is one of the most popular and interesting combining methods, which has great potential to improve performance in machine learning. ME is established based on the divide-and-conquer principle in which the problem space is divided between a few neural network experts, supervised by a gating network. In earlier works on ME, different strategies were developed to divide the problem space between the experts. To survey and analyse these methods more clearly, we present a categorisation of the ME literature based on this difference. Various ME implementations were classified into two groups, according to the partitioning strategies used and both how and when the gating network is involved in the partitioning and combining procedures. In the first group, The conventional ME and the extensions of this method stochastically partition the problem space into a number of subspaces using a special employed error function, and experts become specialised in each subspace. In the second group, the problem space is explicitly partitioned by the clustering method before the experts’ training process starts, and each expert is then assigned to one of these sub-spaces. Based on the implicit problem space partitioning using a tacit competitive process between the experts, we call the first group the mixture of implicitly localised experts (MILE), and the second group is called mixture of explicitly localised experts (MELE), as it uses pre-specified clusters. The properties of both groups are investigated in comparison with each other. Investigation of MILE versus MELE, discussing the advantages and disadvantages of each group, showed that the two approaches have complementary features. Moreover, the features of the ME method are compared with other popular combining methods, including boosting and negative correlation learning methods. As the investigated methods have complementary strengths and limitations, previous researches that attempted to combine their features in integrated approaches are reviewed and, moreover, some suggestions are proposed for future research directions.
Article
Full-text available
We introduce an ensemble of classifiers-based approach for incremental learning of concept drift, characterized by nonstationary environments (NSEs), where the underlying data distributions change over time. The proposed algorithm, named Learn++.NSE, learns from consecutive batches of data without making any assumptions on the nature or rate of drift; it can learn from such environments that experience constant or variable rate of drift, addition or deletion of concept classes, as well as cyclical drift. The algorithm learns incrementally, as other members of the Learn++ family of algorithms, that is, without requiring access to previously seen data. Learn++.NSE trains one new classifier for each batch of data it receives, and combines these classifiers using a dynamically weighted majority voting. The novelty of the approach is in determining the voting weights, based on each classifier's time-adjusted accuracy on current and past environments. This approach allows the algorithm to recognize, and act accordingly, to the changes in underlying data distributions, as well as to a possible reoccurrence of an earlier distribution. We evaluate the algorithm on several synthetic datasets designed to simulate a variety of nonstationary environments, as well as a real-world weather prediction dataset. Comparisons with several other approaches are also included. Results indicate that Learn++.NSE can track the changing environments very closely, regardless of the type of concept drift. To allow future use, comparison and benchmarking by interested researchers, we also release our data used in this paper.
Article
Full-text available
This paper introduces stacked generalization, a scheme for minimizing the generalization error rate of one or more generalizers. Stacked generalization works by deducing the biases of the generalizer(s) with respect to a provided learning set. This deduction proceeds by generalizing in a second space whose inputs are (for example) the guesses of the original generalizers when taught with part of the learning set and trying to guess the rest of it, and whose output is (for example) the correct guess. When used with multiple generalizers, stacked generalization can be seen as a more sophisticated version of cross-validation, exploiting a strategy more sophisticated than cross-validation's crude winner-takes-all for combining the individual generalizers. When used with a single generalizer, stacked generalization is a scheme for estimating (and then correcting for) the error of a generalizer which has been trained on a particular learning set and then asked a particular question. After introducing stacked generalization and justifying its use, this paper presents two numerical experiments. The first demonstrates how stacked generalization improves upon a set of separate generalizers for the NETtalk task of translating text to phonemes. The second demonstrates how stacked generalization improves the performance of a single surface-fitter. With the other experimental evidence in the literature, the usual arguments supporting cross-validation, and the abstract justifications presented in this paper, the conclusion is that for almost any real-world generalization problem one should use some version of stacked generalization to minimize the generalization error rate. This paper ends by discussing some of the variations of stacked generalization, and how it touches on other fields like chaos theory.
Conference Paper
Full-text available
We investigate potential simulation artifacts and their effects on the evaluation of network anomaly detection systems in the 1999 DARPA/MIT Lincoln Laboratory off-line intrusion detection evaluation data set. A statistical comparison of the simulated b ackground and training traffic with real t raffic c ollected from a university departmental server suggests the presence of artifacts that could allow a network anomaly detection system to d etect some novel i ntrusions based on idiosyncrasies of the underlying implementation of the simulation, with an artificially low false alarm rate. The evaluation problem can be mitigated by mixing real traffic into the simulation. We compare five anomaly detection algorithms on simulated and mixed traffic. On mixed traffic they detect fewer attacks, but t he e xplanations for these detections are more plausible.
Article
In this work we propose a new deep learning based approach for online classification on streams of high-dimensional data. While requiring very little historical data storage, our approach is able to alleviate catastrophic forgetting in the scenario of continual learning with no assumption on the stationarity of the data in the stream. To make up for the absence of historical data, we propose a new generative autoencoder endowed with an auxiliary loss function that ensures fast task-sensitive convergence. To evaluate our approach we perform experiments on two well-known image datasets, MNIST and LSUN, in a continuous streaming mode. We extend the experiments to a large multi-class synthetic dataset that allows to check the performance of our method in more challenging settings with up to 1000 distinct classes. Our approach is able to perform classification on dynamic data streams with an accuracy close to the results obtained in the offline classification setup where all the data are available for the full duration of training. In addition, we demonstrate the ability of our method to adapt to unseen data classes and new instances of already known data categories, while avoiding catastrophic forgetting of previously acquired knowledge.
Article
The use of deep learning models for the network intrusion detection task has been an active area of research in cybersecurity. Although several excellent surveys cover the growing body of research on this topic, the literature lacks an objective comparison of the different deep learning models within a controlled environment, especially on recent intrusion detection datasets. In this paper, we first introduce a taxonomy of deep learning models in intrusion detection and summarize the research papers on this topic. Then we train and evaluate four key deep learning models - feed-forward neural network, autoencoder, deep belief network and long short-term memory network - for the intrusion classification task on two legacy datasets (KDD 99, NSL-KDD) and two modern datasets (CIC-IDS2017, CIC-IDS2018). Our results suggest that deep feed-forward neural networks yield desirable evaluation metrics on all four datasets in terms of accuracy, F1-score and training and inference time. The results also indicate that two popular semi-supervised learning models, autoencoders and deep belief networks do not perform better than supervised feed-forward neural networks. The implementation and the complete set of results have been released for future use by the research community. Finally, we discuss the issues in the research literature that were revealed in the survey and suggest several potential future directions for research in machine learning methods for intrusion detection.
Article
It is clear that the learning speed of feedforward neural networks is in general far slower than required and it has been a major bottleneck in their applications for past decades. Two key reasons behind may be: (1) the slow gradient-based learning algorithms are extensively used to train neural networks, and (2) all the parameters of the networks are tuned iteratively by using such learning algorithms. Unlike these conventional implementations, this paper proposes a new learning algorithm called extreme learning machine (ELM) for single-hidden layer feedforward neural networks (SLFNs) which randomly chooses hidden nodes and analytically determines the output weights of SLFNs. In theory, this algorithm tends to provide good generalization performance at extremely fast learning speed. The experimental results based on a few artificial and real benchmark function approximation and classification problems including very large complex applications show that the new algorithm can produce good generalization performance in most cases and can learn thousands of times faster than conventional popular learning algorithms for feedforward neural networks.1
Article
Network traffic anomaly detection is an important technique of ensuring network security. However, there are usually three problems with existing machine learning based anomaly detection algorithms. First, most of the models are built for stale data sets, making them less adaptable in real-world environments; Second, most of the anomaly detection algorithms do not have the ability to learn new models again based on changes in the attack environment; Third, from the perspective of data multi-dimensionality, a single detection algorithm has a peak value and cannot be well adapted to the needs of a complex network attack environment. Thus, we propose a new anomaly detection framework, and this framework is based on the organic integration of multiple deep learning techniques. In the first step, we used the Damped Incremental Statistics algorithm to extract features from network traffic; Second, we train Autoencoder with a small amount of label data; Third, we use Autoencoder to mark the abnormal score of network traffic; Fourth, the data with the abnormal score label is used to train the LSTM; Finally, the weighted method is used to get the final abnormal score. The experimental results show that our HELAD algorithm has better adaptability and accuracy than other state of the art algorithms.
Article
The massive growth of data that are transmitted through a variety of devices and communication protocols have raised serious security concerns, which have increased the importance of developing advanced intrusion detection systems (IDSs). Deep learning is an advanced branch of machine learning, composed of multiple layers of neurons that represent the learning process. Deep learning can cope with large-scale data and has shown success in different fields. Therefore, researchers have paid more attention to investigating deep learning for intrusion detection. This survey comprehensively reviews and compares the key previous deep learning-focused cybersecurity surveys. Through an extensive review, this survey provides a novel fine-grained taxonomy that categorizes the current state-of-the-art deep learning-based IDSs with respect to different facets, including input data, detection, deployment, and evaluation strategies. Each facet is further classified according to different criteria. This survey also compares and discusses the related experimental solutions proposed as deep learning-based IDSs. By analysing the experimental studies, this survey discusses the role of deep learning in intrusion detection, the impact of intrusion detection datasets, and the efficiency and effectiveness of the proposed approaches. The findings demonstrate that further effort is required to improve the current state-of-the art. Finally, open research challenges are identified, and future research directions for deep learning-based IDSs are recommended.
Article
Cutting edge Deep Learning (DL) techniques have been widely applied to areas like image processing and speech recognition so far. Likewise, some DL work has been done in the area of cybersecurity. In this survey, we focus on recent DL approaches that have been proposed in the area of cybersecurity, namely intrusion detection, malware detection, phishing/spam detection, and website defacement detection. First, preliminary definitions of popular DL models and algorithms are described. Then, a general DL framework for cybersecurity applications is proposed and explained based on the four major modules it consists of. Afterward, related papers are summarized and analyzed with regard to the focus area, methodology, model applicability, and feature granularity. Finally, concluding remarks and future work are discussed including the possible research topics that can be taken into consideration to enhance various cybersecurity applications using DL models.
Chapter
In a fast-growing digital era, the increase in devices connected to internet have raised many security issues. For providing security, varieties of the system are available in the IT sector, Intrusion Detection system is one of such system. The design of an efficient intrusion detection system is an open problem to the research community. In this paper, various machine learning algorithms have been used for detecting different types of Denial-of-Service attack. The performance of the models have been measured on the basis of binary and multi-classification. Furthermore, parameter tuning algorithm has been discussed. On the basis of performance parameters, XGBoost performs efficiently and in robust manner to find an intrusion. The proposed method i.e. XGBoost has been compared with other classifiers like AdaBoost, Naïve Bayes, Multi-layer perceptron (MLP) and K-Nearest Neighbour (KNN) on recently captured network traffic by Canadian Institute of Cybersecurity (CIC). In this research, average class error and overall error have been calculated for the multi-classification problem.
Chapter
The chapter is devoted at illustrating the basic principles and the current results which characterize the research on Deep Learning. The term refers to the theory and practice of devising and training complex neural networks for supervised and unsupervised tasks. Within the chapter, we illustrate the basic principle underlying the idea of a single neural unit, and will show how these units can be combined to realize a complex network. We shall discuss the basic algorithms for training a network and the recent advances proposed by the literature for scaling up the training to deep architectures. The chapter concludes by an overview of the most succesful deep architectures proposed in the literature, both for supervised and unsupervised learning.
Article
Network intrusion detection systems (NIDSs) play a crucial role in defending computer networks. However, there are concerns regarding the feasibility and sustainability of current approaches when faced with the demands of modern networks. More specifically, these concerns relate to the increasing levels of required human interaction and the decreasing levels of detection accuracy. This paper presents a novel deep learning technique for intrusion detection, which addresses these concerns. We detail our proposed nonsymmetric deep autoencoder (NDAE) for unsupervised feature learning. Furthermore, we also propose our novel deep learning classification model constructed using stacked NDAEs. Our proposed classifier has been implemented in graphics processing unit (GPU)-enabled TensorFlow and evaluated using the benchmark KDD Cup ’99 and NSL-KDD datasets. Promising results have been obtained from our model thus far, demonstrating improvements over existing approaches and the strong potential for use in modern NIDSs.
Conference Paper
Modern intrusion detection systems must handle many complicated issues in real-time, as they have to cope with a real data stream; indeed, for the task of classification, typically the classes are unbalanced and, in addition, they have to cope with distributed attacks and they have to quickly react to changes in the data. Data mining techniques and, in particular, ensemble of classifiers permit to combine different classifiers that together provide complementary information and can be built in an incremental way. This paper introduces the architecture of a distributed intrusion detection framework and in particular, the detector module based on a meta-ensemble, which is used to cope with the problem of detecting intrusions, in which typically the number of attacks is minor than the number of normal connections. To this aim, we explore the usage of ensembles specialized to detect particular types of attack or normal connections, and Genetic Programming is adopted to generate a non-trainable function to combine each specialized ensemble. Non-trainable functions can be evolved without any extra phase of training and, therefore, they are particularly apt to handle concept drifts, also in the case of real-time constraints. Preliminary experiments, conducted on the well-known KDD dataset and on a more up-to-date dataset, ISCX IDS, show the effectiveness of the approach.
Article
In many applications of information systems learning algorithms have to act in dynamic environments where data are collected in the form of transient data streams. Compared to static data mining, processing streams imposes new computational requirements for algorithms to incrementally process incoming examples while using limited memory and time. Furthermore, due to the non-stationary characteristics of streaming data, prediction models are often also required to adapt to concept drifts. Out of several new proposed stream algorithms, ensembles play an important role, in particular for non-stationary environments. This paper surveys research on ensembles for data stream classification as well as regression tasks. Besides presenting a comprehensive spectrum of ensemble approaches for data streams, we also discuss advanced learning concepts such as imbalanced data streams, novelty detection, active and semi-supervised learning, complex data representations and structured outputs. The paper concludes with a discussion of open research problems and lines of future research.
Conference Paper
Recently, deep learning has gained prominence due to the potential it portends for machine learning. For this reason, deep learning techniques have been applied in many fields, such as recognizing some kinds of patterns or classification. Intrusion detection analyses got data from monitoring security events to get situation assessment of network. Lots of traditional machine learning method has been put forward to intrusion detection, but it is necessary to improvement the detection performance and accuracy. This paper discusses different methods which were used to classify network traffic. We decided to use different methods on open data set and did experiment with these methods to find out a best way to intrusion detection.
Article
Gradient descent optimization algorithms, while increasingly popular, are often used as black-box optimizers, as practical explanations of their strengths and weaknesses are hard to come by. This article aims to provide the reader with intuitions with regard to the behaviour of different algorithms that will allow her to put them to use. In the course of this overview, we look at different variants of gradient descent, summarize challenges, introduce the most common optimization algorithms, review architectures in a parallel and distributed setting, and investigate additional strategies for optimizing gradient descent.
Article
While both cost-sensitive learning and online learning have been studied separately, these two issues have seldom been addressed simultaneously. Yet there are many applications where both aspects are important. This paper investigates a class of algorithmic approaches suitable for online cost-sensitive learning, designed for such problems. The basic idea is to leverage existing methods for online ensemble algorithms, and combine these with batch mode methods for cost-sensitive bagging/boosting algorithms. Within this framework, we describe several theoretically sound online cost-sensitive bagging and online cost-sensitive boosting algorithms, and show that the convergence of the proposed algorithms is guaranteed under certain conditions. We then present extensive experimental results on benchmark datasets to compare the performance of the various proposed approaches.