ArticlePublisher preview available

Anomaly-Based Intrusion Detection Using Extreme Learning Machine and Aggregation of Network Traffic Statistics in Probability Space

  • Nokia Bell Labs, Espoo, Finland
To read the full-text of this research, you can request a copy directly from the authors.

Abstract and Figures

Recently, with the increased use of network communication, the risk of compromising the information has grown immensely. Intrusions have become more sophisticated and few methods can achieve efficient results while the network behavior constantly changes. This paper proposes an intrusion detection system based on modeling distributions of network statistics and Extreme Learning Machine (ELM) to achieve high detection rates of intrusions. The proposed model aggregates the network traffic at the IP subnetwork level and the distribution of statistics are collected for the most frequent IPv4 addresses encountered as destination. The obtained probability distributions are learned by ELM. This model is evaluated on the ISCX-IDS 2012 dataset, which is collected using a real-time testbed. The model is compared against leading approaches using the same dataset. Experimental results show that the presented method achieves an average detection rate of 91% and a misclassification rate of 9%. The experimental results show that our methods significantly improve the performance of the simple ELM despite a trade-off between performance and time complexity. Furthermore, our methods achieve good performance in comparison with the other few state-of-the-art approaches evaluated on the ISCX-IDS 2012 dataset.
This content is subject to copyright. Terms and conditions apply.
Cognitive Computation (2018) 10:848–863
Anomaly-Based Intrusion Detection Using Extreme Learning Machine
and Aggregation of Network Traffic Statistics in Probability Space
Buse Gul Atli1·Yoan Miche2·Aapo Kalliola2·Ian Oliver2·Silke Holtmanns2·Amaury Lendasse3
Received: 1 November 2017 / Accepted: 22 May 2018 / Published online: 5 June 2018
©Springer Science+Business Media, LLC, part of Springer Nature 2018
Recently, with the increased use of network communication, the risk of compromising the information has grown
immensely. Intrusions have become more sophisticated and few methods can achieve efficient results while the network
behavior constantly changes. This paper proposes an intrusion detection system based on modeling distributions of network
statistics and Extreme Learning Machine (ELM) to achieve high detection rates of intrusions. The proposed model
aggregates the network traffic at the IP subnetwork level and the distribution of statistics are collected for the most frequent
IPv4 addresses encountered as destination. The obtained probability distributions are learned by ELM. This model is
evaluated on the ISCX-IDS 2012 dataset, which is collected using a real-time testbed. The model is compared against leading
approaches using the same dataset. Experimental results show that the presented method achieves an average detection
rate of 91% and a misclassification rate of 9%. The experimental results show that our methods significantly improve the
performance of the simple ELM despite a trade-off between performance and time complexity. Furthermore, our methods
achieve good performance in comparison with the other few state-of-the-art approaches evaluated on the ISCX-IDS 2012
Keywords Intrusion detection ·Network behavior analysis ·Probability density function ·Hierarchical clustering ·
Extreme learning machine
In recent years, the advances in networking technology,
especially cloud services and the Internet of Things (IoT),
have created new businesses and connected the world by
converting it into a massive information system. This also
has drawn attention of hackers, since more and more
personal and private information have been stored in hosting
devices [6]. Therefore, security practices have been the
focus of intense research due to the requirement for a safe,
secure environment.
Yoan Miche
1Department of Signal Processing and Acoustics,
Aalto University, Espoo, Finland
2Nokia Bell Labs, Espoo, Finland
3The University of Iowa, Iowa City, IA 52242, USA
Network behavior analysis (NBA) and intrusion detec-
tion systems (IDS) play an important role in cybersecurity.
They are potential defense mechanism layers to monitor
network and detect intrusions when user identification and
authentication mechanisms fail to do so. Intrusion detec-
tion systems are capable of recognizing malicious activities
by triggering an alert or logging the results [4]. Anomaly-
based intrusion detection systems analyze network events
and capture security problems by finding unusual activities
which do not conform to the normal baseline. In order to
support anomaly detection systems, NBA tools are deployed
for capturing, aggregating and comparing different network
behaviors [38].
Anomaly-based intrusion detection has been the focus
of intense research in recent years [24,30]. Despite the
significant number of existing studies in this area, more
research is needed due to the continuously evolving nature
of the attacks. In order to solve this problem, a practical
intrusion detection system should be able to update itself to
detect novel and stealthier attacks, as well as handle large
amount of streaming data [11,37].
Content courtesy of Springer Nature, terms of use apply. Rights reserved.
... Hybrid ML algorithms were developed by merging two or more ML algorithms for IDS but the model complexity is high which makes them less efficient. Similarly, the artificial neural networks (ANN) [16] and extreme learning machine (ELM) [17] also increase the complexity issues. Some variants of ELM [18,19] were found to be effective for intrusion detection but the extensive training time is often a big problem. ...
... Sumaiya Thaseen et al. [50] used an integrated model of Neural Networks with correlation-based feature selection for IDS on NSL-KDD and UNSWNB15 datasets. This neural network resulted in 98.45% accuracy and 500 s of computation time for NSL-KDD and 96.4% accuracy and 660 s computation time for UNSWNB15.Atli et al. [17] used an extreme learning machine (ELM) on ISCX-IDS 2012 dataset with 99% detection accurateness and 1% false-positive rate with 42.12 s training time. Singh et al. [18] utilized online sequential ELM (OSELM) on NSL-KDD with an accuracy of 98.66% and a false-positive rate of 1.74% are achieved in 2.43 s detection time while Gao et al. [19] used incremental ELM (I-ELM) on NSL-KDD with 81.22% accuracy, 30.03% false alarm rate and 19.97 s detection time and UNSWNB15 dataset with 77.36% accuracy, 36.09% false alarm rate and 476.18 s detection time. ...
... Experiments are performed to compare the efficiency of OCNN-HMLSTM on the three datasets individually. The comparisons are made against the existing SVM [10], NN [50], ELM [17], CNN [24], LSTM [26], Conv-LSTM [29], DNN [30] and MSCNN [31] based IDS models from literature. Besides, the individual performance of OCNN and HMLSTM for the intrusion detection problem is also evaluated separately and compared with the proposed unified model of OCNN-HMLSTM. ...
Intrusion detection systems (IDS) differentiate the malicious entries from the legitimate entries in network traffic data and helps in securing the networks. Deep learning algorithms have been greatly employed in the network security field for large scale data in modern cyberspace networks because of their ability to learn the deeply integrated features. However, learning both space and time aspects of system information are very challenging for any individual deep knowledge model. While Convolutional Neural Networks (CNN) effectively acquires the spatial aspects, the Long Short-Term Memory (LSTM) neural networks perform better for temporal features. Integrating the benefits of these models has the potential for improving the large scale IDS. In this paper, a high accurate IDS model is proposed by using a unified model of Optimized CNN (OCNN) and Hierarchical Multi-scale LSTM (HMLSTM) for effective extraction and learning of spatial–temporal features. The proposed IDS model performs the pre-processing, feature extraction through network training and network testing and final classification. In the OCNN-HMLSTM model, the Lion Swarm Optimization (LSO) is used to tune the hyper-parameters of CNN for the optimal configuration of learning spatial features. The HMLSTM learns the hierarchical relationships between the different features and extracts the time features. Lastly, the unified IDS approach utilizes the extracted spatial–temporal features for categorizing the network data. Tests are performed over public IDS datasets namely NSL-KDD, ISCX-IDS and UNSWNB15. Assessing the performance of OCNN-HMLSTM against the contemporary IDS methods, the proposed model performs better intrusion detection with high accuracy of above 90% with less false values and better classification coefficients.
... Intrusion detection is the task of observing, analysing and identifying activities aiming to violate a network's security policy. The key success factor for identifying such activities relies on an appropriate monitoring of the network by diagnosing its usage chronically [1]. In the past, organisations used specific authentication policies articulating various levels of accessing. ...
... The use of ELM for IDS was also used with probabilistic algorithms. This is shown in the work of [1], where a probability density function is learned based on flow features for frequent communications. The authors have used a hierarchical heavy hitters' algorithm for clustering network statistics and learning the probability density function of each feature using ELM. ...
Full-text available
Nowadays, Intrusion Detection System (IDS) is an active research topic with machine learning nature. A single-hidden layer feedforward neural network (SLFN) trained on the approach of extreme learning machine (ELM) is used for (IDS). The encouraging factors for its usage are its fast learning and supportability of sequential learning in its online sequential extreme learning machine (OSELM) variant. An issue with OSELM that has been addressed by researchers is its random weights nature of the input-hidden layer. Most approaches use the concept of metaheuristic optimisation for determining the optimal weights of OSELM and resolve the random weight. However, metaheuristic approaches require many trials to determine the optimal one. Hence, there is concern about the convergence aspect and speed. This article proposes a novel approach for finding the optimal weights of the input-hidden layer. This article presents an approach for an integration between OSELM and back-propagation designated as (OSELM-BP). After integration, BP changes the random weights iteratively and uses an iterated evaluation of the generated error for feedback correction of the weights. The approach is evaluated based on various scenarios of activation functions for OSELM on the one hand and the number of iterations for BP on the other. An extensive evaluation of the approach and comparison with the original OSELM reveal a superiority of OSELM-BP in reaching optimal accuracy with a small number of iterations.
... Extreme learning machine (ELM) is an advanced ML algorithm based on the parallel programming single layer feed-forward neural networks. Atli et al., (2018) used ELM on ISCX-IDS 2012 dataset with 91% detection accurateness. Roshan et al., (2018) employed ELM on NSL-KDD with 81% known attack detection and 89% unknown attack detection. ...
Full-text available
The recent advancements in information and communication technologies have led to an increasing number of online systems and services. These online systems can utilize Intrusion detection systems (IDS) to ensure their trustworthiness by preventing cyber security threats. Hence it has become necessary for any system to design advanced and intelligent IDS models. However, most existing IDS models are based on traditional machine learning algorithms with weak, shallow learning behaviours providing less efficient feature selection and classification performance of new attacks. Another problem is that these approaches are either network-based or host-based intrusion detection and it often leads to many known attacks being unrecognized by the detection module. Additionally, they lack flexible and scalable handling of the massive amounts of network traffic data due to high model complexity. To overcome these issues, an efficient hybrid IDS model is presented which is built using MapReduce based Black Widow Optimized Convolutional-Long Short-Term Memory (BWO-CONV-LSTM) network. The first stage of this IDS model is the feature selection by the Artificial Bee Colony (ABC) algorithm. The second stage is the hybrid deep learning classifier model of BWO-CONV-LSTM on a MapReduce framework for intrusion detection from the system traffic data. The proposed BWO-CONV-LSTM network is the combination of Convolutional and LSTM neural networks whose hyper-parameters are optimized by BWO to obtain the ideal architecture. Performance evaluations of the BWO-CONV-LSTM based IDS model are performed over the NSL-KDD, ISCX-IDS, UNSW-NB15, and CSE-CIC-IDS2018 datasets. The results indicate that the proposed BWO-CONV-LSTM model has high intrusion detection performance with 98.67%, 97.003%, 98.667% and 98.25% accuracy for NSL-KDD, ISCX-IDS, UNSW-NB15, and CSE-CIC-IDS2018 datasets, respectively, with fewer false values, less computation time and better classification coefficients.
... There are a massive number of researches on anomaly detection. A computing system's internals is monitored and analyzed through a host-based anomaly detection system to integrate data mining of the audit records and system logs [10]. A previous study [11] has integrated a single-variable time series to measure data performance on the basis of end-to-end round-trip time and packet loss probability. ...
Full-text available
This study aimed at improving the performance of classifiers when trained to identify signatures of unknown attacks. Furthermore, this paper addresses the following objectives: (1) To establish and examine most commonly used classifiers in the implementation of IDSs (KNN and Bayes); (2) To evaluate the performance of the individual classifiers independently; and (3) To model a hybrid classifier based on the strengths of the two classifiers. This study adopted a quantitative methodology of collecting and interpreting data. The study had used the NSL-KDD and the original KDD 1999 datasets. This paper evaluated the devised mechanisms over virtualised networked environments and traffic workloads. SVM was used for detecting cycle numbers whereas coefficients and signal shifts were used for completing period detection. Also, this paper has presented rare data for detecting anomalies. Anticipated events that have not occurred and unanticipated events can be detected at various sampling frequencies based on a hybrid approach since no one has proposed a hybrid approach for detecting anomalies. This paper has ranked features from a network traffic database based on a combination of feature selection wrappers and filers and determined that 16 features showed a strong contribution to the anomaly detection task.
... In scientific research, the use of Neural Networks (NN) in IDS is rather popular: from a multilayer perceptron (MLP) [5] to extreme learning machines (ELM) [6] and recurrent neural networks (RNN) with long short-term memory (LSTM) [7]. As a rule, even simple methods can achieve high detection accuracy: support vector machine (SVM) -95%, decision tree -97% [8]. ...
Conference Paper
Modern information security systems are not always able to withstand constantly evolving computer attacks. Using machine learning, attackers can carry out complex and unknown attacks. Intrusion detection systems based on the search for anomalies allow us to detect unknown attacks, but give a high percentage of false results. Small classes of attacks are worse detected by classifiers when the training data sets are not balanced. In this paper, we propose to use generative adversarial networks (GAN) to generate anomalous samples, which will balance the data set and make the classifier more resistant to adversarial attacks. For the experiment, a WGAN model was chosen, which has a better learning convergence compared to GAN, and allows minimizing the repetition of generated samples (modal collapse). According to the results of the experiment, a higher percentage of correctly detected anomalies was obtained.
... The machine learning method is one of the accepted methods in this subject. However, traditional machine learning and ANN algorithms like [3][4][5][6] did not have enough functionality to extract the complex and nonlinear patterns commonly observed in big data [7]. Recent developments in artificial intelligence have transformed deep learning into the forefront of the new generation of data analysis. ...
Full-text available
One of the most important parameters that hackers have always considered is obtaining information about the status of computer networks, such as hacking into databases and computer networks used in defense systems. Hence, these networks are always exposed to dangerous attacks. On the other hand, networks and hosts face a large amount of data every second. Hence, intrusion detection mechanisms have to mine this growing mountain of data for possible intrusive patterns from the security perspective. This environment and conditions make it hard to detect intrusions fast and accurately. Therefore, to identify such intrusions, it is necessary to design an intrusion detection system using big data techniques that can handle these types of data that have big data nature in detecting unauthorized access to a communication network. Therefore, this article employs a big data-aware deep learning method to design an efficient and effective Intrusion Detection System (IDS) to cope with these challenges. We designed a specific architecture of Long Short-Term Memory (LSTM), and this model can detect complex relationships and long-term dependencies between incoming traffic packets. Through this way, we could reduce the number of false alarms and increase the accuracy of the designed intrusion detection system. Moreover, using big data analytic techniques can improve the speed of deep learning algorithms in this paper, which have low execution speed due to their high complexity. Actually, using these techniques increases the speed of execution of our complex model. Our extensive experiments are on the BigDL directly on top of the Spark framework and train with the NSL-KDD dataset. Results show that the proposed algorithm, called BDL-IDS, outperforms other IDS schemes, such as traditional machine learning and Artificial Neural Network, in terms of detection rate (20%), false alarm rate (60%), accuracy (15%), and training time (70%).
... Jour. 2018 [154] A129 "Ramp loss one-class support vector machine; A robust and effective approach to anomaly detection problems" ...
Full-text available
Anomaly detection has been used for decades to identify and extract anomalous components from data. Many techniques have been used to detect anomalies. One of the increasingly significant techniques is Machine Learning (ML), which plays an important role in this area. In this research paper, we conduct a Systematic Literature Review (SLR) which analyzes ML models that detect anomalies in their application. Our review analyzes the models from four perspectives; the applications of anomaly detection, ML techniques, performance metrics for ML models, and the classification of anomaly detection. In our review, we have identified 290 research articles, written from 2000-2020, that discuss ML techniques for anomaly detection. After analyzing the selected research articles, we present 43 different applications of anomaly detection found in the selected research articles. Moreover, we identify 29 distinct ML models used in the identification of anomalies. Finally, we present 22 different datasets that are applied in experiments on anomaly detection, as well as many other general datasets. In addition, we observe that unsupervised anomaly detection has been adopted by researchers more than other classification anomaly detection systems. Detection of anomalies using ML models is a promising area of research, and there are a lot of ML models that have been implemented by researchers. Therefore, we provide researchers with recommendations and guidelines based on this review.
... Sarker et al. proposed an ML-based multilayered framework for the purpose of promoting the security of network system, which aims at the applicability towards data-driven intelligent decisions and protects network systems and devices from network attacks [16]. Atli proposed a traffic classification method that identifies the normal traffic and encrypted traffic by analyzing network flow based on decision tree (DT) and K-Nearest Neighbor (KNN) algorithms [17]. e work by D'hooge and Kayes concluded that the results of anomaly detection with machine learning algorithms as a basis are not ideal among different datasets [18], which provides a research impetus to increase the generalization of the model with limited data. ...
Full-text available
With the increase of Internet visits and connections, it is becoming essential and arduous to protect the networks and different devices of the Internet of Things (IoT) from malicious attacks. The intrusion detection systems (IDSs) based on supervised machine learning (ML) methods require a large number of labeled samples. However, the number of abnormal behaviors is far less than that of normal behaviors, let alone that the shots of malicious behavior samples which can be intercepted as training dataset are actually limited. Consequently, it is a key research topic to conduct the anomaly detection for the small number of abnormal behavior samples. This paper proposes an anomaly detection model with a few abnormal samples to solve the problem in few-shot detection based on convolutional neural networks (CNN) and autoencoder (AE). This model mainly consists of the CNN-based supervised pretraining module and the AE-based data reconstruction module. Only a few abnormal samples are utilized to the pretrain module to build the structure of extracting deep features. The data reconstruction module simply chooses the deep features of normal samples as training data. There also exist some effective attention mechanisms in the pretraining module. Through the pretraining of small samples, the accuracy of abnormal detection is improved compared with merely training normal samples with AE. The simulation results prove that this solution can solve the above problems occurring in network behavior anomaly detection. In comparison to the original AE model and other clustering methods, the proposed model advances the detection results in a visible way.
The web has been utilized broadly in all parts of life. The Interference of web associations can create a huge effect. Hence, the job of the Network Intrusion Detection System (IDS) to distinguish digital attacks is vital. A suspicious connection needs to be blocked immediately before performing anything further. The Higher the data transmissions occuring daily its being important to protect the data and its been main factor to prevent intrusions. A good Intrusion System is to be developed to prevent Attacks. This paper presents a novel approach to classify intrusion attacks. The focal thought is to apply different machine learning algorithms like SVM, Naive Bayes, Neural Networks, Random Forest, Logistic Regression. We apply these kinds of supervised and unsupervised learning Techniques and classify the attack classes. The presentation of the various models was analyzed utilizing every one of the highlights and the best-chosen highlights were executed utilizing the disarray grids.
Network abnormal traffic detection can monitor the network environment in real time by extracting and analysing network traffic characteristics, and plays an important role in network security protection. In order to solve the problems that the existing detection methods cannot fully learn the spatio-temporal characteristics of data, the classification accuracy is not high, and the detection time and accuracy are susceptible to the influence of redundant data in the sample. Thus, this paper proposes a network abnormal detection method (PCSS) integrating principal component analysis (PCA) and single-stage headless face detector algorithms (SSH). PCSS applies the PCA algorithm to the data preprocessing to eliminate the interference of redundant data. At the same time, PCSS also combines feature fusion and SSH to enhance the feature extraction of unclear features data, and effectively improve the detection speed and accuracy. Simulation experiments based on IDS2017 and IDS2012 data sets are carried out in this paper. Experimental results show that PCSS is obviously superior to other detection models in detection speed and accuracy, which provides a new method for efficiently detecting traffic attacks. © 2022 The Author(s). Published by Informa UK Limited, trading as Taylor & Francis Group.
Full-text available
In real applications of cognitive computation, data with imbalanced classes are used to be collected sequentially. In this situation, some of current machine learning algorithms, e.g., support vector machine, will obtain weak classification performance, especially on minority class. To solve this problem, a new hybrid sampling online extreme learning machine (ELM) on sequential imbalanced data is proposed in this paper. The key idea is keeping the majority and minority classes balanced with similar sequential distribution characteristic of the original data. This method includes two stages. At the offline stage, we introduce the principal curve to build confidence regions of minority and majority classes respectively. Based on these two confidence zones, over-sampling of minority class and under-sampling of majority class are both conducted to generate new synthetic samples, and then, the initial ELM model is established. At the online stage, we first choose the most valuable ones from the synthetic samples of majority class in terms of sample importance. Afterwards, a new online fast leave-one-out cross validation (LOO CV) algorithm utilizing Cholesky decomposition is proposed to determine whether to update the ELM network weight at online stage or not. We also prove theoretically that the proposed method has upper bound of information loss. Experimental results on seven UCI datasets and one real-world air pollutant forecasting dataset show that, compared with ELM, OS-ELM, meta-cognitive OS-ELM, and OSELM with SMOTE strategy, the proposed method can simultaneously improve the classification performance of minority and majority classes in terms of accuracy, G-mean value, and ROC curve. As a conclusion, the proposed hybrid sampling online extreme learning machine can be effectively applied to the sequential data imbalance problem with better generalization performance and numerical stability.
Full-text available
Most of the existing image blurriness assessment algorithms are proposed based on measuring image edge width, gradient, high-frequency energy, or pixel intensity variation. However, these methods are content sensitive with little consideration of image content variations, which causes variant estimations for images with different contents but same blurriness degrees. In this paper, a content-insensitive blind image blurriness assessment metric is developed utilizing Weibull statistics. Inspired by the property that the statistics of image gradient magnitude (GM) follows Weibull distribution, we parameterize the GM using β (scale parameter) and ɣ (shape parameter) of Weibull distribution. We also adopt skewness (η) to measure the asymmetry of the GM distribution. In order to reduce the influence of image content and achieve more robust performance, divisive normalization is then incorporated to moderate the β, ɣ, and η. The final image quality is predicted using a sparse extreme learning machine. Performances evaluation on the blur image subsets in LIVE, CSIQ, TID2008, and TID2013 databases demonstrate that the proposed method is highly correlated with human perception and robust with image contents. In addition, our method has low computational complexity which is suitable for online applications.
Full-text available
Numerous state-of-the-art perceptual image quality assessment (IQA) algorithms share a common two-stage process: distortion description followed by distortion effects pooling. As for the first stage, the distortion descriptors or measurements are expected to be effective representatives of human visual variations, while the second stage should well express the relationship among quality descriptors and the perceptual visual quality. However, most of the existing quality descriptors (e.g., luminance, contrast, and gradient) do not seem to be consistent with human perception, and the effects pooling is often done in ad-hoc ways. In this paper, we propose a novel full-reference IQA metric. It applies non-negative matrix factorization (NMF) to measure image degradations by making use of the parts-based representation of NMF. On the other hand, a new machine learning technique [extreme learning machine (ELM)] is employed to address the limitations of the existing pooling techniques. Compared with neural networks and support vector regression, ELM can achieve higher learning accuracy with faster learning speed. Extensive experimental results demonstrate that the proposed metric has better performance and lower computational complexity in comparison with the relevant state-of-the-art approaches.
Full-text available
Intrusion Detection is the identification of malicious activities in a given network by analyzing its traffic. Data mining techniques used for this analysis study the traffic traces and identify hostile flows in the traffic. Dimensionality Reduction in data mining focuses on representing data with minimum number of dimensions such that its properties are not lost and hence reducing the underlying complexity in processing the data. Principal Component Analysis (PCA) is one of the prominent dimensionality reduction techniques widely used in network traffic analysis. In this paper, we focus on the efficiency of PCA for intrusion detection and determine its Reduction Ratio (RR), ideal number of Principal Components needed for intrusion detection and the impact of noisy data on PCA. We carried out experiments with PCA using various classifier algorithms on two benchmark datasets namely, KDD CUP and UNB ISCX. Experiments show that the first 10 Principal Components are effective for classification. The classification accuracy for 10 Principal Components is about 99.7% and 98.8%, nearly same as the accuracy obtained using original 41 features for KDD and 28 features for ISCX, respectively.
It is clear that the learning speed of feedforward neural networks is in general far slower than required and it has been a major bottleneck in their applications for past decades. Two key reasons behind may be: (1) the slow gradient-based learning algorithms are extensively used to train neural networks, and (2) all the parameters of the networks are tuned iteratively by using such learning algorithms. Unlike these conventional implementations, this paper proposes a new learning algorithm called extreme learning machine (ELM) for single-hidden layer feedforward neural networks (SLFNs) which randomly chooses hidden nodes and analytically determines the output weights of SLFNs. In theory, this algorithm tends to provide good generalization performance at extremely fast learning speed. The experimental results based on a few artificial and real benchmark function approximation and classification problems including very large complex applications show that the new algorithm can produce good generalization performance in most cases and can learn thousands of times faster than conventional popular learning algorithms for feedforward neural networks.1
Conference Paper
Modern intrusion detection systems must handle many complicated issues in real-time, as they have to cope with a real data stream; indeed, for the task of classification, typically the classes are unbalanced and, in addition, they have to cope with distributed attacks and they have to quickly react to changes in the data. Data mining techniques and, in particular, ensemble of classifiers permit to combine different classifiers that together provide complementary information and can be built in an incremental way. This paper introduces the architecture of a distributed intrusion detection framework and in particular, the detector module based on a meta-ensemble, which is used to cope with the problem of detecting intrusions, in which typically the number of attacks is minor than the number of normal connections. To this aim, we explore the usage of ensembles specialized to detect particular types of attack or normal connections, and Genetic Programming is adopted to generate a non-trainable function to combine each specialized ensemble. Non-trainable functions can be evolved without any extra phase of training and, therefore, they are particularly apt to handle concept drifts, also in the case of real-time constraints. Preliminary experiments, conducted on the well-known KDD dataset and on a more up-to-date dataset, ISCX IDS, show the effectiveness of the approach.