ArticlePDF Available

Using Wavelets for Compression and Detecting Events in Anomalous Network Traffic

Authors:

Abstract and Figures

This is a conference paper [© IEEE] and is also available online at: http://dx.doi.org/10.1109/ICSNC.2009.70 Personal use of this material is permitted. However, permission to reprint/republish this material for advertising or promotional purposes or for creating new collective works for resale or redistribution to servers or lists, or to reuse any copyrighted component of this work in other works must be obtained from the IEEE. Monitoring and measuring various metrics of highdata rate networks produces a vast amount of information over a long period of time making the storage of the monitored data a serious issue. Furthermore, for the collected monitoring data to be useful to network analysts, these measurements need to be processed in order to detect interesting characteristics. In this paper wavelet analysis is used as a multi-resolution analysis tool for compression of data rate measurements. Two known thresholds are suggested for lossy compression and event detection purposes. Results show high compression ratios while preserving the quality (quantitative and visual aspects) and the energy of the signal and detection of sudden changes are achievable. Published
Content may be subject to copyright.
A preview of the PDF is not available
... Wavelet transform has been widely used for the characterization of different types of data, which encompass: biological data [10] [11], images, and sound and network traffic measurements [12]. However, its effectiveness in data network anomaly detection has not been explicitly proven. ...
... wavelet analysis) to internet traffic analysis helps us isolate the characteristics of the traffic by extracting hidden patterns of high and low frequency information [1]. Many researchers have used wavelet analysis to identify network anomalies by reconstructing network traffic data [23], compressing the traffic data by applying two different thresholds from wavelet coefficients [27], and designing better wavelet filters to identify better local frequency information [3,33]. In the work by Barford et al. [3], wavelet transformations are used to extract flow-based traffic abnormality by splitting the input signals into three different ranges of frequencies (low, mid, and high frequencies). ...
Article
Full-text available
Detection of abnormal internet traffic has become a significant area of research in network security. Due to its importance, many predictive models are designed by utilizing machine learning algorithms. The models are well designed to show high performances in detecting abnormal internet traffic behaviors. However, they may not guarantee reliable detection performances for new incoming abnormal internet traffic because they are designed using raw features from imbalanced internet traffic data. Since internet traffic is non-stationary time-series data, it is difficult to identify abnormal internet traffic with the raw features. In this study, we propose a new approach to detecting abnormal internet traffic. Our approach begins with extracting hidden, but important, features by utilizing discrete wavelet transformation. Then, statistical analysis is performed to filter out irrelevant and less important features. Only statistically significant features are used to design a reliable predictive model with logistic regression. A comparative analysis is conducted to determine the importance of our approach by measuring accuracy, sensitivity, and the Area Under the receiver operating characteristic Curve. From the analysis, we found that our model detects abnormal internet traffic successfully with high accuracy.
Article
Full-text available
Abstract— CoMo (Continuous Monitoring) is a passive moni- toring system. CoMo has been designed to be the basic building block of an open network monitoring infrastructure that would allow researchers and network operators to easily process and share network traffic statistics over multiple sites. This paper identifies the challenges that lie ahead in the deployment of such an open infrastructure. These main challenges are: (1) the system must allow any generic metric to be computed on the incoming traffic stream, (2) it must provide privacy and security guarantees to the owner of the monitored link, the network users and the CoMo users, and (3) it must be robust in the face of anomalous traffic patterns. We describe the high-level architecture of CoMo and, in greater detail, the resource management, query processing and security aspects. I. I NTRODUCTION
Conference Paper
Full-text available
Monitoring and measuring various metrics of high speed and high capacity networks produces a vast amount of information over a long period of time. For the collected monitoring data to be useful to administrators, these measurements need to be analyzed and processed in order to detect interesting characteristics such as sudden changes. In this paper wavelet analysis is used along with the universal threshold proposed by Donoho-Johnstone in order to detect abrupt changes in computer network measurements. Experimental results are obtained to compare the behaviour of the algorithm on delay and data rate signals. Both type of signals are measurements from real networks and not produced from a simulation tool. Results show that detection of anomalies is achievable in a variety of signals.
Conference Paper
Full-text available
The rapid and accurate detection of network traffic anomaly is one of the preconditions to guarantee the effective work of the network. Aiming at the deficiency of present methods of network traffic anomaly detection, we propose a scale-adaptive method based on wavelet packet. By means of wavelet packet decomposition, our method can adjust the decomposition process adaptively, has the same detective ability to the anomaly of various frequency, especially the middle and high frequency ones which can not be checked out by the multi-resolution analysis. By means of adaptive reconstruction of the wavelet packet coefficient of different wavelet domains which anomaly, our method is able to confirm the characteristics of anomaly and enhance the reliability of detection. The simulation results prove that the method can detect the network traffic anomaly efficiently
Article
Full-text available
This paper presents Swing, a closed-loop, network-responsive traffic generator that accurately captures the packet interactions of a range of applications using a simple structural model. Starting from observed traffic at a single point in the network, Swing automatically extracts distributions for user, application, and network behavior. It then generates live traffic corresponding to the underlying models in a network emulation environment running commodity network protocol stacks. We find that the generated traffic is statistically similar to the original traffic. Furthermore, to the best of our knowledge, we are the first to reproduce burstiness in traffic across a range of time-scales using a model applicable to a variety of network settings. An initial sensitivity analysis reveals the importance of our individual model parameters to accurately reproduce such burstiness. Finally, we explore Swing's ability to vary user characteristics, application properties, and wide-area network conditions to project traffic characteristics into alternate scenarios.
Conference Paper
Full-text available
Abstract— Automated detection of anomalies in network traffic is an important and challenging task. In this work we propose an automated system to detect volume-based anomalies in network traffic caused by Denial of Service (DoS) attacks. The system has a two-stage architecture that combines more traditional ap- proaches (Adaptive Threshold and Cumulative Sum) with a novel one based on the Continuous Wavelet Transform. Thanks to the proposed architecture, we obtain good results in terms of trade- off between correct detections and false alarms, estimation of anomaly duration, and ability to distinguish between subsequent anomalies. We test our system using a set of publicly available traffic traces to which we superimpose anomalies related to real DoS attacks tools. Extensive test results show how the proposed system accurately detects a wide range of anomalies and how the performance indicators are affected by anomalies characteristics (i.e. amplitude and duration).
Article
With ideal spatial adaptation, an oracle furnishes information about how best to adapt a spatially variable estimator, whether piecewise constant, piecewise polynomial, variable knot spline, or variable bandwidth kernel, to the unknown function. Estimation with the aid of an oracle o ers dramatic advantages over traditional linear estimation by nonadaptive kernels; however, it is a priori unclear whether such performance can be obtained by a procedure relying on the data alone. We describe a new principle for spatially-adaptive estimation: selective wavelet reconstruction. Weshow that variableknot spline ts and piecewise-polynomial ts, when equipped with an oracle to select the knots, are not dramatically more powerful than selective wavelet reconstruction with an oracle. We develop a practical spatially adaptive method, RiskShrink, which works by shrinkage of empirical wavelet coe cients. RiskShrink mimics the performance of an oracle for selective wavelet reconstruction as well as it is possible to do so. A new inequality inmultivariate normal decision theory which we call the oracle inequality shows that attained performance di ers from ideal performance by at most a factor 2 log n, where n is the sample size. Moreover no estimator can give a better guarantee than this. Within the class of spatially adaptive procedures, RiskShrink is essentially optimal. Relying only on the data, it comes within a factor log 2 n of the performance of piecewise polynomial and variable-knot spline methods equipped with an oracle. In contrast, it is unknown how or if piecewise polynomial methods could be made to function this well when denied access to an oracle and forced to rely on data alone.
Article
The Hurst Exponent is a dimensionless estimator for the self-similarity of a time series. Initially defined by Harold Edwin Hurst to develop a law for regularities of the Nile water level, it now has applications in medicine and finance. Meaningful values are in the range [0, 1]. Different methods for estimating the Hurst Exponent have been evaluated: The classical "Rescaled Range" method developed by Harold Edwin Hurst. In addition to nowaday's standard method, two wavelet-based methods have been evaluated and compared, one of which is proven the one with the best convergence [4] developed by Gloter and Hoffmann. A core part of the project was to write software to implement and compare the different algorithms.
Article
Network Denial-of-Service (DoS) attacks that disable network services by flooding them with spurious packets are on the rise. Criminals with large networks (botnets) of compromised nodes (zombies) use the threat of DoS attacks to extort legitimate companies. To fight these threats and ensure network reliability, early detection of these attacks is critical. Many methods have been developed with limited success to date. This paper presents an approach that identifies change points in the time series of network packet arrival rates. The proposed process has two stages: (i) statistical analysis that finds the rate of increase of network traffic, and (ii) wavelet analysis of the network statistics that quickly detects the sudden increases in packet arrival rates characteristic of botnet attacks.Most intrusion detections are tested using data sets from special security testing configurations, which leads to unacceptable false positive rates being found when they are used in the real world. We test our approach using data from both network simulations and a large operational network. The true and false positive detection rates are determined for both data sets, and receiver operating curves use these rates to find optimal parameters for our approach. Evaluation using operational data proves the effectiveness of our approach.
Conference Paper
The ability to generate repeatable, realistic network traffic is critical in both simulation and testbed environments. Traffic generation capabilities to date have been limited to either simple sequenced packet streams typically aimed at throughput testing, or to application-specific tools focused on, for example, recreating representative HTTP requests. In this paper we describe Harpoon, a new application-independent tool for generating representative packet traffic at the IP flow level. Harpoon generates TCP and UDP packet flows that have the same byte, packet, temporal and spatial characteristics as measured at routers in live environments. Harpoon is distinguished from other tools that generate statistically representative traffic in that it can self-configure by automatically extracting parameters from standard Netflow logs or packet traces. We provide details on Harpoon's architecture and implementation, and validate its capabilities in controlled laboratory experiments using configurations derived from flow and packet traces gathered in live environments. We then demonstrate Harpoon's capabilities in a router benchmarking experiment that compares Harpoon with commonly used throughput test methods. Our results show that the router subsystem load generated by Harpoon is significantly different, suggesting that this kind of test can provide important insights into how routers might behave under actual operating conditions.