Article

An Adaptive Traffic Sampling Method for Anomaly Detection

Authors:
To read the full-text of this research, you can request a copy directly from the authors.

Abstract

The random packet sampling method is the simplest methodology for reducing the amount of packets that the network monitoring system has to process. However, the accuracy of anomaly detection is affected by the fact that this method biases a large IP flow. In order to reduce the impact of sampled traffic on network anomaly detecting, an adaptive traffic sampling method is proposed. This method is developed based on time stratification. Our adaptive method lies in an innovative scheme. It divides time into strata and then samples an incoming packet with a probalility, which is a decreasing function f of the predicted size of the flow the packet belongs to. Instead of data streaming algorithms, we use packet samples and a sampling probability to estimate flow size, thus to save resources. A force sampling is also employed to increase the accuracy of estimation of smaller flows. Experiments results show that our scheme is more accurate than traditional random packet sampling for estimating anomalous traffic, thus the performance of anomalous detecting is improved.

No full-text available

Request Full-text Paper PDF

To read the full-text of this research,
you can request a copy directly from the authors.

... By design this sampling approach allows to ensure higher detection accuracy, it however requires a deployment of scoring mechanisms throughout a network. Among other algorithms are Fast filtered sampling [32], Adaptive Traffic Sampling Method [20], Selective flowbased sampling [3], and Adaptive weighted sampling algorithm based on the least squares [38], which were specifically developed for detection of network-level DoS attacks. ...
... [17] Flow No Large Sample-and-hold [19] Hybrid No Large Pkt sampling decision is based on indiv pkt. Amount of sampling is capped on flow basis Sketch-guided sampling [25] Hybrid No Small/ Medium Pkt sampling decision is based on the entire flow Selective flow sampling [3] Flow Yes Small Fast filtered sampling [32] Hybrid Yes Small Pkt sampling decision is based on the entire flow IP flow-based sampling [37] Hybrid Yes Small/ Medium Pkt sampling decision is based on the entire flow Adaptive weighted sampling [38] Packet Yes No pref Adaptive traffic sampling [20] Hybrid Yes Small Pkt sampling decision is based on the entire flow ...
... Adaptive traffic sampling method aims to adjust packet sampling probability to estimated flow size [20]. Dividing a sampling interval into blocks, incoming packets are sampled with a probability assessed for a previous block. ...
Article
A recent escalation of application layer Denial of Service (DoS) attacks on the Internet has quickly shifted the interest of the research community traditionally focused on network-based DoS attacks. A number of studies came forward showing the potency of attacks, introducing new varieties and discussing potential detection strategies. The underlying problem that triggered all this research is the stealthiness of application layer DoS attacks. Since they usually do not manifest themselves at the network level, these types of attacks commonly avoid traditional network-layer based detection mechanisms. In this work we turn our attention to this problem and present a novel detection approach for application layer DoS attacks based on nonparametric CUSUM algorithm. We explore the effectiveness of our detection on various types of these attacks in the context of modern web servers. Since in production environments detection is commonly performed on a sampled subset of network traffic, we also study the impact of sampling techniques on detection of application layer DoS attack. Our results demonstrate that the majority of sampling techniques developed specifically for intrusion detection domain introduce significant distortion in the traffic that minimizes a detection algorithm’s ability to capture the traces of these stealthy attacks.
... The probabilistic function can be uniform, where all packets have an equal probability to be selected, or non-uniform, where the packets have different probability of selection. In [13] it is introduced a sampling technique based on a probabilistic scheme for anomaly detection, namely network scans, SYN flooding and worms. This technique divides time into strata and then selects an incoming packet with a probability, which is a decreasing function f of the predicted size of the flow the packet belongs to. ...
Article
Traffic sampling is viewed as a prominent strategy contributing to lightweight and scalable network measurements. Although multiple sampling techniques have been proposed and used to assist network engineering tasks, these techniques tend to address a single measurement purpose, without detailing the network overhead and computational costs involved. The lack of a modular approach when defining the components of traffic sampling techniques also makes difficult their analysis. Providing a modular view of sampling techniques and classifying their characteristics is, therefore, an important step to enlarge the sampling scope, improve the efficiency of measurement systems, and sustain forthcoming research in the area. Thus, this paper defines a taxonomy of traffic sampling techniques resorting to a comprehensive analysis of the inner components of existing proposals. After identifying granularity, selection scheme, and selection trigger as the main components differentiating sampling proposals, the study goes deeper on characterizing these components, including insights into their computational weight. Following this taxonomy, a general-purpose architecture is established to sustain the development of flexible sampling-based measurement systems. Traveling inside packet sampling techniques, this paper contributes to a clearer positioning and comparison of existing proposals, providing a road map to assist further research and deployments in the area. Copyright
Conference Paper
Full-text available
Packet sampling methods such as Cisco's NetFlow are widely em- ployed by large networks to reduce the amount of traffic data mea- sured. A key problem with packet sampling is that it is inherently a lossy process, discarding (potentially useful) information. In this paper, we empirically evaluate the impact of sampling on anomaly detection metrics. Starting with unsampled flow records collected during the Blaster worm outbreak, we reconstruct the underlying packet trace and simulate packet sampling at increasing rates. We then use our knowledge of the Blaster anomaly to build a baseline of normal traffic (without Blaster), against which we can measure the anomaly size at various sampling rates. This approach allows us to evaluate the impact of packet sampling on anomaly detection without being restricted to (or biased by) a particular anomaly de- tection method. We find that packet sampling does not disturb the anomaly size when measured in volume metrics such as the number of bytes and number of packets, but grossly biases the number of flows. How- ever, we find that recently proposed entropy-based summarizations of packet and flow counts are affected less by sampling, and ex- pose the Blaster worm outbreak even at higher sampling rates. Our findings suggest that entropy summarizations are more resilient to sampling than volume metrics. Thus, while not perfect, sampling still preserves sufficient distributional structure, which when har- nessed by tools like entropy, can expose hard-to-detect scanning anomalies.
Conference Paper
Full-text available
The traffic volume between origin/destination (OD) pairs in a network, known as traffic matrix, is essential for efficient network provisioning and traffic engineering. Existing approaches of estimating the traffic matrix, based on statistical inference and/or packet sampling, usually cannot achieve very high estimation accuracy. In this work, we take a brand new approach in attacking this problem. We propose a novel data streaming algorithm that can process traffic stream at very high speed (e.g., 40 Gbps) and produce traffic digests that are orders of magnitude smaller than the traffic stream. By correlating the digests collected at any OD pair using Bayesian statistics, the volume of traffic flowing between the OD pair can be accurately determined. We also establish principles and techniques for optimally combining this streaming method with sampling, when sampling is necessary due to stringent resource constraints. In addition, we propose another data streaming algorithm that estimates flow matrix , a finer-grained characterization than traffic matrix. Flow matrix is concerned with not only the total traffic between an OD pair (traffic matrix), but also how it splits into flows of various sizes. Through rigorous theoretical analysis and extensive synthetic experiments on real Internet traffic, we demonstrate that these two algorithms can produce very accurate estimation of traffic matrix and flow matrix respectively.
Conference Paper
Full-text available
Knowing the distribution of the sizes of traffic flows passing through a network link helps a network operator to characterize network resource usage, infer traffic demands, detect traffic anomalies, and accommodate new traffic demands through better traffic engineering. Previous work on estimating the flow size distribution has been focused on making inferences from sampled network traffic. Its accuracy is limited by the (typically) low sampling rate required to make the sampling operation affordable. In this paper we present a novel data streaming algorithm to provide much more accurate estimates of flow distribution, using a "lossy data structure" which consists of an array of counters fitted well into SRAM. For each incoming packet, our algorithm only needs to increment one underlying counter, making the algorithm fast enough even for 40 Gbps (OC-768) links. The data structure is lossy in the sense that sizes of multiple flows may collide into the same counter. Our algorithm uses Bayesian statistical methods such as Expectation Maximization to infer the most likely flow size distribution that results in the observed counter values after collision. Evaluations of this algorithm on large Internet traces obtained from several sources (including a tier-1 ISP) demonstrate that it has very high measurement accuracy (within 2%). Our algorithm not only dramatically improves the accuracy of flow distribution measurement, but also contributes to the field of data streaming by formalizing an existing methodology and applying it to the context of estimating the flow-distribution.
Conference Paper
Full-text available
ó Monitoring the trafc in high-speed networks is a data intensive problem. Uniform packet sampling is the most popular technique for reducing the amount of data the network monitoring hardware/software has to process. However, uniform sampling captures far less information than can be potentially obtained with the same overall sampling rate. This is because uniform sampling (unnec- essarily) draws the vast majority of samples from large o ws, and very few from small and medium o ws. This information loss on small and medium o ws signicantly affects the accuracy of the estimation of various network statistics. In this work, we develop a new packet sampling method- ology called ìsketch-guided samplingî (SGS), which offers better statistics than obtainable from uniform sampling, given the same number of raw samples gathered. Its main idea is to make the probability with which an incoming packet is sampled a decreasing sampling function of the size of the o w the packet belongs to. This way our scheme is able to signicantly increase the packet sampling rate of the small and medium o ws at slight expense of the large o ws, resulting in much more accurate estimations of various network statistics. However, the exact sizes of all o ws are available only if we keep per-o w information for every o w, which is prohibitively expensive for high- speed links. Our SGS scheme solves this problem by using a small (lossy) synopsis data structure called counting sketch to encode the approximate sizes of all o ws. Our evaluation on real-world Internet trafc traces shows that our sampling theory based the approximate o w size estimates from the counting sketch works almost as well as if we know the exact sizes of the o ws.
Conference Paper
Full-text available
Sampling techniques are widely used for trac measure- ments at high link speed to conserve router resources. Tra- ditionally, sampled trac data is used for network manage- ment tasks such as trac matrix estimations, but recently it has also been used in numerous anomaly detection algo- rithms, as security analysis becomes increasingly critical for network providers. While the impact of sampling on trac engineering metrics such as flow size and mean rate is well studied, its impact on anomaly detection remains an open question. This paper presents a comprehensive study on whether ex- isting sampling techniques distort trac features critical for eective anomaly detection. We sampled packet traces cap- tured from a Tier-1 IP-backbone using four popular meth- ods: random packet sampling, random flow sampling, smart sampling, and sample-and-hold. The sampled data is then used as input to detect two common classes of anomalies: volume anomalies and port scans. Since it is infeasible to enumerate all existing solutions, we study three representa- tive algorithms: a wavelet-based volume anomaly detection and two portscan detection algorithms based on hypothe- ses testing. Our results show that all the four sampling methods introduce fundamental bias that degrades the per- formance of the three detection schemes, however the degra- dation curves are very dierent. We also identify the trac features critical for anomaly detection and analyze how they are aected by sampling. Our work demonstrates the need
Article
Full-text available
Identifying anomalies rapidly and accurately is critical to the efficient operation of large computer networks. Accurately characterizing important classes of anomalies greatly facilitates their identification; however, the subtleties and complexities of anomalous traffic can easily confound this process. In this paper we report results of signal analysis of four classes of network traffic anomalies: outages, flash crowds, attacks and measurement failures. Data for this study consists of IP flow and SNMP measurements collected over a six month period at the border router of a large university. Our results show that wavelet filters are quite effective at exposing the details of both ambient and anomalous traffic. Specifically, we show that a pseudo-spline filter tuned at specific aggregation levels will expose distinct characteristics of each class of anomaly. We show that an effective way of exposing anomalies is via the detection of a sharp increase in the local variance of the filtered data. We evaluate traffic anomaly signals at different points within a network based on topological distance from the anomaly source or destination. We show that anomalies can be exposed effectively even when aggregated with a large amount of additional traffic. We also compare the difference between the same traffic anomaly signals as seen in SNMP and IP flow data, and show that the more coarse-grained SNMP data can also be used to expose anomalies effectively.
Article
Full-text available
Many routers can generate and export statistics on flows of packets that traverse them. Increasingly, high end routers form flow statistics from only a sampled packet stream in order to manage resource consumption involved.
Conference Paper
Accurate network traffic measurement is required for accounting, bandwidth provisioning, and detecting DOS attacks. However, keeping a counter to measure the traffic sent by each of a million concurrent flows is too expensive (using SRAM) or slow (using DRAM). The current state-of-the-art (e.g., Cisco NetFlow) methods which count periodically sampled packets are slow, inaccurate, and memory-intensive. Our paper introduces a paradigm shift by concentrating on the problem of measuring only "heavy" flows --- i.e., flows whose traffic is above some threshold such as 1% of the link. After showing that a number of simple solutions based on cached counters and classical sampling do not work, we describe two novel and scalable schemes for this purpose which take a constant number of memory references per packet and use a small amount of memory. Further, unlike NetFlow estimates, we have provable bounds on the accuracy of measured rates and the probability of false negatives. We also propose a new form of accounting called threshold accounting in which only flows above threshold are charge by usage while the rest are charged a fixed fee. Threshold accounting generalizes the familiar notions of usage-based and duration based pricing.
Conference Paper
Passive traffic measurement increasingly employs sampling at the packet level. Many high-end routers form flow statistics from a sampled substream of packets. Sampling controls the consumption of resources by the measurement operations. However, knowledge of the statistics of flows in the unsampled stream remains useful, for understanding both characteristics of source traffic, and consumption of resources in the network. This paper provides methods that use flow statistics formed from sampled packet stream to infer the frequencies of the number of packets per flow in the unsampled stream. A key task is to infer the properties of flows of original traffic that evaded sampling altogether. We achieve this through statistical inference, and by exploiting protocol level detail reported in flow records. We investigate the impact on our results of different versions of packet sampling.
Conference Paper
The increasing practicality of large-scale flow capture makes it possible to conceive of traffic analysis methods that detect and identify a large and diverse set of anomalies. However the challenge of effectively analyzing this massive data source for anomaly diagnosis is as yet unmet. We argue that the distributions of packet features (IP addresses and ports) observed in flow traces reveals both the presence and the structure of a wide range of anomalies. Using entropy as a summarization tool, we show that the analysis of feature distributions leads to significant advances on two fronts: (1) it enables highly sensitive detection of a wide range of anomalies, augmenting detections by volume-based methods, and (2) it enables automatic classification of anomalies via unsupervised learning. We show that using feature distributions, anomalies naturally fall into distinct and meaningful clusters. These clusters can be used to automatically classify anomalies and to uncover new anomaly types. We validate our claims on data from two backbone networks (Abilene and Geant) and conclude that feature distributions show promise as a key element of a fairly general network anomaly diagnosis framework.
Article
Passive traffic measurement increasingly employs sampling at the packet level. Many high-end routers form flow statistics from a sampled substream of packets. Sampling is,necessary in order to control the consumption of resources by the measurement operations. However, knowledge of the statistics of flows in the unsampled stream remains useful, for understanding both characteristics of source traffic, and consumption of resources in the network. This paper provide methods that use flow statistics formed from sampled packet stream to infer the absolute frequencies of lengths of flows in the unsampled stream. A key part of our work is inferring the numbers and lengths of flows of original traffic that evaded sampling altogether. We achieve this through statistical inference, and by exploiting protocol level detail reported in flow records. The method has applications to detection and characterization of network attacks: we show how to estimate, from sampled flow statistics, the number of compromised hosts that are sending attack traffic past the measurement point. We also investigate the impact on our results of different implementations of packet sampling.
Article
Sampling has become an integral part of passive network measurement. This role is driven by the need to control the consumption of resources in the measurement infrastructure under increasing traffic rates and the demand for detailed measurements from applications and service providers. Classical sampling methods play an important role in the current practice of Internet measurement. The aims of this review are (i) to explain the classical sampling methodology in the context of the Internet to readers who are not necessarily acquainted with either, (ii) to give an account of newer applications and sampling methods for passive measurement and (iii) to identify emerging areas that are ripe for the application of statistical expertise.
Conference Paper
Considerable research has been done on detecting and blocking portscan activities that are typically conducted by infected hosts to discover other vulnerable hosts. However, the focus has been on enterprise gateway-level intrusion detection systems where the traffic volume is low and network configuration information is readily available. This paper investigates the effectiveness of existing portscan detection algorithms in the context of a large transit backbone network and proposes a new algorithm that meets the demands of aggregated high speed backbone traffic. Specifically, we evaluate two existing approaches - the portscan detection algorithm in SNORT, and a modified version of the TRW algorithm that is a part of the intrusion detection tool BRO. We then propose a new approach, TAPS, which uses sequential hypothesis testing to detect hosts that exhibit abnormal access patterns in terms of destination hosts and destination ports. We perform a comparative analysis of these three approaches using real backbone packet traces, and find that TAPS exhibits the best performance in terms of catching the maximum number of true scanners and yielding the least number of false positives. We have a working implementation of TAPS on our monitoring platform. Further implementation optimizations using bloom filters are identified and discussed
Article
We present a parameterizable methodology for profiling Internet traffic flows at a variety of granularities. Our methodology differs from many previous studies that have concentrated on end-point definitions of flows in terms of state derived from observing the explicit opening and closing of TCP connections. Instead, our model defines flows based on traffic satisfying various temporal and spatial locality conditions, as observed at internal points of the network. This approach to flow characterization helps address some central problems in networking based on the Internet model. Among them are route caching, resource reservation at multiple service levels, usage based accounting, and the integration of IP traffic over an ATM fabric. We first define the parameter space and then concentrate on metrics characterizing both individual flows as well as the aggregate flow profile. We consider various granularities of the definition of a flow, such as by destination network, host-pair, or host and port quadruple. We include some measurements based on case studies we undertook, which yield significant insights into some aspects of Internet traffic, including demonstrating (i) the brevity of a significant fraction of IP flows at a variety of traffic aggregation granularities, (ii) that the number of host-pair IP flows is not significantly larger than the number of destination network flows, and (iii) that schemes for caching traffic information could significantly benefit from using application information
Article
Knowing the distribution of the sizes of tra#c flows passing through a network link helps a network operator to characterize network resource usage, infer tra#c demands, detect tra#c anomalies, and accommodate new tra#c demands through better tra#c engineering. Previous work on estimating the flow size distribution has been focused on making inferences from sampled network tra#c. Its accuracy is limited by the (typically) low sampling rate required to make the sampling operation a#ordable. In this paper we present a novel data streaming algorithm to provide much more accurate estimates of flow distribution, using a "lossy data structure" which consists of an array of counters fitted well into SRAM. For each incoming packet, our algorithm only needs to increment one underlying counter, making the algorithm fast enough even for 40 Gbps (OC-768) links. The data structure is lossy in the sense that sizes of multiple flows may collide into the same counter. Our algorithm uses Bayesian statistical methods such as Expectation Maximization to infer the most likely flow size distribution that results in the observed counter values after collision. Evaluations of this algorithm on large Internet traces obtained from several sources (including a tier-1 ISP) demonstrate that it has very high measurement accuracy (within 2%). Our algorithm not only dramatically improves the accuracy of flow distribution measurement, but also contributes to the field of data streaming by formalizing an existing methodology and applying it to the context of estimating the flow-distribution. Categories and Subject Descriptors C.2.3 [COMPUTER-COMMUNICATION NETWORKS]: Network Operations - Network Monitoring E.1 [DATA STRUCTURES] General Terms Algorithms, Measurement, Theory Permission to make digit...
Time-Series Analysis
  • J Gottman
sampled data sufficient for anomaly detection//Proceedings of the 6th ACM SIGCOMM on Internet Measurement
  • Jianning Mai
  • Chen-Nee Chuah
  • Ashwin Seidharan
Impact of Packet Sampling on Anomaly Detection Metrics//Proceedings of the 6th ACM SIGCOMM on Internet Measurement
  • Daniela Brauckhoff
  • Bernhard Tellenbach
  • Amo Wagner
Title of paper if known
  • K Elissa