A Fast and Compact Method for Unveiling Significant Patterns in High Speed Networks
ABSTRACT Identification of significant patterns in network traffic, such as IPs or flows that contribute large volume (heavy hitters) or introduce large changes (heavy changers), has many applications in accounting and network anomaly detection. As network speed and the number of flows grow rapidly, tracking per-IP or per-flow statistics becomes infeasible due to both the computational overhead and memory requirements. In this paper, we propose a novel sequential hashing scheme that requires only O(H log N) both in memory and computational overhead that are close to being optimal, where N is the the number of all possible keys (e.g., flows, IPs) and H is the maximum number of heavy keys. Moreover, the generalized sequential hashing scheme makes it possible to trade off among memory, update cost, and detection cost in a large range that can be utilized by different computer architectures for optimizing the overall performance. In addition, we also propose statistically efficient algorithms for estimating the values of heavy hitters and heavy changers. Using both theoretical analysis and experimental studies of Internet traces, we demonstrate that our approach can achieve the same accuracy as the existing methods do but using much less memory and computational overhead.
-
Citations (0)
-
Cited In (0)
Page 1
1
A Fast and Compact Method for Unveiling
Significant Patterns in High Speed Networks
Tian Bu∗, Jin Cao∗, Aiyou Chen∗, Patrick P. C. Lee†
∗Bell Laboratories, Alcatel-Lucent, NJ, USA
†Department of Computer Science, Columbia University, New York, NY, USA
{tbu, cao, aychen}@research.bell-labs.com, pclee@cs.columbia.edu
Abstract—Identification of significant patterns in network
traffic, such as IPs or flows that contribute large volume (heavy
hitters) or introduce large changes (heavy changers), has many
applications in accounting and network anomaly detection. As
network speed and the number of flows grow rapidly, tracking
per-IP or per-flow statistics becomes infeasible due to both the
computational overhead and memory requirements. In this paper,
we propose a novel sequential hashing scheme that requires only
O(H logN) both in memory and computational overhead that
are close to being optimal, where N is the the number of all
possible keys (e.g., flows, IPs) and H is the maximum number of
heavy keys. Moreover, the generalized sequential hashing scheme
makes it possible to trade off among memory, update cost, and
detection cost in a large range that can be utilized by different
computer architectures for optimizing the overall performance.
In addition, we also propose statistically efficient algorithms for
estimating the values of heavy hitters and heavy changers. Using
both theoretical analysis and experimental studies of Internet
traces, we demonstrate that our approach can achieve the same
accuracy as the existing methods do but using much less memory
and computational overhead.
I. INTRODUCTION
Monitoring and detecting significant behaviors in a network,
such as the presence of persistent large flows or a sudden
increase in network traffic due to the emergence of new
flows, are essential for network provisioning, management and
security because significant behaviors often imply events of
interests.
In this paper, we focus on the detection of two impor-
tant significant behaviors known as heavy hitters and heavy
changers. A heavy hitter is a key whose traffic exceeds a pre-
defined threshold, whereas a heavy changer is a key whose
change in traffic volume between two monitoring intervals
exceeds a pre-defined threshold1. Here a key represents a
source IP address/port, a destination IP address/port, or their
combinations such as the five-tuple flow. For instance, a flow
that accounts for more than 10% of total traffic, which is a
heavy hitter by flows, may suggest the violation of a service
agreement. On the other hand, a sudden increase of traffic
volume flowing to a destination, which is a heavy changer by
destination, may indicate either a hot spot, the beginning of a
DoS attack, or traffic rerouting due to link failures elsewhere.
The goal of the heavy key detection problem is to identify all
1There are more sophisticated definitions of “change” that account for
traffic forecast models. However, the technique we develop in the paper would
also apply to such definitions with linear forecast models. Using the simple
definition of change allows us to explain the technique more clearly.
heavy keys (i.e., either heavy hitters or heavy changers) and
estimate their associated values with a low error rate while
minimizing both memory usage and computational overhead.
However, as the Internet continues to grow in size and
complexity, the ever-increasing network bandwidth poses great
challenges on monitoring heavy keys in real time due to
computational and storage constraints. To identify any network
flow that causes significant volume change, the system should
scale up to at least 2104keys2. Some fundamental requirements
for monitoring and detecting significant patterns in real time
for high bandwidth links are discussed below.
• Fast per-packet update. The per-packet update speed has
to be able to catch up with the link bandwidth even in the
worst case when all packets are of the smallest possible
size. Otherwise the real time constraint is violated.
• Fast discovery of significant patterns. The detection delay
of significant patterns should be short such that important
events like network attacks and link failures can be
responded in time before any serious damage is made.
• High accuracy. Both false positive and false negative
rates should be minimized. It is well understood that
having a false negative may miss an important event and
thus delay the necessary reaction. Having a false positive,
on the other hand, may trigger unnecessary responses that
waste resources.
Data monitoring algorithms based on efficient data struc-
tures have been studied for heavy hitter detection (e.g., [7])
or traffic-volume query (e.g., [1]). Estan and Varghese [5] use
parallel hash tables to identify large flows using a memory
that is only a small constant larger than the number of
large flows. However, such schemes only address heavy hitter
detection, but not heavy changer detection. Both [6] and [12]
address heavy changer detection. In particular, [12] proposes a
modularized hashing scheme to narrow down a candidate set
of heavy keys. Our proposed approach improves the scheme in
[12] not only on accuracy but also on computational overhead
and memory usage.
In summary, the main contributions of this paper are:
• We derive a lower bound of memory usage when applying
parallel hash tables for heavy key detection for a given
2This number is calculated based on the number of possible five-tuple flows:
source IP address (32 bits), source port (16 bits), destination IP address (32
bits), destination port (16 bits), and protocol (8 bits). The number may be
significantly smaller for realistic traffic since not all possible combination of
these fields are possible
Page 2
2
error rate.
• We propose a sequential hashing scheme that uses multi-
level hash arrays for fast and accurate detection of heavy
keys while minimizing computational overhead and mem-
ory usage. Moreover, we demonstrate our scheme can
trade off between computational overhead and memory
usage so as to maximize the overall system performance
when being implemented on different hardware architec-
tures.
• We design efficient yet accurate methods for estimating
the values of heavy keys. We also demonstrate that our
estimation methods can further reduce errors introduced
in the detection stage. With the help of the estimation
step, our detection scheme can be more memory efficient
by allowing a high error rate in the detection stage and
then eliminating the errors in the estimation stage.
• Through extensive simulation using real Internet traces
collected from a high speed link, we show that our
scheme yields more accurate results, yet is more memory
and computationally efficient, than existing work.
The balance of the paper is organized as follows. We derive
a lower bound of memory requirement when using parallel
hash tables for heavy hitter/changer detection in Section II.
Section III focuses on the multi-level hash techniques for
heavy hitter/changer detection. In Section IV, we present
efficient algorithms for estimating the values of heavy hit-
ters/changers. Section V shows the evaluation results using
Internet traces. We conclude the paper in Section VI.
II. LOWER BOUND OF MEMORY REQUIREMENT OF A
HASH ARRAY
We model the set of network traffic within a measurement
interval as a stream of data that arrive sequentially, where
each item (x,vx) consists of a key x ∈ {0,1,...,N −1} and
an associated value vx. The identification of heavy keys (i.e.,
either heavy hitters or heavy changers) is straightforward if all
values of vx are known. However, tracking the exact values
of vxfor all x may not be feasible for large N. To overcome
this, as proposed in [5], [12], we introduce here the use of a
single hash array for approximating the heavy keys where a
hash array consists of M hash tables each with K buckets. The
hash functions for each table are chosen independently from
a class of 2-universal hash functions, and so the K buckets
of each table form a random partition of N keys. We define
ym,j as the sum of vxfor all x in the jth bucket in the lth
table. Table I summarizes the important notation used in this
paper.
In this section, we derive the lower bound of memory (in
terms of the total number of buckets in a hash array) required
for identifying the heavy keys in network traffic using a single
hash array. We first describe our analysis for the case of heavy
hitter detection. We will then show how the results may also
apply to heavy changer detection.
A. Memory lower bound for heavy hitter detection
Recall a heavy hitter is a key x whose traffic value vx
exceeds a pre-specified threshold t. Suppose there are H heavy
hitters. Call a bucket heavy iff its y value crosses the threshold
TABLE I
NOTATION
x, vx
N,Ni
M, Mi
U
H
K
γ
?, α
D
C, Ci
ym,j,yi,m,j
key and the value associated with key x in the stream
size of key set
number of hash tables in one hash array
memory size (total number of buckets)
true number of heavy hitters/changes
size of a hash table
H/K
expected number of false positives divided by H (Eq. 2)
number of hash arrays (also the number of words in a key)
size of the candidate set of heavy hitters
sum of vx for all x mapped to bucket j of table l
(Notation with subscript i denotes the corresponding quantities for the ith
hash array in the sequential hashing scheme presented in Section III.)
t. For any heavy hitter, it is easy to see that the bucket that it
falls into in each of the M tables is a heavy bucket. Therefore,
a superset of heavy hitter keys, say C, can be formed by using
the intersection of M subsets, each of which consists of keys
in the heavy buckets corresponding to one table.
In order to derive the lower bound, we assume that the
traffic distribution is very skewed such that the sum of any set
of non-heavy hitter key values is less than the threshold, i.e.,
the contributions of non-heavy hitters are negligible. expected
size of order H, Assume that H ? N. Let Z be the number
of heavy hitters contained in an arbitrary bucket, and let γ =
H/K, i.e, K = γ−1H. The following two lemmas describes
the distribution of Z and the expected size E|C| of set C in
the lower bound case.
Lemma 1: Z ≈ Binomial(1
greater than 100), Z ≈ Poisson(γ).
The proof is straightforward and is omitted. When γ = log2
(see Theorem 1 below), Lemma 1 indicates that about 50% of
the buckets do not contain any heavy hitters and that among
heavy buckets about 70% of them contain exactly one heavy
hitter.
Lemma 2: E|C| ≈ H +(N −H)(1−(1−1
H is large, then
K,H). When H is large (say
K)H)M. When
E|C| ≈ H + (N − H)(1 − e−γ)M.
Proof: Let pebe the probability that a non-heavy hitter
falls into the set C. Notice the probability that a non-heavy
hitter falls into the heavy buckets of the l-th table is pl≈ 1−
(1−1
as an approximation due to H ? N. The result follows readily
from pe=?M
defined as the expected number of false positives divided by
H3, i.e.,
E|C| = H + ?H.
Then by (1), for a given value ? and a large H, the required
number of tables of the hash array is
(1)
K)H, since each heavy hitter can be treated independently
l=1pland E|C| = H + (N − H)pe.
For the set C, let ? be the expected normalized false positives
(2)
M = −log(N?−1H−1)
log(1 − e−γ)
.
(3)
Therefore, the required memory, say U ≡ MK, is logarithmic
in N and linear in H. The following theorem states the
3The expected false positive error of the set C, defined by the number of
false positives divided by the size of C, is ?/(1 + ?).
Page 3
3
minimal memory requirement for achieving a specified false
positive error.
Theorem 1: Given an expected normalized false positives,
?, the memory size U is minimized when K = H/log2 and
M = log2(N?−1H−1) for a large H (say larger than 100).
The proof is based on minimizing the memory size directly,
but the details are omitted due to space limitation. In fact, this
memory optimization problem is essentially the same memory
optimization problem in the design of Bloom filter [2]. A nice
survey of Bloom filter and its network related applications is
given in [3].
There is a trade-off between the memory requirement and
the hash computations for achieving a fixed false positive error.
Figure 1 shows the trade-off between M and U for the case
where N = 232, H = 1000 in the lower bound case. The
circles represent the optimal pair of (M,U) such that U is
minimized. To achieve the same expected normalized false
positive error (? = 10−6or ? = 10−3), we can in fact use just
half of the optimal number of hashing tables with the price of
increasing the memory size by about only 20%. This may be
desirable when hash operations are considered expensive.
10203040
1.0
1.5
2.0
2.5
Number of hashing functions (m)
Ratio w.r.t. required minimal memory size
ε = 10−6
ε = 10−3
Fig. 1.Trade-off between the number of hashing table and memory size
B. Memory for heavy changer detection
For the (m,j)th bucket, let y(1)
in interval 1 and 2 respectively, and let ym,j= y(2)
be the change in the bucket value. For the heavy changer
case, a bucket is considered heavy iff |ym,j| crosses a pre-
specified threshold t. When the values of non-heavy changers
are negligible, unlike the heavy hitter case presented in the
above, it is now possible that some positive changers and
negative changers collide in the same bucket such that the
bucket is not heavy (i.e., |ym,j| is less than t). Therefore,
the outcome of the threshold test does not fully reflect the
values of heavy keys, and there will be a false negative error in
addition to the false positive error when using the intersections
of heavy buckets to identify the heavy changers. To control the
false negative error, a notion of misses has been introduced in
[6] and [12] to refer to those non-heavy buckets, so that a key
is included in the candidate set if it falls into at least M − r
heavy buckets, where r is the number of allowed misses. We
refine this criterion by using an additional constraint: for a
miss (i.e., a non-heavy bucket) to be considered legitimate, the
bucket value in either y(1)
t. We found the inclusion of this refined criterion very useful
in reducing the false positives. With the allowed r misses,
m,j,y(2)
m,jbe the bucket values
m,j− y(1)
m,j
m,jof y(2)
m,jhas to cross the threshold
the false positive rate will increase, and hence the memory
requirement will increase. It is also clear that when the values
of non-heavy hitters or changers become significant, both false
negative and false positive rates will increase using the same
hash array, and so does the memory requirement for a given
false positive rate.
III. SEQUENTIAL HASHING SCHEME FOR IDENTIFYING
HEAVY KEYS
To identify the heavy keys in a total of N keys using a
single hash array, one has to enumerate the entire key space
to see if each key falls into some heavy bucket in each of
the tables in the hash array. Such an approach, however, is
computationally expensive or even infeasible if the key space
is very large.
In this section, we propose a general framework of using a
multi-level hashing scheme for recovering H heavy elements
in N keys when enumerating the entire key space becomes
computationally prohibitive. The multi-level hashing scheme
allows us to divide the original problem into much smaller sub-
problems where the exhaustive search can be applied. We then
focus on a special version of the general multi-level hashing
scheme called sequential hashing, which has a few desirable
properties. Lastly, we present a mathematical analysis of
its complexity in terms of memory and computation, and
discuss the design parameter optimization for memory and
computation cost for a targeted false positive rate.
A. Multi-level hashing
To illustrate the general idea of multi-level hashing, for a
key x with n = log2N bits, we first focus on identifying a sub-
key of x with b bits that belongs to a heavy key. We assume
b is sufficiently small (say 4, 8) such that enumeration of this
sub-key space for the identification of the heavy sub-keys is
now trivial using a hash array as described in Section II. Next,
we combine the heavy sub-keys that have just been found
with some remaining bits (say 2,4 bits) of the key to form a
larger sub-key with more bits, say b?bits. Enumeration of this
larger sub-key space (with b?bits) is now significantly reduced
because the smaller sub-keys (with b bits) for heavy keys are
already known. Therefore, we can again use a new hash array
to identify the larger sub-keys of the heavy keys. Repeating
the process, we can eventually discover the key values of the
heavy keys in the original key space.
For easy enumeration, a similar idea is proposed in [12]
that divides a large key into smaller words that are enumerated
first. However, they choose to directly combine all enumerated
words of the heavy keys in one step without more intermediate
steps. This makes it not only more computationally expensive,
but also less flexible on trading off between computation and
space as will be described later.
B. Sequential Hashing Scheme
We now propose a sequential hashing scheme for identifying
heavy keys which is a special version of the multi-level hash-
ing scheme discussed above. Our sequential hashing scheme
consists of two major steps: (1) update step, which includes the
Page 4
4
T1,1
Key
... ...
... ...
...
T1,2T1,3T1,4T2,1T2,2T2,3
TD,1TD,2TD,3TD,4TD,5
Array 1Array 2Array D
K
buckets
fD,5
f1,1
fD,1
f2,3
Fig. 2.
hashing scheme.
Relationship between a key and the hash arrays in the sequential
value of a key into the associated buckets of the hash arrays,
and (2) detection step, which determines the set of heavy keys.
Figure 2 depicts the relationship between a key and the hash
arrays in the sequential hashing scheme. We partition a key x
into D words w1w2···,wDsuch that each word wihas bibits,
where 1 ≤ i ≤ D. We now consider the sub-key w1···wi,
formed by the first i words of key x. Let Ni= 2
let Nibe the corresponding sub-key space {0,1,···,Ni−1},
which contains all possible values of sub-key w1···wi. In
each sub-key space Ni, let Hidenote the set of sub-keys of
those heavy keys in the original key space. Note that Hi is
at most of size H. In addition, we construct a set of D hash
arrays, in which the ith hash array corresponds to sub-key
w1···wiand contains Mihash tables Ti,1,···,Ti,Mi.
?i
r=1br, and
Algorithm 1 Update step
Input: a key x with value v
1: Partition key x into D words as w1w2···wD, where word wi
has bi bits for 1 ≤ i ≤ D
2: for i = 1 to D do
3:
for j = 1 to Mi do
4:
Increment the counter of bucket fi,j(w1···wi) in hash
table Ti,j with value v
To begin with, Algorithm 1 outlines the update step. For
each incoming key x = w1···wD with value v, we asso-
ciate the sub-key w1···wiwith hash function fi,j to bucket
fi,j(w1···wi) ∈ {1,···,K} in hash table Ti,j, where 1 ≤
i ≤ D, 1 ≤ j ≤ Mi, and 1 ≤ k ≤ K. We then increment the
counter in the bucket with value v.
Algorithm 2 summarizes the detection step for the case of
heavy hitter detection. The main idea is to decompose the
original problem of finding H heavy keys into a sequence of
D nested sub-problems, each of which determines a candidate
set Cifrom subspace Nias an approximation of Hi. We first
identify C1 by searching for all values in N1 that have all
their associated buckets in T1,1,···T1,M1considered to be
heavy, i.e., the counter of a bucket exceeds a pre-specified
threshold. To determine Ci, where 2 ≤ i ≤ D, we first
concatenate each sub-key x?∈ Ci−1 with an arbitrary word
wi∈ {0,···,2bi−1} to form x??. We then include x??into Ci
Algorithm 2 Detection step
Inputs: hash tables {Ti,j}1≤i≤D,1≤j≤Miwith heavy buckets
Output: a set of heavy keys
1: Set C0 = {0} and Ci = φ for 1 ≤ i ≤ D
2: for i = 1 to D do
3:
for all x?∈ Ci−1 do
4:
for wi = 0 to 2bi− 1 do
5:
x??= x?× 2bi+ wi
6:
Set flag = TRUE
7:
for j = 1 to Mi do
8:
if bucket fi,j(x??) in Ti,j NOT heavy then
9:
Set flag = FALSE
10:
Exit the for-loop of lines 7-10
11:
if flag == TRUE then
12:
Add x??to Ci
13: return CD
if all its associated buckets in Ti,1···Ti,Miare heavy (i.e., the
variable flag remains TRUE). We continue this process and
finally return the candidate set CD.
Note that Algorithm 2 is illustrated for heavy hitter de-
tection. For heavy changer detection, we include ri allowed
misses for the ith hash array and modify Line 8 as follows,
i.e., we set flag to FALSE if bucket fi,j(x??) is a non-legitimate
miss, or the number of legitimate misses over the i hash array
exceeds ri(see Section II-B for details).
In the following, we present a mathematical complexity
analysis of our sequential hashing scheme in terms of memory
and computation, and discuss the design choice to achieve the
most savings in both memory and computation for a targeted
false positive rate. We first analyze the situation when the
non-heavy keys have negligible contribution to the counter
values, and then discuss how our result can be extended to the
situation of significant non-heavy keys. We show that with the
right design choice, our scheme can reduce the computation
in the detection step from Ø(N) (by enumerating all N keys)
to O(H log2N) with very little increase in total memory. We
also conduct complexity comparison between our scheme and
the competing schemes, and show that our scheme is superior
in terms of both memory and computation.
C. Mathematical complexity analysis when non-heavy keys are
negligible
Assume the heavy keys are distributed randomly in the key
space, then it can be shown that the expected size of Hi(i.e.,
the distinct first i words of H heavy keys) is
?
E|Hi| ≈ Ni
1 −
?
1 −
1
Ni
?H?
≈ H,
(4)
where the approximation holds when Ni ? H4. When the
non-heavy keys have negligible contribution to the counter
values, the optimal value of K which minimizes the memory
requirement is K = γ−1H with γ = log2, which is
independent of the size of the key space. Therefore, we can
choose the same number of buckets K for the hash tables in
each hash array.
4In practice, this will be satisfied when Ni≥ 64H.
Page 5
5
For the ith sub-problem, where 1 ≤ i ≤ D, suppose that
the expected number of false positives normalized by H is αi
for 1 ≤ i ≤ D − 1 and ? for i = D, i.e.,
E|Ci| = H + αiH, E|C| = E|CD| = H + ?H, 1 ≤ i < D.
Therefore the expected number of keys to be enumerated for
each sub-problem is 2b1for i = 1, and (1 + αi−1)H2bifor
2 ≤ i ≤ D. Since the complexity of each sub-problem is
determined by the size of keys to be enumerated, it is now
natural to let all the sub-problems have the same expected
number of keys to be enumerated. This can be achieved by
letting αi= α, and dividing the whole key into D words such
that
2b1= (1 + α)H2b, and bi= b, 2 ≤ i ≤ D.
Under this setting, we now consider two main quantities for
the complexity study when the non-heavy keys are negligible:
update memory and recovery cost, and we list the results for
other quantities in Table II. We are interested in how the
complexity grows as a function of H and N.
1) Update Memory: By applying (3) to each sub-problem
i (replacing N with (1 + α)H), 1 ≤ i ≤ D − 1, the required
total number of hash tables with a size K = γ−1H is
(5)
M =
D−1
?
i=1
rlog2((1 + α)Hα−1H−1) + rlog2(N?−1H−1)
= rlog2
N
?H+ r(D − 1)log2(1 + α−1),
where r = −1/log2(1−e−γ). Notice that the first quantity in
(6) is the total number of the tables required to recover the H
heavy keys using a single random hash array by enumerating
all the keys in the original space, for the same normalized
false positive number ?. Therefore, the latter quantity in (6)
is the additional number of tables required for the sequential
hashing scheme, which decreases when α increases.
2) Detection Cost: We define the detection cost as the
number of hash operations needed to recover all heavy keys.
Since the number of keys to be enumerated is (1+α)H2bfor
each sub-problem under our setting (5), and in the worst case,
for each sub-key, we need to check all Mitables to include
or exclude it, the total hash computation required is
(6)
Computation ≤ (1+α)2bHM = γ−1(1+α)2b×(Memory).
(7)
D. Design choices when non-heavy keys are negligible
Given a normalized false positive number ?, our sequential
hashing scheme has two tuning parameters: α, the intermediate
normalized false positives, and b the number of bits of each
word except the first one. Notice that by (5), the number of
total words D is a function of α,b since
log2(1 + α) + bD = log2(H−1N).
(8)
Now we formulate the design problem as an optimization
problem where we try to minimize both the memory increase
and the computational cost, i.e., following (3) and (7), we want
to
minimize (D − 1)log2(1 + α−1) and (1 + α)2b,
0
100
200
300
400
500
600
700
800
30354045 505560 65 70
detection cost (in 10^3 hash operations)
update memory (in M = number of tables)
b=1
b=2
b=4
Fig. 3.
and H = 500.
Trade-off between update memory and detection cost with N = 232
given the constraint (8) and (1 + α)2b≥ 64 so that (4) will
be satisfied. Notice that the computation is exponential in
b, therefore we should let b small. For a fixed small b, if
α = O(log2N), then the memory increase is bounded by a
constant and the computation is O((log2N)2). If α is of O(1),
then the memory increase is O(logN) and the computation is
O(logN) as well. For practical values of log2N (say 32 bits),
we found that there is little difference in the memory increase
when b is between 1 to 5 bits if we set (1 + α)2b≥ 64 (the
number of tables differ at most 2).
To understand the above results, Figure 3 illustrates how our
sequential hashing scheme trades off between update memory
and detection cost. Here, we evaluate the values of b for the
case when non-heavy keys have negligible contribution to the
counter values, and hence γ = log2. We assume that N = 232,
H = 500 (and hence K ≈ 722), N1= 216(and hence Ni≥
64H), ? = 0.2%. We then vary α to obtain the corresponding
update memory (in terms of M) and detection cost. As shown
in the figure, when b = 1 or 2, a smaller detection cost is
obtained as compared to b = 4, while the difference between
b=1 and 2 is very small. For example, when b = 2 and α = 9,
we have M ≈ 33 (where M1= 4, Mi= 2 for 2 ≤ i ≤ D−1,
MD = 15, and D = 9), while the detection cost is about
400K, which is twice the minimum detection cost achieved
by larger update memory. Note that the number of tables in
the lower-bound memory requirement is log2
the heavy-key detection is done by enumeration of the entire
key space. Thus, with only one extra table, we can recover all
heavy keys with manageable detection cost.
N
?H= 32, where
E. Extension in the presence of significant non-heavy keys
In Section II-A, we show that the required memory for de-
tecting H heavy hitters in N keys is at least O(H log(N/?H))
for a given normalized false positive number ?. In the presence
of significant non-heavy keys, [12] introduced the so-called
?-approximate heavy keys and non-heavy keys (see [6]) and
studied its false negatives and false positives respectively. By
plugging in ? in Theorem 2 of [12], we can in fact show that
the memory requirement can achieve O(H log(N/?H)) for
the normalized false positive and false negative number ?. In
this case, the complexity results in (3) and (7) still hold but
with different values of r in (6). Therefore the design choice
studied in Section III-D also applies here.