Content uploaded by Wei Lu
Author content
All content in this area was uploaded by Wei Lu
Content may be subject to copyright.
BotCop: An Online Botnet Traffic Classifier
Wei Lu, Mahbod Tavallaee, Goaletsa Rammidi and Ali A. Ghorbani
Faculty of Computer Science
University of New Brunswick
Fredericton, NB E3B 5A3, Canada
{wlu,m.tavallaee, g.rammidi, ghorbani}@unb.ca
Abstract
A botnet is a network of compromised computers
infected with malicious code that can be controlled
remotely under a common command and control (C&C)
channel. As one the most serious security threats to the
Internet, a botnet cannot only be implemented with
existing network applications (e.g. IRC, HTTP, or Peer-
to-Peer) but also can be constructed by unknown or
creative applications, thus making the botnet detection a
challenging problem. In this paper, we propose a new
online botnet traffic classification system, called BotCop,
in which the network traffic are fully classified into
different application communities by using payload
signatures and a novel decision tree model, and then on
each obtained application community, the temporal-
frequent characteristic of flows is studied and analyzed to
differentiate the malicious communication traffic created
by bots from normal traffic generated by human beings.
We evaluate our approach with about 30 million flows
collected over one day on a large-scale WiFi ISP network
and results show that the proposed approach successfully
detects an IRC botnet from about 30 million flows with a
high detection rate and a low false alarm rate.
1. Introduction
Over the past few years botnets have differentiated
themselves as the main source of malicious activities such
as distributed-denial-of-service (DDoS) attacks, phishing,
spamming, keylogging, click fraud, identity theft and
information exfiltration. Similar to the other malicious
software, botnets use a self-propagating application to
infect vulnerable hosts. They, however, take advantage of
a command and control (C&C) channel through which
they can be updated and directed. According to the
command and control (C&C) models, botnets are divided
into two groups of centralized (e.g., IRC and HTTP) and
distributed (e.g., P2P). Centralized botnets employ two
mechanisms to receive the command from the server,
namely push and pull. In the push mechanism, bots are
connected to the C&C server (e.g., IRC server) and wait
for the commands from the botmaster. In contrast, in the
pull mechanism, the botmaster sets the commands in a file
at C&C server (e.g., HTTP server), and the bots
frequently connect to the server to read the latest
commands. While in centralized structure all bots receive
the commands from a specific server, in distributed
structure the command files will be shared over P2P
networks by botmaster, and bots can use specific search
keys to find the published command files.
In reality, detecting and blocking such an IRC botnet,
however, is not a difficult task since the whole botnet can
be put down by blacklisting the IRC server. To overcome
this issue, botnets have evolved by allowing more
flexibility in the applied protocols, and now they are even
transforming from centralized structure into the advanced
distributed strategy to solve the weakness of having a
single point of failure. Compared to the traditional
centralized C&C model, the distributed (Peer-to-Peer)
botnet is much harder to be detected and destroyed
because the bot’s communication does not heavily depend
on a few selected servers, and thus shutting down a single
or even a couple of bots cannot necessarily lead to the
complete destruction of the whole botnet.
Early research to detect botnets are mainly based on
honeypots [1,2,3]. Setting up and installing honeypots on
the Internet is very helpful to capture malware and
understand the basic behavior of botnets, and, as a result,
makes it possible to create bot binaries or botnet
signatures. However, this analysis is always based on the
existing botnets and provides no solution for the new
botnets. To overcome this issue, new methods are
proposed to automatically detect the botnets. These
approaches can be categorized into two major groups: (1)
passive anomaly analysis [e.g. 4,5]; and (2) traffic
classification [e.g. 6]. Botnet detection based on the
passive anomaly analysis is usually independent of the
traffic content and has the potential to find different types
of botnets (e.g., HTTP, IRC and P2P). This approach is,
however, limited to a specific botnet structure (e.g.
centralized only). In contrast, traffic classification focuses
on classifying network traffic into the corresponding
applications, and then distinguishing between normal and
malicious activities. The biggest challenge of this
approach is classification of traffic into appropriate
application groups.
2009 Seventh Annual Communications Networks and Services Research Conference
978-0-7695-3649-1/09 $25.00 © 2009 IEEE
DOI 10.1109/CNSR.2009.21
70
2009 Seventh Annual Communication Networks and Services Research Conference
978-0-7695-3649-1/09 $25.00 © 2009 IEEE
DOI 10.1109/CNSR.2009.21
70
Authorized licensed use limited to: National Taiwan University. Downloaded on December 30, 2009 at 08:41 from IEEE Xplore. Restrictions apply.
Addressing the aforementioned challenges, we propose
a hierarchical framework for the next generation botnet
detection, which consists of two levels: (1) in the higher
level all unknown network traffic are labeled and
classified into different network application communities,
such as P2P community, HTTP Web community, Chat
community, DataTransfer community, Online Games
community, Mail Communication community, Multimedia
(streaming and VoIP) community and Remote Access
community; (2) in the lower level focusing on each
application community, we investigate and apply the
temporal-frequent characteristics of network flows to
differentiate the malicious botnet behavior from the
normal application traffic.
The major contributions of this paper include: (1) we
propose a novel application discovery approach for
automatically classifying network applications on a large-
scale WiFi ISP network; and (2) we develop a generic
algorithm to discriminate general botnet behavior from the
normal network traffic on a specific application
community, which is based on n-gram (frequent
characteristics) of flow payload over a time period
(temporal characteristics).
The rest of the paper is organized as follows. Section 2
introduces related work, in which we discuss some typical
literatures on the current botnet detection communities.
The proposed online traffic classification method is
discussed in Section 3. Section 4 presents the temporal-
frequent characteristic and then explains our botnet
detection approach. Section 5 is the experimental
evaluation for our detection model with a mixture of
around 30 million flows collected on a large-scale WiFi
ISP network and a botnet traffic trace collected on a
honeynet deployed on the public Internet. Finally, in
Section 6 we make some concluding remarks and discuss
the future work.
2. Related work
Previous attempts to detect botnets are mainly based on
honeypots, passive anomaly analysis and traffic
classification. In order to get a full understanding of
botnets behavior, honeypots are widely installed and setup
on the Internet to capture the malware and consequently
track and analyze the bots [1,2,3,]. A typical example is
the Nepenthes honeypot that is commonly used to collect
the shell code or bot binaries by mimicking a reply that
can be generated by a vulnerable service. Rajab et al. in [1]
deployed nepenthes to collect malware in their unused IP
address space. A honeynet consisting of VMWare virtual
machines running Windows XP is used to capture any
exploits that may be missed by Nepenthes. Once all
binaries are collected, they use greybox testing that runs
the collected binary on a clean image of Windows XP
virtual machine while logging all traffic, to try and get
details of how a compromised host will join that particular
botnet in the wild. During this testing, network
fingerprints are created to capture network information
like DNS requests, Destinations IP addresses, contacted
ports and presence of default scanning behavior. IRC-
related features are also extracted by running an IRC
server in the testing hosts and then any attempted
connections are logged and an IRC fingerprint consisting
of PASS, NICK, USER, MODE and JOIN values is
created. Botnets are then tracked by joining a modified
IRC tracker to the actual IRC server and observing it, and
also DNS cache probing. Although the honeypot based
approach is quite helpful in creating bot binaries and bot
signatures, it is always limited to the existing botnets and
provides no solution for the new bots.
To overcome this shortcoming two botnet detection
approaches have been proposed recently, namely traffic
classification and passive anomaly analysis. A typical
work of traffic classification based botnet detection using
machine learning algorithms is illustrated at [6], in which
Strayer et al. propose an approach for detecting botnets by
examining flow characteristics such as bandwidth,
duration, and packet timing in order to look for the
evidence of the botnet command and control activities.
They propose an architecture that first eliminates traffic
that is unlikely to be a part of a botnet, then classifies the
remaining traffic into a group that is likely to be part of a
botnet, and finally correlates the likely traffic to find
common communications patterns that would suggest the
activity of a botnet.
Typical approaches of passive anomaly based botnet
detection are discussed in [4,5]. In [4], Karasaridis et al.
study network flows and detect IRC botnet controllers in a
fashion of four steps, in which the most important one is
to identify hosts with suspicious behavior and isolate flow
records to/from those hosts. In [5], Gu et al. investigate
the spatial-temporal correlation and similarity in network
traffic and implement a prototype system, BotSniffer, to
detect botnets. All the above mentioned botnet detection
techniques are either limited to the specific C&C
protocols or limited to the specific botnet structures.
3. Traffic classification
Early common techniques for identifying network
application rely on the association of a particular port with
a particular protocol. Such a port number based traffic
classification approach has been proved to be ineffective
due to: (1) the constant emergence of new peer-to-peer
networking applications that IANA does not define the
corresponding port numbers [7], (2) the dynamic port
number assignment for some applications (e.g. FTP for
data transfer), and (3) the encapsulation of different
7171
Authorized licensed use limited to: National Taiwan University. Downloaded on December 30, 2009 at 08:41 from IEEE Xplore. Restrictions apply.
services into same application (e.g. chat or steaming can
be encapsulated into the same HTTP protocol). Recent
studies on network traffic application classification
include "applying machine learning algorithm for
clustering and classifying traffic flows based on a set of
statistical features" [8,9], "modeling payload content
signatures for traffic application classification "[10,11]
and "identifying traffic based on heuristics derived from
analysis of communication patterns of hosts" [12,13].
Although existing traffic classification mechanisms
generate a number of good ideas, they are far from
completed yet due to the limited number of applications
they can identify and the rough application scopes (e.g.
BLINC in [13] attempts to identify the general P2P traffic
instead of the specific underlying P2P applications like
eDonkey or BitTorrent). Moreover comparing all above
mentioned methods is difficult because of the lack of
sharable dataset and appropriate metrics [14].
Addressing these limitations, we propose in this paper
a hybrid mechanism for classifying flow applications on
the fly, in which we first model and generate signatures
for more than 470 applications according to port numbers
and protocol specifications of these applications and then
concentrating on unknown flows that cannot be identified
by signatures, we investigate their temporal-frequent
characteristics in order to differentiate them into the
already labeled applications based on a decision tree
trained by corresponding temporal-frequent characteristics
of known flows. Next we discuss the online traffic
classification system in more detailed.
3.1. Signatures based classifier
The payload signature based classifier is to investigate
the characteristics of bit strings in the packet payload. For
most applications, their initial protocol handshake steps
are usually different and thus can be used for classification.
Moreover, the protocol signatures can be modeled through
either public documents like RFC or empirical analysis for
deriving the distinct bit strings on both TCP and UDP
traffic. The signatures based classifier is deployed on
Fred-eZone, a free wireless fidelity (WiFi) network
service provider being operated by the City of Fredericton
[15]. Table 1 lists the general workload dimensions for the
Fred-eZone network capacity. From Table 1, we see, for
example, that the unique number of source IP addresses
(SrcIP) appeared over one day is about 1,055 thousands
and the total number of packets is about 944 millions. All
the flows are bi-directional and we clean all uni-
directional flows before applying the classifier. Table 2
lists the classification results over one hour traffic
collected on Fred-eZone.
From Table 2, we see that about 249,000 flows can be
identified by the application payload signatures and about
215,000 flows cannot be identified. A general result is that
about 40% flows cannot be classified by the current
payload signatures based classification method. In next
section we build a module that works in parallel with the
signatures based application detection engine. The new
module focuses only on those applications that the
signature-based detector could not identify and that appear
to the signatures-based classifier as unknown.
Table 1. Workload of Fred-eZone WiFi network over 1 day
SrcIP DstIP Flows Packets
Bytes
1055K 1228K 30783K
994M 500G
Table 2. Classification results with one hour traffic on Fred-
eZone
Known Applications Unknown Applications
Flows
ScrIPs
DstIPs
App.
Flows
SrcIPs
DstIPs
249K 102K 202K 82 215K 1001K
1055K
3.2. Decision tree based classifier
N-gram bytes distribution has proven its efficiency on
detecting network anomalies. Wang et al. examine 1-gram
byte distribution of the packet payload, represent each
packet into a 256-dimenational vector describing the
occurrence frequency of one of the 256 ASCII characters
in the payload and then construct the normal packet
profile through calculating the statistical average and
deviation value of normal packets to a specific application
service (e.g. HTTP) [16]. Anomalies will be alerted once
a Mahalanobis distance deviation of the testing data to the
normal profiles exceeds a predefined threshold. Gu et al.
improve this approach and apply it for detecting malware
infection in their recent work [17]. Different with previous
n-gram based approaches for network intrusion detection,
we extend in this paper n-gram frequency into a temporal
domain and generate a set of 256-dimentional vector
representing the temporal-frequent characteristics of the
256 ASCII binary bytes on the payload over a predefined
time interval. By observing and analyzing the known
network traffic applications, labeled by the signatures
based classifier, over a long period on a large-scale WiFi
ISP network, we found that the n-gram (i.e. n = 1 in
particular) over a one second time interval for both source
flow payload and destination flow payload is a strong
enough feature that can be applied to differentiate traffic
applications. As an example, Figures 1 to 5 illustrate this
novel temporal-frequent metric for the application
BitTorrent (P2P), Gnutella (P2P), LimeWire (P2P),
HTTPWeb (WEB) and SecureWeb (WEB), respectively.
Axis X in all these 5 Figures is the ASCII characters from
0 to 255 on the source flow payload. Axis Y stands for the
7272
Authorized licensed use limited to: National Taiwan University. Downloaded on December 30, 2009 at 08:41 from IEEE Xplore. Restrictions apply.
frequent value for each ASCII character appeared over a
predefined time interval (i.e. 1 second).
Figure 1. Temporal-frequent metric for source flow payload
of BitTorrent application.
Figure 2. Temporal-frequent metric for source flow payload
of Gnutella application.
Figure 3. Temporal-frequent metric for source flow payload
of LimeWire application.
Figure 4. Temporal-frequent metric for source flow payload
of HTTPWeb application.
Figure 5. Temporal-frequent metric for source flow payload
of SecureWeb application.
By comparing Figures 1 to 3 with the Figures 4 and 5,
we see that the temporal-frequent metric of flow payload
are very different for P2P and WEB applications. In more
fine-grained level, we see that the temporal-frequent
metric of flow payload for applications BitTorrent,
Gnutella and LimeWire are different as well by comparing
Figures 1 to 3. Similar results also apply to differentiate
the two applications (i.e. HTTPWeb and SecureWeb) in
the same application group (i.e. WEB).
We denote the 256-dimensional n-gram byte
distribution as a vector
1 2 256
, ,...,
i i i
t t t
f f f
< >
, where
i
t
j
f
stands for the frequency of the
th
j
ASCII character on
the flow payload over a time window
( 1, 2...256; 0,1, 2,...)
i
t j i= = (i.e. the temporal-frequent
metric of the flow payload). Given n historical known
flows for each specific application, we define a
256
n
×
matrix,
app
p
, for profiling applications, which are
illustrated as follows:
1 1 1
2 2 2
256
1 2 2 5 6
1 2 2 5 6
1 2 2 5 6
n
n n n
t t t
t t t
a p p
t t t
f f f
f f f
p
f f f
×
=
We create over 470 application profiling matrix for all
the applications on the signatures base. Unknown flows
that cannot be identified by signatures based classifier,
therefore, could be labeled by the new application
profiling matrix because unknown flows with payload,
even though no signature is found to match the signature
base, their temporal-frequent characteristics can always be
modeled and thus can be used for unknown traffic
classification.
The decision tree technique is a good candidate to
achieve the unknown traffic classification in this case due
to its low computational complexity and the training
capability for large-size dataset. A typical decision tree is
represented in a form of a tree structure (e.g. Figure 6), in
which each node is either a leaf node or a decision node.
A leaf node indicates the value of the target class, such as
Application = Gnutella
in the Figure 6 and a decision
node specifies some test to be carried out on a single
attribute value, with one branch and sub-tree for each
possible outcome of the test, for instance a decision
5
f
with a branch test
5
0.3
f≤
in Figure 6.
A decision tree can be used to classify an example by
starting at the root of the tree and moving through it until
a leaf node, which provides the classification of the
instance. Suppose Figure 6 is the decision tree for
application classification trained by the 256-dimensional
7373
Authorized licensed use limited to: National Taiwan University. Downloaded on December 30, 2009 at 08:41 from IEEE Xplore. Restrictions apply.
attribute
1 2 256
, ,...,f f f
< >
, an unknown flow with a new
256-dimensional vector will be compared starting from
root node
1
f
to see if it is bigger than 0.1 or not, and if
the testing result is
1
0.1
f≤, then
5
f
is selected to see if it
is bigger than 0.3 or not, if it is bigger than 0.3, the
unknown flow will be labeled as Gnutella application. The
training of the decision tree for obtaining a decision model
is based on the historical 470 application profiling matrix
and each application profiling matrix includes at least
10,00 instances (i.e. the size of the matrix is
1000 256
×
).
The decision tree algorithm we apply is the C4.5 proposed
by Quinlan [18] since it is well known and frequently used
over the years.
Figure 6. A typical decision tree for traffic classification
4. Botnet detection
The temporal-frequent characteristic based on n-gram
over a time period cannot only be applied to train the
decision tree model for traffic classification, but also can
discriminate the malicious traffic by bots from the normal
traffic created by human-beings. The temporal feature is
important in botnet detection due to two empirical
observations of botnets behavior: (1) the response time of
bots is usually immediate and accurate once they receive
commands from botmaster, while normal human behavior
might perform an action with various possibilities after a
reasonable thinking time, and (2) bots basically have
preprogrammed activities based on botmaster's commands,
and thus all bots might be synchronized with each other.
These two observations have been confirmed by a
preliminary experiment conducted in [19]. As an example,
Figures 7 and 8 illustrate the average byte frequency over
the normal IRC flows and IRC botnet flows, respectively.
By comparing Figures 7 and 8, we see the average byte
frequency over a specific time period for normal IRC
traffic is much smaller than average byte frequency over a
specific time period for botnet IRC traffic.
After obtaining the n-gram (n = 1 in this case) features
for flows over a time window, we then apply an
agglomerative hierarchical clustering algorithm to cluster
the data objects with 256 features. We do not construct
the normal profiles because normal traffic is sensitive to
the practical networking environment and a high false
positive rate might be generated when deploying the
training model on a new environment. In contrast, the
agglomerative hierarchical clustering is unsupervised and
does not define threshold that needs to be tuned in
different cases. In our approach, the final number of
clusters is set to 2.
Given a set of
N
data objects
~ { | 1,2, ..., }
i
F F i N
=,
where
1 2 256
, ,...,
i i i
t t t
i
F f f f
=< >
, the detection approach is
described in Algorithm 1.
In practice, labeling clusters is always a challenging
problem when applying unsupervised algorithm for
intrusion detection. Previous intrusive cluster labeling
methods are based on two assumptions: (1) there are two
clusters only, one is normal and the other is intrusive, and
Figure 7. Average byte frequency over 256 ASCIIs for
normal IRC flows
Figure 8. Average byte frequency over 256 ASCIIs for
botnet IRC flows
1
f
20
f
1
0.1
f≤
1
0.1
f>
App=Gnutella
App=BitTorrent
App=LimeWire
App=Httpweb
App=Secureweb
5
f
5
0.3
f>
5
0.3
f≤
64
f
20
0.45
f≤
20
0.45
f>
64
0.05
f<
64
0.05
f≥
7474
Authorized licensed use limited to: National Taiwan University. Downloaded on December 30, 2009 at 08:41 from IEEE Xplore. Restrictions apply.
(2) the number of instances in normal cluster is much
bigger than the number of instances in intrusive cluster
[20] and thus the cluster with small number of instances is
usually labeled as intrusive cluster. We apply the same
labeling strategy in this paper.
Algorithm 1. Implementation of Botnet detection approach
Function BotDel (F) returns botnet cluster
Inputs: Collection of data objects
1 2 256
, ,...,
i i i
t t t
i
F f f f
=< >
,
1,2,...,
i N
=
Initialization:
initialize number of clusters
k
(i.e.
k N
=
) by
assigning each data instance to a cluster so that
each cluster contains only one data instance
Repeat:
1
k k
← −
find the closest pair of clusters and then merge
them into a single cluster compute distance
between new clusters and each data of old clusters
Until:
2
k
=
calculate number of instances in each cluster,
1
,.,
m
g g
,
1
≤ ≤
m k
If
1 2
min( , , ..., )
b m
g g g g
=then cluster b is labeled as
botnet cluster
Return the botnet cluster b with
b
g
.
5. Experimental evaluation
We implement a prototype system for the approach and
then evaluate it on a large-scale WiFi ISP network over
one day. The botnet traffic is collected on a honeypot
deployed on a real network, aggregated them into 243
flows. The time interval for flow aggregation is 1 second.
When evaluating the prototype system, we randomly
insert and replay botnet traffic flows on the normal daily
traffic. Since our approach is a two-stage process (i.e.
unknown traffic classification first and botnet detection on
application communities next), the evaluation is
accordingly divided into two parts: (1) the performance
testing for unknown traffic classification, not only
focusing on the capability of our approach to classify the
unknown IRC traffic, we also concentrate on the
classification accuracy for other unknown applications
(e.g. new P2P) since we expect the algorithm could be
extended to detect any new appeared decentralized botnet;
(2) the performance evaluation for system to discriminate
malicious IRC bonnet traffic from normal human being
IRC traffic.
5.1. Evaluation on traffic classification
The data set for traffic trace used in the experimental
evaluation is collected over three consecutive days on a
large-scale WiFi ISP network, in which we achieve a 60%
classification rate over 100 millions flows. The workload
for Fred-eZone network is illustrated in Table 1. In order
to create the training dataset for learning the decision tree
based classifier, 11 typical applications belonging to 8
typical application groups are modeled from known
labeled flows, which are illustrated in Table 3. The size of
input data for training decision tree is
11000 256
×
. In
order to validate the decision tree model we conduct a
realtime classification evaluation in which traffic trace
collected over 2 days are used for training and the
realtime traffic flows collect on the 3
rd
day are used for
testing.
Table 3. Applications in training dataset
Application
ID
Application
Name
Application
Group
Size of
Matrix
2006 BitTorrent P2P
1000 256
×
2000 Gnutella P2P
1000 256
×
2008 LimeWire P2P
1000 256
×
1010 HTTPWeb WEB
1000 256
×
1011 SecureWeb WEB
1000 256
×
1008 POP MAIL
1000 256
×
1004 SMTP MAIL
1000 256
×
1002 FTP DataTransfer
1000 256
×
5672 MSN CHAT
1000 256
×
1005 SSH RemoteAccess
1000 256
×
5005 WindowsMediaPlayer
Streaming
1000 256
×
During the online evaluation, the decision tree based
classifier is deployed on a large-scale WiFi ISP network
and works in parallel with the signature based classifier.
More than 90,000 flows are collected over the testing day
on the network and are enforced to be identified as
unknown, of which the real labels are illustrated in Table
4. Tables 5 and 6 describe the detailed classification
accuracy for each specific application using source flow
based classifier and destination flow based classifier,
respectively. The general classifying accuracy is
illustrated in Table 7 for both classifiers.
The online evaluation results show that the decision
tree classifier based on destination flows achieves a 92.6%
classification accuracy which is higher than 89.4%
accuracy obtained by the source flows based classifier. All
unknown flows are identified to specific applications and
no unclassified flows happen due to the deterministic
mechanism of decision tree structure.
5.2. Evaluation on botnet detection
During the evaluation of botnet detection, the proposed
approach is evaluated with one day traffic. Table 8 shows
the flow distribution for the application community with
bot flows and the total number of flows after the traffic
classification step. As illustrated in Table 8, the total
number of flows is 32,693K and the number of flows
7575
Authorized licensed use limited to: National Taiwan University. Downloaded on December 30, 2009 at 08:41 from IEEE Xplore. Restrictions apply.
labeled by the payload signature based classifier is 20,596.
The rest unknown flows are 12,097, in which 243
unknown flows are classified into known IRC community
(i.e. they actually represent the IRC C&C bot flows).
Since we know all these unknown flows are actually
belong to IRC, our approach obtains 100% accuracy for
classifying these malicious bot C&C flows into their own
application community. Next, we evaluate the capability
of our approach for discriminating the bot generated
traffic from normal traffic in the same application
community. As illustrated in Table 9, we show the
detection results in terms of number of correctly detected
bot C&C flows and the number of falsely detected bot
flows over the actual number of bot flows and normal
flows on the specific community.
From Table 8, we see that the total number of flows we
collect for one day is over 30 millions and the total
number of known flows which can be labeled by the
payload signatures is over 20 millions. The number of
IRC C&C flows is a very small part of the total flows. Our
traffic classification approach can classify the unknown
(malicious) IRC flows to the IRC application communities
with a 100% classification rate on the evaluation. All the
IRC C&C flows are differentiated from the normal traffic
with a low false alarm rate, i.e. only 4 false alarms on the
evaluation.
Table 4. Distribution of "unknown" application flows
Applications Number of Flows
BitTorrent 29739
FTP 224
Gnutella 15109
HTTPWeb 16216
LimeWire 141
MSN 4049
POP 26
SecureWeb 12886
SMTP 11522
SSH 2197
WindowsMediaPlayer
722
Table 5. Classification results with source flow based
decision tree classifier
Applications Number of
Unknown
Flows
Number of Flows
Correctly
Labeled
BitTorrent 29739 27777
FTP 224 193
Gnutella 15109 11929
HTTPWeb 16216 12635
LimeWire 141 131
MSN 4049 4021
POP 26 26
SecureWeb 12886 12097
SMTP 11522 11512
SSH 2197 2181
WindowsMediaPlayer
722 481
Table 6. Classification results with destination flow based
decision tree classifier
Applications Number of
Unknown
Flows
Number of Flows
Correctly
Labeled
BitTorrent 29739 27796
FTP 224 181
Gnutella 15109 13992
HTTPWeb 16216 13996
LimeWire 141 108
MSN 4049 4012
POP 26 26
SecureWeb 12886 11809
SMTP 11522 11424
SSH 2197 2170
WindowsMediaPlayer
722 81
Table 7. General classification accuracy for both classifiers
Table 8. Description of application community
Total
Flows
Known
Flows
Flows in
Botnet
Communities
32693K 20596
K
264 IRC {21
normal}
Table 9. Detection performance
Normal
IRC
Flows
Bot
C&C
Flows
Correctly
detected
Bot C&C
Flows
Number
of Falsely
Identified
Bot C&C
Flows
21 243 243 4
6. Conclusions
In this paper, we present a novel generic botnet traffic
classification framework, in which unknown applications
on the current network are firstly classified into different
application communities, such as Chat (or more specific
IRC) community, P2P community, Web community, to
name a few, and then focusing on each application
community, a novel temporal-frequent characteristic is
applied for discriminating network traffic by bots from
normal network traffic by human-beings. Since botnets are
usually exploring existing application protocols, our
approach can be extended to find different types of
Decision Tree Classifier
Based on Source Flows
Decision Tree Classifier
Based on Destination Flows
Total
Number of
Flows
Correctly
Indentified
Classification
Accuracy (%)
Total
Number of
Flows
Correctly
Indentified
Classification
Accuracy (%)
82983 89.4 85995 92.6
7676
Authorized licensed use limited to: National Taiwan University. Downloaded on December 30, 2009 at 08:41 from IEEE Xplore. Restrictions apply.
botnets and has the potential to find the new botnets when
exploring specifically the traffic on the "unknown"
community. In particular, we evaluate our framework on
IRC chat community and evaluation results show that our
approach obtains a very high detection rate (approaching
100% for IRC bot) with a low false alarm rate when
detecting IRC botnet traffic. In the immediate future, we
will evaluate our approach on the P2P community and
measure its performance on P2P based botnets.
Acknowledgement
The authors graciously acknowledge the funding from the
Atlantic Canada Opportunity Agency (ACOA) through the
Atlantic Innovation Fund (AIF) to Dr. Ali Ghorbani.
References
[1]
M.A. Rajab, J. Zarfoss, F. Monrose, and A. Terzis,
"A multifaceted approach to understanding the botnet
phenomenon," In Proceedings of the 6
th
ACM
SIGCOMM Conference on Internet measurement, pp.
41-52, 2006.
[2]
V. Yegneswaran, P. Barford, and V. Paxson, "Using
honeynets for internet situational awareness," In
Proceedings of the 4
th
Workshop on Hot Topics in
Networks, College Park, MD, 2005.
[3]
F. Freiling, T. Holz, and G. Wicherski. "Botnet
tracking: exploring a root-cause methodology to
prevent Denial of Service attacks". In Proceedings of
10
th
European Symposium on Research in Computer
Security (ESORICS’05), 2005.
[4]
A. Karasaridis, B. Rexroad, and D. Hoeflin, "Wide-
scale botnet detection and characterization," In
Proceedings of the 1
st
Conference on 1
st
Workshop on
Hot Topics in Understanding Botnets, Cambridge,
MA, 2007.
[5]
G.F. Gu, J.J. Zhang, and W.K. Lee, "BotSniffer:
detecting botnet command and control channels in
network traffic," In Proceedings of the 15
th
Annual
Network and Distributed System Security Symposium,
San Diego, CA, February 2008.
[6]
T. Strayer, D. Lapsley, R. Walsh, and C. Livadas,
"Botnet detection based on network behavior," Botnet
Detection: Countering the Largest Security Threat, in
Series: Advances in Information Security, Vol. 36, W.
K. Lee, C. Wang, D. Dagon, (Eds.), Springer, 2008.
[7]
IANA port numbers, available and retrieved in Dec.
2008.http://www.iana.org/assignments/port-numbers
[8]
J. Erman, A. Mahanti, M. Arlitt,, I. Cohen, and C.
Williamson, "Offline/realtime traffic classification
using semi-supervised learning", Performance
Evaluation, Vol. 64, No. 9-12., 1194-1213, 2007.
[9]
L. Bernaille, R. Teixeira, I. Akodkenou, A. Soule,
and K. Salamatian, "Traffic classification on the fly",
ACM SIGCOMM Computer Communication Review,
Vol. 36, Issue 2, 23-26,2006.
[10]
L. Bernaille and R. Teixeira, "Early recognition of
encrypted applications". In Proceedings of Passive
and Active Measurement Conference (PAM 2007),
Louvain-la-neuve, Belgium, 165-175, 2007.
[11]
S. Sen, and J. Wang, "Analyzing peer-to-peer traffic
across large networks". In Proceedings of ACM
SIGCOMM Internet Measurement Workshop,
Marseilles, France, 2002.
[12]
A. Moore and K. Papagiannaki, "Toward the accurate
identification of network applications", In
Proceedings of 6th Passive and Active Measurement
Workshop (PAM 2005), 2005.
[13]
T. Karagiannis, K. Papagiannaki, and M. Faloutsos.
"BLINC: multilevel traffic classification in the dark",
In Proceedings of the 2005 Conference on
Applications, Technologies, Architectures, and
Protocols for Computer Communications,
Philadelphia, Pennsylvania, 229-240, 2005.
[14]
L. Salgarelli, F. Gringoli, and T. Karagiannis,
"Comparing traffic classifiers", ACM SIGCOMM
Computer Communication Review, Volume 37, Issue
3, 65-68, 2008.
[15]
Fred-eZone WiFi ISP, available and retrieved in
December2008, http://www.fred-ezone.ca/
[16]
K. Wang, and S. Stolfo, "Anomalous payload-based
network intrusion detection", In Proceedings of the
7th International Symposium on Recent Advances in
Intrusion Detection (RAID), Sophia Antipolis, France,
2004.
[17]
G.. F. Gu, P. Porras, V. Yegneswaran, M. Fong, and
W.K. Lee, "BotHunter: detecting malware infection
through IDS-Driven dialog correlation". In
Proceedings of the 16th USENIX Security Symposium,
Boston, MA, 2007.
[18]
J. R. Quinlan, C4.5: Programs for Machine Learning.
Morgan Kaufmann Publishers, 1993.
[19]
M. Akiyama, T. Kawamoto, M. Shimamura, T.
Yokoyama, Y. Kadobayashi, and S. Yamaguchi, "A
proposal of metrics for botnet detection based on its
cooperative behavior," In Proceedings of the 2007
International Symposium on Applications and the
Internet Workshops, pp. 82-85, 2007.
[20]
E. Eskin, "Anomaly detection over noisy data using
learned probability distributions," In Proceedings of
17
th
International Conference on Machine Learning,
pp. 255-262, Palo Alto, 2000.
7777
Authorized licensed use limited to: National Taiwan University. Downloaded on December 30, 2009 at 08:41 from IEEE Xplore. Restrictions apply.