Network intrusion detection using hardware techniques: A review

Conference Paper (PDF Available) · April 2016with 676 Reads 
How we measure 'reads'
A 'read' is counted each time someone views a publication summary (such as the title, abstract, and list of authors), clicks on a figure, or views or downloads the full-text. Learn more
DOI: 10.1109/LISAT.2016.7494100
Conference: 2016 IEEE Long Island Systems, Applications and Technology Conference (LISAT)
Cite this publication
Abstract
The increasing amount of network throughput and security threat makes intrusion detection a major research problem. In the literature, intrusion detection has been approached by either a hardware or software technique. This paper reviews and compares hardware based techniques that are commonly used in intrusion detection systems with a special emphasis on modern hardware platforms such as FPGA, GPU, many-core processors and ASIC. It also provides a detailed comparison between these hardware solution platforms. Our approach to classify modern hardware-based Intrusion Detection System (IDS) techniques is based on the detection approach. In addition, we provide a comparison between the classified detection approaches based on essential criteria such as definition, update process, detection ability, features of the system, and implementation requirements. Finally, a classification tree of hardware-based NIDS platforms is given.
Network Intrusion Detection Using Hardware
Techniques: A Review
Razan Abdulhammed, Miad Faezipour and Khaled M. Elleithy
Computer Science and Engineering Department
University of Bridgeport, Bridgeport, CT
rabdulha@my.bridgeport.edu,
mfaezipo@bridgeport.edu, elleithy@bridgeport.edu
Abstract
The increasing amount of network throughput
and security threat makes intrusion detection a major
research problem. In the literature, intrusion detection has
been approached by either a hardware or software
technique. This paper reviews and compares hardware
based techniques that are commonly used in intrusion
detection systems with a special emphasis on modern
hardware platforms such as FPGA, GPU, many-core
processors and ASIC. It also provides a detailed comparison
between these hardware solution platforms. Our approach
to classify modern hardware-based Intrusion Detection
System (IDS) techniques is based on the detection approach.
In addition, we provide a comparison between the classified
detection approaches based on essential criteria such as
definition, update process, detection ability, features of the
system, and implementation requirements. Finally, a
classification tree of hardware-based NIDS platforms is
given.
Index Terms — Intrusion detection system, FPGA, GPU,
NFA, DFA, Pattern matching, TCAM, ASIC, Many-Core
Processors.
I. I
NTRODUCTION
The concept of Intrusion Detection System (IDS) was
first described in the early eighties [1]. Intrusion detection
system is a security measure that helps to identify a set of
malicious actions that compromise the integrity,
confidentiality, and availability of information resources
[2]. From a general prospective, IDSs can be divided
based on the detection approach into two categories:
anomaly, and signature based detection. The fundamental
concept in an anomaly- based detection method is to
define the behavior of the targeted network, and then this
predefined behavior is compared to the normal actual
network behavior. Anomaly based detection is used to
detect both known and unknown attacks. In contrast,
signature based systems are used to detect known attacks
and require prior knowledge of attack signatures. Table I
shows a comparison between these approaches.
The IDS gathers, observes and collects data before the
analysis phase using different strategies. These include:
host-based, network-based, and hybrid-based approaches.
The analyzing process takes place either while the
sessions are ongoing, yielding a real time IDS, or after the
information data has been already collected, yielding an
offline IDS. The offline type of IDS is easy to implement
compared to the real time one. However; the real time
IDS is useful for understanding the attacker’s behavior.
The performance of an intrusion detection system depends
on two metrics: throughput and the total number of
patterns that can fit on a device.
A Network Intrusion Detection System (NIDS)
requires pattern matching, string matching
,
multimatch
packet classification, and a regular expression to perform
its functions. In spite of the great progress that has been
made, there are several factors that contribute to
increasing the gap between the processing requirements of
IDS and their software implementations. Such as the
increase in network throughput, the increasing number
and sophistication of attacks, and the performance
limitations of sequential software execution. Within these
factors, developers consider to build IDS by using
hardware implementation techniques to secure network
infrastructure. The utmost reason for shifting from
software to hardware is to enable real-time
implementation of IDS. The major performance
difference between hardware and software solutions lies
in their dissimilar execution paradigms. Essentially,
hardware involves implementations that allow execution
of many operations in parallel, while the execution of
pure software implementation is serial or with limited
parallelism [3]. As a matter of fact, the design boundary
between hardware and software is illusory at this point
[4], [5].
The primary goal of this paper is to investigate the
common hardware-based techniques in the literature to
build and implement NIDS functions. The organization of
the rest of the paper is as follow: hardware based
techniques for packet classification is described in the
Section II as a base to understand the methods. Hardware
based techniques for string and pattern matching are
reviewed in the third part, and hardware based techniques
for regular expression are discussed in part 4. A sample of
the literature of hardware-based Intrusion detection
system is given in part 5. A brief conclusion is in the final
part of the paper.
II. H
ARDWARE
-B
ASED
T
ECHNIQUES
F
OR
P
ACKET
C
LASSIFICATION
A search of the literature revealed a few studies which
provide various multi-match packet classification
solutions. To start with, Song et al. [6] presented BV-
T
ABLE
I.
A
COMPARISON BETWEEN
IDS
DETECTION APPROACHES
.
TCAM that is a packet classification architecture based on
FPGA for network intrusion detection. The BV-TCAM
architecture uses a combination of both Ternary Content
Addressable Memory (TCAM) and the Bit Vector (BV)
algorithm. This architecture is able to eliminate the
requirement for the prefix expansion of the port ranges
through TCAM and a multi-bit trie algorithm that is
implemented in a tree bitmap to be used as the
source or destination port lookup
.
The architecture
gives an acceptable performance. However, the BV-
TCAM algorithm suffers from the O(N
2
) memory
requirement.
Likewise, Fang Yu et al. [7] developed a new
algorithm to split the rule set into two groups in order to
perform separate TCAM lookups among the rules. The
new algorithm that is called Set Splitting Algorithm
(SSA) was able to meet both requirements of low TCAM
memory usage and low-power consumption comparing to
the BV-TCAM. The rules set that were used to verify the
algorithm were SNORT rules. Moreover, SSA can be
combined with SRAM/TCAM hybrid approaches for
further energy consumption reduction. However, the
worst case memory requirement is still O(N
2
), which
means that this approach is expensive regarding power
and memory consumption when the rules set has many
intersections. By the same token, a TCAM-based multi-
match packet classification partitioning approach was
proposed in [8]. This approach was called a Maximum-
Minimum Insertion Partitioning approach (MX-MN-IP).
The MX-MN-IP partitions the entire rule set into several
disjointed partitions. It helped not only to reduce power
consumption but also allows only one portion active when
classifying a packet. Furthermore, each partition is further
partitioned so that each sub partition can output one
matching result in only one clock cycle. One of the
drawbacks of this approach is that it may need a large
number of small TCAMs and as a result the actual
implementation becomes impractical. Moreover, the
number of partitions depends on the characteristic of the
rules set which will vary the power reduction ratio.
Jiang et al. [9] proposed an SRAM-based architecture
which was called field-split parallel bit vector (FSBV). In
this architecture, the authors make a key observation in
the original BV algorithm where some header fields of a
packet are further split into bit-level subfields. To clarify,
a W-bit field is split into w subfields in which each
subfield takes only 1 bit, and the number of unique values
in each subfield will be no more than 2. The result was
T
ABLE
II.
C
OMPARISON OF THE PERFORMANCE OF THE STATE OF
THE ART HARDWARE SOLUTIONS FOR MULTI
-
MATCH PACKET
CLASSIFICATION IN
NIDS.
Authors Algorithm Platform Through-
put
Storage Power Support
SNORT
Feature
Song BV-TCAM FPGA 10 73.8 17.7 No
Fang TCAM-
SSA
TCAM 20 13 312 No
Faezipour MX-MN-IP TCAM 80 13 296 No
able to reduce the memory requirement compared to the
original BV algorithm in which the memory requirement
of FSBV is a linear increase of the number of rules
assuming fixed number of W. The splitting process of a
field through FSBV architecture was efficiently
implemented in an FPGA. The FSBV memory
requirement is O(wN) where w is the number of
subfields and N is number of bit vectors in each field.
Table II shows a comparison of the hardware-based
packet classification NIDS techniques.
III. H
ARDWARE TECHNIQUES FOR STRING AND PATTERN
MATCHING
String and pattern matching-based techniques are
commonly used in network based intrusion detection
systems in which the attack patterns are modeled,
matched and identified based on the packet header, packet
content or both. The NIDS has to scan the incoming
traffic in real time and on fast link.
Much work has been
done in this field and the state of the art solutions in the
literature for string and pattern matching that have been
used in IDS can be generally divided based on the
implementation platform into these categories: multi core
processor-based, Specific Integrated Circuit-based, Field
Programmable Gate Array- Based, TCAM, and pipelined
Non-Deterministic Finite Automaton (NFA), and
Pipelined Deterministic Finite Automaton (DFA).
The most common method to implement pattern
matching in hardware is to use Finite Automata (FA)
approaches. The approaches of Finite Automata (FA) are
either Deterministic Finite Automata (DFA) or non-
deterministic finite automata (NFA). Each of these
approaches has its own advantages and drawbacks. The
DFA based pattern approach is fast enough but requires
more memory. In contrast, the NFA based pattern
matching approach comparatively takes less memory but
the speed of matching is very slow. Many approaches
have been proposed to overcome the drawbacks of both
types of the finite automata. A good comparative study of
some Finite Automata (FA) based techniques for pattern
matching in network intrusion detection system (NIDS) is
reported by the study of Rathod et al. [10].
Due to the huge growth of network traffic, multi-
pattern and string matching has been a major performance
bottleneck in network intrusion detection systems and
what follows presents a sample of the literature.
Scarpazza et al. [11], parallelized Aho-Corasick string
searching algorithm using IBM Cell/B.E. processor as the
hardware platform .The goal was to perform exact string
matching against large dictionaries. Since memory
Criteria Anomaly-Based Signature-Based
Update No Yes
Detection ability Both Known and
unknown attacks
Only Known attacks with very
high accuracy
Definition
Use deviation from
normal usage
pattern to identify
intrusions
Use patterns of well-known
attacks to identify intrusions
Feature of the
system High False Alarm Low False Alarm
Implementation
requirement
Requires less
computation and
resources
Requires more computation and
resources
congestion plays a crucial role in determining the
performance of the original algorithm, thus, the authors
proposed a new approach to fully exploit the DMA-based
communication mechanisms of the Cell/B.E. in order to
provide an unprecedented level of aggregate performance
with irregular access patterns.
Gnort was reported as the first systematic study of
using GPU to build IDS in 2008 [12]. In this study, the
authors used SNORT open-source NIDS in order to
exploit the underutilized computational power of modern
graphics cards, specifically the NVIDIA GeForce 8 Series
(G80) cards using CUDA. The goal was to offload the
costly pattern matching operations from the CPU, and
hence increase the overall processing throughput. This
study showed that modern graphics cards can be used
effectively to speed up intrusion detection systems, as
well as other systems that involve pattern matching
operations.
Tian Song et al. [13] presented the ACC multiple
pattern matching algorithm; the acronym ACC stands for
Aho-Corsaic with counters. The presented algorithm is
based on a Cached Deterministic Finite Automata
(CDFA) that extends a DFA through associating some
memory as cache and uses a next-state addressing (NSA)
scheme to store and access transition rules in memory.
The authors use several optimizations methods such as
set
-
associative policy, fine grain multithreading, and
entry combination to overcome the problems associated
with this kind of implementation such as the shortened
critical path and the memory utilization efficiency. The
CDFA model can be used for multiple regular expression
matching with an acceptable performance and efficient
memory utilization.
Jung et al. [14] implemented a bit-split string-
matching architecture on FPGA towards addressing
memory efficiency, pin count, and optimizing the number
of keywords per rule module. One of the characteristics
of the bit-split algorithm is that it allows large hardware
state machines to be converted into a form with much
higher memory efficiency [15]. This was done through
using a new architecture that utilizes a priority encoder to
reduce the number of external IO pins. The results
showed that the architecture can be effectively optimized
for FPGA implementation.
Lunteren [16] presented BFSM, hardware-based
scheme for pattern-matching. The presented scheme
exploits a hardware-based programmable state machine
technology in order to achieve deterministic processing
rates that are independent of both input and pattern
characteristics. Furthermore, BFSM can exploit hardware
parallelism. The implemented algorithm was able to
achieve processing speed in the order of 10 Gb/s for
FPGA and at least 20 Gb/s for ASIC.
E. Yang et al. [18] proposed pipelined field-merge
architecture to enhance the bit-split approach [14] for
Large-Scale String Matching (LSSM) that is memory-
efficient and yieds high-throughputs. The fundamental
concept of the pipelined architecture is to partition the
character input (8-bit) into several bit-field inputs of
smaller widths (usually 2-bits), followed by a matching
process that is done for each bit-field input in a partial
state machine (PSM) pipeline. The (PSM) is constructed
from the respective bit-field patterns. Then, in every
pipeline stage a matching process with the help of an
auxiliary table (ATB) is used for the matching results
from all the bit-fields.
One of the problems associated with AC-DFA (Aho-
Corsaic with Deterministic Finite Automaton) is that it
requires a large number of state transitions. This problem
inhibits efficient hardware implementation to achieve
high performance. Some developer's recent work [18],
[19] have shown that the AC-DFA can be reduced to a
character trier that contains only the forward transitions
through incorporating pipelined processing. However,
they have limitations in either handling long patterns or
extensions to support multi-character input per clock
cycle to achieve high throughput.
Weirong Jiang et al. [20] generalized the previous
mentioned problem and provided a better solution by
presenting a scalable pipeline architecture. The
architecture can be easily extended to support multi-
character inputs per clock cycle through mapping a
compressed AC-DFA [21] onto multiple pipelines.
Hoang Le et al. [22] proposed the "leaf-attaching"
approach to preprocess a given dictionary such that the
resulting set of the post-processed patterns can be
searched using any tree-search data structure without
increasing the number of patterns. Furthermore, the work
proposed an architecture based on a pipelined binary
search tree for large-scale string matching (LSSM). The
new architecture was scalable in such a way that the
MASM module can be duplicated to accept multiple
characters per cycle. One of the advantages of this work is
that the update process of the dictionary involves simply
rewriting the content of the memory, which can be done
without reconfiguring the FPGA chip
.
A study carried out
by Gharae [23] provided a deep insight on the pattern
matching algorithms that are used in intrusions detection
systems
.
Table III provides a comparison of different
hardware-based string matching techniques.
T
ABLE
III.
C
OMPARISON OF VARIOUS HARDWARE
-
BASED
S
TRING
M
ATCHING PERFORMANCE SOLUTIONS
Authors Approach Platform Number
of
patterns
Pattern
Length
Throughp
ut
Jung Bit-Split FPGA 1316 No
limit
1.76
Vasiliadies AC-DFA GeForce
8600GT
4000 <25 2.3
Lunteren B_FSM FPGA 8000 No
limit
2.2
Scarpazza AC-DFA Cell/B.E 8400 <10 2.5
E Yang Field Merge FPGA 6944 <64 4.56
Song CDFA ACIS 1785 No
limit
6.1
Jiang Depth-
Bounded
Pipeline
FPGA 9033 No
limit
11.4
Hoang Le leaf-
attaching
FPGA 10856 <64 11.8
IV. H
ARDWARE
-B
ASED
T
ECHNIQUES FOR
R
EGULAR
E
XPRESSION
Regular expression (regex) matching is an important
mechanism used by NIDS to perform deep packet
inspection against potential threats. Regex matching is
becoming both a bottleneck and a vulnerability of the
NIDS because both the increasing bandwidth of network
traffic and the huge number of patterns that need to be
scanned by the NIDS. A basic regular expression
matching engine can be implemented in hardware as a
state machine, either a deterministic finite machine or a
non-deterministic finite machine [24].
Recently, a considerable literature has grown up
around the theme of regular expression matching; a
sample of which is provided in the following paragraphs.
Yamagaki et al. [25] presented a high throughput
approach for matching regular expressions. The presented
approach was capable of converting the regular
expressions into a multi-character Non-deterministic
Finite Automata (NFA) that is able to be processed as
multiple characters per clock cycle. The work was
implemented and configured onto an FPGA platform. The
result showed that the work can implement the range
match operation efficiently.
Clark et al. [26] presented a scalable design
methodology for searching network packet payloads for
complex regular expressions. The basic idea is to use a
multi-character decoder based on the NFA technique that
produced high-performance circuits over a wide range of
pattern set sizes. The result showed that this methodology
was able to offer flexible trade-offs among throughput,
character capacity, and data bus rate.
Sourdis et al. [27] described a regular expression
pattern matching approach for reconfigurable hardware
based on the NFA direction. The proposed approach
introduced three new basic building blocks in order to
support more efficiently the constraint number of
repetition syntaxes. Optimization techniques were
employed to maximize the performance and reduce the
area cost of the designs. Table IV provides a comparison
of a few hardware-based regular expression matching
engines.
T
ABLE
IV.
C
OMPARISON OF THE MOST COMMON HARDWARE
-
BASED
R
EGULAR EXPRESSION MATCHING ENGINE PERFORMANCE
SOLUTIONS
Authors Approach Non-
Meta
Chars
Number
of LUT
per state
Multi-
Char Per
Cycle
Throughput
Clark et al. FPGA 17,537 3.1 4 73.8
Sourdis et al FPGA 69,127 0.66 1 2.42
Yamagaki et al FPGA 40,896 0.94 4 3.63
Bispo et al. FPGA 19,580 1.28 1 2.9
V. H
ARDWARE
-B
ASED
I
NTRUSION
D
ETECTION
S
YSTEMS
A hardware-based intrusion detection system is a
scalable method as it is able to inspect packets in high
speed networks. We have taken a glance at earlier
hardware techniques platforms; the general purpose
processor such as Central Processing Unit (CPU),
Graphical Processing Units (GPU), network processors,
multi core processors, Application Specific Integrated
Circuit (ASIC), Field Programmable Gate Arrays
(FPGA), and Ternary Content Addressable Memory
(TCAM). The processors are highly programmable, and
as a result of the Von Neumann architecture, the
bandwidth between the processor and the external
memory is the limit. Therefore, their performance will be
less effective for network processing. In contrast, ASICs
are able to provide better performances; however, they
cannot be reconfigurable. On the other hand, FPGAs
which are reconfigurable hardware platforms, offer better
performance than ASIC and processors. Most developers
use FPGA as a preferred hardware platform for their
superior performance and reprogramming applicability.
The reported research work in this area has been well
summarized by Chen et al. [5]. Ternary Content
Addressable Memory (TCAM) [6, 28, 29] is a special
ASIC that is used by developers in network search
engines and for multi-match packet classification engines.
It has been adopted to solve the multimatch classification
problem because of their ability to perform fast parallel
matching. However, TCAMs are expensive, not scalable
with respect to clock rate, have high power consumptions,
and circuit area. In the recent years, there has been an
increasing amount of literature on packet classification.
Taylor [30] provided an excellent survey of the seminal
and recent solutions to the problem. The TCAM-based
approach is expensive with respect to memory and power
consumption when the rules set has many insertions [5].
Furthermore, researchers have attempted to reduce the
power consumption in TCAM-based approaches [29].
Graphics Processor Units have been used to step-up
computationally intensive tasks in intrusion detection
systems. For example, Vasilios et al. reported the first
systematic study of using GPU in 2008 [31]. The
architecture of these hardware solutions has its own
advantages and disadvantages. Table V shows a
comparison among hardware platform solutions.
The performance of a hardware implementation will,
in general, exceed that of software, and there are still
many constraints that must be taken into account when
designing an IDS based on hardware detection approaches
because these hardware solutions are complex to
customize and are usually tied to a definitive application,
making it difficult to update the design [32].
T
ABLE
V.
C
OMPARISON AMONG HARDWARE PLATFORM SOLUTIONS
Multi-Core
Processors
ASIC FPGA
Can enhance the
aggregate throughput
dramatically by using a
large number of threads
to process multiple
input stream in parallel
Provide impressively
high per-stream
throughput
Provide desirable high
performance ,flexibility
of software and
reconfigurable
programming
Additional Complexity
is introduced in
scheduling ,buffering
ordering, and load
balancing
The applicability is
limited by the high
implementation cost
and low
reprogrammability
It takes considerable
time to resynthesize the
design and reprogram
the FPGA device
VI. C
LASSIFICATION
T
REE OF THE
S
URVEY
We summarize our findings in Table VI. Despite our
best attempts, these tables do not take in all available
literature. We listed the IDS platform in column 2. The
audit features column provides a description of what
features a system is working on. The analysis time
column indicates whether the IDS is a real time or offline
based IDS. The audit material specifies the type of the
data collecting mechanism.
In [33], the authors developed a different, traffic-
aware, modular approach in the design of an FPGA based
NIDS. The approach classifies and groups homogeneous
traffic and dispatches it to different capable hardware
blocks, each supporting a (smaller) rule set tailored to the
specific traffic category. The SNORTrules have been split
into different subdivisions based on protocol, port, and the
direction of the traffic that helps support the dispatcher.
The basic implementation of the string matching circuit in
this workflow is the basic architecture that is considered
in the shift-and-compare architecture presented in [34].
The IDS was implemented on the INVEA COMBO-LXT,
an express PCI x8 mother card equipped with the XILINX
Virtex5 XC5VLX155T, two QDR RAM memories, and
up to 4 GB of DDR2 memory. The authors stated that
their approach produced an 80% reduction in the number
of Lookup tables (LUTs).
In the same fashion, the authors of [35] implemented a
real-time intrusion detection system prototype that uses a
high frequency sampling rate enabled field-programmable
gate array (FPGA). The FPGA is equipped with a
software-driven engine. The software-driven engine is
implemented based on an internally evolvable improved
Block Based Neural Network (BBNN), which is an
intelligent learning algorithm, to perform the attack
related pattern recognition. This combination provides the
flexibility to learn and detect unknown attacks. The
experiment compared the improved system with four
major schemes of Support Vector Machine (SVM) and
the Naive Bayes algorithm. The authors used flow records
that are extracted from a DARPA packet based dataset
after converting it into a NetFlow format
because the
DAPRA data set is in the form of tcpdump format. The
tcpdump is a common packet analyzer format that runs
under the command line to allow the user to display
TCP/IP and other packets being transmitted or received
over a network to which the computer is attached. The
results indicated that the improved BBNN outperforms
other algorithms with respect to the classification and
detection performances. The system false alarm rate is
successfully reduced to as low as 5.14% with a detection
rate of 99.92%. This work uses a large-scale Altera
Cyclone III FPGA Starter Board with Universal
Synchronous Bus (USB) interface. It is noticeable that
SVM-based IDS can classify only discrete features. Thus,
preprocessing of those characteristics is required.
Similarly, cited work [36] examined a hybrid network
intrusion detection system (SFAOENIDS) for Solarflare
AOE devices using Inline FPGA with a network adapter
to provide hardware support for pattern matching and
software support for post processing. The system was
implemented using a hardware based description
language. The SFAOENIDS was experimentally tested on
a real network environment operating at 200 MHz,
handled a 10Gbps data rate without dropping packets,
while simultaneously minimizing the server CPU load.
SFAOENIDS performance was tested and compared
against
SNORT.
Since SFAOENIDS missed the TCP
stream reassembly features, these were disabled in
SNORT. The CPU-only based SNORT implementation
dropped 90% of packets at speeds near 10 Gbps. In
contrast, SFAOENIDS dropped zero packets at these
speeds.
Equally important, the authors of cited work [37]
proposed the anomaly based IDS using FPGA
architecture. At first, a feature extractor module (FEM)
was developed. By using an FPGA, significant
performance was achieved compared to software
solutions like SNORT. It was shown in that work that
around 22 Gbps throughput can be achieved. Their
proposed IDS also used Principal Component Analysis
(PCA) to reduce data dimensionality. The results of their
analysis produced acceptable accuracy scores on the
KDDCup dataset. The FPGA used in their work was the
Xilinx Vertex Family.
Different from cited work [37], the authors of [38]
considered a support vector machine intrusion detection
method that used a Graphical Processing Unit (GPU)–
based parallel computing model. The work used the
KDDCup1999 dataset to implement the proposed model.
The simulation result showed that the implemented
system was able to reduce the time consumption in the
training procedure, while keeping the performance of IDS
as usual. This work was based on the anomaly detection
method.
By the same token, cited work [39] studies a layered,
simple statistics anomaly based network intrusion
detection system. The study used a GPU with parallel
implementation to describe the training algorithm. The
work was done in two stages: training stage and decision
stage. In the training stage, similar operations were
performed for all of the packets residing in the training
set. These operations included threshold setting, bin
statistics computation, packet scoring, and bin counting.
In the detection stage, computation functions that
included bin counting and packet scoring were performed.
The study tested the performance of the parallel
implementation of the system using Compute Unified
Device Architecture (CUDA) framework with the Nvidia
GTX 780 model and by a CPU on a machine with
AMD FX-8350, 16GB RAM, and Ubuntu Linux 12.04.2.
  • Conference Paper
    Many network security applications such as Intrusion Detection System (IDS), Firewall and Data Loss Prevention System (DLPS) are based on deep packet inspection, in this packets header as well as payload of the packets are checked with predefined attack signature to identify whether it contains malicious traffic or not. To perform this checking different pattern matching methods are used by NIDS. The most popular method to implement pattern matching is to use of Finite Automata (FA). Generally, regular expressions are used to represent most of the attack signatures defined by NIDS. They are implemented using finite automata, which takes the payload of packet as input string. However, existing approaches of Finite Automata (FA), both deterministic finite automata (DFA) and non-deterministic finite automata (NFA) for pattern matching are having their own advantages and some drawbacks. The DFA based pattern matching methods are fast enough but require more memory. However, NFA based pattern matching methods are comparatively takes less memory but the speed of matching is very slow, to overcome these drawbacks of finite automata there are many approaches have been proposed. This paper discuses comparative study of some Finite Automata (FA) based techniques for pattern matching in network intrusion detection system (NIDS).
  • In this paper, we present the design and implementation of Haetae, a high-performance Suricata-based NIDS on many-core processors (MCPs). Haetae achieves high performance with three design choices. First, Haetae extensively exploits high parallelism by launching NIDS engines that independently analyze the incoming flows at high speed as much as possible. Second, Haetae fully leverages programmable network interface cards to offload common packet processing tasks from regular cores. Also, Haetae minimizes redundant memory access by maintaining the packet metadata structure as small as possible. Third, Haetae dynamically offloads flows to the host-side CPU when the system experiences a high load. This dynamic flow offloading utilizes all processing power on a given system regardless of processor types. Our evaluation shows that Haetae achieves up to 79.3 Gbps for synthetic traffic or 48.5 Gbps for real packet traces. Our system outperforms the best-known GPU-based NIDS by 2.4 times and the best-performing MCP-based system by 1.7 times. In addition, Haetae is 5.8 times more power efficient than the state-of-the-art GPU-based NIDS.
  • Article
    To optimize training procedure of IDS based on SVM and reduce time consumption, a SVM intrusion detection method based on GPU is proposed in the study. During the simulation experiments with KDD Cup 1999 data, GPU-based parallel computing model is adopted. Results of the simulation experiments demonstrate that time consumption in the training procedure of IDS is reduced, and performance of IDS is kept as usual.
  • Conference Paper
    A network intrusion detection system (NIDS), such as SNORT, analyzes incoming packets to identify potential security threats. Pattern matching is arguably the most important and most computationally intensive component of a NIDS. Software-based NIDS implementations drop up to 90% of packets during increased network load even at lower network bandwidth. We propose an alternative hybrid-NIDS that couples an FPGA with a network adapter to provide hardware support for pattern matching and software support for post processing. The proposed system, SFAOENIDS, offers an extensible open-source NIDS for Solarflare AOE devices. The pattern matching engine-the primary component of the hardware architecture was designed based on the requirements of typical NIDS implementations. In testing on a real network environment, the SFAOENIDS hardware implementation, operating at 200 MHz, handles a 10Gbps data rate without dropping packets while simultaneously minimizing the server CPU load.
  • Article
    Full-text available
    Data security is a huge responsibility for sensor network as there are various ways in which security can be breached, enabling hackers to access sensitive data. Threats to wireless sensor networks are numerous and potentially devastating. Security issues ranging from session hijacking to Denial of Service (DOS) can plague a WSN. To aid in the defense and detection of these potential threats, WSN employ a security solution that includes an intrusion detection system (IDS). Different neural methods have been proposed in recent years for the development of intrusion detection system. In this paper, we surveyed denial of service attacks that disseminate the WSN such a way that it temporarily paralyses a network and proposed a hybrid Intrusion Detection approach based on stream flow and session state transition analysis that monitor and analyze stream flow of data, identify abnormal network activity, detect policy violations against sync flood attack.
  • Article
    In network intrusion detection, anomaly-based solutions complement signature-based solutions in mitigating zero-day attacks, but require extensive training and learning to effectively model what the normal pattern for a given system (or service) looks like. Though the training typically happens off-line, and the processing speed is not as important as the detection stage (which occurs on-line in real-time), continuous analysis and retuning may be attractive depending on the deployment scenarios. The different types of computation required to perform automatic retuning (or retraining) of the system may result in resource competition for other important system tasks. Thus, a mechanism by which the retuning can take place without affecting the actual system workload is important. In this paper, we describe a layered, simple statistics based anomaly detection algorithm with parallel implementation of the training algorithm. We focus on the use of graphic processing units (GPU) to allow cost-efficient implementation with minimal impact on CPU loads so as to minimize affecting the day to day server workloads. Our results show potential for significant performance improvements.
  • Conference Paper
    As high-speed networks are becoming commonplace, it is increasingly challenging to prevent the attack attempts at the edge of the Internet. While many high-performance intrusion detection systems (IDSes) employ dedicated network processors or special memory to meet the demanding performance requirements, it often increases the cost and limits functional flexibility. In contrast, existing software-based IDS stacks fail to achieve a high throughput despite modern hardware innovations such as multicore CPUs, manycore GPUs, and 10 Gbps network cards that support multiple hardware queues. We present Kargus, a highly-scalable software-based IDS that exploits the full potential of commodity computing hardware. First, Kargus batch processes incoming packets at network cards and achieves up to 40 Gbps input rate even for minimum-sized packets. Second, it exploits high processing parallelism by balancing the pattern matching workloads with multicore CPUs and heterogeneous GPUs, and benefits from extensive batch processing of multiple packets per each IDS function call. Third, Kargus adapts its resource usage depending on the input rate, significantly saving the power in a normal situation. Our evaluation shows that Kargus on a 12-core machine with two GPUs handles up to 33 Gbps of normal traffic and achieves 9 to 10 Gbps even when all packets contain attack signatures, a factor of 1.9 to 4.3 performance improvements over the existing state-of-the-art software IDS. We design Kargus to be compatible with the most popular software IDS, Snort.
  • Conference Paper
    Future large-scale complex computing environments present challenges to the real-time intrusion detection systems (IDSs). In this paper, we design a prototype with hybrid software-enabled detection engine on the basis of our improved block-based neural network (BBNN), and integrate it with a high-frequency FPGA board to form a real-time intrusion detection system. The established prototype can seamlessly feed the large-scale NetFlow data obtained from Cisco routers directly into the improved BBNN based IDS. The corresponding BBNN structure and parameter settings have been improved and experimentally tested. Experimental performance comparisons have been conducted against four major schemes of Support Vector Machine (SVM) and Naive Bayes algorithm. The results show that the improved BBNN outperforms other algorithms with respect to the classification and detection performances. The false alarm rate is successfully reduced as low as 5.14% while the genuine detection rate 99.92% is still maintained.
  • Article
    Full-text available
    Security of today's networks heavily rely on network intrusion detection systems (NIDSs). The ability to promptly update the supported rule sets and detect new emerging attacks makes field-programmable gate arrays (FPGAs) a very appealing technology. An important issue is how to scale FPGA-based NIDS implementations to ever faster network links. Whereas a trivial approach is to balance traffic over multiple, but functionally equivalent, hardware blocks, each implementing the whole rule set (several thousands rules), the obvious cons is the linear increase in the resource occupation. In this work, we promote a different, traffic-aware, modular approach in the design of FPGA-based NIDS. Instead of purely splitting traffic across equivalent modules, we classify and group homogeneous traffic, and dispatch it to differently capable hardware blocks, each supporting a (smaller) rule set tailored to the specific traffic category. We implement and validate our approach using the rule set of the well-known Snort NIDS, and we experimentally investigate the emerging trade-offs and advantages, showing resource savings up to 80 percent based on real-world traffic statistics gathered from an operator's backbone.