A Design Method of a Regular Expression Matching Circuit Based on Decomposed Automaton.

IEICE Transactions 01/2012; 95-D:364-373. DOI: 10.1587/transinf.E95.D.364
Source: DBLP

ABSTRACT This paper shows a design method for a regular expression matching
circuit based on a decomposed automaton. To implement a regular
expression matching circuit, first, we convert a regular expression into
a non-deterministic finite automaton (NFA). Then, to reduce the number
of states, we convert the NFA into a merged-states non-deterministic
finite automaton with unbounded string transition (MNFAU) using a greedy
algorithm. Next, to realize it by a feasible amount of hardware, we
decompose the MNFAU into a deterministic finite automaton (DFA) and an
NFA. The DFA part is implemented by an off-chip memory and a simple
sequencer, while the NFA part is implemented by a cascade of logic
cells. Also, in this paper, we show that the MNFAU based implementation
has lower area complexity than the DFA and the NFA based ones.
Experiments using regular expressions form SNORT shows that, as for the
embedded memory size per a character, the MNFAU is 17.17-148.70 times
smaller than DFA methods. Also, as for the number of LCs (Logic Cells)
per a character, the MNFAU is 1.56-5.12 times smaller than NFA methods.
This paper describes detail of the MEMOCODE2010 HW/SW co-design contest
for which we won the first place award.

  • Source
    [Show abstract] [Hide abstract]
    ABSTRACT: We introduce a family of simple and fast algorithms for solving the classical string matching problem, string matching with don't care symbols and complement symbols, and multiple patterns. In addition we solve the same problems allowing up to k mismatches. Among the features of these algorithms are that they are real time algorithms, they don't need to buffer the input, and they are suitable to be implemented in hardware.
    ACM SIGIR Forum 01/1988; 23(SI):168-175.
  • Source
    [Show abstract] [Hide abstract]
    ABSTRACT: This paper describes a simple, efficient algorithm to locate all occurrences of any of a finite number of keywords in a string of text. The algorithm consists of constructing a finite state pattern matching machine from the keywords and then using the pattern matching machine to process the text string in a single pass. Construction of the pattern matching machine takes time proportional to the sum of the lengths of the keywords. The number of state transitions made by the pattern matching machine in processing the text string is independent of the number of keywords. The algorithm has been used to improve the speed of a library bibliographic search program by a factor of 5 to 10.
    Commun. ACM. 01/1975; 18:333-340.
  • Source
    [Show abstract] [Hide abstract]
    ABSTRACT: We present an extensible automation framework for constructing and optimizing large-scale regular expression matching (REM) circuits on FPGA. Paralleling the technique used by software compilers, we divide our framework into two parts: a frontend that parses each PCRE-formatted regular expression (regex) into a modular non-deterministic finite automaton (RE-NFA), followed by a backend that generates the REM circuit design for a multi-pipeline architecture. With such organization, various pattern and circuit level optimizations can be applied to the frontend and backend, respectively. The multi-pipeline architecture utilizes both logic slices and on-chip BRAM for optimized character matching; in addition, it can be configured at compile-time to produce concurrent matching outputs from multiple RE-NFAs. Our framework prototype handles up to 64k "regular" regexes with arbitrary complexity and number of states, limited only by the hardware resources of the target device. Running on a commodity 2.3 GHz PC (AMD Opteron 1356), it takes less than a minute for the framework to convert ~1800 regexes used by the Snort IDS into RTL-level designs with optimized logic and memory usage. Such an automation framework could be invaluable to REM systems to update regex definitions with minimal human intervention.
    International Conference on Field Programmable Logic and Applications, FPL 2010, August 31 2010 - September 2, 2010, Milano, Italy; 01/2010

Full-text (2 Sources)

Available from
Jul 3, 2014