Conference Paper

A Parallel Branching Program Machine for Emulation of Sequential Circuits

DOI: 10.1007/978-3-642-00641-8_26 Conference: Reconfigurable Computing: Architectures, Tools and Applications, 5th International Workshop, ARC 2009, Karlsruhe, Germany, March 16-18, 2009. Proceedings
Source: DBLP

ABSTRACT

The parallel branching program machine (PBM128) consists of 128 branching program machines (BMs) and a programmable interconnection.
To represent logic functions on BMs, we use quaternary decision diagrams. To evaluate functions, we use 3-address quaternary
branch instructions. We emulated many benchmark circuits on PBM128, and compared its memory size and computation time with
the Intel’s Core2Duo microprocessor. PBM128 requires approximately quarter of the memory for the Core2Duo, and is 21.4-96.1
times faster than the Core2Duo.

Download full-text

Full-text

Available from: Hiroki Nakahara
  • Source
    • "The throughput for the state-of-the-art packet classifier using the MPU is at most hundreds mega bits per second [4], so it cannot keep up with accelerated speed up of the Internet. This paper shows a packet classifier using a parallel branching program machine (PBM) [12]. A branching program machine (BM) is a special purpose processor that uses only two instructions [2], [1], [21]. "
    [Show abstract] [Hide abstract]
    ABSTRACT: A branching program machine (BM) is a special purpose processor that uses only two kinds of instructions: Branch and output instructions. Thus, the architecture for the BM is much simpler than that for a general purpose proces sor (MPU). Since the BM uses the dedicated instructions for a special purpose application, it is faster than the MPU. This paper presents a packet classifier using a parallel branching program machine (PBM). To reduce computation time and code size, first, a set of rules for the packet classifier is partitioned into groups. Then, they are evaluated by the PBM in parallel. Also, this paper shows a method to estimate the number of necessary BMs to realize the packet classifier. The PBM32 consisting of 32 BMs has been implemented on an FPGA, and compared with the Intel's [email protected] /* */ The PBM32 is 8.1-11.1 times faster than the Core2Duo, and the PBM32 requires only 0.2-10.3 percent of the memory for the Core2Duo.
    Full-text · Conference Paper · Sep 2010
  • Source
    • "6 concludes the paper. This paper builds on the previous publication [18] "
    [Show abstract] [Hide abstract]
    ABSTRACT: The parallel branching program machine (PBM128) consists of 128 branching program machines (BMs) and a programmable interconnection. To represent logic functions on BMs, we use quaternary decision diagrams. To evaluate functions, we use 3-address quaternary branch instructions. We realized many benchmark functions on the PBM128, and compared its memory size, computation time, and power consumption with the Intel's Core2Duo microprocessor. The PBM128 requires approximately a quarter of the memory for the Core2Duo, and is 21.4-96.1 times faster than the Core2Duo. It dissipates a quarter of the power of the Core2Duo. Also, we realized packet filters such as an access controller and a firewall, and compared their performance with software on the Core2Duo. For these packet filters, the PBM128 requires approximately 17% of the memory for the Core2Duo, and is 21.3-23.7 times faster than the Core2Duo.
    Full-text · Article · Aug 2010 · IEICE Transactions on Information and Systems
  • Source
    • "Applications for DDMs include industrial process controllers [17], and logic simulators [5]. In [11], a parallelized DDM has been proposed. Compared with the Intel's Core2Duo microprocessor, it requires a quarter of the memory for the Core2Duo, while is about 100 times faster at its peak performance. "
    [Show abstract] [Hide abstract]
    ABSTRACT: This paper compares 6 decision diagram machines (DDMs) with respect to area-time complexity, throughput, and compatibility to the existing memory. First, 6 types of decision diagrams (DDs): BDD, MDD, QRBDD, QRMDD, heterogeneous MDD (HMDD), and QRHMDD are introduced. Second, corresponding DDMs are developed. Third, memory sizes and average path length (APL) for these DDs are compared. As for area-time complexity, the QDDM is the best; as for throughput, the QRQDDM is the best; and as for compatibility to the existing memory, the HMDDM is the best.
    Full-text · Conference Paper · Jan 2010
Show more