Fig 5 - uploaded by Johann Glaser
Content may be subject to copyright.
Area proportions of the TR-FSM parts, shown for the three instance types. MUX:NS: large multiplexer to select next state, MUX:O: large multiplexer to select the output.

Area proportions of the TR-FSM parts, shown for the three instance types. MUX:NS: large multiplexer to select next state, MUX:O: large multiplexer to select the output.

Source publication
Article
Full-text available
Finite State Machines (FSMs) are a key element of integrated circuits. Hard-coded FSMs do not allow changes after the ASIC production. While an embedded FPGA IP core provides flexibility, it is a complex circuit, requires difficult synthesis tools, and is expensive. This article presents and evaluates a novel architecture that is specifically optim...

Contexts in source publication

Context 1
... An evaluation of the proportion on area consumption of the indi- vidual building blocks of an TR-FSM (see Fig. 5) shows, that the TRs (SSG, ISM, IPG, NSR, OPR) account for 91.3 %, 92.9 % and 93.7 % of the area for the instances 1, 2 and 3, and 4, respectively. The ISM and the OPR grow disproportionate due to the strong dependency on the number of inputs n I and outputs n O , respectively: All ISMs of TR-FSM instance 1 require 20 % of the area, ...
Context 2
... 2 and 3 and to 37.8 % for the largest instance 4. The relative OPR size increases from 16.6 % to 19.5 % and 22.4 %. Building Blocks. The state register requires considerably less than 1 % of the total area because it is just n S D flip-flops. The large multiplexers for the next state and the output value (denoted by "MUX:NS" and "MUX:O" in Fig. 5) linearly depend on the number of state bits n S and outputs n O , respectively, as well as the number of transition rows T ...

Similar publications

Article
Full-text available
Cuckoo hashing has proven to be an efficient option to implement exact matching in networking applications. It provides good memory utilization and deterministic worst case access time. The continuous increase in speed and complexity of networking devices creates a need for higher throughput exact matching in many applications. In this paper, a new...

Citations

... The behavior of a sequential block can be represented using a finite state machine (FSM) model [6,7]. These models are widely used for representing, for example, such sequential blocks as (1) the control units of computers and microcontrollers [8][9][10][11][12][13]; (2) the hardware-software interfaces and communication protocols of embedded systems [14,15]; (3) various cryptoprocessors [16]; (4) hypertangent and exponential functions [17]; (5) various integral stochastic computing blocks [18]; (6) the activation functions of deep neural networks [19,20]; (7) different stages of cascaded digital processing systems [21,22]. ...
Article
Full-text available
Methods for reducing power consumption in circuits of finite state machines (FSMs) are discussed in this review. The review outlines the main approaches to solving this problem that have been developed over the last 40 years. The main sources of power dissipation in CMOS circuits are shown; the static and dynamic components of this phenomenon are analyzed. The power consumption saving can be achieved by using coarse-grained methods common to all digital systems. These methods are based on voltage or/and clock frequency scaling. The review shows the main structural diagrams generated by the use of these methods when optimizing the power characteristics of FSM circuits. Also, there are various known fine-grained methods taking into account the specifics of both FSMs and logic elements used. Three groups of the fine-grained methods targeting FPGA-based FSM circuits are analyzed. These groups include clock gating, state assignment, and replacing look-up table (LUT) elements by embedded memory blocks (EMBs). The clock gating involves a separate or joint use of such approaches as the (1) decomposition of FSM inputs and (2) disabling FSM inputs. The aim of the power-saving state assignment is to reduce the switching activity of a resulting FSM circuit. The replacement of LUTs by EMBs allows a reduction in the power consumption due to a decrease in the number of FSM circuit elements and their interconnections. We hope that the review will help experts to use known methods and develop new ones for reducing power consumption. We think that a good knowledge and understanding of existing methods of reducing power consumption is a prerequisite for the development of new, more effective methods to solve this very important problem. Although the methods considered are mainly aimed at FPGA-based FSMs, they can be modified, if necessary, and used for the power consumption optimization of FSM circuits implemented with other logic elements.
... Very often, the model of Mealy finite state machine (FSM) [4,5] is used to specify sequential blocks. For example, FSM specify: 1) the complex hyper-tangent and exponential functions and other complex functions [6]; 2) the hardware-software interfaces of embedded systems [7]; 3) the blocks of integral stochastic computing [8]; 4) the activation functions for deep neural networks [9,10]; 5) different stages of cascaded digital processing systems [11,12]; 6) the control units of computers and other complex digital systems [5,[13][14][15][16][17][18][19]. Based on the popularity of FSMs, we also use this model in our current research. ...
... To find SBFs (2)-(3), it is necessary to transform the initial STT into a direct structure table (DST). It has the following columns [12,39]: a m is a current state; K(a m ) is a code of the state a m ∈ A; a s is a state of transition; K(a s ) is a code of the state a s ∈ A; X h is an input signal causing the transition a m , a s and equal to a conjunction of some inputs x l ∈ X (or their complements); Y h is a collection of outputs y n ∈ Y generated during the transition a m , a s ; Φ h is a set of IMF equal to 1 to replace the code K(a m ) by the code K(a s ) into RG; h is a number of transition (h ∈ {1,. ..,H}). ...
... Ref. [9] considers the problem of state encoding and optimization of the combinational part upon the implementation of high-performance FSMs in complex programmable logic devices (CPLD). Ref. [10] presents a novel architecture that is particularly optimized for implementation of reconfigurable FSMs; this architecture is called the transition-based reconfigurable FSM (TR-FSM) and shows a significant reduction in area, speed, and power consumption in relation to FPGA architectures. In [11], the implementation of finite state machines in FPGA with the application of integral blocks of read-only memory (ROM) is described. ...
Article
Full-text available
Different strategies for the combination of merging and splitting transformation procedures for incompletely specified finite state machines implemented on field-programmable logic devices are offered. In these methods, such optimization criteria as the speed of operation, power consumption and implementation cost are considered already in the early phase of finite state machine synthesis. The methods also take into account the technological features of programmable logic devices and the state assignment method. The transformation quality ratio is calculated on the base of estimations of consumed power, critical path delay and number of utilized logic cells. The user is also able to choose the order of merging and splitting procedures and the direction of the optimization by setting weights for each criterion. The methods of the estimation of optimization criteria values are described, and the experimental results are also discussed.
... Therefore, the area of Look-up table grow essentially with growing of Ni or Ns. Developer can use next approaches to reduce area: decomposition of initial finite state machine to subautomates [12]; additional multiplexing schemes for input signals, that allow us to decrease the number of input signals (Ni) connected to Look-up table in comparison with the number of input signals of finite state machine [13], [14], [15]. ...
Conference Paper
Full-text available
Transport Protocol Controllers are parts of most terminal nodes of local networks widely used in information and telecommunication systems. Typically, a terminal node (node of local network) is exploited for several years. New applications, new versions of transport protocols or new transport protocols could be developed during this time. Support of these new features without change of equipment is very important today. Also set of tasks could be dynamically changed that leads changes in data flows. These changes in turn leads the changes of used transport layer protocols or their profiles. The Transport Protocol Controller unit should be dynamically reconfigurable to meet these requirements. Today dynamically reconfigurable components usually are developed with Field Programmable Gate Array (FPGA). However, power consumption, area, time characteristics (e.g. achievable clock frequency) of FPGA implementations are essentially worse than same parameters of Application Specific Integration Circuits (ASIC) implementations. These factors essentially constrain the application area of FPGA based dynamically reconfigurable systems. In this paper, we consider existing approaches for development of dynamically reconfigurable systems with ASIC, evaluate its applicability for Transport Protocol Controller Unit. We propose an approach to development of dynamically reconfigurable Transport Protocol Controller Unit. This approach allows us to take into account the specific requirements for this unit. In the paper we present several examples of the proposed approach. We have evaluated reachable parameters and overheads for these examples.
... Very often, sequential blocks are represented using models of finite state machines (FSMs) (Baranov, 2008;Micheli, 1994). For example, FSMs are used for implementing: (i) hardware-software interfaces of embedded systems (Gajski et al., 2009), (ii) complex functions such as hyper-tangent and exponentiation functions (Brown and Card, 2001;Li et al., 2014), (iii) activation functions in deep neural networks (Li et al., 2017;Xie et al., 2017), (iv) some blocks for integral stochastic computing (Ardakani et al., 2017), (v) different stages of cascaded digital processing systems (Rafla and Gauba, 2010;Glaser et al., 2011;Das and Priya, 2018). Also, they are used for the synthesis of control units of digital systems (Czerwiński and Kania, 2013;Sklyarov et al., 746 A. Barkalov et al. of LUT-based FSMs, the chip area is proportional to the number of LUTs in the circuit. ...
... For the security cluster (green color), we found that AES has an strong relationship with security, encryption, cryptography and also with low power implementations [470,1134]. Similarly, we found that FPGAs' implementations have helped to add security to WSN nodes [1139][1140][1141] and low power processing capabilities [1470][1471][1472]. ...
Article
Full-text available
Field Programmable Gate Array (FPGA) is a general purpose programmable logic device that can be configured by a customer after manufacturing to perform from a simple logic gate operations to complex systems on chip or even artificial intelligence systems. Scientific publications related to FPGA started in 1992 and, up to now, we found more than 70,000 documents in the two leading scientific databases (Scopus and Clarivative Web of Science). These publications show the vast range of applications based on FPGAs, from the new mechanism that enables the magnetic suspension system for the kilogram redefinition, to the Mars rovers’ navigation systems. This paper reviews the top FPGAs’ applications by a scientometric analysis in ScientoPy, covering publications related to FPGAs from 1992 to 2018. Here we found the top 150 applications that we divided into the following categories: digital control, communication interfaces, networking, computer security, cryptography techniques, machine learning, digital signal processing, image and video processing, big data, computer algorithms and other applications. Also, we present an evolution and trend analysis of the related applications.
... The applications as mentioned earlier can be observed as cascaded stages (i.e., multistage) of operations [2], where each stage requires a specific FSM. Hence, a Reconfigurable FSM is investigated in the literature for optimal performance in such applications [6,7]. A Reconfigurable FSM is defined as a single FSM, which acts as one of the FSMs from the set (i.e., set of FSMs for a specific application) by applying particular mode bits. ...
... The objective is thus confined to minimize the cost function given in (7). Evidently, it is a discrete optimization problem, where each state can opt only a particular binary state code. ...
Article
Full-text available
Recently, the Reconfigurable FSM has drawn the attention of the researchers for multistage signal processing applications. The optimal synthesis of Reconfigurable finite state machine with input multiplexing (Reconfigurable FSMIM) architecture is done by the iterative greedy heuristic based Hungarian algorithm (IGHA). The major problem concerning IGHA is the disintegration of a state encoding technique. This paper proposes the integration of IGHA with the state assignment using logarithmic barrier function based gradient descent approach to reduce the hardware consumption of Reconfigurable FSMIM. Experiments have been performed using MCNC FSM benchmarks which illustrate a significant area and speed improvement over other architectures during field programmable gate array (FPGA) implementation.
... The mathematical model for designing a controller for applications such as microprocessor control units, circuit testing, and digital signal processing (DSP) is a finite state machine (FSM). Consequently, designing such systems requires an efficient synthesis technique for high-speed FSM [1,2]. Applications such as DSP [3,4] and built-in self-test (BIST) [5] require specific operations to be performed only in the particular instances. ...
... Different control units are required to complete each operation. Hence, to optimally perform these operations, a single control unit is defined which can configure itself depending upon the applied mode of operation; it is also known as reconfigurable FSM [1]. The mode of operation for such FSM is controlled by a counter, timer, or any user-defined control signals based on the application requirements. ...
... The mode of operation for such FSM is controlled by a counter, timer, or any user-defined control signals based on the application requirements. An example of a reconfigurable FSM is given in [1] as a test chip for wireless sensor network. In this example, Transition-Based Reconfigurable FSM (TR-FSM) [1] is configured into one of the MCNC FSM benchmark circuits (i.e., dk15, s386, or cse) at different instances. ...
Article
Full-text available
The mathematical model for designing a complex digital system is a finite state machine (FSM). Applications such as digital signal processing (DSP) and built-in self-test (BIST) require specific operations to be performed only in the particular instances. Hence, the optimal synthesis of such systems requires a reconfigurable FSM. The objective of this paper is to create a framework for a reconfigurable FSM with input multiplexing and state-based input selection (Reconfigurable FSMIM-S) architecture. The Reconfigurable FSMIM-S architecture is constructed by combining the conventional FSMIM-S architecture and an optimized multiplexer bank (which defines the mode of operation). For this, the descriptions of a set of FSMs are taken for a particular application. The problem of obtaining the required optimized multiplexer bank is transformed into a weighted bipartite graph matching problem where the objective is to iteratively match the description of FSMs in the set with minimal cost. As a solution, an iterative greedy heuristic based Hungarian algorithm is proposed. The experimental results from MCNC FSM benchmarks demonstrate a significant speed improvement by 30.43% as compared with variation-based reconfigurable multiplexer bank (VRMUX) and by 9.14% in comparison with combination-based reconfigurable multiplexer bank (CRMUX) during field programmable gate array (FPGA) implementation.
... Most existing overlays have focused on applying these tradeoffs on pipelined datapaths with minimal control requirements through the interconnection structure (e.g., [6], [7], [10], [17]). Previous work in control-specialized overlays primarily focused on memory-based FSM implementations [5] using general techniques for FSM synthesis [15] and decomposition techniques from reconfigurable FSM studies (e.g., [3], [8], [9]). The Multi-RAM overlay expands from the 3-RAM overlay [5] with memory decomposition to reduce memory requirements and comparatively requires 15% to 28% fewer lookup tables for individual FSMs, and 77% to 99% fewer lookup tables for more common overlay use cases. ...
Article
Productivity issues such as lengthy compilation and limited code reuse have restricted usage of field-programmable gate arrays (FPGAs), despite significant technical advantages. Recent work into overlays -- virtual coarse-grained architectures implemented atop FPGAs -- has aimed to address these concerns through abstraction, but have mostly focused on pipelined applications with minimal control requirements. Although research has introduced overlays for finite-state machines, those architectures suffer from limited scalability and flexibility, which we address with a new overlay architecture using memory decomposition on transitional logic. Although our overlay provides modest average improvements of 15% to 29% fewer lookup tables for individual finite-state machines, for the more common usage of an overlay supporting different finite-state machines, our overlay achieves a 77% to 99% reduction in lookup tables. In addition, our overlay reduces compilation time to tenths of a second to enable rapid iterative-development methodologies.
... The specification includes information on the employed blocks (also called functional units) (e.g., adders, FSMs, ...), which can be reconfigurable themselves (e.g. an adder be reconfigured as a subtracter, reconfigurable FSM [7]). Additionally it includes the number of instances of each block as well as details on the connections among them. ...
... In the manual design flow described in [5], these cell types are instantiated in a hierarchical and structured design. Three connection types (bit, byte and word) were used to connect the peripherals (ADC, serial bus masters, ...) with the control logic (using a TR-FSM [7]), the byte memory and the arithmetic unit. The interconnect is also implemented hierarchically using large multiplexers and switch boxes. ...
Conference Paper
Full-text available
Low power consumption or high execution speed is achieved by making an application specific design. However, today's systems also require flexibility in order to allow running similar or updated applications (e.g. due to changing standards). Finding a good trade-off between reconfigurability and performance is a challenge. This paper presents a tool that analyzes a given set of applications (as netlists) and generates a heterogeneous coarse-grain reconfigurable architecture that matches their requirements. Its main task is to optimize the interconnect by hierarchically grouping the functional units. Additional resources can be added to enable future applications. The tool generates the HDL source for a module with the instances of all blocks and the reconfigurable interconnect. The feasibility of the methodology is demonstrated by the design of a reconfigurable architecture for digital filters and simple logic networks.