FIFO2 partitioning with asynchronous pointer comparison logic

Source publication

Simulation and Synthesis Techniques for Asynchronous FIFO Design with Asynchronous Pointer Comparisons

Article

Full-text available

Jan 2002

An interesting technique for doing FIFO design is to perform asynchronous comparisons between the FIFO write and read pointers that are generated in clock domains that are asynchronous to each other. The asynchronous FIFO pointer comparison technique uses fewer synchronization flip-flops to build the FIFO. The asynchronous FIFO comparison method re...

Context 1

... block diagram for FIFO style #2 is shown in Figure 5. To facilitate static timing analysis of the style #2 FIFO design, the design has been partitioned into the following five Verilog modules with the following functionality and clock domains: ...

View in full-text

Figure 1. (a) Workflow approach proposed by Meylan et al. [21] and...

Table 1 . Quantitative measures for Figure 8.

Figure 2. Inverse exponent curves for different k values [assuming (X *...

Table 2 . Quantitative measures for Figure 9.

Figure 4. The four image quality regions partitioned by image luminance...

Hardware Implementation of an Automatic Rendering Tone Mapping Algorithm for a Wide Dynamic Range Display

Article

Full-text available

Dec 2013

Tone mapping algorithms are used to adapt captured wide dynamic range (WDR) scenes to the limited dynamic range of available display devices. Although there are several tone mapping algorithms available, most of them require manual tuning of their rendering parameters. In addition, the high complexities of some of these algorithms make it difficult...

Numerical simulation results under m = 1 and n = 1. (a) y-z phase...

Numerical simulation results of saddle-shaped multiscroll chaotic...

A Multiscroll Chaotic Attractors with Arrangement of Saddle-Shapes and Its Field Programmable Gate Array (FPGA) Implementation

Article

Full-text available

Jul 2020

Based on the step function and signum function, a chaotic system which can generate multiscroll chaotic attractors with arrangement of saddle-shapes is proposed and the stability of its equilibrium points is analyzed. The under mechanism for the generation of multiscroll chaotic attractors and the reason for the arrangement of saddle shapes and bei...

Fig. 1. Integration of our ChipGPT framework to chip design flow

Fig. 10. Matrix Multiply Verilog Program list, where power is measured...

ChipGPT: How far are we from natural language hardware design

Preprint

Full-text available

May 2023

As large language models (LLMs) like ChatGPT exhibited unprecedented machine intelligence, it also shows great performance in assisting hardware engineers to realize higher-efficiency logic design via natural language interaction. To estimate the potential of the hardware design process assisted by LLMs, this work attempts to demonstrate an automat...

Synthesizable Verilog Code Generator for Variable-Width Tree Multipliers

Article

Full-text available

Jul 2021

Tree multipliers are fast multipliers which are important for timing-critical applications. However, due to the irregular multiplier structure, the process of coding a tree multiplier is often very time-consuming. In addition, it is difficult to generalize the multiplier codes for variable-width inputs. In this paper, the authors used Python script...

Figure 3. Configurable MFNN architecture.

Figure 4. Hardware contained in Datapath block.

Figure 5. Architecture for the approximated hyperbolic tangent function.

Figure 6. FSM designed flow diagram for the ANN.

SoC Design Based on a FPGA for a Configurable Neural Network Trained by Means of an EKF

Article

Full-text available

Jul 2019

This work presents a configurable architecture for an artificial neural network implemented with a Field Programmable Gate Array (FPGA) in a System on Chip (SoC) environment. This architecture can reproduce the transfer function of different Multilayer Feedforward Neural Network (MFNN) configurations. The functionality of this configurable architec...

An Enhanced Reconfigurable Dual-Clock FIFO for Inter-IP Data Transmission

Article

Sep 2023
IEICE ELECTRON EXPR

In integrated designs of multiple IP cores across clock domains, signal metastability can occur due to unequal wiring and variations in PVT. This leads to inconsistency between the signals obtained by the target and the signals at the source. Establishing a FIFO is one of the crucial methods for addressing data inconsistency. Therefore, this paper proposes a novel array structure based on one-hot coding, where the row and column codes generated by Johnson counters are XORed to create the address pointer. This innovation reduces the area for the FIFO and enables rapid control logic using one-hot coding. Furthermore, a state-based approach is employed to mitigate the impact of memory size on the empty/full detection circuit. It only records the read-and-write addresses, enhancing the reconfigurability of the FIFO. Using the SMIC 0.18µm process, the synthesis and simulation results demonstrate that the FIFO can achieve a maximum operating frequency of 830MHz. Additionally, compared to similar synchronous FIFO, it exhibits a significant 30% reduction in area. When considering different FIFO depths and widths, the method proposed in the paper shows an area reduction of 30% to 47% compared to similar synchronous methods. For a depth of 16 and a data width of one word, the power consumption is about 6.8 mW. The FIFO presented in this paper can serve as a reference for data transmission between different clock domains.

A flexible control system for atomic, molecular and optical physics experiments

Preprint

Full-text available

Jun 2021

We have implemented a control system for experiments in atomic, molecular and optical physics based on a commercial low-cost board, featuring a field-programmable gate array as part of a system-on-a-chip on which a Linux operating system is running. The board features Gigabit Ethernet, allowing for fast data transmission and operation of remote experimental systems. A single board can control a set of devices generating digital, analog and radio frequency signals with a precise timing given either by an external or internal clock. Contiguous output and input sampling rates of up to 40 MHz are achievable. Several boards can run synchronously with a timing error approaching 1 ns. For this purpose, a novel auto-synchronization scheme is demonstrated, with possible application in complex distributed experimental setups with demanding timing requests.

Latency Minimization using Mesochronous Scheduling in MPSoC Operation

Article

Feb 2020

High speed computing is the upcoming challenge for next generation applications. To cope with high speed operations, new processing architectures are evolving. Multi processor design is one optimal design approach for such need. In the design development of multi processor unit, Multi-Processor System-on-Chip (MPSoC) has an outcome in the domain of VLSI design. MPSoC are designed to process multiple instructions and data handling simultaneously. The parallel processing feature make this unit faster and optimal design for upcoming applications. However, MPSoC operations have a latency issue in clock allocation and resource utilization, which effects the processing efficiency and introduces delay and resource overhead in MPSoC interface. This paper outlines a Mesochronous operation in MPSoC design for minimizing latency in clock allocation and resource allocation, hence improving the speed of operation.

Asynchronous 1R-1W dual-port SRAM by using single-port SRAM in 28nm UTBB-FDSOI technology

Conference Paper

Full-text available

Sep 2017

With the advancement in technology nodes, the number of components operating in different clock domains in a System on Chip (SoC) increases. Asynchronous multi-port memory with dedicated write and read ports is used to allow data to cross clock domain boundaries. The dual-port memory architecture introduced in this paper, is based on the Single-Port SRAM (SP-SRAM) that can be generated in larger capacities with better performance statistics compared to the Dual-Port SRAM (DP-SRAM). The proposed design has been evaluated by comparing existing dual-port 1R-1W and 2RW designs in 28nm Ultra Thin Body and Box Fully Depleted Silicon on Insulator (UTBB-FDSOI) technology. A memory with a capacity of 2048 words with 64 bits, shows 15%, 35%, 28% and 4.5% improvement in read power, write power, read-write power consumption and performance respectively over conventional 1R-1W DP-SRAM with equal area. The synthesis with area optimizations applied instead, shows an area advantage of 50% over conventional 1R-1W DP-SRAM, but with a degradation in performance.

GALS Technology to Improve Throughput of FIFO

Article

Jan 2017

A novel asynchronous first-in-first-out adapting to multi-synchronous network-on-chips

Article

Full-text available

Feb 2015

The integration of a variety of IP cores into a single chip to meet the high demand of new applications leads to many challenges in timing issues, especially the interface between different clock domains. Globally Asynchronous, Locally Synchronous (GALS) approach addresses these challenges by dividing a chip into several independent subsystems working with different clock signals. In multi-synchronous Network-on-Chip (NoC) based on GALS architecture, the network routers run with different frequencies, so the problem is how to transfer data safely and efficiently between them. In order to build a synchronization unit to tackle this problem, in this paper, we propose a novel efficient asynchronous First-In-First-Out architecture targeting to multi-synchronous NoCs. Token ring structure, register-based memory, and modified Asynchronous Assertion-Synchronous De-assertion techniques are applied to improve the performance of the proposed asynchronous FIFO. After simulating and verifying the design, we have implemented our asynchronous FIFO architecture with CMOS 180nm technology from AMS. Implementation results are analyzed and compared with previous works to show the strong points of our design.

Design and verification of an efficient WISHBONE-based network interface for network on chip

Article

Aug 2014
COMPUT ELECTR ENG

In this paper, a generic asynchronous First In First Out (FIFO) based WISHBONE compatible plug and play Network Interface (NI) for Network on Chip (NoC) is designed and verified. Four different types of encoded asynchronous FIFOs namely binary, Gray, one-hot and Johnson are designed and analyzed. It is found that Gray-code asynchronous FIFO is the best to handle the asynchronous clock domain issues in NI. The control signals of the WISHBONE bus wrappers from/to asynchronous FIFOs and packing/unpacking modules are asserted concurrently at the same rising edge of the respective router and IP clocks to reduce the latency. The same NI has been utilized for transferring data between synchronous as well as asynchronous clock domains irrespective of clock frequency and phase differences. The proposed NI ensures the seamless high data throughput between the routers and IP cores with minimal latency, higher throughput, higher speed and utilized lesser area compared to the existing design.

Implementation of USB 3. 0 SuperSpeed Physical Layer using Verilog HDL

Article

Full-text available

Jun 2014

In this proposed design it mainly includes USB 3.0, Physical Layer along with USB 2.0 functionality with Super speed functionality. Physical Layer mainly contains PCI Express and PIPE interface. This proposed design transferred data from transmitter to receiver serially. This design manages to transfer data either on 2.5GT/s or on 5.0GT/s depends upon the mode and rate. The design generates clock that runs on two different frequencies i.e. 125MHz and 250MHz that used to transfer data on parallel interface. This Design manages to capture the data that are coming asynchronously and lock the receiver clock with incoming asynchronous serial data. The architecture for USB 3.0 Physical Layer has been proposed in this paper. The proposed model is implemented and verified using Verilog HDL.

High Speed Low Power Ping Pong Buffering Based Network Interface for Network on Chip

Article

Aug 2013
J Low Power Electron

Connecting different Intellectual Property (IP) cores with the Network on Chip (NoC) router using a Network Interface (NI) is a challenging task due to its asynchronous nature and data width. In this paper, a generic high-speed NI for NoC using Ping Pong Buffers is proposed, in order to ensure low power and seamless high throughput between the router and processing IP core. The proposed scheme uses simple control logic to handle the read and write operations simultaneously from/to the memory modules and disables the unused memory banks as per the data flow required to achieve low power. This proposed method is analyzed with the existing Asynchronous First in First Out (FIFO) based NIs with different encoding schemes like One-Hot encoding and Johnson encoding. The NI is implemented using the asynchronous FIFOs and ping pong—double buffering scheme in Altera Stratix III FPGA. The synthesis results show that the proposed architecture enhances the speed of NI by 30% when memory depth is 8 and enhances speed by 11% when memory depth is 256. The power reduction is 12% and 5% when the memory size is 8 and 30% and 17% when the memory size is 256 for the transaction from PE to Router banks and Router to PE banks respectively.

Design of a low power network interface for Network on chip

Conference Paper

Full-text available

May 2013

In this paper, a low power flexible Network Interface (NI) Architecture for Network on chip (NoC) is proposed. The flexible run time configuration controller in the proposed NI plays a vital role to reduce the power by enabling and disabling the entire asynchronous First In First Outs (FIFOs) based on the traffic conditions between the router and the processing elements (PE). The NI has been implemented in Xilinx Virtex-5 XC5VLX110T FPGA. Experimental results show that the proposed low power NI offers a power improvement of 37%, when both FIFOs are inactive and 32 %, when only one FIFO is active.

FIFO2 partitioning with asynchronous pointer comparison logic

Context in source publication

Similar publications

Citations