Impact of Cluster Size on Efficient LUT-FPGA Architecture for Best Area and Delay Trade-Off.
ABSTRACT The delay of a circuit implemented in a Lookup table (LUT) based Field-Programmable Gate Arrays (FPGAs) is a combination of
routing delays, and logic block delays. However most of an FPGA’s area is devoted to programmable routing. When these blocks
are replaced with logic clusters, the fraction of delay due to the cluster has significant impact on total delay. This paper
investigates the impact of logic cluster size when the most favorable LUT size is achieved. As a result, fast and area efficient
FPGA architecture can be proposed that can combine the logic blocks into logic clusters. In lookup table FPGA architecture,
area and delay are the main factors to be tackled, the best value for each of the parameters depends on complex trade-offs.
If an FPGA with smaller LUTs is constructed to minimize the area, the result is poor speed. On the other hand, if an FPGA
includes larger LUTs, speed might increase but area is unnecessarily wasted. In this experimental work 20 benchmark circuits
were tested to calculate the delay and area metric. Results show increasing logic cluster size has no more effect on delay
as well as area, when suitable optimal values of lookup table size (LUT) are established.
Article: Field Programmable Gate-ArraysIEEE Design and Test of Computers 02/1998; · 1.62 Impact Factor
Conference Proceeding: An SoC design methodology using FPGAs and embedded microprocessors.[show abstract] [hide abstract]
ABSTRACT: In System on Chip (SoC) design, growing design complexity has forced designers to start designs at higher abstraction levels. This paper proposes an SoC design methodology that makes full use of FPGA capabilities. Design modules in different abstraction levels are all combined and run together in an FPGA prototyping system that fully emulates the target SoC. The higher abstraction level design modules run on microprocessors embedded in the FPGAs, while lower-level synthesizable RTL design modules are directly mapped onto FPGA reconfigurable cells. We made a hardware wrapper that gets the embedded microprocessors to interface with the fully synthesized modules through IBM CoreConnect buses. Using this methodology, we developed an image processor SoC with cryptographic functions, and we verified the design by running real firmware and application programs. For the designs that are too large to be fit into an FPGA, dynamic reconfiguration method is used.Proceedings of the 41th Design Automation Conference, DAC 2004, San Diego, CA, USA, June 7-11, 2004; 01/2004
- [show abstract] [hide abstract]
ABSTRACT: Programmable devices containing lookup tables (LUTs) and programmable logic arrays (PLAs) provide a heterogeneous target platform for user designs. Present commercial tools, which target these hybrid devices, require hand partitioning of user designs to isolate logic for each type of logic resource. In this paper, an automated technology mapping tool, hybridmap , is presented that identifies design logic partitions as suitable for either LUT or PLA implementation. A breadth-first search-based subgraph extraction and evaluation heuristic is integrated with product term (Pterm) count, area, and delay estimators to guide the technology mapping process. Hybridmap can be adapted to target a variety of PLA architectures and can accommodate user-provided timing constraints. It is shown that when timing constrained, hybridmap reduces LUT consumption for Apex20KE devices by 8% and when unconstrained by 14% by migrating logic from LUTs to Pterm structures. Hybridmap is shown to outperform previous mapping approaches for Apex20KE-type devices by up to 22%.IEEE Transactions on Computer-Aided Design of Integrated Circuits and Systems 06/2003; · 1.09 Impact Factor