[show abstract][hide abstract] ABSTRACT: This paper presents a complete design of a reconfigurable architecture support system, called ACS (an addressless configuration support), which provides efficient access to non-contiguous reconfigurable locations in reconfigurable systems. ACS reduces the amount of partial reconfiguration information required by removing a large amount of addressing information and padding as found in Virtex-4 bitstreams. ACS improves significantly on the distTree architecture previously proposed by us. ACS introduces the selector block which connects the leaf nodes to a consecutive block of reconfiguration locations called a frame set. The system allows any number of leaf nodes customised to the size of the device, thereby providing much more flexibility. The hardware costs have also been reduced significantly over the distTree design. Together with the new marker loading mechanism, ACS is readily applicable to SRAM-based FPGAs. This new ACS system is benchmarked using eight real-world applications against a Virtex-4 device and the results show 6.83%-15.07% speedups when the reconfiguration granularity is set to a Virtex-4 frame.
ICECE Technology, 2008. FPT 2008. International Conference on; 01/2009
[show abstract][hide abstract] ABSTRACT: In line with Shannon's ideas, we define the entropy of FPGA reconfiguration to be the amount of information needed to configure a given circuit onto a given device. We propose using entropy as a gauge of the maximum configuration compression that can be achieved and determine the entropy of a set of 24 benchmark circuits for the Virtex device family. We demonstrate that simple off-the-shelf compression techniques such as Golomb encoding and hierarchical vector compression achieve compression results that are within 1-10% of the theoretical bound. We present an enhanced configuration memory system based on the hierarchical vector compression technique that accelerates reconfiguration in proportion to the amount of compression achieved. The proposed system demands little additional chip area and can be clocked at the same rate as the Virtex configuration clock
Field Programmable Logic and Applications, 2006. FPL '06. International Conference on; 09/2006
[show abstract][hide abstract] ABSTRACT: This paper presents a configuration memory architecture that offers fast FPGA reconfiguration. The underlying principle behind the design is the use of fine-grained partial reconfiguration that allows significant configuration re-use while switching from one circuit to another. The proposed configuration memory works by reading on-chip configuration data into a buffer, modifying them based on the externally supplied data and writing them back to their original registers. A prototype implementation of the proposed design in a 90nm cell library indicates that the new memory adds less than 1% area to a commercially available FPGA implemented using the same library. The proposed design reduces the reconfiguration time for a wide set of benchmark circuits by 63%. However, power consumption during reconfiguration increases by a factor of 2.5 because the read-modify-write strategy results in more switching in the memory array.
Field Programmable Logic and Applications, 2005. International Conference on; 09/2005
[show abstract][hide abstract] ABSTRACT: Dynamic FPGA reconfiguration represents an overhead that can be critical to the performance of a realised circuit. To address this problem, This work presents a technique that is applicable at the times of loading the configuration data on the device. The technique involves reusing the on-chip configuration fragments to implement the next configuration thereby reducing the amount of data that must be externally transferred to the configuration memory. This work provides an analysis of the effect of circuit placement and configuration granularity on configuration reuse. The problem of finding placements of each circuit in a sequence of circuits so as to maximize configuration re-use is considered in detail. A greedy solution to this NP complete problem was found to reduce configuration overheads by less than 5% for a benchmark set. The effect of configuration granularity on configuration reuse was also considered and it was found that reducing the size of the unit of configuration allowed us to reduce the size of the benchmark configurations by 41%.
Field-Programmable Technology, 2004. Proceedings. 2004 IEEE International Conference on; 01/2005
[show abstract][hide abstract] ABSTRACT: The Circal process algebra is being used to explore the behavioural specification of systems that are mapped to field programmable logic circuits. In this paper we report on the implementation and performance of an interpreter for system specifications given in the Circal language. In contrast to the typical design flow for field programmable technology in which designs are statically partitioned, synthesised, and mapped to pre-allocated resources, in this system the specified circuits are extracted from behavioural specifications that are partitioned, elaborated, mapped, and configured at run time as control passes through them. We report on the details of a design that targets the Celoxica RC1000 co-processor and assess preliminary performance results for this implementation. The results clearly demonstrate our method is a practical approach to overcome resource constraints, particularly in applications where these change at run time. The results also establish a benchmark against which to measure future improvements and alternative methods.