Figure 3 - uploaded by Alexander Fell
Content may be subject to copyright.
Source publication
In this thesis a Network-on-Chip (NoC) router implementation called RECONNECT realized in Bluespec System Verilog (BSV), is presented. It is highly configurable in terms of flit size, the number of provided Input Port/Output Port (IP/OP) pairs and support for configurations during runtime, to name a few. Depending on the amount of available IP/OP p...
Similar publications
Photonic Network-on-Chips is a new generation of Network-on-Chips and has been proposed as a novel solution for the communication infrastructure of chip multiprocessors as well as a different solution to eliminate limitations of Network-on-Chips. Photonic Network-on-Chips has important properties such as increasing communication bandwidth, lowering...
Citations
... This allows designers to freely consider the interconnect architecture best suited for the resulting inter-PE communication pattern (e.g. NoC [20], bus, etc), then applying this result to the interconnect constraints set, yielding the final schedule and making the configuration derivable. Let P = {pe 1 , pe 2 , . . . ...
Coarse-Grained ReconfigurableArchitectures (CGRA) are proven to be advantageous over fine-grained architectures, massively parallel GPUs and generic CPUs, in terms of energy and flexibility. However the key challenge of programmability is preventing wide-spread adoption. To exploit instruction level parallelism inherent to such architectures, optimal scheduling and mapping of algorithmic kernels is essential. Transforming an input algorithm in the form of a Data Flow Graph (DFG) into a CGRA schedule and mapping configuration is very challenging, due the necessity to consider architectural details such as memory bandwidth requirements, communication patterns, pipelining and heterogeneity to optimally extract maximum performance.
In this paper, an algorithm is proposed that employs Force-Directed Scheduling concepts to solve such scheduling and resource minimization problems. Our euristic extensions are flexible enough for generic heterogeneous CGRAs, allowing to estimate the execution time of an algorithm with different configurations, while maximizing the utilization of available hardware. Beside our experiments, we compare also given CGRA configurations introduced by state-of-the-art mapping algorithms such as EPIMap, achieving optimal resource utilization by our schedule with a reduced overall DFG execution time by 39% on average.