
James Coole- Doctor of Philosophy
- University of Florida
James Coole
- Doctor of Philosophy
- University of Florida
About
11
Publications
1,530
Reads
How we measure 'reads'
A 'read' is counted each time someone views a publication summary (such as the title, abstract, and list of authors), clicks on a figure, or views or downloads the full-text. Learn more
282
Citations
Introduction
Current institution
Publications
Publications (11)
Recent FPGA research has increasingly focused on overlays---virtual coarse-grained architectures---to address widely known application-design productivity problems such as lengthy compilation times and a lack of portability. However, existing overlay research has not yet been adopted due to several key limitations: 1) a general focus on datapath-ce...
With FPGAs emerging as a promising accelerator for general-purpose computing, there is a strong demand to make them accessible to software developers. Recent advances in OpenCL compilers for FPGAs pave the way for synthesizing FPGA hardware from OpenCL kernel code. To enable broader adoption of this paradigm, significant challenges remain. This pap...
Previous work has shown that virtual architectures, or overlays, can greatly reduce lengthy FPGA compile times by providing application-specialized resources along with a flexible interconnect to support application changes. However, retaining full configurability of interconnect has also required significant area overhead. In this paper, we introd...
High-level synthesis from OpenCL has shown significant potential, but current approaches conflict with mainstream OpenCL design methodologies owing to orders-of-magnitude longer field-programmable gate array compilation times and limited support for changing or adding kernels after system compilation. In this article, the authors introduce a back-e...
Numerous studies have shown the advantages of hardware and software co-design using FPGAs. However, increasingly lengthy place-and-route times represent a barrier to the broader adoption of this technology by significantly reducing designer productivity and turns-per-day, especially compared to more traditional design environments offered by compet...
Field-programmable gate arrays (FPGAs) suffer from lower application design productivity than other devices, which is largely due to compilation taking hours or even days. Making FPGA compilation comparable to software compilation is critical for continued FPGA usage due to competitive technolo- gies, such as graphics-processing units, that use lan...
A study that involved extending and combining existing FPGA tools to create a tool flow that addresses these bottlenecks is presented. The key contributions of the tool flow include formulation techniques for rapid design-space exploration, a coordination framework for communication and synchronization between tasks in different languages and devic...
Although hardware/software partitioning of embedded applications onto FPGAs is widely known to have performance and power advantages, FPGA usage has been typically limited to hardware experts, due largely to several problems: 1) difficulty of integrating hardware design tools into well-established software tool flows, 2) increasingly lengthy FPGA d...
Field-programmable gate arrays (FPGAs) and other reconfigurable computing (RC) devices have been widely shown to have numerous advantages including order of magnitude performance and power improvements compared to microprocessors for some applications. Unfortunately, FPGA usage has largely been limited to applications exhibiting sequential memory a...
Numerous studies have shown that field-programmable gate arrays (FPGAs) often achieve large speedups compared to microprocessors. However, one significant limitation of FPGAs that has prevented their use on important applications is the requirement for regular memory access patterns. Traversal caches were previously introduced to improve the perfor...
Field-programmable gate arrays (FPGAs) often achieve order of magnitude speedups compared to microprocessors, but typically have been unable to improve the performance of applications with irregular memory access patterns, such as traversals of pointer- based data structures. Due to the common use of these data structures, the applicability and wid...