The Library
DeCO : A DSP block based FPGA accelerator overlay with low overhead interconnect
Tools
Jain, Abhishek Kumar, Li, Xiangwei, Singhai, Pranjul, Maskell, Douglas L. and Fahmy, Suhaib A. (2016) DeCO : A DSP block based FPGA accelerator overlay with low overhead interconnect. In: IEEE International Symposium on Field-Programmable Custom Computing Machines, Washington, DC, 1–3 May 2016. Published in: 2016 IEEE 24th Annual International Symposium on Field-Programmable Custom Computing Machines (FCCM), pp. 1-8.
PDF
WRAP_fccm2016-jain-preprint.pdf - Accepted Version - Requires a PDF viewer. Download (1021Kb) |
Official URL: http://dx.doi.org/10.1109/FCCM.2016.10
Abstract
Coarse-grained FPGA overlay architectures paired with general purpose processors offer a number of advantages for general purpose hardware acceleration because of software- like programmability, fast compilation, application portability, and improved design productivity. However, the area overheads of these overlays, and in particular architectures with island-style interconnect, negate many of these advantages, preventing their use in practical FPGA-based systems. Crucially, the interconnect flexibility provided by these overlay architectures is normally over-provisioned for accelerators based on feed-forward pipelined datapaths, which in many cases have the general shape of inverted cones. We propose DeCO, a cone shaped cluster of FUs utilizing a simple linear interconnect between them. This reduces the area overheads for implementing compute kernels extracted from compute-intensive applications represented as directed acyclic dataflow graphs, while still allowing high data throughput. We perform design space exploration by modeling programmability overhead as a function of overlay design parameters, and compare to the programmability overhead of island- style overlays. We observe 87% savings in LUT requirements using the proposed approach compared to DSP block based island-style overlays. Our experimental evaluation shows that the proposed overlay exhibits an achievable frequency of 395MHz, close to the DSP theoretical limit on the Xilinx Zynq. We also present an automated tool flow that provides a rapid and vendor- independent mapping of the high level compute kernel code to the proposed overlay.
Item Type: | Conference Item (Paper) | ||||||
---|---|---|---|---|---|---|---|
Subjects: | Q Science > QA Mathematics > QA76 Electronic computers. Computer science. Computer software T Technology > TK Electrical engineering. Electronics Nuclear engineering |
||||||
Divisions: | Faculty of Science, Engineering and Medicine > Engineering > Engineering | ||||||
Library of Congress Subject Headings (LCSH): | Field programmable gate arrays, Computer architecture , Signal processing--Digital techniques | ||||||
Journal or Publication Title: | 2016 IEEE 24th Annual International Symposium on Field-Programmable Custom Computing Machines (FCCM), | ||||||
Publisher: | IEEE | ||||||
Official Date: | 18 August 2016 | ||||||
Dates: |
|
||||||
Page Range: | pp. 1-8 | ||||||
Status: | Peer Reviewed | ||||||
Publication Status: | Published | ||||||
Date of first compliant deposit: | 22 April 2016 | ||||||
Date of first compliant Open Access: | 24 August 2016 | ||||||
Conference Paper Type: | Paper | ||||||
Title of Event: | IEEE International Symposium on Field-Programmable Custom Computing Machines | ||||||
Type of Event: | Conference | ||||||
Location of Event: | Washington, DC | ||||||
Date(s) of Event: | 1–3 May 2016 | ||||||
Related URLs: |
Request changes or add full text files to a record
Repository staff actions (login required)
View Item |
Downloads
Downloads per month over past year