Abstract
We can exploit the standardization of communication abstractions provided by modern high-level synthesis tools like Vivado HLS, Bluespec and SCORE to provide stable system interfaces between the host and PCIe-based FPGA accelerator platforms. At a high level, our FPGA driver attempts to provide CUDA-like driver behavior, and more, to FPGA programmers. On the FPGA fabric, we develop an AXI-compliant, lightweight interface switch coupled to multiple physical interfaces (PCIe, Ethernet, DRAM) to provide programmable, portable routing capability between the host and user logic on the FPGA. On the host, we adapt the RIFFA 1.0 driver to provide enhanced communication APIs along with bitstream configuration capability allowing low-latency, high-throughput communication and safe, reliable programming of user logic on the FPGA. Our driver only consumes 21% BRAMs and 14% logic overhead on a Xilinx ML605 platform or 9% BRAMs and 8% logic overhead on a Xilinx V707 board. We are able to sustain DMA transfer throughput (to DRAM) of 1.47GB/s (74% peak) of the PCIe (x4 Gen2) bandwidth, 120.2MB/s (96%) of the Ethernet (1G) bandwidth and 5.93GB/s (92.5%) of DRAM bandwidth.
Original language | English |
---|---|
Title of host publication | FPT 2013 - Proceedings of the 2013 International Conference on Field Programmable Technology |
Pages | 128-135 |
Number of pages | 8 |
DOIs | |
Publication status | Published - 2013 |
Externally published | Yes |
Event | 2013 12th International Conference on Field-Programmable Technology, FPT 2013 - Kyoto, Japan Duration: Dec 9 2013 → Dec 11 2013 |
Conference
Conference | 2013 12th International Conference on Field-Programmable Technology, FPT 2013 |
---|---|
Country | Japan |
City | Kyoto |
Period | 12/9/13 → 12/11/13 |
Fingerprint
ASJC Scopus subject areas
- Software
Cite this
System-level FPGA device driver with high-level synthesis support. / Vipin, Kizheppatt; Shreejith, Shanker; Gunasekera, Dulitha; Fahmy, Suhaib A.; Kapre, Nachiket.
FPT 2013 - Proceedings of the 2013 International Conference on Field Programmable Technology. 2013. p. 128-135 6718342.Research output: Chapter in Book/Report/Conference proceeding › Conference contribution
}
TY - GEN
T1 - System-level FPGA device driver with high-level synthesis support
AU - Vipin, Kizheppatt
AU - Shreejith, Shanker
AU - Gunasekera, Dulitha
AU - Fahmy, Suhaib A.
AU - Kapre, Nachiket
PY - 2013
Y1 - 2013
N2 - We can exploit the standardization of communication abstractions provided by modern high-level synthesis tools like Vivado HLS, Bluespec and SCORE to provide stable system interfaces between the host and PCIe-based FPGA accelerator platforms. At a high level, our FPGA driver attempts to provide CUDA-like driver behavior, and more, to FPGA programmers. On the FPGA fabric, we develop an AXI-compliant, lightweight interface switch coupled to multiple physical interfaces (PCIe, Ethernet, DRAM) to provide programmable, portable routing capability between the host and user logic on the FPGA. On the host, we adapt the RIFFA 1.0 driver to provide enhanced communication APIs along with bitstream configuration capability allowing low-latency, high-throughput communication and safe, reliable programming of user logic on the FPGA. Our driver only consumes 21% BRAMs and 14% logic overhead on a Xilinx ML605 platform or 9% BRAMs and 8% logic overhead on a Xilinx V707 board. We are able to sustain DMA transfer throughput (to DRAM) of 1.47GB/s (74% peak) of the PCIe (x4 Gen2) bandwidth, 120.2MB/s (96%) of the Ethernet (1G) bandwidth and 5.93GB/s (92.5%) of DRAM bandwidth.
AB - We can exploit the standardization of communication abstractions provided by modern high-level synthesis tools like Vivado HLS, Bluespec and SCORE to provide stable system interfaces between the host and PCIe-based FPGA accelerator platforms. At a high level, our FPGA driver attempts to provide CUDA-like driver behavior, and more, to FPGA programmers. On the FPGA fabric, we develop an AXI-compliant, lightweight interface switch coupled to multiple physical interfaces (PCIe, Ethernet, DRAM) to provide programmable, portable routing capability between the host and user logic on the FPGA. On the host, we adapt the RIFFA 1.0 driver to provide enhanced communication APIs along with bitstream configuration capability allowing low-latency, high-throughput communication and safe, reliable programming of user logic on the FPGA. Our driver only consumes 21% BRAMs and 14% logic overhead on a Xilinx ML605 platform or 9% BRAMs and 8% logic overhead on a Xilinx V707 board. We are able to sustain DMA transfer throughput (to DRAM) of 1.47GB/s (74% peak) of the PCIe (x4 Gen2) bandwidth, 120.2MB/s (96%) of the Ethernet (1G) bandwidth and 5.93GB/s (92.5%) of DRAM bandwidth.
UR - http://www.scopus.com/inward/record.url?scp=84894164347&partnerID=8YFLogxK
UR - http://www.scopus.com/inward/citedby.url?scp=84894164347&partnerID=8YFLogxK
U2 - 10.1109/FPT.2013.6718342
DO - 10.1109/FPT.2013.6718342
M3 - Conference contribution
AN - SCOPUS:84894164347
SN - 9781479921990
SP - 128
EP - 135
BT - FPT 2013 - Proceedings of the 2013 International Conference on Field Programmable Technology
ER -