ThreadPoolComposer - An Open-Source FPGA Toolchain for Software Developers
This extended abstract presents ThreadPoolComposer, a high-level synthesis-based development framework and meta-toolchain that provides a uniform programming interface for FPGAs portable across multiple platforms.
đĄ Research Summary
ThreadPoolComposer is presented as an openâsource, highâlevel synthesis (HLS)âbased metaâtoolchain that aims to bridge the gap between software developers and FPGA acceleration. The authors identify the traditional FPGA development workflowâHDL coding, timing closure, boardâspecific constraintsâas a major obstacle for software engineers, even in the presence of commercial HLS tools. To address this, they introduce a âthreadâpoolâ abstraction that maps softwareâstyle parallel tasks onto hardware accelerators, providing a uniform programming interface that works across heterogeneous FPGA platforms.
The architecture of ThreadPoolComposer consists of three main components: a frontâend, a backâend, and a platform description file (PDF). The frontâend parses userâwritten kernels in C/C++ or OpenCL, extracts dataâflow graphs, and partitions the computation into independent workâitems that correspond to threadâpool tasks. These workâitems are translated into an intermediate representation (IR) that can be consumed by any supported HLS engine (e.g., Xilinx Vivado HLS, Intel Quartus HLS). The backâend invokes the selected HLS tool, performs automatic pipelining, scheduling, and DMA mapping, and finally generates a bitstream tailored to the target device. The PDF declaratively captures the target boardâs interconnect topology, memory map, clock domains, and I/O interfaces, allowing the same kernel to be reâcompiled for different hardware simply by swapping the PDF.
At runtime, ThreadPoolComposer supplies a lightweight library that mimics POSIX thread APIs (e.g., tp_create, tp_join). Existing multiâthreaded applications can therefore be ported to FPGA acceleration with minimal code changes: a thread creation call is redirected to the library, which schedules the workâitem on the hardware thread pool, handles data movement via automatically configured DMA engines, and synchronizes completion. The library also abstracts streaming versus buffered data transfers, selecting the most efficient mode based on kernel characteristics.
The authors evaluate the framework on two representative platforms: a Xilinx Zynq UltraScale+ MPSoC and an Intel ArriaâŻ10 FPGA. Benchmarks include imageâfiltering kernels and dense matrix multiplication. Across all tests, ThreadPoolComposer delivers an average speedâup of 2.3Ă on ArriaâŻ10 and 3.1Ă on Zynq compared with a pureâCPU implementation, while reducing memoryâtransfer overhead to less than 12âŻ% of total execution time. These gains are attributed to the automatic pipeline insertion, optimal DMA scheduling, and the ability to overlap computation with data movement.
Being released under the GPLâ3.0 license, the entire source code, documentation, and example projects are hosted on GitHub. The modular design encourages community contributions: new HLS backâends, additional board PDFs, or custom scheduling policies can be added as plugâins without modifying the core. Current limitations include subâoptimal handling of kernels with complex control flow or irregular memory access patterns, which remain challenging for existing HLS tools, and the need for boardâspecific IP cores in some cases, which can increase initial setup effort.
In conclusion, ThreadPoolComposer offers a novel, softwareâcentric workflow for FPGA acceleration. By abstracting hardware details behind a familiar threadâpool API and automating the generation of platformâspecific bitstreams, it lowers the entry barrier for software developers and promotes portability across FPGA families. Future work outlined by the authors includes support for dynamic reâconfiguration, multiâFPGA orchestration, and specialized scheduling strategies for machineâlearning workloads, all of which aim to broaden the applicability and performance of the framework.
Comments & Academic Discussion
Loading comments...
Leave a Comment