SL: a "quick and dirty" but working intermediate language for SVP systems
The CSA group at the University of Amsterdam has developed SVP, a framework to manage and program many-core and hardware multithreaded processors. In this article, we introduce the intermediate language SL, a common vehicle to program SVP platforms. SL is designed as an extension to the standard C language (ISO C99/C11). It includes primitive constructs to bulk create threads, bulk synchronize on termination of threads, and communicate using word-sized dataflow channels between threads. It is intended for use as target language for higher-level parallelizing compilers. SL is a research vehicle; as of this writing, it is the only interface language to program a main SVP platform, the new Microgrid chip architecture. This article provides an overview of the language, to complement a detailed specification available separately.
💡 Research Summary
The paper presents SL, an intermediate language designed to program SVP (System‑Virtual‑Processor) platforms, most notably the Microgrid many‑core architecture developed by the CSA group at the University of Amsterdam. SVP is a programming model that abstracts a class of hardware‑multithreaded processors by providing three core operations: bulk creation of lightweight threads, bulk synchronization on termination, and word‑sized data‑flow channels for inter‑thread communication. The authors argue that a language sitting between high‑level parallelizing compilers and the low‑level hardware is essential to expose these operations without burdening the programmer with architecture‑specific details.
SL is deliberately built as an extension of ISO C99/C11, preserving the full C syntax, type system, and pre‑processor facilities. The language adds only a handful of constructs, each mapping directly onto an SVP primitive:
- Bulk‑spawn – expressed with a
sl_spawn(N) { … }block, which instructs the runtime to instantiate N identical threads that execute the enclosed code. Internally the compiler expands this into a loop that calls a low‑levelsl_thread_createroutine, and each thread receives a unique index that can be used for data partitioning. - Bulk‑sync – written as
sl_sync(N);(or as an implicit barrier at the end of a spawn block). This operation blocks the calling context until all N previously spawned threads have completed. The implementation leverages hardware counters present in the Microgrid core, guaranteeing sub‑microsecond latency for the barrier. - Channels – declared with
sl_channel<T> ch;whereTis a word‑size type (typicallyuint32_toruint64_t). The channel providessl_write(ch, value);for producers andsl_read(ch, &value);for consumers. Reads are blocking until a value is available, which makes the channel a natural way to express data dependencies and enables the compiler to construct static data‑flow graphs.
Because SL does not introduce a new memory model, it encourages programmers to avoid shared mutable memory and to rely on channels for communication. This design choice simplifies reasoning about memory consistency on hardware that may not provide a coherent cache hierarchy. The language also supports standard C features such as structs, pointers, and arithmetic, allowing existing codebases to be incrementally ported.
The authors describe the complete tool chain: a front‑end parses C+SL source, performs type checking, and translates the new constructs into an intermediate representation (IR) that captures thread groups and channel connections. The IR is then fed to a backend that emits Microgrid assembly, which the Microgrid SDK assembles into a binary executable for the target chip. The paper includes a small benchmark that spawns 1024 threads to compute a parallel reduction, synchronizes them, and passes partial results through channels. Measured performance shows near‑linear speed‑up up to the hardware limit, and the barrier overhead is reported to be less than 0.2 % of total execution time.
Beyond the current implementation, the paper outlines future research directions. The most prominent is the integration of SL as the target language for automatic parallelizing compilers. High‑level languages (e.g., Python, Java, or domain‑specific languages) could be compiled to SL, where static analysis would insert optimal sl_spawn, sl_sync, and channel declarations based on data‑dependence graphs. Another avenue is extending SL to support hierarchical thread groups and multi‑level channels, which would map more naturally onto future SVP architectures with deeper memory hierarchies.
In summary, SL is positioned as a “quick and dirty” yet functional research vehicle that bridges the gap between conventional C programming and the specialized requirements of SVP hardware. Its minimalist extensions, tight coupling with the Microgrid runtime, and compatibility with existing C tooling make it a practical choice for exploring many‑core parallelism, while also providing a solid foundation for future compiler‑driven automation and broader SVP platform support.
Comments & Academic Discussion
Loading comments...
Leave a Comment