Idempotent Slices with Applications to Code-Size Reduction
Given a value computed within a program, an idempotent backward slice with respect to this value is a maximal subprogram that computes it. An informal notion of an idempotent slice has previously been used by Guimaraes et al. to transform eager into strict evaluation in the LLVM intermediate representation. However, that algorithm is insufficient to be correctly applied to general control-flow graphs. This paper addresses these omissions by formalizing the notion of idempotent backward slices and presenting a sound and efficient algorithm for extracting them from programs in Gated Static Single Assignment (GSA) form. As an example of their practical use, the paper describes how identifying and extracting idempotent backward slices enables a sparse code-size reduction optimization; that is, one capable of merging non-contiguous sequences of instructions within the control-flow graph of a single function or across functions. Experiments with the LLVM test suite show that, in specific benchmarks, this new algorithm achieves code-size reductions up to -7.24% on programs highly optimized by the -Os sequence of passes from clang 17.
💡 Research Summary
The paper introduces a formal definition of an “idempotent backward slice” – a maximal subprogram that computes a given value and can be executed repeatedly without changing the program’s observable state. The authors point out that earlier work by Guimarães and Pereira used an informal notion of such slices to lazily evaluate function arguments, but their algorithm fails on programs that do not satisfy conventional SSA properties or that lack a hammock (single‑entry‑single‑exit) structure. To overcome these limitations, the authors adopt the Gated Static Single Assignment (GSA) form, which replaces φ‑functions with three gate instructions (µ, γ, η) that make both data and control dependencies explicit.
In Section 2 they formalize idempotent backward slices with two constraints: (1) the set of basic blocks containing the slice must form a single‑entry region, and (2) the slice must be idempotent, meaning that for any fixed binding of its free variables the slice always yields the same result and does not modify memory or raise exceptions. They also define idempotent execution and illustrate it with LLVM IR examples, excluding stores, exception‑raising operations, and loads from mutable memory.
Section 3 presents a linear‑time algorithm for extracting such slices from programs already in GSA form. The algorithm first builds a GSA representation using the path‑numbering technique of Tu and Padua, which assigns unique numbers to basic blocks and makes dominance relationships easy to compute. Then, starting from a slice criterion variable, it traverses the sparse backward dependence graph, adding instructions only if they satisfy the idempotent criteria (no side effects, no exceptions, only immutable loads). The traversal respects the single‑entry requirement by checking that all visited blocks share a unique entry block. Because the algorithm works directly on the GSA graph, it does not require the program to be in CSSA or to have a hammock decomposition, and its complexity is O(|E|), where |E| is the number of CFG edges.
The extracted slices are then used for a novel code‑size reduction optimization described in Section 4. The idea is to identify as many idempotent slices as possible, group together slices that are isomorphic (i.e., have the same structure and the same set of inputs), and outline each group into a single helper function. Unlike previous function‑merging techniques that rely on contiguous instruction sequences or ordered code, this approach can merge non‑contiguous or even cross‑function instruction groups, because the slices are defined by data‑flow rather than textual proximity.
The authors implemented the slicing pass and the subsequent merging pass in LLVM 17.0.6 and evaluated it on the full LLVM test suite (2007 programs). The code‑size reduction pass succeeded on 29 programs, achieving a geometric mean reduction of –7.24 % in the .text section after the standard –Os optimization pipeline. The most significant gain was on the AMGmk benchmark, where the pass reduced code size by –12.49 % on top of clang –Os. For comparison, the earlier function‑merging technique by Rocha et al. achieved only –5.59 % on the same benchmark, and the LLVM IROutliner achieved –0.94 %. The authors emphasize that their optimization is complementary to existing techniques; it does not subsume them, and each method can contribute unique reductions.
In summary, the paper makes three main contributions: (1) a rigorous definition of idempotent backward slices, (2) a sound, linear‑time algorithm for extracting them from GSA programs without requiring special CFG structures, and (3) a practical application of these slices to a sparse code‑size reduction optimization that can merge non‑contiguous code regions. The work demonstrates that making control dependencies explicit via GSA enables more precise program analyses and opens the door for further optimizations such as automatic parallelization, redundancy elimination, and hot‑cold code splitting.
Comments & Academic Discussion
Loading comments...
Leave a Comment