AIRE-Prune: Asymptotic Impulse-Response Energy for State Pruning in State Space Models

AIRE-Prune: Asymptotic Impulse-Response Energy for State Pruning in State Space Models
Notice: This research summary and analysis were automatically generated using AI technology. For absolute accuracy, please refer to the [Original Paper Viewer] below or the Original ArXiv Source.

State space models (SSMs) often sacrifice capacity, search space, or stability to offset the memory and compute costs of large state dimensions. We introduce a structured post-training pruning method for SSMs – AIRE-Prune (Asymptotic Impulse-Response Energy for State PRUN(E)) – that reduces each layer’s state dimension by directly minimizing long-run output-energy distortion. AIRE-Prune assigns every state a closed-form asymptotic impulse-response energy-based score, i.e., the total impulse-response energy it contributes over an infinite horizon (time), and normalizes these scores layer-wise to enable global cross-layer comparison and selection. This extends modal truncation from single systems to deep stacks and aligns pruning with asymptotic response energy rather than worst-case gain. Across diverse sequence benchmarks, AIRE-Prune reveals substantial redundancy in SISO and MIMO SSMs with average pruning of 60.8%, with average accuracy drop of 0.29% without retraining, while significantly lowering compute. Code: https://github.com/falcon-arrow/AIRE-Prune.


💡 Research Summary

State‑space models (SSMs) have become a powerful backbone for modern sequence processing, offering compact representations of long‑range dependencies through linear dynamical cores wrapped with nonlinearities. Nevertheless, the state dimension n remains the dominant factor governing memory consumption and compute cost. Existing post‑training compression techniques either prune individual weight matrices (unstructured) or remove whole channels, but they rarely address the redundancy of the latent state space itself. The most recent structured approach, LAST (Gwak et al., 2025), scores each state by its worst‑case H∞ gain and normalizes scores layer‑wise to enable global pruning. While LAST demonstrates that SSMs are highly compressible, its reliance on a worst‑case metric can be overly conservative because typical inputs (impulses or white noise) excite the system across all frequencies rather than probing a single resonant mode.

AIRE‑Prune (Asymptotic Impulse‑Response Energy for State PRUN(E)) proposes a fundamentally different importance criterion: the total output energy generated by a unit impulse over an infinite horizon. For a discrete‑time diagonal SSM layer Σ = (Λ, B, C) with poles λ_i satisfying |λ_i| < 1, the impulse response at time k is H_k = C Λ^k B. The infinite‑horizon energy is ∑_{k=0}^{∞}‖H_k‖_F², which by Parseval’s theorem equals the H₂ norm of the transfer function. Because Λ is diagonal, this sum decomposes into independent contributions from each mode i: \


Comments & Academic Discussion

Loading comments...

Leave a Comment