Structural Disentanglement in Bilinear MLPs via Architectural Inductive Bias

Notice: This research summary and analysis were automatically generated using AI technology. For absolute accuracy, please refer to the [Original Paper Viewer] below or the Original ArXiv Source.

Selective unlearning and long-horizon extrapolation remain fragile in modern neural networks, even when tasks have underlying algebraic structure. In this work, we argue that these failures arise not solely from optimization or unlearning algorithms, but from how models structure their internal representations during training. We explore if having explicit multiplicative interactions as an architectural inductive bias helps in structural disentanglement, through Bilinear MLPs. We show analytically that bilinear parameterizations possess a `non-mixing’ property under gradient flow conditions, where functional components separate into orthogonal subspace representations. This provides a mathematical foundation for surgical model modification. We validate this hypothesis through a series of controlled experiments spanning modular arithmetic, cyclic reasoning, Lie group dynamics, and targeted unlearning benchmarks. Unlike pointwise nonlinear networks, multiplicative architectures are able to recover true operators aligned with the underlying algebraic structure. Our results suggest that model editability and generalization are constrained by representational structure, and that architectural inductive bias plays a central role in enabling reliable unlearning.

💡 Research Summary

The paper tackles two increasingly important capabilities of modern neural networks—selective unlearning and long‑horizon extrapolation—and argues that their fragility stems not from optimization tricks but from the way internal representations are structured. The authors introduce the notion of structural disentanglement, a property where a model’s learned operator decomposes into orthogonal, independent components that align with the underlying algebraic or compositional structure of the task. They hypothesize that an architectural inductive bias that explicitly models multiplicative interactions can promote such disentanglement.

To investigate this, the study focuses on Bilinear MLPs, a class of networks where each layer computes a pointwise product of two linear projections: g(x) = (Wx) ⊙ (Vx). This yields a quadratic form xᵀMₖx for each output unit, with Mₖ = wₖvₖᵀ a rank‑1 interaction matrix. By aggregating over outputs, the model defines a global interaction operator Q = ∑ₖαₖMₖ, which can be treated as a symmetric matrix without loss of generality. The symmetry guarantees real eigenvalues and an orthogonal eigenbasis, providing a natural substrate for disentanglement.

The theoretical contribution derives the gradient‑flow dynamics of Q under a squared‑Frobenius loss L = ½‖Q − Q*‖²_F, where Q* is the ground‑truth operator. Parameterizing Q as UVᵀ, the flow obeys

˙Q = −(Q − Q*) VVᵀ − UUᵀ (Q − Q*).

Assuming Q* has an SVD Q* = ∑_i s_i u_i v_iᵀ and that the network is initialized with small random weights, the authors show that the learned operator remains aligned with the singular vectors: Q(t) = ∑_i c_i(t) u_i v_iᵀ. Substituting into the flow yields independent scalar ODEs

˙c_i = −(a_i² + b_i²)(c_i − s_i),

where a_i = ‖Uᵀu_i‖ and b_i = ‖Vᵀv_i‖. Crucially, no cross‑terms appear, meaning each mode evolves in isolation. This “non‑mixing” property guarantees that setting a target singular value to zero (i.e., unlearning a specific functional component) does not disturb the remaining modes. Moreover, because Q is symmetric, repeated application corresponds to Q^k = V Λ^k Vᵀ, enabling exact long‑term composition without error accumulation—a stark contrast to piecewise‑linear ReLU networks where local affine maps differ across regions, leading to compounding multiplicative errors.

Empirically, the authors evaluate four families of tasks:

Modular arithmetic (addition and multiplication modulo a prime p = 97). For addition, the true operator is circulant and diagonalizable by the discrete Fourier transform. The authors measure Fourier entropy of the learned interaction matrices; Bilinear models achieve entropy close to the theoretical value (log p), indicating they recover the uniform spectral distribution, whereas ReLU baselines exhibit higher entropy and noisy spectra. For multiplication, they examine singular‑value decay, showing Bilinear networks capture a low‑rank structure with rapid decay, while ReLU models retain a more diffuse spectrum.
Cyclical reasoning / long‑horizon extrapolation. Models are trained to predict the successor function f(a) = (a + 1) mod p and then evaluated on i‑step iterates by raising the extracted transition matrix T to the i‑th power. Bilinear models learn a near‑deterministic permutation matrix (low column entropy) and maintain high accuracy even for 20‑step predictions. ReLU models, however, produce diffuse transition matrices, leading to rapid degradation of multi‑step accuracy.
Lie‑group dynamics. The paper simulates continuous transformations on a Lie group and shows that Bilinear MLPs learn the underlying generator matrix, preserving the group structure across multiple steps, whereas pointwise networks drift away from the true manifold.
Selective unlearning benchmarks. The authors target removal of a specific class (e.g., a particular residue modulo p) by directly zeroing the corresponding singular value in Q. Bilinear networks achieve high unlearning selectivity: the targeted class accuracy drops to chance while overall performance remains unchanged. ReLU networks suffer substantial collateral damage, confirming that entangled representations hinder precise edits.

Across all experiments, Bilinear MLPs match or exceed standard ReLU models in raw task performance while dramatically improving both editability (the ability to surgically modify a single functional component) and extrapolation robustness (stable behavior under repeated composition). The authors stress that these benefits arise from the architectural inductive bias itself, not from any specialized unlearning algorithm.

In conclusion, the work provides a compelling argument that architectural choices dictate the geometry of learned representations. By embedding multiplicative interactions, Bilinear MLPs naturally enforce a non‑mixing gradient flow that aligns learned operators with the true algebraic structure of the data. This alignment yields two practical advantages: (1) selective unlearning becomes a tractable, structurally guaranteed operation, and (2) models generalize reliably to long‑range compositional tasks. The paper opens several avenues for future research, including extending the non‑mixing analysis to deeper, more complex architectures (e.g., transformers with bilinear attention), exploring regularization schemes that further promote orthogonal mode separation, and investigating how initialization and normalization affect the emergence of structural disentanglement.

Structural Disentanglement in Bilinear MLPs via Architectural Inductive Bias

💡 Research Summary

Comments & Academic Discussion

Leave a Comment