A General Algorithm for Detecting Higher-Order Interactions via Random Sequential Additions
Many systems exhibit complex interactions between their components: some features or actions amplify each other’s effects, others provide redundant information, and some contribute independently. We present a simple geometric method for discovering interactions and redundancies: when elements are added in random sequential orders and their contributions plotted over many trials, characteristic L-shaped patterns emerge that directly reflect interaction structure. The approach quantifies how the contribution of each element depends on those added before it, revealing patterns that distinguish interaction, independence, and redundancy on a unified scale. When pairwise contributions are visualized as two–dimensional point clouds, redundant pairs form L–shaped patterns where only the first-added element contributes, while synergistic pairs form L–shaped patterns where only elements contribute together. Independent elements show order–invariant distributions. We formalize this with the L–score, a continuous measure ranging from $-1$ (perfect synergy, e.g. $Y=X_1X_2$) to $0$ (independence) to $+1$ (perfect redundancy, $X_1 \approx X_2$). The relative scaling of the L–shaped arms reveals feature dominance in which element consistently provides more information. Although computed only from pairwise measurements, higher–order interactions among three or more elements emerge naturally through consistent cross–pair relationships (e.g. AB, AC, BC). The method is metric–agnostic and broadly applicable to any domain where performance can be evaluated incrementally over non-repeating element sequences, providing a unified geometric approach to uncovering interaction structure.
💡 Research Summary
The paper introduces a novel, geometry‑based framework for detecting feature interactions—specifically synergy, redundancy, and independence—using random sequential addition of features to a predictive model. The core idea is simple: run many trials in which all available features are added one by one in a random order, and after each addition record the incremental improvement in a chosen performance metric (the authors use mean‑squared‑error reduction, but any monotonic metric works). For any pair of features (X₁, X₂) the trials are split into two groups depending on which feature appeared first. Plotting the marginal contributions of the “first” and “second” feature as points in a 2‑D plane yields characteristic point clouds:
- Redundancy – the first‑added feature supplies almost all the gain, the second contributes little. The two clouds form an L‑shape with one arm horizontal (first feature) and the other vertical (second feature).
- Synergy – substantial gain occurs only when both features are present; each feature contributes mainly when it is added after the other. The clouds are again L‑shaped but rotated, producing perpendicular arms.
- Independence – contribution does not depend on order; points are scattered symmetrically around the diagonal, showing no L‑shape.
To turn these visual patterns into a quantitative metric the authors define the L‑score, ranging from –1 (perfect synergy) through 0 (independence) to +1 (perfect redundancy). The score is computed as
L_score = skinny_red × skinny_blue × (horiz_red – horiz_blue)²
where:
- Skinnyness measures how elongated each cloud is, using the ratio of the first to second eigenvalues from a PCA (λ₁/λ₂). Values close to 1 indicate a thin, line‑like cloud.
- Horizontalness maps the orientation of the cloud’s principal axis to a value between –1 (vertical) and +1 (horizontal) via
cos(2θ), with θ the angle of the first principal component. - The squared difference of the two horizontalness values captures whether the clouds are orthogonal (synergy, negative sign) or aligned (redundancy, positive sign).
Two computational strategies are offered. The exhaustive permutation approach evaluates every possible ordering (n! permutations for n features), guaranteeing a complete point cloud but quickly becoming infeasible as n grows. The path‑based (sampling) approach draws a modest number k of random permutations, records contributions, and approximates the clouds; this reduces complexity from factorial to O(k·n) while preserving the essential geometry.
The authors validate the method on synthetic datasets designed to isolate each interaction type. For synergy they use functions such as Y = A·B, Y = A³·B, and Y = sin(A·B); for redundancy they generate a latent variable A and create transformed copies (A², cos(πA), |A|). In both cases the L‑score correctly identifies the interaction: strong negative values (≈ –0.98) for synergistic pairs and strong positive values (≈ +0.95) for redundant pairs. A “dominance coefficient” derived from the average contribution of each feature further reveals which member of a pair is more informative.
Comparisons with established measures—Pearson correlation, mutual information, Sobol indices, Friedman’s H‑statistic, and SHAP interaction values—show that L‑score uniquely captures both redundancy and synergy on a single scale. Correlation and mutual information detect redundancy but miss pure synergy because the underlying features are statistically independent. SHAP interactions detect synergy but not redundancy, illustrating the need for a unified metric.
Key advantages of the proposed framework include:
- Unified scale – a single continuous value distinguishes all three interaction regimes.
- Intuitive visualization – L‑shaped point clouds provide immediate qualitative insight.
- Metric‑agnostic – any performance improvement measure can be substituted, making the method applicable to regression, classification, reinforcement learning, or control tasks.
- Higher‑order inference – consistent pairwise L‑scores across triples (AB, AC, BC) can suggest three‑way interactions without explicitly computing k‑way terms.
The paper also discusses limitations. The sampling‑based approach requires enough random permutations to achieve statistical stability, which may be challenging in very high‑dimensional settings. The current formulation focuses on pairwise clouds; extending the geometry to directly capture three‑ or higher‑order L‑shapes remains an open problem. Sensitivity to noise is noted: small differences in marginal contributions can inflate the L‑score, suggesting the need for confidence intervals or bootstrap validation. Finally, the method assumes that the performance metric is additive across sequential additions, which may not hold for all model families.
In summary, the authors present a simple yet powerful technique—random sequential feature addition combined with geometric analysis—to detect and quantify feature interactions. By converting order‑dependent contribution patterns into the L‑score, they provide a unified, interpretable, and computationally tractable tool for researchers and practitioners dealing with complex, interdependent variables across a wide range of scientific and engineering domains. Future work is suggested on optimizing sampling strategies, extending the metric to direct k‑way interactions, and applying the framework to real‑world datasets in genomics, neuroscience, and reinforcement learning.
Comments & Academic Discussion
Loading comments...
Leave a Comment