GRIP2: A Robust and Powerful Deep Knockoff Method for Feature Selection
Identifying truly predictive covariates while strictly controlling false discoveries remains a fundamental challenge in nonlinear, highly correlated, and low signal-to-noise regimes, where deep learning based feature selection methods are most attractive. We propose Group Regularization Importance Persistence in 2 Dimensions (GRIP2), a deep knockoff feature importance statistic that integrates first-layer feature activity over a two-dimensional regularization surface controlling both sparsity strength and sparsification geometry. To approximate this surface integral in a single training run, we introduce efficient block-stochastic sampling, which aggregates feature activity magnitudes across diverse regularization regimes along the optimization trajectory. The resulting statistics are antisymmetric by construction, ensuring finite-sample FDR control. In extensive experiments on synthetic and semi-real data, GRIP2 demonstrates improved robustness to feature correlation and noise level: in high correlation and low signal-to-noise ratio regimes where standard deep learning based feature selectors may struggle, our method retains high power and stability. Finally, on real-world HIV drug resistance data, GRIP2 recovers known resistance-associated mutations with power better than established linear baselines, confirming its reliability in practice.
💡 Research Summary
The paper introduces GRIP2, a novel deep knockoff‑based feature‑selection statistic designed to retain high power and rigorous false‑discovery‑rate (FDR) control in challenging regimes characterized by strong feature correlation and low signal‑to‑noise ratios. Building on the Model‑X knockoff framework, the authors augment the data with synthetic knockoff variables and train a multilayer perceptron (MLP) that receives both original and knockoff features. The key insight is to use the ℓ₂ norm of each first‑layer weight vector, (w_j), as a proxy for the activity of feature (j). Rather than evaluating this activity at a single regularization setting, GRIP2 integrates it over a two‑dimensional regularization surface defined by (λ, a): λ controls overall sparsity strength, while a∈(0, 1] shapes the geometry of the group penalty (a = 1 corresponds to the convex group‑Lasso, a < 1 yields a non‑convex, L₀‑like penalty).
The “soft regularization persistence” score for feature j is defined as the expectation of its activity over a prior distribution μ on the (λ, a) domain:
(S_j = \mathbb{E}_{(λ,a)∼μ}
Comments & Academic Discussion
Loading comments...
Leave a Comment