Optimal Estimation in Orthogonally Invariant Generalized Linear Models: Spectral Initialization and Approximate Message Passing
We consider the problem of parameter estimation from a generalized linear model with a random design matrix that is orthogonally invariant in law. Such a model allows the design have an arbitrary distribution of singular values and only assumes that its singular vectors are generic. It is a vast generalization of the i.i.d. Gaussian design typically considered in the theoretical literature, and is motivated by the fact that real data often have a complex correlation structure so that methods relying on i.i.d. assumptions can be highly suboptimal. Building on the paradigm of spectrally-initialized iterative optimization, this paper proposes optimal spectral estimators and combines them with an approximate message passing (AMP) algorithm, establishing rigorous performance guarantees for these two algorithmic steps. Both the spectral initialization and the subsequent AMP meet existing conjectures on the fundamental limits to estimation – the former on the optimal sample complexity for efficient weak recovery, and the latter on the optimal errors. Numerical experiments suggest the effectiveness of our methods and accuracy of our theory beyond orthogonally invariant data.
💡 Research Summary
**
This paper addresses the fundamental problem of estimating a high‑dimensional parameter vector β∗ from a generalized linear model (GLM) when the design matrix X is random but only assumed to be orthogonally invariant in law. Orthogonal invariance means that the singular vectors of X are Haar‑distributed (i.e., generic) while the singular values may follow any prescribed distribution. This model captures a wide range of realistic correlation structures that are not covered by the classical i.i.d. Gaussian design assumption, yet it remains analytically tractable through tools from free probability and random matrix theory.
The authors propose a two‑stage algorithmic pipeline. In the first stage they construct a spectral estimator by forming the data‑dependent matrix
D = Xᵀ diag(T(y₁),…,T(y_n)) X
and taking its leading eigenvector as an initial estimate β̂⁽⁰⁾. The preprocessing function T(·) is chosen from a broad class. The main theoretical contribution here is Theorem 4.1, which identifies a “criticality condition” (equation 4.13) under which the top eigenvalue of D separates from the bulk of the spectrum (a BBP‑type phase transition). When this condition holds, the associated eigenvector achieves a non‑vanishing asymptotic overlap with β∗, thereby satisfying the weak‑recovery criterion
lim inf_{d→∞} |⟨β̂⁽⁰⁾,β∗⟩|/(‖β̂⁽⁰⁾‖‖β∗‖) > 0.
Moreover, Theorem 4.2 shows that among all admissible T, there exists an optimal choice T★ that minimizes the required sample‑complexity δ = n/d for weak recovery. This optimal δ coincides with the conjectured computational threshold δ* from prior work, establishing that the spectral estimator is information‑theoretically optimal among polynomial‑time methods.
In the second stage the authors introduce Generalized Vector Approximate Message Passing (GV‑AMP), a flexible extension of the classic GAMP algorithm. GV‑AMP allows (i) arbitrary trace‑free spectral transformations of the singular values of X and (ii) entry‑wise, divergence‑free nonlinearities applied to the iterates. The dynamics of GV‑AMP are rigorously tracked by a deterministic state‑evolution (SE) recursion (Theorem 4.3). The SE equations predict, at each iteration, the mean‑squared error and the overlap with β∗, and they hold for any orthogonally invariant design matrix under the stated assumptions.
By specializing the generic GV‑AMP to a Bayes‑optimal denoiser (i.e., using the posterior mean under the assumed prior on β∗), the authors obtain the Bayes‑GV‑AMP algorithm. Theorem 4.6 proves that the fixed point of this algorithm attains the Bayes risk
lim_{d→∞} (1/d) E
Comments & Academic Discussion
Loading comments...
Leave a Comment