Large Scale High-Dimensional Reduced-Rank Linear Discriminant Analysis

Large Scale High-Dimensional Reduced-Rank Linear Discriminant Analysis
Notice: This research summary and analysis were automatically generated using AI technology. For absolute accuracy, please refer to the [Original Paper Viewer] below or the Original ArXiv Source.

Reduced-rank linear discriminant analysis (RRLDA) is a foundational method of dimension reduction for classification that has been useful in a wide range of applications. The goal is to identify an optimal subspace to project the observations onto that simultaneously maximizes between-group variation while minimizing within-group differences. The solution is straight forward when the number of observations is greater than the number of features but computational difficulties arise in both the high-dimensional setting, where there are more features than there are observations, and when the data are very large. Many works have proposed solutions for the high-dimensional setting and frequently involve additional assumptions or tuning parameters. We propose a fast and simple iterative algorithm for both classical and high-dimensional RRLDA on large data that is free from these additional requirements and that comes with guarantees. We also explain how RRLDA-RK provides implicit regularization towards the least norm solution without explicitly incorporating penalties. We demonstrate our algorithm on real data and highlight some results.


💡 Research Summary

The paper tackles two practical challenges of Reduced‑Rank Linear Discriminant Analysis (RRLDA): (1) the high‑dimensional regime where the number of features d exceeds the number of observations n, and (2) the large‑scale regime where both n and d are huge. Classical RRLDA solves a generalized eigenvalue problem that requires O(d³) operations, which becomes infeasible in these settings. Existing high‑dimensional solutions typically impose structural assumptions on the scatter matrices (e.g., diagonal, spiked, low‑rank) or add regularization terms with tunable hyper‑parameters, leading to additional computational overhead and the need for cross‑validation.

The authors first reformulate RRLDA as a least‑squares problem. Let X∈ℝ^{n×d} be the centered data matrix and Y∈ℝ^{n×g} the centered class‑indicator matrix (g is the number of classes). The goal becomes

  min_W ½‖XW – Y‖_F², W∈ℝ^{d×g}.

When n≥d this is an over‑determined system with the usual solution W = (XᵀX)^{-1}XᵀY. When n<d the system is under‑determined and admits infinitely many solutions; the minimum‑norm solution W_LN = (XXᵀ)^{†}XY coincides with the Moore‑Penrose based solution used in prior high‑dimensional RRLDA work. This equivalence guarantees that the subspace spanned by the columns of W_LN is the same as the one obtained from the generalized eigenvalue formulation, up to orthogonal rotation and zero padding.

To solve the least‑squares problem efficiently, the authors adopt the Randomized Kaczmarz (RK) method. Classical RK iteratively selects a row i of X with probability proportional to its squared Euclidean norm, then updates a vector estimate w_k via a simple projection onto the hyperplane defined by that row. The paper extends RK to the matrix‑right‑hand‑side case: each iteration updates the entire matrix estimate W_k using the selected row x_i and the corresponding row Y_i of the label matrix. The update rule is

  W_{k+1} = W_k + (Y_i – x_iᵀW_k)·(x_i / ‖x_i‖²)ᵀ.

Because the selection probabilities are p_i = ‖x_i‖² / ‖X‖_F², the expected convergence rate depends on the scaled condition number κ(X) = ‖X‖_F² / σ_min⁺(X)². Proposition 1 proves that after K iterations

  E


Comments & Academic Discussion

Loading comments...

Leave a Comment