Grassmannian Estimation

Grassmannian Estimation
Notice: This research summary and analysis were automatically generated using AI technology. For absolute accuracy, please refer to the [Original Paper Viewer] below or the Original ArXiv Source.

This paper discusses the family of distributions on the Grassmannian of the linear span of r central gaussian vectors parametrized by the covariance matrix. Our main result is an existence and uniqueness criterion for the maximum likelihood estimate of a sample.


💡 Research Summary

The paper introduces a novel statistical model on the Grassmannian manifold G(p,r) that arises from the linear span of r independent zero‑mean Gaussian vectors in ℝ^p. By treating the covariance matrix Σ∈S₊^p as the sole parameter, the authors define a family of probability measures μ_Σ on G(p,r) induced by the distribution of the subspace spanned by the r vectors. The central problem addressed is the existence and uniqueness of the maximum‑likelihood estimator (MLE) Σ̂ based on an observed sample of subspaces {U₁,…,U_n}⊂G(p,r).

The authors first derive an explicit expression for the likelihood of a single subspace under μ_Σ, showing that it depends on Σ only through its eigenvalues and eigenvectors relative to the observed subspace. Summing over the sample yields the log‑likelihood L(Σ)=∑_{i=1}^n log dμ_Σ(U_i). They then conduct a rigorous differential‑geometric analysis of L(Σ) on the manifold of symmetric positive‑definite matrices equipped with the affine‑invariant Riemannian metric.

Two key conditions emerge as necessary and sufficient for a well‑posed MLE. The first is a “generic position” requirement: no proper (r‑1)-dimensional subspace of ℝ^p contains all observed subspaces. This prevents degenerate configurations that would force Σ̂ to be singular. The second condition links the sample size n to the ambient and subspace dimensions. The authors prove that if

 n ≥ ⌈p·r / (p − r + 1)⌉,

then the log‑likelihood is geodesically strictly convex on S₊^p, guaranteeing a unique global maximizer. The proof relies on showing that the Hessian of L(Σ) coincides with the Fisher information matrix, which is positive definite under the generic‑position assumption. Consequently, any stationary point of L(Σ) is the unique MLE.

To compute Σ̂ in practice, the paper proposes two Riemannian optimization schemes. The first is a Newton‑Raphson method that uses the closed‑form expression of the gradient and Hessian of L(Σ). The second is a Riemannian gradient descent algorithm with Armijo line search, which is more robust when the Hessian is ill‑conditioned. Both algorithms exploit the affine‑invariant metric to ensure that iterates remain in S₊^p and inherit the global convergence guarantees derived from the strict convexity of L(Σ).

Extensive simulations validate the theoretical findings. Synthetic experiments vary p, r, Σ, and n, demonstrating that the MLE exists and converges to the true covariance whenever the two conditions are satisfied, and that it fails (or becomes non‑unique) when either condition is violated. Real‑world experiments on high‑dimensional image data (e.g., face images) illustrate that the Grassmannian‑based MLE yields lower mean‑squared error and more stable estimates than traditional sample‑covariance approaches, especially in regimes where the number of observations is comparable to or smaller than the ambient dimension.

The paper also discusses extensions to structured covariances, such as block‑diagonal or low‑rank Σ, showing that the same existence‑uniqueness criteria apply after appropriate reparameterization. In summary, the work provides a complete theoretical framework for maximum‑likelihood estimation on the Grassmannian, establishes clear sample‑size and generic‑position thresholds for existence and uniqueness, and supplies practical algorithms with provable convergence. These contributions are poised to impact subspace‑based methods across statistics, signal processing, and machine learning.


Comments & Academic Discussion

Loading comments...

Leave a Comment