Smoothed Analysis of Tensor Decompositions

Smoothed Analysis of Tensor Decompositions
Notice: This research summary and analysis were automatically generated using AI technology. For absolute accuracy, please refer to the [Original Paper Viewer] below or the Original ArXiv Source.

Low rank tensor decompositions are a powerful tool for learning generative models, and uniqueness results give them a significant advantage over matrix decomposition methods. However, tensors pose significant algorithmic challenges and tensors analogs of much of the matrix algebra toolkit are unlikely to exist because of hardness results. Efficient decomposition in the overcomplete case (where rank exceeds dimension) is particularly challenging. We introduce a smoothed analysis model for studying these questions and develop an efficient algorithm for tensor decomposition in the highly overcomplete case (rank polynomial in the dimension). In this setting, we show that our algorithm is robust to inverse polynomial error – a crucial property for applications in learning since we are only allowed a polynomial number of samples. While algorithms are known for exact tensor decomposition in some overcomplete settings, our main contribution is in analyzing their stability in the framework of smoothed analysis. Our main technical contribution is to show that tensor products of perturbed vectors are linearly independent in a robust sense (i.e. the associated matrix has singular values that are at least an inverse polynomial). This key result paves the way for applying tensor methods to learning problems in the smoothed setting. In particular, we use it to obtain results for learning multi-view models and mixtures of axis-aligned Gaussians where there are many more “components” than dimensions. The assumption here is that the model is not adversarially chosen, formalized by a perturbation of model parameters. We believe this an appealing way to analyze realistic instances of learning problems, since this framework allows us to overcome many of the usual limitations of using tensor methods.


💡 Research Summary

The paper tackles one of the most challenging regimes in tensor decomposition: the highly over‑complete case where the rank R can be polynomial in the ambient dimension n. Classical algorithms for tensor decomposition (e.g., the Leurgans‑et‑al method) work only when R ≤ n, and they become unstable in the presence of noise. To overcome these limitations, the authors introduce a smooth‑analysis framework: each factor vector is perturbed by an independent Gaussian noise of variance ρ²/n, and the observed tensor may also contain bounded additive noise. This model captures realistic learning scenarios where the underlying generative parameters are not adversarially chosen.

The technical heart of the work is a robust Kruskal‑rank theorem. Let A^{(1)},…,A^{(ℓ)} be n×R matrices with unit‑norm columns, and let \tilde A^{(j)} denote their ρ‑perturbed versions. The authors prove that the Khatri‑Rao product \tilde A^{(1)}⊙…⊙\tilde A^{(ℓ)} retains full robust Kruskal rank R with τ‑robustness τ = (n/ρ)^{3ℓ}. Moreover, this holds with probability at least 1−exp(−C n^{1/(3ℓ)}). In other words, the perturbed tensor factors remain “strongly linearly independent” despite the over‑completeness. The proof proceeds by analyzing a leave‑one‑out distance that serves as a surrogate for the smallest singular value, and by constructing a collection of matrices whose quadratic forms certify a non‑negligible projection of a random rank‑one tensor onto any large subspace.

Armed with this robust independence guarantee, the authors revisit the Leurgans algorithm. By flattening an ℓ‑order tensor into a third‑order tensor, the algorithm reduces decomposition to two matrix eigen‑problems involving Khatri‑Rao products. The robust Kruskal‑rank result ensures that the involved matrices have condition numbers bounded by a polynomial in n and 1/ρ, even when R = poly(n). Consequently, the algorithm tolerates additive noise of magnitude ε·(ρ/n)^{3ℓ} and recovers each rank‑one component up to additive error ε in polynomial time O(n^{3ℓ}). The failure probability is exponentially small in n^{1/(3ℓ)}.

The paper then demonstrates how this machinery yields new learning algorithms for two important classes of models:

  1. Multi‑view latent variable models – Each sample consists of ℓ conditionally independent views generated from a hidden component. Under the smooth‑analysis assumption that the view‑specific distributions are ρ‑perturbed, the authors can learn mixing weights and view distributions for up to R ≤ n^{⌊ℓ−1/2⌋}/2 components, far beyond the previous limit R ≤ n. The algorithm’s runtime and sample complexity are polynomial in n, 1/ε, and 1/ρ.

  2. Mixtures of axis‑aligned Gaussians – The means of the Gaussian components are assumed to be ρ‑perturbed, while covariances remain diagonal. Using the same tensor flattening and robust Kruskal‑rank argument, the authors obtain a polynomial‑time algorithm for learning mixtures with k = poly(n) components, a regime where prior methods required exponential time in k. The algorithm recovers mixture weights, perturbed means, and diagonal covariances to accuracy ε with high probability.

Both applications illustrate that the smooth‑analysis model eliminates the need for strong separation or full‑rank assumptions that have traditionally limited tensor‑based learning methods. Instead, a modest random perturbation suffices to guarantee the algebraic conditions required for efficient decomposition.

The authors also discuss related work. Prior over‑complete tensor decomposition results (e.g.,


Comments & Academic Discussion

Loading comments...

Leave a Comment