Alignment and matching tests for high-dimensional tensor signals via tensor contraction

Alignment and matching tests for high-dimensional tensor signals via tensor contraction
Notice: This research summary and analysis were automatically generated using AI technology. For absolute accuracy, please refer to the [Original Paper Viewer] below or the Original ArXiv Source.

We consider two hypothesis testing problems for low-rank and high-dimensional tensor signals, namely the tensor signal alignment and tensor signal matching problems. These problems are challenging due to the high dimension of tensors and the lack of suitable test statistics. By exploiting a recent tensor contraction method, we propose and validate relevant test statistics using eigenvalues of a data matrix resulting from the tensor contraction. The matrix entries exhibit long-range dependence, which makes the analysis of the matrix challenging, involved, and distinct from standard random matrix theory. Our approach provides a novel framework for addressing hypothesis testing problems in the context of high-dimensional tensor signals.


💡 Research Summary

This paper tackles two hypothesis‑testing problems that arise in high‑dimensional, low‑rank tensor models: (i) testing whether a tensor signal aligns with a set of prescribed directional vectors (tensor signal alignment) and (ii) testing whether a tensor signal matches a reference tensor (tensor signal matching). The authors start from the d‑fold spiked tensor model
(T=\sum_{r=1}^{R}\beta_{r},x^{(r,1)}\otimes\cdots\otimes x^{(r,d)}+N^{-1/2}X),
where the noise tensor (X) has i.i.d. zero‑mean, unit‑variance entries. Direct recovery of the factors (x^{(r,\ell)}) is impossible when the signal‑to‑noise ratios (SNRs) fall below a phase‑transition threshold, yet many applications (e.g., video action recognition, fMRI classification) require a statistical decision about alignment rather than exact recovery.

To construct a tractable test statistic, the authors employ a recent tensor‑contraction operator (\Phi_{d}) (Liu & Wang, 2023). Given the tensor (T) and a collection of unit vectors (a^{(\ell)}\in\mathbb{R}^{n_{\ell}}), (\Phi_{d}) collapses all but two modes of the tensor, producing a symmetric (N\times N) matrix (R=\Phi_{d}(T,a^{(1)},\dots,a^{(d)})) with (N=n_{1}+\dots+n_{d}). Under the null hypothesis (H_{0}) (no alignment) the signal component of (R) vanishes, so (R) equals a noise matrix (M); under the alternative (H_{1}) a low‑rank signal matrix (S\neq0) is added. Crucially, the squared Frobenius norm (|R|{F}^{2}) equals
(\sum
{r=1}^{R}\beta_{r}^{2}\prod_{\ell=1}^{d}\langle x^{(r,\ell)},a^{(\ell)}\rangle^{2}),
which directly encodes the alignment of each factor with the prescribed directions. Hence (|R|_{F}^{2}) serves as a natural test statistic.

The main technical contribution is a rigorous random‑matrix analysis of the contracted noise matrix (M). Unlike classical Wigner matrices, the entries of (M) are long‑range correlated because each entry aggregates many noise tensor elements sharing the same contraction vectors. The authors derive the resolvent (Q(z)=(M-zI)^{-1}), decompose it into (d\times d) blocks, and study the block‑wise normalized traces (\rho_{i}(z)). Defining the vector (m(z)=(\mathbb{E}\rho_{1}(z),\dots,\mathbb{E}\rho_{d}(z))), they show that (m(z)) approximately satisfies a perturbed vector Dyson equation
(-c,m(z)=z+S_{d}m(z)+\varepsilon(z)),
with (S_{d}= \mathbf{1}{d\times d}-I{d}) and a perturbation (\varepsilon(z)) that vanishes as (N\to\infty). The limiting equation (-c,g(z)=z+S_{d}g(z)) has a unique analytic solution (g(z)); its sum (g(z)=\sum_{i=1}^{d}g_{i}(z)) is the Stieltjes transform of a probability measure (\nu), which is proved to be the limiting spectral distribution (LSD) of (M). The authors also establish entrywise local laws for (Q(z)) and a bound on the spectral norm of (M).

Armed with the LSD, the authors treat (|R|{F}^{2}) as a linear spectral statistic (LSS) of (R). They define the centered statistic
(b
{T}^{(d)}(N)=|R|{F}^{2}-N\int x^{2},\nu(dx))
and prove a central limit theorem (CLT): under (H
{0})
(\frac{b_{T}^{(d)}(N)-\xi_{N}^{(d)}}{\sigma_{N}^{(d)}};\xrightarrow{d};N(0,1)),
where (\xi_{N}^{(d)}) and (\sigma_{N}^{(d)}) are explicit quantities computable from (\nu) and the model parameters. Under (H_{1}) the same normalized statistic converges to a normal distribution with mean shift (D^{(d)}/\sigma_{N}^{(d)}), where (D^{(d)}) captures the aggregate alignment strength. Consequently, a test at any significance level (\alpha) can be built by comparing the normalized statistic to the standard normal quantile, and the asymptotic power is a simple function of the mean shift.

The paper also extends the framework to the “tensor signal matching” problem, where the prior information is a reference tensor rather than a set of vectors. By applying the same contraction operator to both the observed tensor and the reference tensor, the authors obtain two matrices whose difference’s Frobenius norm serves as a test statistic. The same CLT machinery applies, provided the reference tensor’s signal strength exceeds a mild threshold.

Extensive simulations validate the theoretical CLT (empirical distributions match the Gaussian limit) and demonstrate high power across a range of dimensions, ranks, and SNRs. Real‑data experiments include (a) human action recognition from video clips, where the proposed alignment test successfully discriminates actions using learned template directions, and (b) fMRI classification, where matching tests differentiate patient and control groups. In both cases, the proposed methods achieve comparable or superior performance to deep‑learning classifiers while offering interpretable statistical evidence.

In summary, the authors introduce a novel, mathematically rigorous framework for hypothesis testing on high‑dimensional tensor data. By leveraging tensor contraction, a vector Dyson equation, and linear spectral statistics, they overcome the challenges posed by long‑range dependence in the contracted noise matrix. The resulting tests are computationally feasible, theoretically justified, and empirically effective, filling a gap in the literature where low‑SNR tensor signals render traditional recovery methods ineffective. Future work may explore non‑Gaussian or heteroscedastic noise, multiple spikes, and online tensor streams.


Comments & Academic Discussion

Loading comments...

Leave a Comment