On the Impact of Sample Size in Reconstructing Noisy Graph Signals: A Theoretical Characterisation
Reconstructing a signal on a graph from noisy observations of a subset of the vertices is a fundamental problem in the field of graph signal processing. This paper investigates how sample size affects reconstruction error in the presence of noise via an in-depth theoretical analysis of the two most common reconstruction methods in the literature, least-squares reconstruction (LS) and graph-Laplacian regularised reconstruction (GLR). Our theorems show that at sufficiently low signal-to-noise ratios (SNRs), under these reconstruction methods we may simultaneously decrease sample size and decrease average reconstruction error. We further show that at sufficiently low SNRs, for LS reconstruction we have a $Λ$-shaped error curve and for GLR reconstruction, a sample size of $ O(\sqrt{N})$, where $N$ is the total number of vertices, results in lower reconstruction error than near full observation. We present thresholds on the SNRs, $τ$ and $τ_{GLR}$, below which the error is non-monotonic, and illustrate these theoretical results with experiments across multiple random graph models, sampling schemes and SNRs. These results demonstrate that any decision in sample-size choice has to be made in light of the noise levels in the data.
💡 Research Summary
This paper tackles a fundamental yet under‑explored question in graph signal processing: how does the number of observed vertices (the sample size) affect the reconstruction error when the observations are corrupted by noise? While most prior work either assumes noiseless measurements or studies a fixed sampling budget, the authors provide a comprehensive theoretical characterisation of the mean‑squared error (MSE) as a function of sample size for the two most widely used reconstruction schemes—Least‑Squares (LS) and Graph‑Laplacian Regularised (GLR) reconstruction.
Problem setting
- The underlying signal x is a random k‑bandlimited graph signal (i.e., a linear combination of the first k eigenvectors of the combinatorial Laplacian L).
- Observations are y = x + n, where n = σ·ε and ε is zero‑mean with either full‑band covariance I_N or bandlimited covariance Π_bl(K).
- The signal‑to‑noise ratio (SNR) is defined as E
Comments & Academic Discussion
Loading comments...
Leave a Comment