Nearly Optimal Sparse Fourier Transform

We consider the problem of computing the k-sparse approximation to the discrete Fourier transform of an n-dimensional signal. We show: * An O(k log n)-time randomized algorithm for the case where the input signal has at most k non-zero Fourier coefficients, and * An O(k log n log(n/k))-time randomized algorithm for general input signals. Both algorithms achieve o(n log n) time, and thus improve over the Fast Fourier Transform, for any k = o(n). They are the first known algorithms that satisfy this property. Also, if one assumes that the Fast Fourier Transform is optimal, the algorithm for the exactly k-sparse case is optimal for any k = n^{\Omega(1)}. We complement our algorithmic results by showing that any algorithm for computing the sparse Fourier transform of a general signal must use at least \Omega(k log(n/k)/ log log n) signal samples, even if it is allowed to perform adaptive sampling.

💡 Research Summary

The paper addresses the problem of computing a k‑sparse approximation of the discrete Fourier transform (DFT) of an n‑dimensional signal, a task that arises in many areas such as signal processing, data compression, and scientific computing. The authors present two randomized algorithms that run in sub‑linear time when k = o(n), thereby beating the classic Fast Fourier Transform (FFT) which requires Θ(n log n) operations.

The first algorithm handles the “exactly k‑sparse” case, where the signal’s Fourier spectrum contains at most k non‑zero coefficients. By applying a random permutation to the frequency domain and then hashing frequencies into O(k) buckets, the algorithm isolates each large coefficient with high probability. A carefully designed flat‑window filter is used to suppress contributions from other frequencies within each bucket. To reduce the probability of collisions, the process is repeated with O(log n) independent hash functions, and the estimates from the buckets are averaged. The total number of time‑domain samples accessed is O(k log n), and the overall running time is also O(k log n). The algorithm succeeds with probability at least 1 − δ for any constant δ, and its sample complexity matches the lower bound up to a log log n factor.

The second algorithm extends the technique to arbitrary signals, aiming to recover the k largest Fourier coefficients (in magnitude) up to a prescribed error tolerance. It employs a multistage refinement strategy: an initial coarse pass identifies a superset of candidate frequencies, and subsequent passes progressively narrow this set by adjusting the sampling rate and bucket size. Each stage uses a new random hash and a tighter filter, thereby reducing collision probability and improving the accuracy of the amplitude estimates. The total runtime is O(k log n log(n/k)), and the number of samples required is O(k log n log(n/k)). This algorithm also achieves a success probability of 1 − δ, with error guarantees that are comparable to those of the best known sparse recovery methods.

Beyond algorithmic contributions, the authors prove a fundamental lower bound on the number of signal samples needed for any algorithm (adaptive or non‑adaptive) that computes a k‑sparse Fourier approximation of a general signal. They show that at least Ω(k log(n/k)/log log n) samples are necessary. This lower bound matches the sample complexity of their second algorithm up to a polylogarithmic factor, establishing near‑optimality in both time and sample usage.

Finally, under the widely believed conjecture that the FFT is optimal for dense transforms, the authors argue that their O(k log n) algorithm for the exactly k‑sparse case is also optimal for any k = n^{Ω(1)}. In other words, no algorithm can achieve a substantially better asymptotic runtime for this regime without violating the assumed optimality of the FFT.

Overall, the paper delivers the first known algorithms that compute a sparse Fourier transform in o(n log n) time for any non‑trivial sparsity level, provides rigorous probabilistic guarantees, and establishes matching lower bounds. These results have immediate practical relevance for applications where signals are naturally sparse or can be approximated sparsely, offering a concrete pathway to replace the FFT with faster, sample‑efficient methods in large‑scale data analysis.