Adaptive Sparse Möbius Transforms for Learning Polynomials

Adaptive Sparse Möbius Transforms for Learning Polynomials
Notice: This research summary and analysis were automatically generated using AI technology. For absolute accuracy, please refer to the [Original Paper Viewer] below or the Original ArXiv Source.

We consider the problem of exactly learning an $s$-sparse real-valued Boolean polynomial of degree $d$ of the form $f:{ 0,1}^n \rightarrow \mathbb{R}$. This problem corresponds to decomposing functions in the AND basis and is known as taking a Möbius transform. While the analogous problem for the parity basis (Fourier transform) $f: {-1,1 }^n \rightarrow \mathbb{R}$ is well-understood, the AND basis presents a unique challenge: the basis vectors are coherent, precluding standard compressed sensing methods. We overcome this challenge by identifying that we can exploit adaptive group testing to provide a constructive, query-efficient implementation of the Möbius transform (also known as Möbius inversion) for sparse functions. We present two algorithms based on this insight. The Fully-Adaptive Sparse Möbius Transform (FASMT) uses $O(sd \log(n/d))$ adaptive queries in $O((sd + n) sd \log(n/d))$ time, which we show is near-optimal in query complexity. Furthermore, we also present the Partially-Adaptive Sparse Möbius Transform (PASMT), which uses $O(sd^2\log(n/d))$ queries, trading a factor of $d$ to reduce the number of adaptive rounds to $O(d^2\log(n/d))$, with no dependence on $s$. When applied to hypergraph reconstruction from edge-count queries, our results improve upon baselines by avoiding the combinatorial explosion in the rank $d$. We demonstrate the practical utility of our method for hypergraph reconstruction by applying it to learning real hypergraphs in simulations.


💡 Research Summary

The paper addresses the exact learning of an $s$‑sparse real‑valued Boolean polynomial of degree $d$, i.e., a function $f:{0,1}^n\to\mathbb{R}$ expressed in the AND (Möbius) basis. Unlike the well‑studied Fourier (parity) basis, the AND basis is highly coherent because basis functions corresponding to nested variable sets overlap, which prevents the use of standard compressed‑sensing techniques. To overcome this, the authors exploit a deep connection between the Möbius transform and group testing. In the additive query model, a query $x$ returns $f(x)=\sum_{k\le x}F(k)$, where $F(k)$ are the Möbius coefficients. The condition $k\le x$ is equivalent to the group‑testing condition $\neg x^{\top}k=0$, allowing the design of queries that act as group‑testing measurements over the Boolean semiring.

Two algorithms are proposed:

  1. Fully‑Adaptive Sparse Möbius Transform (FASMT).
    This algorithm performs a depth‑first search reminiscent of generalized binary splitting. For each active “bin” (a set of candidate coefficient indices) it adaptively selects a test vector, queries the appropriate transformed input, and subtracts contributions of already discovered coefficients. Each non‑zero coefficient is isolated after $O(d\log (n/d))$ queries, leading to a total query complexity of $O(s d \log (n/d))$ and a running time of $O((sd+n)sd\log (n/d))$. The adaptivity is fully sequential, with the number of rounds proportional to $s$.

  2. Partially‑Adaptive Sparse Möbius Transform (PASMT).
    PASMT fixes a $d$‑disjunct matrix $H\in{0,1}^{n\times b}$ with $b=O(d^{2}\log n)$ columns in advance. In each of $b$ rounds, all currently active bins are split simultaneously using the same column $h_t$ of $H$. The $d$‑disjunct property guarantees that after $b$ rounds each surviving bin contains exactly one non‑zero coefficient of degree at most $d$. Consequently, PASMT uses $O(s d^{2}\log (n/d))$ queries but only $O(d^{2}\log (n/d))$ adaptive rounds, independent of $s$.

The authors prove an information‑theoretic lower bound of $\Omega!\bigl(\frac{s d\log (n/d)}{\log s}\bigr)$ on any algorithm’s query complexity, showing that FASMT is near‑optimal up to a $\log s$ factor. They also establish that PASMT’s round complexity is essentially optimal given the fixed test matrix constraint.

A major application discussed is hypergraph reconstruction from edge‑count (additive) queries. In this setting, each hyperedge corresponds to a monomial of degree at most $d$, and a query returns the total weight of hyperedges contained in the queried vertex subset. Prior work using Boolean “edge‑detection” or Fourier‑based methods suffers exponential dependence on $d$ or $s$. By applying FASMT or PASMT, any hypergraph with $n$ vertices, $s$ hyperedges, and maximum edge size $d$ can be exactly recovered with $O(s d\log n)$ or $O(s d^{2}\log n)$ additive queries respectively. Simulations on hypergraphs derived from digital logic circuits and metabolic networks demonstrate that the algorithms achieve high reconstruction accuracy while scaling nearly linearly in $n$ and $s$, confirming practical viability.

In summary, the paper introduces a novel, group‑testing‑driven framework for learning sparse Boolean polynomials in the AND basis, delivering query‑optimal (up to logarithmic factors) and computationally efficient algorithms. The work bridges a gap between theoretical learning of non‑orthogonal representations and practical combinatorial reconstruction tasks, offering both strong theoretical guarantees and empirical validation.


Comments & Academic Discussion

Loading comments...

Leave a Comment