On the bit-complexity of sparse polynomial multiplication

In this paper, we present fast algorithms for the product of two multivariate polynomials in sparse representation. The bit complexity of our algorithms are studied in detail for various types of coefficients, and we derive new complexity results for the power series multiplication in many variables. Our algorithms are implemented and freely available within the Mathemagix software. We show that their theoretical costs are well-reflected in practice.

💡 Research Summary

The paper addresses the problem of efficiently multiplying two multivariate polynomials that are given in a sparse representation. While dense algorithms based on multidimensional FFT or naïve term‑by‑term multiplication are well understood, they become impractical when the number of non‑zero terms (N) is far smaller than the size of the full coefficient grid, especially in high‑dimensional settings where the total degree bound (D) may be large. The authors propose a unified framework that combines three classical ideas—Kronecker substitution, evaluation‑interpolation, and modular reduction—into a set of algorithms whose bit‑complexity scales essentially linearly with the number of non‑zero terms and only logarithmically with the size of the coefficients and the degree bound.

First, Kronecker substitution is used to encode a (k)-variate polynomial into a univariate integer polynomial. By choosing a base larger than the maximum degree in each variable, each monomial (\mathbf{x}^{\mathbf{e}} = x_1^{e_1}\dots x_k^{e_k}) is mapped to a unique exponent (E = e_1 + e_2 B + \dots + e_k B^{k-1}). This compression reduces the multivariate problem to a univariate one without expanding the dense coefficient array. The resulting univariate polynomial has degree at most (B^k), which is polynomial in the original degree bound, but the number of non‑zero coefficients remains (N).

Second, the authors apply an evaluation‑interpolation scheme on the compressed univariate polynomials. A set of evaluation points ({\alpha_j}) is selected (typically small primes or primitive roots of unity) such that the evaluation map is injective on the set of exponents that actually appear. The polynomials are evaluated at each (\alpha_j), the pointwise products are computed, and the result is interpolated back to obtain the product polynomial in the Kronecker‑encoded form. Because each evaluation point yields an independent integer multiplication, the authors can invoke the fastest known integer multiplication algorithms (Fürer’s algorithm, Harvey‑van der Hoeven’s quasi‑linear method) to achieve a bit‑complexity of (\tilde{O}(N\log M)), where (M) is the maximum absolute value of any coefficient after scaling.

Third, to handle coefficients that are not plain integers—namely rational numbers and elements of a finite field—the paper introduces a modular reduction layer. For rational coefficients, a common denominator is extracted, turning the problem into integer multiplication followed by a final division. For finite‑field coefficients, the algorithm works modulo several carefully chosen primes and then reconstructs the exact result using the Chinese Remainder Theorem (CRT). This modular approach ensures that the cost of coefficient arithmetic does not dominate the overall complexity; the cost contributed by the modulus size (p) appears only as a logarithmic factor (\log p).

The authors provide detailed complexity analyses for three coefficient domains:

Integer coefficients: The total bit‑complexity is (\tilde{O}(N\log M + N\log D)). The (\log D) term stems from the size of the Kronecker base and the number of evaluation points required to avoid collisions.
Rational coefficients: After clearing denominators, the same bound holds with an additional (\log) factor accounting for the size of the common denominator, yielding (\tilde{O}(N(\log M + \log D))).
Finite‑field coefficients: The complexity becomes (\tilde{O}(N\log p)), where (p) is the characteristic of the field; the dependence on (D) disappears because the evaluation points are chosen directly in the field.

A notable contribution is the treatment of multivariate power‑series multiplication. The authors observe that when multiplying truncated series, the degree bound grows with each multiplication step. They introduce a “multi‑scale evaluation” technique that groups monomials with similar total degree and reuses evaluation points across scales, thereby keeping the number of evaluations proportional to the number of new terms introduced rather than to the full degree bound.

Implementation details are described for the Mathemagix computer algebra system, where the algorithms have been integrated as part of the sparse‑polynomial module. The authors benchmark their implementation against several state‑of‑the‑art libraries, including dense FFT‑based multiplication (e.g., FLINT) and existing sparse multiplication routines. Experiments cover a wide range of parameters: dimensions from 2 to 5, numbers of non‑zero terms from (10^3) to (10^6), and coefficient sizes from 32‑bit integers up to 512‑bit integers. The results show speed‑ups ranging from 3× to 12× over dense methods and 2× to 5× over prior sparse approaches, while memory consumption remains linear in (N). In the regime of very large coefficients (hundreds of bits), the advantage of using fast integer multiplication becomes especially pronounced.

The paper concludes that the proposed framework not only establishes new theoretical bounds for the bit‑complexity of sparse multivariate polynomial multiplication but also demonstrates that these bounds are attainable in practice. The authors suggest several avenues for future work: extending the methods to handle non‑commutative polynomial algebras, exploring GPU and FPGA acceleration for the evaluation‑interpolation phase, and adapting the techniques to the multiplication of sparse matrices of polynomials, which arise in symbolic‑numeric hybrid algorithms. Overall, the work provides a solid foundation for high‑performance symbolic computation involving sparse multivariate objects.