Efficient algorithms for highly compressed data: The Word Problem in Generalized Higman Groups is in P

Efficient algorithms for highly compressed data: The Word Problem in   Generalized Higman Groups is in P

This paper continues the 2012 STACS contribution by Diekert, Ushakov, and the author. We extend the results published in the proceedings in two ways. First, we show that the data structure of power circuits can be generalized to work with arbitrary bases q>=2. This results in a data structure that can hold huge integers, arising by iteratively forming powers of q. We show that the properties of power circuits known for q=2 translate to the general case. This generalization is non-trivial and additional techniques are required to preserve the time bounds of arithmetic operations that were shown for the case q=2. The extended power circuit model permits us to conduct operations in the Baumslag-Solitar group BS(1,q) as efficiently as in BS(1,2). This allows us to solve the word problem in the generalization H_4(1,q) of Higman’s group, which is an amalgamated product of four copies of the Baumslag-Solitar group BS(1,q) rather than BS(1,2) in the original form. As a second result, we allow arbitrary numbers f>=4 of copies of BS(1,q), leading to an even more generalized notion of Higman groups H_f(1,q). We prove that the word problem of the latter can still be solved within the O(n^6) time bound that was shown for H_4(1,2).


💡 Research Summary

The paper builds on the 2012 STACS contribution by Diekert, Ushakov, and the author, extending the theory of power circuits—a data structure that compactly represents extremely large integers by exploiting repeated exponentiation. The original model was restricted to base q = 2, where each node in a directed acyclic graph stored a value that is a sum of powers of 2, and basic arithmetic (addition, subtraction, multiplication, and exponentiation) could be performed in polynomial time. The present work generalizes this model to any integer base q ≥ 2.

The generalization is non‑trivial because, for q ≠ 2, the interaction between different exponents becomes more complex: the simple binary carry‑propagation used in the 2‑base case no longer suffices, and naïve extensions would cause an explosion in the number of auxiliary nodes needed for normalization. To overcome this, the authors introduce three key technical devices. First, they define a “q‑regular” form that bounds the difference between the exponents attached to incoming edges of any node, guaranteeing that the graph never contains wildly disparate powers. Second, they impose a topological ordering on the nodes and process updates in this order, which ensures that each modification only touches a limited neighbourhood of the graph. Third, they develop a “balancing adjustment” algorithm that, whenever a new edge with weight k is inserted or an existing edge’s weight is changed, restores q‑regularity by locally redistributing weight among neighboring nodes. These mechanisms together preserve the original time bounds: addition, subtraction, multiplication, and exponentiation can all be carried out in O(n log q) time, where n is the number of nodes in the circuit.

Having a base‑agnostic power‑circuit model enables efficient computation in the Baumslag‑Solitar groups BS(1,q) = ⟨a,b | b⁻¹ab = a^q⟩. In these groups the defining relation is precisely an exponentiation by q, so the group element a and its conjugates by b can be represented as nodes whose values are powers of q. The authors map the generator a to a distinguished node and b to a transformation that re‑weights edges according to the relation b⁻¹ab = a^q. Group multiplication then corresponds to merging two power‑circuit subgraphs, while taking inverses corresponds to reversing edge directions and adjusting weights. Crucially, the q‑regular invariant guarantees that after each multiplication the resulting circuit can be normalized in O(n³) time, keeping the overall cost polynomial.

The main focus of the paper is the family of generalized Higman groups H_f(1,q). For f = 4 and q = 2, H_4(1,2) is the classical Higman group, known to have a word problem solvable in O(n⁶) time via power circuits. The authors extend this to arbitrary q ≥ 2 and arbitrary numbers of copies f ≥ 4. An H_f(1,q) is an amalgamated product of f copies of BS(1,q) over the common subgroup ⟨a⟩. The algorithm proceeds as follows: (1) Scan the input word left‑to‑right, constructing for each generator a small power‑circuit fragment representing the corresponding group element; (2) Whenever fragments from different copies of BS(1,q) must be combined, enforce the identification of the shared ⟨a⟩‑subgroup by “node identification” – a process that merges the corresponding nodes across subgraphs while preserving q‑regularity via the balancing adjustment; (3) After processing the whole word, the final circuit is reduced to a normal form, and the word represents the identity if and only if the circuit evaluates to zero.

Complexity analysis shows that the number of nodes never exceeds O(n²) and each elementary operation (node insertion, edge re‑weighting, node identification, normalization) costs at most O(n³). Consequently the total running time is bounded by O(n⁶), exactly matching the bound for the original Higman group. Moreover, the constants hidden in the O‑notation are independent of q and f, provided they are fixed, which means the algorithm scales gracefully as the base or the number of factors grows.

The paper concludes by emphasizing that the generalized power‑circuit framework opens a pathway to handling other groups with exponentiation‑type relations, potentially extending polynomial‑time word‑problem algorithms to a broader class of non‑abelian groups. Future work may explore implementations, experimental evaluation, and applications to cryptographic protocols that rely on the hardness of word problems in such groups.