Complexity of Splits Reconstruction for Low-Degree Trees
Given a vertex-weighted tree T, the split of an edge xy in T is min{s_x(xy), s_y(xy)} where s_u(uv) is the sum of all weights of vertices that are closer to u than to v in T. Given a set of weighted vertices V and a multiset of splits S, we consider the problem of constructing a tree on V whose splits correspond to S. The problem is known to be NP-complete, even when all vertices have unit weight and the maximum vertex degree of T is required to be no more than 4. We show that the problem is strongly NP-complete when T is required to be a path, the problem is NP-complete when all vertices have unit weight and the maximum degree of T is required to be no more than 3, and it remains NP-complete when all vertices have unit weight and T is required to be a caterpillar with unbounded hair length and maximum degree at most 3. We also design polynomial time algorithms for the variant where T is required to be a path and the number of distinct vertex weights is constant, and the variant where all vertices have unit weight and T has a constant number of leaves. The latter algorithm is not only polynomial when the number of leaves, k, is a constant, but also fixed-parameter tractable when parameterized by k. Finally, we shortly discuss the problem when the vertex weights are not given but can be freely chosen by an algorithm. The considered problem is related to building libraries of chemical compounds used for drug design and discovery. In these inverse problems, the goal is to generate chemical compounds having desired structural properties, as there is a strong correlation between structural properties, such as the Wiener index, which is closely connected to the considered problem, and biological activity.
💡 Research Summary
The paper studies the inverse problem of reconstructing a vertex‑weighted tree from a multiset of edge “splits”. For a tree T with vertex weights w(v)≥0, the split of an edge xy is defined as min{s_x(xy), s_y(xy)} where s_u(uv) is the sum of the weights of all vertices that are strictly closer to u than to v. Given a set V of weighted vertices and a multiset S of splits, the task is to build a tree on V whose edge splits exactly match S. This problem originates from cheminformatics: many molecular descriptors (e.g., the Wiener index) can be expressed in terms of splits, so being able to generate trees with prescribed split patterns is directly relevant to designing libraries of chemical compounds with desired physicochemical properties.
Main Complexity Results
-
Strong NP‑completeness for paths – The authors reduce the strongly NP‑hard 3‑Partition problem to the split‑reconstruction problem restricted to paths. Each integer in the 3‑Partition instance becomes a vertex weight; the desired split values are set to the target sum of each triple. Because a path forces each split to involve exactly two consecutive vertices, a feasible path exists if and only if the original numbers can be partitioned into triples of equal sum. Hence even when the underlying tree is a simple path, the problem remains strongly NP‑complete.
-
NP‑completeness for degree‑3 trees with unit weights – Using a reduction from 1‑in‑3‑SAT, the paper shows that when all vertices have weight 1 and the maximum degree of the tree is limited to three, the reconstruction problem is still NP‑complete. Variables are represented by gadgets that can be placed in one of two configurations (true/false). Each clause gadget connects to three variable gadgets and is constructed so that its split constraint is satisfied only when exactly one of the three incident variables is set to true. The degree bound of three is carefully respected throughout the construction.
-
NP‑completeness for caterpillars with degree 3 – The authors adapt the 1‑in‑3‑SAT reduction to a caterpillar topology (a central spine with pendant “hairs”). The spine encodes the variables, while each hair encodes a clause. Even though the hairs can be arbitrarily long, the overall maximum degree never exceeds three. This demonstrates that the problem stays hard for a very restricted class of trees that are often used as models for linear polymers with side chains.
Positive Algorithmic Results
-
Path case with a constant number of distinct weights – When the tree must be a path but the set of possible vertex weights contains only a constant k different values, a dynamic‑programming algorithm runs in polynomial time. The DP state records the position along the path and the multiset of remaining weight types; because k is constant, the state space is O(n·poly(k)) and the algorithm runs in O(n·k) time.
-
Constant‑leaf case for unit‑weight trees – If all vertices have weight 1 and the number of leaves k is a fixed constant, the authors present an FPT algorithm parameterized by k. The tree is decomposed into a “core” formed by the internal vertices and k leaf‑subtrees. All possible split assignments for each leaf‑subtree are enumerated (a number depending only on k), and a bounded‑search tree or integer‑linear‑programming formulation is used to combine them consistently. The total running time is f(k)·poly(n), where f(k) grows exponentially only with k, making the algorithm practical for small k (e.g., k ≤ 10).
Weight‑Free Variant
The paper briefly discusses a version where the algorithm may assign vertex weights arbitrarily rather than being given them in advance. In this setting the question becomes: does there exist any weight assignment that yields the prescribed split multiset? The authors conjecture that the problem remains NP‑hard in general, but note that if the split values are highly constrained (e.g., all equal or bounded within a narrow interval) polynomial‑time algorithms might be possible, leaving this as an open direction.
Implications and Applications
The results delineate a clear boundary between tractable and intractable instances of split reconstruction. Even severe structural restrictions—maximum degree three, caterpillar shape, or path topology—do not suffice to make the problem easy, underscoring the intrinsic combinatorial difficulty. Conversely, natural parameters such as the number of distinct weight values or the number of leaves provide avenues for efficient algorithms, which is encouraging for practical applications. In cheminformatics, many target molecules can be modeled as trees with a limited number of functional groups (leaves) or a small palette of atomic weights, so the FPT and DP algorithms could be directly employed to generate candidate structures that satisfy desired Wiener‑index‑derived split patterns.
In summary, the paper establishes strong hardness results for split reconstruction under low‑degree and highly constrained tree topologies, while also delivering polynomial‑time and fixed‑parameter algorithms for realistic parameter regimes. This dual contribution advances both the theoretical understanding of graph‑reconstruction problems and their practical utility in the design of chemically relevant tree‑like structures.
Comments & Academic Discussion
Loading comments...
Leave a Comment