The comparison of tree-sibling time consistent phylogenetic networks is graph isomorphism-complete
In a previous work, we gave a metric on the class of semibinary tree-sibling time consistent phylogenetic networks that is computable in polynomial time; in particular, the problem of deciding if two networks of this kind are isomorphic is in P. In this paper, we show that if we remove the semibinarity condition above, then the problem becomes much harder. More precisely, we proof that the isomorphism problem for generic tree-sibling time consistent phylogenetic networks is polynomially equivalent to the graph isomorphism problem. Since the latter is believed to be neither in P nor NP-complete, the chances are that it is impossible to define a metric on the class of all tree-sibling time consistent phylogenetic networks that can be computed in polynomial time.
💡 Research Summary
The paper investigates the computational complexity of the isomorphism problem for tree‑sibling time‑consistent (TSTC) phylogenetic networks when the semibinary restriction is lifted. In earlier work the authors introduced a polynomial‑time computable metric for semibinary TSTC networks, which implied that deciding whether two such networks are isomorphic lies in P. The current study asks what happens if we consider the full class of TSTC networks, where internal nodes may have an arbitrary number of children.
First, the authors recall the definition of a TSTC network. A phylogenetic network is a directed acyclic graph whose leaves are labelled by taxa. The “tree‑sibling” condition requires that every non‑leaf node has at least one child that is a tree node (i.e., a node with indegree 1). The “time‑consistent” condition assigns an integer time label to each node such that every edge goes from a node of lower time to a node of higher time, guaranteeing that reticulation events do not create temporal paradoxes. In the semibinary setting each internal node has exactly two outgoing edges, which dramatically limits the combinatorial possibilities.
The authors then demonstrate that, without the semibinary constraint, the isomorphism problem becomes as hard as the classic Graph Isomorphism (GI) problem. Their proof proceeds via two polynomial‑time reductions.
-
GI → TSTC‑ISO: Given an arbitrary simple graph G, they construct a TSTC network N(G) that encodes G’s adjacency structure while respecting the tree‑sibling and time‑consistent constraints. Each vertex of G is represented by a gadget consisting of a tree node together with a sibling reticulation node; edges of G become directed paths that connect the corresponding gadgets in a way that preserves the original graph’s automorphism group. The construction can be carried out in time polynomial in |V(G)| + |E(G)|, and it holds that G₁ ≅ G₂ if and only if N(G₁) ≅ N(G₂).
-
TSTC‑ISO → GI: Conversely, for any TSTC network N they build an undirected graph G(N) that captures the full structure of N. Nodes of N are turned into labelled vertices of G(N); directed parent‑child arcs become undirected edges, and the time‑labeling is encoded by adding auxiliary edges that enforce the same partial order. The tree‑sibling condition is reflected by special “sibling” edges that guarantee any automorphism of G(N) must map tree‑sibling pairs to each other. This transformation is also polynomial‑time and satisfies N₁ ≅ N₂ ⇔ G(N₁) ≅ G(N₂).
Since each reduction is polynomial, the two problems are polynomially equivalent; therefore the isomorphism problem for generic TSTC networks is GI‑complete. This places the problem in a complexity class that is believed to be neither in P nor NP‑complete. Consequently, the existence of a polynomial‑time computable metric for the entire class of TSTC networks is highly unlikely, because such a metric would give a polynomial‑time solution to the isomorphism problem.
The paper concludes by discussing the practical implications for phylogenetics. Many real‑world datasets produce networks that are not semibinary, so the hardness result suggests that exact, efficiently computable distances between arbitrary TSTC networks may be infeasible. Researchers may need to focus on restricted subclasses (e.g., semibinary, bounded reticulation number, or bounded treewidth) where polynomial‑time metrics are still possible, or resort to approximation and heuristic methods for broader classes. The work thus bridges phylogenetic network theory with classical graph‑theoretic complexity, providing a clear boundary between tractable and intractable cases of network comparison.
Comments & Academic Discussion
Loading comments...
Leave a Comment