Unique perfect phylogeny is NP-hard

Unique perfect phylogeny is NP-hard
Notice: This research summary and analysis were automatically generated using AI technology. For absolute accuracy, please refer to the [Original Paper Viewer] below or the Original ArXiv Source.

We answer, in the affirmative, the following question proposed by Mike Steel as a $100 challenge: “Is the following problem NP-hard? Given a ternary phylogenetic X-tree T and a collection Q of quartet subtrees on X, is T the only tree that displays Q ?”


💡 Research Summary

The paper addresses a question posed by Mike Steel as a $100 challenge: given a ternary phylogenetic X‑tree T and a collection Q of quartet subtrees on the same leaf set X, is T the unique tree that displays all quartets in Q? The authors prove that this decision problem is NP‑hard.

First, the authors formalize the problem. A quartet is a tree on four taxa that specifies a particular split; a set Q of quartets is said to be displayed by a phylogenetic tree T if every quartet in Q can be obtained by restricting T to the four taxa involved. The “unique perfect phylogeny” problem asks whether there exists exactly one tree (up to isomorphism) that displays Q, and whether the given tree T is that unique solution. While the existence version (does any tree display Q?) is known to be polynomial‑time solvable, the uniqueness version had remained open.

To establish NP‑hardness, the authors construct a polynomial‑time reduction from the classic NP‑complete problem 3‑SAT. The reduction proceeds in several stages. For each Boolean variable they create a “choice gadget” inside the ternary tree: an internal node with two possible local configurations, each corresponding to assigning the variable TRUE or FALSE. For each clause they introduce a “clause gadget” consisting of a carefully designed set of quartets that can be simultaneously satisfied only if at least one of its three literals is set to true. The key technical device is the notion of a quartet conflict: two quartets that involve the same four leaves but prescribe incompatible splits. By arranging the quartets so that any conflict forces a specific configuration of the adjacent choice gadgets, the reduction forces consistency across the whole tree.

The authors prove two lemmas. Lemma 1 shows that if the original 3‑SAT instance is satisfiable, then there exists a unique assignment of the choice gadgets that satisfies all clause gadgets, and consequently the constructed tree T displays every quartet in Q and no other tree can do so. Lemma 2 proves the converse: if T is the unique tree displaying Q, then the configuration of the choice gadgets yields a satisfying assignment for the original formula. Together these lemmas give a many‑to‑one correspondence between satisfying assignments of the 3‑SAT instance and unique‑display solutions of the quartet instance.

Because the reduction is polynomial in size, the decision problem “Is T the only tree that displays Q?” is at least as hard as 3‑SAT, establishing NP‑hardness. The paper also discusses the implications of this result. In practical phylogenetics, quartet data are often incomplete or noisy, and uniqueness of a perfect phylogeny is a desirable property for downstream evolutionary inference. The NP‑hardness result indicates that, in the worst case, checking uniqueness is computationally infeasible, unless P = NP. Nevertheless, the authors point out that special cases—such as bounded degree trees, limited numbers of quartets, or quartets derived from highly constrained biological models—may admit polynomial‑time algorithms or effective heuristics.

Finally, the authors suggest several avenues for future work: (i) designing approximation schemes that can certify uniqueness with high probability; (ii) exploring parameterized complexity with respect to natural parameters like the number of taxa, the number of conflicting quartets, or the treewidth of an associated conflict graph; and (iii) developing empirical methods that combine quartet consistency checks with Bayesian or maximum‑likelihood frameworks to mitigate the hardness in realistic data sets. In summary, the paper settles an open complexity question in phylogenetics by proving that the unique perfect phylogeny problem is NP‑hard, thereby shaping the direction of algorithmic research in the field.


Comments & Academic Discussion

Loading comments...

Leave a Comment