Computing supertrees is a central problem in phylogenetics. The supertree method that is by far the most widely used today was introduced in 1992 and is called Matrix Representation with Parsimony analysis (MRP). Matrix Representation using Flipping (MRF)}, which was introduced in 2002, is an interesting variant of MRP: MRF is arguably more relevant that MRP and various efficient implementations of MRF have been presented. From a theoretical point of view, implementing MRF or MRP is solving NP-hard optimization problems. The aim of this paper is to study the approximability and the fixed-parameter tractability of the optimization problem corresponding to MRF, namely Minimum-Flip Supertree. We prove strongly negative results.
Deep Dive into Intractability of the Minimum-Flip Supertree problem and its variants.
Computing supertrees is a central problem in phylogenetics. The supertree method that is by far the most widely used today was introduced in 1992 and is called Matrix Representation with Parsimony analysis (MRP). Matrix Representation using Flipping (MRF)}, which was introduced in 2002, is an interesting variant of MRP: MRF is arguably more relevant that MRP and various efficient implementations of MRF have been presented. From a theoretical point of view, implementing MRF or MRP is solving NP-hard optimization problems. The aim of this paper is to study the approximability and the fixed-parameter tractability of the optimization problem corresponding to MRF, namely Minimum-Flip Supertree. We prove strongly negative results.
When studying the evolutionary relatedness of current taxa, the discovered relations are usually represented as rooted trees, called phylogenies. Phylogenies for various taxa sets are routinely inferred from various kinds of molecular and morphological data sets. A subsequent problem is computing supertrees [4], i.e., amalgamating phylogenies for non-identical but overlapping taxon sets to obtain more comprehensive phylogenies. Constructing supertrees is easy if no contradictory information is contained in the data [1]. However, incompatible input phylogenies are the rule rather than the exception in practice. The major problem for supertree methods is thus dealing with incompatibilities.
The supertree method that is by far the most widely used today was independently proposed by Baum [3] and Ragan [26] in 1992; it is called Matrix Representation with Parsimony analysis (MRP) [4]. From a theoretical point of view, implementing MRP is designing an algorithm for an NP-hard optimization problem [14,18], so the running times of MRP algorithms are sometimes prohibitive for large data sets.
In 2002, Chen et al. proposed a variant of MRP [11], which was later called Matrix Representation using Flipping (MRF) [9]. MRF is arguably more relevant than MRP [4] (see also [12,16]), and various efficient implementations of MRF have been presented [10,13,16]. However, as in the case of MRP, implementing MRF is designing an algorithm for an NP-hard optimization problem [12], namely Minimum-Flip Supertree. The aim of the present paper is to study the approximability and the fixed-parameter tractability [17] of Minimum-Flip Supertree. We prove strongly negative results.
Let S be a finite set. A (rooted) phylogeny for S is a subset T of the power set of S that satisfies the following properties: โ
โ T , S โ T , {s} โ T for all s โ S, and X โฉ Y โ {โ
, X, Y } for all X, Y โ T . The elements of S are the leaves of T . The elements of T are the clusters of T . The most natural representation of T is, of course, a rooted graph-theoretic tree with |T | -1 nodes (the empty cluster does not correspond to any vertex).
Given two phylogenies T 1 and T 2 for S, T 1 is a subset of T 2 if, and only if, the graph representation of T 1 can be obtained from the graph representation of T 2 by contracting (internal) edges. If T 1 is a subset of T 2 and if we assume that hard polytomies never occur then T 2 is at least as informative as T 1 . Let M(G) denote the set of all quintuples (s, c, s
โ E, and (c โฒ , s) / โ E. The latter conditions state that the bipartite graph depicted in [25,Figure 4] is an induced subgraph of G. A perfect phylogeny for G is a phylogeny T for S such that N G (c) is a cluster of T for every c โ C. We say that G is M-free [5,[11][12][13]22] (or ฮฃ-free) [4,9,25]) if the following three equivalent conditions are met:
For each e โ C ร S, the magnitude of F (e) is the edit cost of e in H. An edition of H is a bipartite graph G of the form G = (C, S, E) for some subset E โ C ร S. A conflict between G and H is an element e โ C ร S that satisfies one of the following two conditions:
- e is an edge of G and e is a non-edge of H or 2. e is a non-edge of G and e is an edge of H.
The sum of the edit costs in H over all conflicts between G and H is denoted โ(G, H):
The following minimization problem and its (parameterized) decision version generalize several previously studied problems:
Name: M-free Edition or Edit. Input: A bipartite draft-graph H and an integer k โฅ 0. Question: Is there an M-free edition G of H such that โ(G, H) โค k? Parameter : k.
For each subset X โ Z, define Min Edit-X as the restriction of Min Edit to those bipartite draft-graphs whose weight ranges are subsets of X, and similarly, define Edit-X as the restriction of Edit to those instances (H, k) such that the weight range of H is a subset of X. Notably, Min Edit-{-1, +1} is the Minimum-Flip Supertree problem and its restiction Min Edit-{-1, 0, +1} is the Minimum-Flip Consensus Tree problem [4, 5, 9-13, 16, 22].
Modelization. Incomplete and/or possibly erroneous character data sets are naturally modeled by bipartite draft-graphs: joker-edges represent incompletenesses and edit costs allow parsimonious error-corrections.
Supertrees. The most interesting feature of Min Edit is that it can be thought as a supertree construction problem, and more precisely, the optimization problem underlying MRF [4,9,11,12].
Min Edit-X has been studied for several subsets X โ Z [4,5,9,10,12,13,16,[20][21][22]25], sometimes implicitely. Let H = (C, S, F ) be a bipartite draft-graph and let k be a nonnegative integer.
Put Z + = n โ Z : n โฅ 0 and Z -= n โ Z : n โค 0 . If H has no non-edge, or equivalently, if the weight range of H is a subset of Z + then the complete bipartite graph Edit-I, Edit-D, and Edit-U are NP-complete [12].
Put Z * = Z \ {0}. Min Edit-Z * is the restriction of Min Edit to those bipartite draftgraphs that have no joker-edge. The most positive r
…(Full text truncated)…
This content is AI-processed based on ArXiv data.