Finding a Path is Harder than Finding a Tree
I consider the problem of learning an optimal path graphical model from data and show the problem to be NP-hard for the maximum likelihood and minimum description length approaches and a Bayesian approach. This hardness result holds despite the fact that the problem is a restriction of the polynomially solvable problem of finding the optimal tree graphical model.
💡 Research Summary
The paper investigates the computational difficulty of learning optimal directed graphical models when the admissible structures are restricted to paths, i.e., directed graphs in which exactly one node has indegree zero (the start of the path) and every other node has indegree one. While learning optimal tree structures (each node having at most one parent) is known to be solvable in polynomial time using classic algorithms such as Chow‑Liu or Edmonds’ optimum branching, the author shows that imposing the stricter “path” constraint makes the problem NP‑hard under three widely used scoring criteria: maximum likelihood (ML), minimum description length (MDL), and a Bayesian score with a uniform (uninformative) prior over both structures and parameters.
The analysis begins by formalising the three scoring functions. All three decompose into a sum of local scores, each depending only on a variable and its parent set. For ML the local score is proportional to the empirical conditional entropy; MDL adds a penalty term proportional to the number of parameters (log N / 2); the Bayesian score uses a Dirichlet prior that yields a closed‑form marginal likelihood. The key observation is that, because the scores are additive, the global optimum can be expressed as a combinatorial optimisation over possible parent sets.
To prove hardness, the author reduces the Hamiltonian Path (HP) problem—known to be NP‑complete—to the decision version of the optimal‑path problem (OP): given a data set D and a threshold k, does there exist a directed path model whose score is at least k? For an arbitrary undirected graph G = (V,E) with |V| = n, the reduction constructs a data set over n ternary variables X₁,…,Xₙ. For each unordered pair {i,j} the construction adds eight cases; the exact pattern of 0/1/2 entries depends on whether {i,j}∈E. This yields a data set whose size is polynomial in n and satisfies five crucial properties: (i) all variables have identical marginal counts, (ii) local scores for empty parent sets are equal, (iii) local scores for a single parent are either a constant α (if the edge exists) or a smaller constant (if the edge does not exist), (iv) the score is symmetric with respect to the direction of the edge, and (v) the total score of a path equals α + (n − 1)·α precisely when the underlying undirected graph contains a Hamiltonian path following that order. Consequently, a Hamiltonian path exists in G iff the OP decision problem answers “yes” for the constructed D and k = α + (n − 1)·α.
Because the reduction is polynomial, OP is at least as hard as HP, establishing NP‑hardness for the ML score. The same construction works for MDL, since the penalty term depends only on the number of parents and is identical for all single‑parent configurations, and for the Bayesian score, because the chosen uniform prior makes the marginal likelihood depend solely on the same local counts. Thus, all three scoring schemes inherit the NP‑hardness.
The paper also notes that the result extends to undirected path models, whose scoring functions are identical to the directed case when expressed as sums of local contributions. The author discusses practical implications: although optimal path learning is intractable, optimal tree learning remains easy, and the tree’s score provides an upper bound for any path. Consequently, heuristic approaches that first find the optimal tree (e.g., Chow‑Liu) and then search for a high‑scoring Hamiltonian path using algorithms such as Karp‑Held can yield useful approximations.
In conclusion, the work demonstrates a striking contrast: restricting the model class from trees to paths—seemingly a simplification—actually makes the learning problem computationally harder. This insight cautions researchers against assuming that narrower model families are automatically easier to learn and motivates future work on identifying broader yet tractable subclasses (e.g., polytrees, bounded‑degree networks) where optimal structure discovery remains feasible.
Comments & Academic Discussion
Loading comments...
Leave a Comment