Computing Geodesic Distances in Tree Space

Notice: This research summary and analysis were automatically generated using AI technology. For absolute accuracy, please refer to the [Original Paper Viewer] below or the Original ArXiv Source.

We present two algorithms for computing the geodesic distance between phylogenetic trees in tree space, as introduced by Billera, Holmes, and Vogtmann (2001). We show that the possible combinatorial types of shortest paths between two trees can be compactly represented by a partially ordered set. We calculate the shortest distance along each candidate path by converting the problem into one of finding the shortest path through a certain region of Euclidean space. In particular, we show there is a linear time algorithm for finding the shortest path between a point in the all positive orthant and a point in the all negative orthant of R^k contained in the subspace of R^k consisting of all orthants with the first i coordinates non-positive and the remaining coordinates non-negative for 0 <= i <= k.

💡 Research Summary

The paper tackles the problem of computing the geodesic distance between two phylogenetic trees in the Billera‑Holmes‑Vogtmann (BHV) tree space, a non‑Euclidean space formed by gluing together orthants that correspond to different tree topologies. In this space, each orthant is a Euclidean cone whose coordinates are the lengths of the splits (edges) present in a particular tree. The geodesic between two trees is the shortest path that moves from the orthant of the first tree to that of the second while possibly crossing a sequence of orthants where splits are added or removed. Prior algorithms either enumerated all possible orthant‑transition sequences (exponential in the number of splits) or relied on high‑dimensional convex optimisation, making them impractical for large trees.

The authors introduce two new algorithms that dramatically reduce the computational burden. Their first contribution is a combinatorial representation of all possible shortest‑path types. By examining the split sets of the two input trees, they construct a partially ordered set (poset) whose elements correspond to individual split‑addition or split‑deletion events. A chain in this poset encodes a feasible order in which splits can appear or disappear along a geodesic. Crucially, they prove that every geodesic must follow a chain of this poset, so the poset captures the entire search space without redundancy.

Given a candidate chain, the second contribution translates the geometric problem into a Euclidean shortest‑path problem with sign constraints. The length vectors of the two trees become points A (all coordinates non‑negative) and B (all coordinates non‑positive) in ℝ^k, where k is the total number of distinct splits across both trees. The admissible region consists of all orthants in which the first i coordinates are non‑positive and the remaining k‑i coordinates are non‑negative, for some i∈

Computing Geodesic Distances in Tree Space

💡 Research Summary

Comments & Academic Discussion

Leave a Comment